Abstract
Natural language is the human expression of thought. The representation of natural language, and thus thought, is fundamental to the field of artificial intelligence. This thesis explores that representation through weighted sums of embeddings. Embeddings are dense numerical vectors representing natural language components (e.g. words). Their sum is commonly overlooked as being too simple: in contrast to sequence or tree representations. However, we find that on numerous real-world problems it is actually superior. This thesis demonstrates this capacity, and explains why. The sum of embeddings is a particularly effective dimensionality-reduced representation of the crucial surface features of language.
Original language | English |
---|---|
Qualification | Doctor of Philosophy |
Awarding Institution |
|
Supervisors/Advisors |
|
Thesis sponsors | |
Award date | 16 Apr 2019 |
DOIs | |
Publication status | Unpublished - 2019 |
Embargo information
- Embargoed from 20/06/2019 to 20/06/2021. Made publicly available on 20/06/2021.