On the surprising capacity of linear combinations of embeddings for natural language processing

Lyndon White

Research output: ThesisDoctoral Thesis

204 Downloads (Pure)

Abstract

Natural language is the human expression of thought. The representation of natural language, and thus thought, is fundamental to the field of artificial intelligence. This thesis explores that representation through weighted sums of embeddings. Embeddings are dense numerical vectors representing natural language components (e.g. words). Their sum is commonly overlooked as being too simple: in contrast to sequence or tree representations. However, we find that on numerous real-world problems it is actually superior. This thesis demonstrates this capacity, and explains why. The sum of embeddings is a particularly effective dimensionality-reduced representation of the crucial surface features of language.
Original languageEnglish
QualificationDoctor of Philosophy
Awarding Institution
  • The University of Western Australia
Supervisors/Advisors
  • Togneri, Roberto, Supervisor
  • Liu, Wei, Supervisor
  • Bennamoun, Mohammed, Supervisor
Thesis sponsors
Award date16 Apr 2019
DOIs
Publication statusUnpublished - 2019

Embargo information

  • Embargoed from 20/06/2019 to 20/06/2021. Made publicly available on 20/06/2021.

Fingerprint

Dive into the research topics of 'On the surprising capacity of linear combinations of embeddings for natural language processing'. Together they form a unique fingerprint.

Cite this