Compounding-aware Word Embedding for Improved Semantic Representation

Bhat, Shripad Anant

dc.contributor.advisor	Majumder, Prasenjit
dc.contributor.author	Bhat, Shripad Anant
dc.date.accessioned	06-05-2022T06:11:53Z
dc.date.available	2023-02-18T06:11:53Z
dc.date.issued	2021
dc.identifier.citation	Bhat, Shripad Anant (2021). Compounding-aware Word Embedding for Improved Semantic Representation. Dhirubhai Ambani Institute of Information and Communication Technology. viii, 43 p. (Acc.No: T00932)
dc.identifier.uri	http://drsr.daiict.ac.in//handle/123456789/993
dc.description.abstract	Existing word embedding approaches may not adequately capture the inherent complexities of a language, e.g. the word compounding phenomenon. While a class of data-driven approaches has been shown to be effective in embedding words of languages that are relatively simple as per inflections and compounding characteristics (e.g. English), an open area of investigation is ways of integrating language-specific characteristics within the framework of an embedding model. In this work, we explore how words in a highly agglutinative language, e.g. German, can be embedded more effectively by additionally taking into account the contexts around the constituents of a compound word. We propose a word transformation based generalization of the skip-gram algorithm to address these relationships between a compound word and its constituents. Our experiments on standard German word-pair similarity datasets and polarity classification of German compounds confirm our hypothesis that modeling contextual relationships between a compound word and its constituents can improve word representations.
dc.publisher	Dhirubhai Ambani Institute of Information and Communication Technology
dc.subject	word embedding
dc.subject	compound words
dc.classification.ddc	025.04 BHA
dc.title	Compounding-aware Word Embedding for Improved Semantic Representation
dc.type	Dissertation
dc.degree	M. Tech
dc.student.id	201911003
dc.accession.number	T00932

Files in this item

Name:: 201911003_SHRIPAD_ANANT_BHAT_M ...
Size:: 2.797Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

M Tech Dissertations [923]

Show simple item record