Towards a Multi-view Language Representation: A Shared Space of Discrete and Continuous Language Features

doi:10.5281/zenodo.3923625

Published June 30, 2019 | Version v1

Conference paper Open

Towards a Multi-view Language Representation: A Shared Space of Discrete and Continuous Language Features

Linguistic typology databases contain valuable knowledge of the distinguishing properties of different languages. Typically they contain sparse discrete features that are difficult to integrate into computational methods, and dense task-learned language vectors have emerged in response. To join both worlds, we compute a shared space between discrete (binary) and continuous features using canonical correlation analysis. We evaluate the new language representation against a concatenation baseline in typological feature prediction and in phylogenetic inference, obtaining promising results to explore further.

Files

_TyP_NLP_2019__Towards_a_Multi_view_Language_Representation.pdf

Files (190.3 kB)

Name	Size	Download all
_TyP_NLP_2019__Towards_a_Multi_view_Language_Representation.pdf md5:0d407619b9768f55c8b39852eb95e8f1	190.3 kB	Preview Download

Additional details

GoURMET – Global Under-Resourced MEedia Translation 825299: European Commission

Views

Downloads

Show more details

	All versions	This version
Views	46	46
Downloads	16	16
Data volume	3.2 MB	3.2 MB

More info on how stats are collected....

DOI

Resource type

Conference paper

Publisher

Zenodo

Languages

English

Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: June 30, 2020
Modified: July 19, 2024

Towards a Multi-view Language Representation: A Shared Space of Discrete and Continuous Language Features

Creators

Description

Files

_TyP_NLP_2019__Towards_a_Multi_view_Language_Representation.pdf

Files (190.3 kB)

Additional details

Funding