Other Open Access

Type-driven distributional semantics for prepositional phrase attachment

Delpeuch, Antonin


DCAT Export

<?xml version='1.0' encoding='utf-8'?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:adms="http://www.w3.org/ns/adms#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dct="http://purl.org/dc/terms/" xmlns:dctype="http://purl.org/dc/dcmitype/" xmlns:dcat="http://www.w3.org/ns/dcat#" xmlns:duv="http://www.w3.org/ns/duv#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:frapo="http://purl.org/cerif/frapo/" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:gsp="http://www.opengis.net/ont/geosparql#" xmlns:locn="http://www.w3.org/ns/locn#" xmlns:org="http://www.w3.org/ns/org#" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:prov="http://www.w3.org/ns/prov#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:schema="http://schema.org/" xmlns:skos="http://www.w3.org/2004/02/skos/core#" xmlns:vcard="http://www.w3.org/2006/vcard/ns#" xmlns:wdrs="http://www.w3.org/2007/05/powder-s#">
  <rdf:Description rdf:about="https://doi.org/10.5281/zenodo.1299049">
    <rdf:type rdf:resource="http://www.w3.org/ns/dcat#Dataset"/>
    <dct:type rdf:resource="http://purl.org/dc/dcmitype/Text"/>
    <dct:identifier rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI">https://doi.org/10.5281/zenodo.1299049</dct:identifier>
    <foaf:page rdf:resource="https://doi.org/10.5281/zenodo.1299049"/>
    <dct:creator>
      <rdf:Description rdf:about="http://orcid.org/0000-0002-8612-8827">
        <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Agent"/>
        <dct:identifier rdf:datatype="http://www.w3.org/2001/XMLSchema#string">0000-0002-8612-8827</dct:identifier>
        <foaf:name>Delpeuch, Antonin</foaf:name>
        <foaf:givenName>Antonin</foaf:givenName>
        <foaf:familyName>Delpeuch</foaf:familyName>
      </rdf:Description>
    </dct:creator>
    <dct:title>Type-driven distributional semantics for prepositional phrase attachment</dct:title>
    <dct:publisher>
      <foaf:Agent>
        <foaf:name>Zenodo</foaf:name>
      </foaf:Agent>
    </dct:publisher>
    <dct:issued rdf:datatype="http://www.w3.org/2001/XMLSchema#gYear">2015</dct:issued>
    <dct:issued rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2015-01-01</dct:issued>
    <owl:sameAs rdf:resource="https://zenodo.org/record/1299049"/>
    <adms:identifier>
      <adms:Identifier>
        <skos:notation rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI">https://zenodo.org/record/1299049</skos:notation>
        <adms:schemeAgency>url</adms:schemeAgency>
      </adms:Identifier>
    </adms:identifier>
    <dct:isVersionOf rdf:resource="https://doi.org/10.5281/zenodo.1299048"/>
    <dct:description>Combining the strengths of distributional and logical semantics of natural language is a problem that has gained a lot of attention recently. We focus here on the distributional compositional framework of Coecke et al. (2011), which brings syntax-driven compositionality to word vectors. Using type driven grammars, they propose a method to translate the syntactic structure of any sentence to a series of algebraic operations combining the individual word meanings into a sentence representation. My contribution to these semantics is twofold. First, I propose a new approach to tackle the dimensionality issues this model yields. One of the major hurdles to apply this composition technique to arbitrary sentences is indeed the large number of parameters to be stored and manipulated. This is due to the use of tensors, whose dimensions grow exponentially with the number of types involved in the syntax. Going back to the category-theoretical roots of the model, I show how the use of diagrams can help reduce the number of parameters, and adapt the composition operations to new sources of distributional information. Second, I apply this framework to a concrete problem: prepositional phrase attachment. As this form of syntactic ambiguity requires semantic information to be resolved, distributional methods are a natural choice to improve disambiguation algoritms which usually consider words as discrete units. The attachment decision involves at least four different words, so it is interesting to see if the categorical composition method can be used to combine their representation into useful information to predict the correct attachment. A byproduct of this work is a new dataset with enriched annotations, allowing for a more fine-grained decision problem than the traditional PP attachment problem.</dct:description>
    <dct:accessRights rdf:resource="http://publications.europa.eu/resource/authority/access-right/PUBLIC"/>
    <dct:accessRights>
      <dct:RightsStatement rdf:about="info:eu-repo/semantics/openAccess">
        <rdfs:label>Open Access</rdfs:label>
      </dct:RightsStatement>
    </dct:accessRights>
    <dcat:distribution>
      <dcat:Distribution>
        <dct:license rdf:resource="https://creativecommons.org/licenses/by/4.0/legalcode"/>
        <dcat:accessURL rdf:resource="https://doi.org/10.5281/zenodo.1299049"/>
      </dcat:Distribution>
    </dcat:distribution>
    <dcat:distribution>
      <dcat:Distribution>
        <dcat:accessURL>https://doi.org/10.5281/zenodo.1299049</dcat:accessURL>
        <dcat:byteSize>678483</dcat:byteSize>
        <dcat:downloadURL>https://zenodo.org/record/1299049/files/article.pdf</dcat:downloadURL>
        <dcat:mediaType>application/pdf</dcat:mediaType>
      </dcat:Distribution>
    </dcat:distribution>
  </rdf:Description>
</rdf:RDF>
61
164
views
downloads
All versions This version
Views 6161
Downloads 164165
Data volume 111.3 MB111.9 MB
Unique views 6161
Unique downloads 152153

Share

Cite as