Dataset Open Access

Palmetto position storing Lucene index of Dutch Wikipedia

van der Zwaan, Janneke M.; Marx, Maarten; Kamps, Jaap


DataCite XML Export

<?xml version='1.0' encoding='utf-8'?>
<resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-4" xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4.1/metadata.xsd">
  <identifier identifierType="DOI">10.5281/zenodo.46377</identifier>
  <creators>
    <creator>
      <creatorName>van der Zwaan,  Janneke M.</creatorName>
      <givenName>Janneke M.</givenName>
      <familyName>van der Zwaan</familyName>
      <affiliation>Netherlands eScience Center</affiliation>
    </creator>
    <creator>
      <creatorName>Marx, Maarten</creatorName>
      <givenName>Maarten</givenName>
      <familyName>Marx</familyName>
      <affiliation>University of Amsterdam</affiliation>
    </creator>
    <creator>
      <creatorName>Kamps, Jaap</creatorName>
      <givenName>Jaap</givenName>
      <familyName>Kamps</familyName>
      <affiliation>University of Amsterdam</affiliation>
    </creator>
  </creators>
  <titles>
    <title>Palmetto position storing Lucene index of Dutch Wikipedia</title>
  </titles>
  <publisher>Zenodo</publisher>
  <publicationYear>2016</publicationYear>
  <subjects>
    <subject>topic modeling</subject>
    <subject>topic coherence</subject>
    <subject>Palmetto</subject>
    <subject>Dutch</subject>
    <subject>Wikipedia</subject>
  </subjects>
  <dates>
    <date dateType="Issued">2016-02-22</date>
  </dates>
  <resourceType resourceTypeGeneral="Dataset"/>
  <alternateIdentifiers>
    <alternateIdentifier alternateIdentifierType="url">https://zenodo.org/record/46377</alternateIdentifier>
  </alternateIdentifiers>
  <rightsList>
    <rights rightsURI="http://creativecommons.org/licenses/by-sa/4.0/legalcode">Creative Commons Attribution Share Alike 4.0 International</rights>
    <rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights>
  </rightsList>
  <descriptions>
    <description descriptionType="Abstract">&lt;p&gt;Dutch language resource for calculating topic coherence with Palmetto [1, 2]. The dataset is a position storing Lucene index of the Dutch Wikipedia [3]. It was created in the context of the Netherlands eScience Center Dilipad project [4]. The pdf file contains the results of a case study that shows best topic coherence measure for topics consisting of Dutch nouns is NPMI.&lt;/p&gt;

&lt;p&gt;More details can be found in the README.&lt;/p&gt;

&lt;p&gt;[1] M. Roeder, A. Both, and A. Hinneburg. Exploring the space of topic coherence measures. In &lt;em&gt;Proceedings of the Eighth ACM International Conference on Web Search and Data Mining&lt;/em&gt;, pages 399&amp;ndash;408, 2015.&lt;/p&gt;

&lt;p&gt;[2] http://aksw.org/Projects/Palmetto.html&lt;/p&gt;

&lt;p&gt;[3] https://dumps.wikimedia.org/nlwiki/20151102/&lt;/p&gt;

&lt;p&gt;[4] https://www.esciencecenter.nl/project/dilipad&lt;/p&gt;</description>
  </descriptions>
</resource>
3,252
49
views
downloads
All versions This version
Views 3,2523,252
Downloads 4949
Data volume 6.6 GB6.6 GB
Unique views 3,2393,239
Unique downloads 3737

Share

Cite as