Dataset Open Access

Palmetto position storing Lucene index of Dutch Wikipedia

van der Zwaan, Janneke M.; Marx, Maarten; Kamps, Jaap

DataCite XML Export

<?xml version='1.0' encoding='utf-8'?>
<resource xmlns:xsi="" xmlns="" xsi:schemaLocation="">
  <identifier identifierType="DOI">10.5281/zenodo.46377</identifier>
      <creatorName>van der Zwaan,  Janneke M.</creatorName>
      <givenName>Janneke M.</givenName>
      <familyName>van der Zwaan</familyName>
      <affiliation>Netherlands eScience Center</affiliation>
      <creatorName>Marx, Maarten</creatorName>
      <affiliation>University of Amsterdam</affiliation>
      <creatorName>Kamps, Jaap</creatorName>
      <affiliation>University of Amsterdam</affiliation>
    <title>Palmetto position storing Lucene index of Dutch Wikipedia</title>
    <subject>topic modeling</subject>
    <subject>topic coherence</subject>
    <date dateType="Issued">2016-02-22</date>
  <resourceType resourceTypeGeneral="Dataset"/>
    <alternateIdentifier alternateIdentifierType="url"></alternateIdentifier>
    <rights rightsURI="">Creative Commons Attribution Share Alike 4.0 International</rights>
    <rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights>
    <description descriptionType="Abstract">&lt;p&gt;Dutch language resource for calculating topic coherence with Palmetto [1, 2]. The dataset is a position storing Lucene index of the Dutch Wikipedia [3]. It was created in the context of the Netherlands eScience Center Dilipad project [4]. The pdf file contains the results of a case study that shows best topic coherence measure for topics consisting of Dutch nouns is NPMI.&lt;/p&gt;

&lt;p&gt;More details can be found in the README.&lt;/p&gt;

&lt;p&gt;[1] M. Roeder, A. Both, and A. Hinneburg. Exploring the space of topic coherence measures. In &lt;em&gt;Proceedings of the Eighth ACM International Conference on Web Search and Data Mining&lt;/em&gt;, pages 399&amp;ndash;408, 2015.&lt;/p&gt;



All versions This version
Views 4,0344,035
Downloads 186186
Data volume 22.4 GB22.4 GB
Unique views 3,9893,990
Unique downloads 150150


Cite as