Dataset Open Access
Burattin, Andrea
<?xml version='1.0' encoding='UTF-8'?> <record xmlns="http://www.loc.gov/MARC21/slim"> <leader>00000nmm##2200000uu#4500</leader> <datafield tag="540" ind1=" " ind2=" "> <subfield code="u">https://creativecommons.org/publicdomain/zero/1.0/legalcode</subfield> <subfield code="a">Creative Commons Zero v1.0 Universal</subfield> </datafield> <datafield tag="260" ind1=" " ind2=" "> <subfield code="c">2015-03-16</subfield> </datafield> <controlfield tag="005">20200124192432.0</controlfield> <controlfield tag="001">19187</controlfield> <datafield tag="909" ind1="C" ind2="O"> <subfield code="p">openaire_data</subfield> <subfield code="o">oai:zenodo.org:19187</subfield> </datafield> <datafield tag="520" ind1=" " ind2=" "> <subfield code="a"><p>This file&nbsp;contains two datasets.</p> <p><strong>1. Periodical Sudden Drifts</strong></p> <p>For this case study, we have generated two synthetic logs (<span class="math-tex">\(\mathcal{L}_1\)</span>&nbsp;and <span class="math-tex">\(\mathcal{L}_2\)</span>) by modeling two variants of the insurance claim process described in [1] in CPN Tools&nbsp;and by simulating the models.&nbsp;<span class="math-tex">\(\mathcal{L}_1\)</span> contains 14,840 events and <span class="math-tex">\(\mathcal{L}_2\)</span> contains 16,438 events.</p> <p>We merged the logs (eight alternations of <span class="math-tex">\(\mathcal{L}_1\)</span> and <span class="math-tex">\(\mathcal{L}_2\)</span>) using the <em>Stream Package</em>&nbsp;in ProM&nbsp;(the source code of the package is publicly available at https://svn.win.tue.nl/repos/prom/Packages/Stream/Trunk).&nbsp;The same package has been used to transform the resulting log into an event stream. The event stream contains 250,224 events and has several sudden concept drifts (one for every switch from <span class="math-tex">\(\mathcal{L}_1\)</span> to <span class="math-tex">\(\mathcal{L}_2\)</span>).</p> <p><strong>2. Gradual Drifts</strong></p> <p>We have considered two variants of the insurance claim process described in [1],&nbsp;<span class="math-tex">\(\mathcal{M}_1'\)</span>&nbsp;(with 21 activities) and <span class="math-tex">\(\mathcal{M}_2'\)</span>&nbsp;(with 19 activities). We have also designed 6 additional models&nbsp;<span class="math-tex">\(\mathcal{M}_a,\dots, \mathcal{M}_f\)</span>&nbsp;to represent the intermediate steps to go from <span class="math-tex">\(\mathcal{M}_1'\)</span>&nbsp;to <span class="math-tex">\(\mathcal{M}_2'\)</span>.&nbsp;Therefore, <span class="math-tex">\(\mathcal{M}_1'\)</span> and <span class="math-tex">\(\mathcal{M}_a\)</span>&nbsp;are very similar and the same happens for <span class="math-tex">\(\mathcal{M}_a\)</span> compared to <span class="math-tex">\(\mathcal{M}_b\)</span>, for <span class="math-tex">\(\mathcal{M}_b\)</span> compared to <span class="math-tex">\(\mathcal{M}_c\)</span>, and so on.&nbsp;We have simulated these models generating 8 logs (<span class="math-tex">\(\mathcal{L}_1', \mathcal{L}_a, \dots,\mathcal{L}_f, \mathcal{L}_2'\)</span>). <span class="math-tex">\(\mathcal{L}_1'\)</span>&nbsp;contains 139,938 events, <span class="math-tex">\(\mathcal{L}_2'\)</span>&nbsp;contains 128,696 events and <span class="math-tex">\(\mathcal{L}_a,\dots,\mathcal{L}_f\)</span>&nbsp;contain 77,231 events (altogether).</p> <p>Using the <em>Stream Package</em>, we have generated an event stream containing 345,865 events.</p> <p>&nbsp;</p> <p><strong>References</strong></p> <ol> <li>R. J. C. Bose, &ldquo;Process Mining in the Large: Preprocessing, Discovery,&nbsp;and Diagnostics,&rdquo; Ph.D. dissertation, Eindhoven University of&nbsp;Technology, 2012.</li> </ol></subfield> </datafield> <datafield tag="856" ind1="4" ind2=" "> <subfield code="s">8008449</subfield> <subfield code="z">md5:de14108430e1eddd4c78165c4b314953</subfield> <subfield code="u">https://zenodo.org/record/19187/files/2015-TSC.zip</subfield> </datafield> <datafield tag="542" ind1=" " ind2=" "> <subfield code="l">open</subfield> </datafield> <datafield tag="980" ind1=" " ind2=" "> <subfield code="a">dataset</subfield> </datafield> <datafield tag="100" ind1=" " ind2=" "> <subfield code="u">University of Innsbruck</subfield> <subfield code="a">Burattin, Andrea</subfield> </datafield> <datafield tag="653" ind1=" " ind2=" "> <subfield code="a">process mining</subfield> </datafield> <datafield tag="653" ind1=" " ind2=" "> <subfield code="a">event log</subfield> </datafield> <datafield tag="653" ind1=" " ind2=" "> <subfield code="a">event stream</subfield> </datafield> <datafield tag="653" ind1=" " ind2=" "> <subfield code="a">artificial dataset</subfield> </datafield> <datafield tag="024" ind1=" " ind2=" "> <subfield code="a">10.5281/zenodo.19187</subfield> <subfield code="2">doi</subfield> </datafield> <datafield tag="245" ind1=" " ind2=" "> <subfield code="a">Artificial datasets for online Declare discovery</subfield> </datafield> <datafield tag="650" ind1="1" ind2="7"> <subfield code="a">cc-by</subfield> <subfield code="2">opendefinition.org</subfield> </datafield> </record>
All versions | This version | |
---|---|---|
Views | 163 | 163 |
Downloads | 14 | 14 |
Data volume | 112.1 MB | 112.1 MB |
Unique views | 154 | 154 |
Unique downloads | 14 | 14 |