Technical note Open Access

A Dataset of Enterprise-Driven Open Source Software: Extended Description

Spinellis, Diomidis; Kotti, Zoe; Kravvaritis, Konstantinos; Theodorou, Georgios; Louridas, Panos


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nam##2200000uu#4500</leader>
  <datafield tag="041" ind1=" " ind2=" ">
    <subfield code="a">eng</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">software engineering economics</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">software ecosystems</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">open source software in business</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Fortune Global 500</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">SEC 10-K</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">SEC 20-F</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">EDGAR</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">dataset</subfield>
  </datafield>
  <controlfield tag="005">20200422090657.0</controlfield>
  <controlfield tag="001">3742854</controlfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Athens University of Economics and Business</subfield>
    <subfield code="0">(orcid)0000-0003-3816-9162</subfield>
    <subfield code="a">Kotti, Zoe</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Athens University of Economics and Business</subfield>
    <subfield code="0">(orcid)0000-0002-8889-0612</subfield>
    <subfield code="a">Kravvaritis, Konstantinos</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Athens University of Economics and Business</subfield>
    <subfield code="0">(orcid)0000-0001-5413-2189</subfield>
    <subfield code="a">Theodorou, Georgios</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Athens University of Economics and Business</subfield>
    <subfield code="0">(orcid)0000-0002-3971-4612</subfield>
    <subfield code="a">Louridas, Panos</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">446115</subfield>
    <subfield code="z">md5:f6634681f5c4c15570e25e064331654d</subfield>
    <subfield code="u">https://zenodo.org/record/3742854/files/indoss-extended.pdf</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2020-04-21</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire</subfield>
    <subfield code="o">oai:zenodo.org:3742854</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">Athens University of Economics and Business</subfield>
    <subfield code="0">(orcid)0000-0003-4231-1897</subfield>
    <subfield code="a">Spinellis, Diomidis</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">A Dataset of Enterprise-Driven Open Source Software: Extended Description</subfield>
  </datafield>
  <datafield tag="536" ind1=" " ind2=" ">
    <subfield code="c">825328</subfield>
    <subfield code="a">Fine-Grained Analysis of Software Ecosystems as Networks</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;We present a dataset of open source software developed mainly by enterprises rather than volunteers. This can be used to address known generalizability concerns, and, also, to perform research on open source business software development. Based on the premise that an enterprise&amp;#39;s employees are likely to contribute to a project developed by their organization using the email account provided by it, we mine domain names associated with enterprises from open data sources as well as through white- and blacklisting, and use them through three heuristics to identify 17,264 enterprise GitHub projects. We provide these as a dataset detailing their provenance and properties. A manual evaluation of a dataset sample shows an identification accuracy of 89%. Through an exploratory data analysis we found that projects are staffed by a plurality of enterprise insiders, who appear to be pulling more than their weight, and that in a small percentage of relatively large projects development happens exclusively through enterprise insiders.&lt;/p&gt;

&lt;p&gt;This technical note provides an extended description of a paper with the same name to appear in the &lt;em&gt;17th International Conference on Mining Software Repositories &lt;/em&gt;(MSR 2020).&lt;/p&gt;</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">compiles</subfield>
    <subfield code="a">10.5281/zenodo.3653878</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">documents</subfield>
    <subfield code="a">10.5281/zenodo.3742973</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isSupplementTo</subfield>
    <subfield code="a">10.1145/3379597.3387495</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">arxiv</subfield>
    <subfield code="i">isSupplementTo</subfield>
    <subfield code="a">arXiv:2002.03927</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isReferencedBy</subfield>
    <subfield code="a">10.1145/3379597.3387495</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">arxiv</subfield>
    <subfield code="i">isReferencedBy</subfield>
    <subfield code="a">arXiv:2002.03927</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.3742853</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.3742854</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">publication</subfield>
    <subfield code="b">technicalnote</subfield>
  </datafield>
</record>
377
256
views
downloads
All versions This version
Views 377377
Downloads 256256
Data volume 114.2 MB114.2 MB
Unique views 354354
Unique downloads 224224

Share

Cite as