Dataset Open Access
Fadel, Ali; Musleh, Husam; Tuffaha, Ibraheem; Al-Ayyoub, Mahmoud; Jararweh, Yaser; Benkhelifa, Elhadj; Rosso, Paolo
<?xml version='1.0' encoding='utf-8'?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:adms="http://www.w3.org/ns/adms#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dct="http://purl.org/dc/terms/" xmlns:dctype="http://purl.org/dc/dcmitype/" xmlns:dcat="http://www.w3.org/ns/dcat#" xmlns:duv="http://www.w3.org/ns/duv#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:frapo="http://purl.org/cerif/frapo/" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:gsp="http://www.opengis.net/ont/geosparql#" xmlns:locn="http://www.w3.org/ns/locn#" xmlns:org="http://www.w3.org/ns/org#" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:prov="http://www.w3.org/ns/prov#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:schema="http://schema.org/" xmlns:skos="http://www.w3.org/2004/02/skos/core#" xmlns:vcard="http://www.w3.org/2006/vcard/ns#" xmlns:wdrs="http://www.w3.org/2007/05/powder-s#"> <rdf:Description rdf:about="https://doi.org/10.5281/zenodo.4059840"> <rdf:type rdf:resource="http://www.w3.org/ns/dcat#Dataset"/> <dct:type rdf:resource="http://purl.org/dc/dcmitype/Dataset"/> <dct:identifier rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI">https://doi.org/10.5281/zenodo.4059840</dct:identifier> <foaf:page rdf:resource="https://doi.org/10.5281/zenodo.4059840"/> <dct:creator> <rdf:Description> <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Agent"/> <foaf:name>Fadel, Ali</foaf:name> <foaf:givenName>Ali</foaf:givenName> <foaf:familyName>Fadel</foaf:familyName> <org:memberOf> <foaf:Organization> <foaf:name>Jordan University of Science and Technology</foaf:name> </foaf:Organization> </org:memberOf> </rdf:Description> </dct:creator> <dct:creator> <rdf:Description> <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Agent"/> <foaf:name>Musleh, Husam</foaf:name> <foaf:givenName>Husam</foaf:givenName> <foaf:familyName>Musleh</foaf:familyName> <org:memberOf> <foaf:Organization> <foaf:name>Jordan University of Science and Technology</foaf:name> </foaf:Organization> </org:memberOf> </rdf:Description> </dct:creator> <dct:creator> <rdf:Description> <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Agent"/> <foaf:name>Tuffaha, Ibraheem</foaf:name> <foaf:givenName>Ibraheem</foaf:givenName> <foaf:familyName>Tuffaha</foaf:familyName> <org:memberOf> <foaf:Organization> <foaf:name>Jordan University of Science and Technology</foaf:name> </foaf:Organization> </org:memberOf> </rdf:Description> </dct:creator> <dct:creator> <rdf:Description> <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Agent"/> <foaf:name>Al-Ayyoub, Mahmoud</foaf:name> <foaf:givenName>Mahmoud</foaf:givenName> <foaf:familyName>Al-Ayyoub</foaf:familyName> <org:memberOf> <foaf:Organization> <foaf:name>Jordan University of Science and Technology</foaf:name> </foaf:Organization> </org:memberOf> </rdf:Description> </dct:creator> <dct:creator> <rdf:Description> <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Agent"/> <foaf:name>Jararweh, Yaser</foaf:name> <foaf:givenName>Yaser</foaf:givenName> <foaf:familyName>Jararweh</foaf:familyName> <org:memberOf> <foaf:Organization> <foaf:name>Duquesne University</foaf:name> </foaf:Organization> </org:memberOf> </rdf:Description> </dct:creator> <dct:creator> <rdf:Description> <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Agent"/> <foaf:name>Benkhelifa, Elhadj</foaf:name> <foaf:givenName>Elhadj</foaf:givenName> <foaf:familyName>Benkhelifa</foaf:familyName> <org:memberOf> <foaf:Organization> <foaf:name>Staffordshire University</foaf:name> </foaf:Organization> </org:memberOf> </rdf:Description> </dct:creator> <dct:creator> <rdf:Description> <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Agent"/> <foaf:name>Rosso, Paolo</foaf:name> <foaf:givenName>Paolo</foaf:givenName> <foaf:familyName>Rosso</foaf:familyName> <org:memberOf> <foaf:Organization> <foaf:name>Universitat Politècnica de València</foaf:name> </foaf:Organization> </org:memberOf> </rdf:Description> </dct:creator> <dct:title>Authorship Identification of SOurce COde 2020 (AI-SOCO)</dct:title> <dct:publisher> <foaf:Agent> <foaf:name>Zenodo</foaf:name> </foaf:Agent> </dct:publisher> <dct:issued rdf:datatype="http://www.w3.org/2001/XMLSchema#gYear">2020</dct:issued> <dcat:keyword>authorship identification</dcat:keyword> <dcat:keyword>source code</dcat:keyword> <dcat:keyword>ai-soco</dcat:keyword> <dcat:keyword>fire2020</dcat:keyword> <dcat:keyword>pan2020</dcat:keyword> <dcat:keyword>codeforces</dcat:keyword> <dct:issued rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2020-05-30</dct:issued> <owl:sameAs rdf:resource="https://zenodo.org/record/4059840"/> <adms:identifier> <adms:Identifier> <skos:notation rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI">https://zenodo.org/record/4059840</skos:notation> <adms:schemeAgency>url</adms:schemeAgency> </adms:Identifier> </adms:identifier> <dct:isVersionOf rdf:resource="https://doi.org/10.5281/zenodo.4059839"/> <owl:versionInfo>1.0</owl:versionInfo> <dct:description><p>General authorship identification is essential to the detection of undesirable deception of others&#39; content misuse or exposing the owners of some anonymous hurtful content. This is done by revealing the author of that content.&nbsp;<strong>A</strong>uthorship&nbsp;<strong>I</strong>dentification of&nbsp;<strong>SO</strong>urce&nbsp;<strong>CO</strong>de (AI-SOCO) focuses on uncovering the author who wrote some piece of code. This facilitates solving issues related to cheating in academic, work and open source environments. Also, it can be helpful in detecting the authors of malware softwares over the world.</p> <p>The detection of cheating in academic communities is significant to properly address the contribution of each researcher. Also, in work environments, credit sometimes goes to people that did not deserve it. Such issues of plagiarism could arise in open source projects that are available on public platforms. Similarly, this could be used in public or private online coding contests whether done in coding interviews or in official coding training contests to detect the cheating of applicants or contestants. A system like this could also play a big role in detecting the source of anonymous malicious softwares.</p> <p>The dataset is composed of source codes collected from the open submissions in the&nbsp;<a href="http://www.google.com/url?q=http%3A%2F%2Fcodeforces.com%2F&amp;sa=D&amp;sntz=1&amp;usg=AFQjCNHKGPIjzjl6ujCm0t4EU_waJWvU-Q">Codeforces</a>&nbsp;online judge. Codeforces is an online judge for hosting competitive programming contests such that each contest consists of multiple problems to be solved by the participants. A Codeforces participant can solve a problem by writing a solution for it using any of the available programming languages on the website, and then submitting the solution through the website. The solution&#39;s result can be correct (accepted) or incorrect (wrong answer, time limit exceeded, etc.).</p> <p>In our dataset, we selected 1,000 users and collected 100 source codes from each one. So, the total number of source codes is 100,000. All collected source codes are correct, bug-free, compile-ready and written using the C++ programming language using different versions. For each user, all collected source codes are from unique problems.</p> <p>Given&nbsp;the&nbsp;pre-defined set of source codes and their&nbsp;authors, the task&nbsp;is to build a system&nbsp;to determine which one of these authors wrote&nbsp;a&nbsp;given unseen before source code.</p> <p>Dataset website:&nbsp;https://sites.google.com/view/ai-soco-2020.</p></dct:description> <dct:accessRights rdf:resource="http://publications.europa.eu/resource/authority/access-right/PUBLIC"/> <dct:accessRights> <dct:RightsStatement rdf:about="info:eu-repo/semantics/openAccess"> <rdfs:label>Open Access</rdfs:label> </dct:RightsStatement> </dct:accessRights> <dcat:distribution> <dcat:Distribution> <dct:license rdf:resource="https://creativecommons.org/licenses/by/4.0/legalcode"/> <dcat:accessURL rdf:resource="https://doi.org/10.5281/zenodo.4059840"/> </dcat:Distribution> </dcat:distribution> <dcat:distribution> <dcat:Distribution> <dcat:accessURL rdf:resource="https://doi.org/10.5281/zenodo.4059840"/> <dcat:byteSize>83164902</dcat:byteSize> <dcat:downloadURL rdf:resource="https://zenodo.org/record/4059840/files/ai-soco-2020.zip"/> <dcat:mediaType>application/zip</dcat:mediaType> </dcat:Distribution> </dcat:distribution> </rdf:Description> </rdf:RDF>
All versions | This version | |
---|---|---|
Views | 303 | 303 |
Downloads | 73 | 73 |
Data volume | 6.1 GB | 6.1 GB |
Unique views | 270 | 270 |
Unique downloads | 55 | 55 |