Project deliverable Open Access
This report constitutes SSHOC Deliverable 9.7 Design and Planning of Knowledge Graph in Electoral Studies. It describes the approach as well as the implementation plan for the SSHOC Pilot project in the field of electoral studies, which will constitute Deliverable 9.9.
After a brief introduction that sets out the purpose of this report, Section 2 elaborates the substantive domain to be covered by the Knowledge Graph (KG) and subsequently develops the concept of a KG and reasons for developing such infrastructural tools. The substantive domain is contained within the field of Electoral Studies, which was described in detail in Deliverable 9.6 Demarcation Report of Electoral Studies User Community. For reasons of practicality, this domain has been further specified in several steps. First it was specified as pertaining to (sub)field of citizen/voter behavior, and subsequently it was narrowed down further to the field of (studies of) electoral participation. These choices imply that the KG to be developed will not cover the entire field of electoral studies, but that it will be centrally located within the wider field and that its relevance for end-users is not restricted to those who are primarily interested in electoral participation.
The aspired functionality of the pilot-KG is discussed in Section 3. It defines as its intended audience of end-users in first instance scholars engaged in empirical research, but it expects in its later development stages to be also of relevance to journalists, think tanks, government agencies, corporations, political parties and politicians, and individual citizens. Functionalities to be sought relate to (a) focussed searching based on domain-specific criteria not available in other search tools; (b) teaching; and (c) research.
Section 3 also discusses the main datasets and kinds of publications to be covered by the pilot-KG. The overwhelming majority of these datasets are in the public domain, while a considerable part of relevant publications is not (at least not for the first years after publication). This provides challenges that will be addressed in the work program of Deliverable 9.9.
Section 4 discusses how the team developing the pilot-KG plan to involve the user community of electoral studies. Recruitment for such involvement is planned to be done via relevant scientific conferences (which are identified in Section 4) and the authorship of publications in relevant scientific journals; recruited volunteers will be mainly tasked with pre-structured coding and testing. In addition, smaller groups of community members will be personally invited based on their expertise and their willingness to commit somewhat more of their time to assist in the development of an ontology, coding schemes, and of the design of testing phases.
Section 5 discusses the development of an ontology, which is necessary in view of the de facto absence of ontologies, classification schemes or controlled dictionaries that would be able to classify substantive content within the field of electoral studies. The section discusses approaches for ontology development and how these will be applied in the context of the development of the intended pilot-KG. In this context, this section also discusses the governance and management of the KG and its ontology.
Section 6 presents technical specifications for processes such as data ingestion, data cleaning, data authoring, data linking, data enrichment, data provisioning and data analysis. It also discusses the technical environment of semantic middleware to be used (for which PoolParty was chosen).
Section 7 discusses testing and user-community involvement in that process.
Section 8 discusses post-delivery development issues, and Section 9 presents a planning in terms of tasks and timelines.
Bergman, Michael K. 2009. "The Fundamental Importance of Keeping an ABox and TBox Split", AI3, online at http://www.mkbergman.com/489/ontology-best-practices-for-data-driven-applications-part-2/
Bergman, Michael K. 2014. "Knowledge-based Artificial Intelligence", AI3, online at http://www.mkbergman.com/1816/knowledge-based-artificial-intelligence/
Brant, Kenneth and Svetlana Sicular. 2018. "Hype Cycle for Artificial Intelligence", Gartner Research, online at https://www.gartner.com/en/documents/3883863.
Ehrlinger, Lisa and Wolfram Wöß. 2016. "Towards a Definition of Knowledge Graphs." SEMANTiCS (2016), online at http://ceur-ws.org/Vol-1695/paper4.pdf
Gligoric, Nenad and Martin Kaltenböck. 2018. "Enhanced data management techniques for real time logistics planning and scheduling". Online at https://ec.europa.eu/research/participants/documents/downloadPublic? documentIds=080166e5c03e0e8b&appId=PPGMS
Pan Jeff Z., Matentzoglu Nico, Jay Caroline, Vigo Markel, Zhao Yuting (2017). Understanding Author Intentions: Test Driven Knowledge Graph Construction. In: Pan Jeff et al. (eds) Reasoning Web: Logical Foundation of Knowledge Graph Construction and Query Answering. Reasoning Web 2016. Lecture Notes in Computer Science, vol 9885. Springer, Cham. Online at https://doi.org/10.1007/978-3-319-49493-7_1
Van der Eijk, Cees. (2020). SSHOC D9.6 Demarcation Report of Electoral Studies User Community. DOI: 10.5281/zenodo.3725823, online at https://zenodo.org/record/3725823#.XoSbSdMzajg