On using the core public sector vocabulary (CPSV) to publish a "citizen's guide" as linked data

A large number of public authorities worldwide publish descriptions of their administrative procedures (public services) in public service catalogues. These public services descriptions are often based on a national or an ad hoc description template (also referred to as model). Moreover, usually they are not provided in a machine-readable format. At the end of 2016, the European Commission released CPSV-AP 2.0 (Core Public Sector Vocabulary-Application Profile) which is a Core Vocabulary for public service modeling and publishing, incorporating Linked Data technologies. In this paper, we present an approach for using CPSV-AP 2.0 to model and publish, as Linked Data, public services descriptions of a regional public service catalogue. Concequently, we discuss the challenges that we encountered during the modeling and publishing procedure.

the maturity of their online provision across the Member States [4]. It has been estimated that their online provision across the EU could contribute to high annual savings [5].
Public service (PS) modeling is a cornerstone of PS provision. Usually, PS modeling comprises the first step and the basis for the development of online Public Service Catalogues and/or eGovernment Information Systems.
In practice, the development of eGovernment Information Systems is usually based on ad hoc PS models or models included in national interoperability frameworks (e.g. the Greek eGIF [21]). In academic literature, a number of public service models have also been proposed, e.g. the Governmental Markup Language (GovML) [32], service model of the Governance Enterprise Architecture (GEA) [20], etc. Additionally, standardization bodies such as CEN and W3C promote the definition of commonly agreed standards for public service modeling towards facilitating eGovernment interoperability [12]. Furthermore, W3C has established the W3C EGOV Interest Group that is interested in advancing eGovernment through W3C technologies [13]. The development and use of a common public service model is among the issues of interest of this group [19].
Despite the significance of online PS provision and the plethora of public service models, both applied and theoretical, until recently a universally-accepted standard (model) did not exist. The absence of such standard results in wasting of recourses and hindering of interoperability [16,28]. Additionally, it has been suggested that the introduction of unified public service models will bring a number of benefits, including reduced development costs, improved software quality, improved users experience and interoperability across different eGovernment Information Systems [31]. Furthermore, it has been articulated that sharing of common PS models would improve the analysis and development of eGovernment Information Systems, increase the efficiency of online PS provision, facilitate the adoption of public services by citizens and promote the reusability of public services [14,15,18]. Thus, the development and adoption of a relevant European standard is considered a prerequisite for eGovernment interoperability and cross-border pan-European PS provision, facilitating the vision of a Digital Single Market, which is a major priority of the European Union [6].
Identifying the need for a standard PS model, the European Commission (EC) launched in 2012 the Core Public Service Vocabulary (CPSV) initiative [11], in the framework of ISA (Interoperability Solutions for Public Administrations) and its successor ISA 2 (Interoperability solutions for public administrations, businesses and citizens) programmes [29], aiming at developing a simplified, reusable and extensible model that captures the fundamental characteristics of a service offered by public administrations. CPSV was released in 2013 aiming to become the de facto European Standard for public service modeling. Subsequently, at the end of 2016, EC announced the final version of CPSV-AP 2.0 [30], which is an application profile of CPSV. CPSV-AP 2.0 was developed by a working group that included representatives from all European Union (EU) Member States. CPSV-AP 2.0 incorporates Linked Data as underpinning technology.
The "Citizen's Guide of the Region of Epirus", which was initiated in July 2011, is an example of a regional PS catalogue. This catalogue has been awarded as "best practice" in the European Public Sector Award 2015 competition, which was organised by the European Institute of Public Administration (EIPA) with the support of the European Commission. Despite this fact, this catalogue is based on an ad hoc PS model.
The objective of this paper is to present the use of CPSV-AP 2.0 in order to migrate the "Citizen's Guide" to linked data technology. The motivation is based on the assumption that exploiting linked data technology using a European Standard, namely the CPSV-AP 2.0, will provide value to all stakeholders, including citizens, policy makers, public organizations, etc. For example, PS modeling using Linked Data would enable PS visualizations and application of knowledge management and knowledge representation methodologies. Additionally, the use of a European standard would facilitate semantic interoperability across European public organizations for the provision of pan-European public services.
In the rest of this paper, we present the steps towards using CPSV-AP 2.0 to publish the PS descriptions included in "the Citizen's Guide of the Region of Epirus" as Linked Data. In section 2, we provide some background information, including a short description of CPSV, reflections of the literature on Linked Government Data and a short presentation of CPSV-AP 2.0. Following, in section 3 "the Citizen's Guide Project" is briefly presented. Subsequently, in section 4, the approach that was adopted for using CPSV is described. In section 5, our effort for using CPSV-AP 2.0 to model and publish the public services of the Region of Epirus is summarized. In section 6, we discuss the main challenges and in section 7 we draw our conclusions and articulate plans for future work.

BACKGROUND 2.1 Core Public Service Vocabulary
The Core Public Service Vocabulary (CPSV) is a European Standard designed to model PS descriptions and bring consensus on the basic PS metadata across EU Member States. CPSV has not been designed to model every characteristic or every property of PS across all European Countries as well as all domains and all Public Organizations. In the position paper of European Commission [10] it is suggested that "A Core Concept is a simplified data model that captures the minimal, global characteristics/attributes of an entity in a generic, country and domain neutral fashion. It can be represented as Core Vocabulary using different formalisms (e.g. XML, RDF, JSON)". Rather, CPSV is proposed as basis for new PS models and as the linking element between existing PS models for achieving a minimum level of semantic interoperability. Thus, new PS models can be designed based on CPSV while existing PS models can be mapped to CPSV.

Linked Government Data
Linked data refers to "data published on the Web in such a way that it is machine-readable, its meaning is explicitly defined, it is linked to other external data sets, and can in turn be linked to from external data sets" [2]. Linked Open Data (LOD) technology aspires to realize the Web of Data where data exist as self-descriptive entities. Here, data are not integrated with applications but rather exist autonomously and can be interlinked through typed links. Linked data technology exploits Uniform Resource Identifiers (URI) and the Resource Description Framework (RDF) to describe data and their interrelations with other data [9]. During the previous years, there has been an increasing trend to publish data as linked data to be more easily sharable and reusable [17]. The Web of Data aims at interconnecting isolated data islands (data silos) to a giant distributed dataset, also called the linked open data (LOD) cloud. Thus, LOD technology transforms fragmented datasets, which exist at data silos, to a global data space, that can be visualized as a graph. Public sector linked data hubs could contribute to sharing, reusing, federation and integration of Public Sector Information. Furthermore, LOD technology enables public organizations to publish their data in a structured and modular format, facilitating the systematic data and knowledge management of the produced semantic database. Thus, the use of linked data could potentially be an integration paradigm for public services and organizations offering significant benefits to citizens (individuals and enterprises) as well as public organizations (including civil servants and policy makers).
However, the benefits of publishing public services descriptions as linked data are counterbalanced by the difficulties of the publishing procedure. The literature suggests that although linked data technology and standards are mature and powerful there is still much work to be done on formalizing the process of publishing linked government data by developing straightforward patterns that could be easily adopted by public organizations or institutes [27]. Additionally, it is pointed out that there are no systematic guidelines for publishing Government Linked Data, including technical details about all the steps of the publishing process [1].

Core Public Service Vocabulary
Application Profile 2.0 CPSV-AP 2.0 is the Application Profile, version 2.0, of CPSV. An Application Profile is "a specification that re-uses terms from one or more base standards, adding more specificity by identifying mandatory, recommended and optional elements to be used for a particular application, as well as recommendations for controlled vocabularies to be used " [30]. CPSV-AP 2.0 was released by the end of 2016 and incorporates Linked Data technology. For example, it reuses many of the classes and the properties of well-established (persistent) vocabularies of the Semantic Web, including other European Commission (EC) ISA Core Vocabularies, e.g. Core Public Organisation Vocabulary (CPOV). The use of well-known vocabularies to represent data and their real-life meaning is the key factor that turns Linked Data to semantic data [22]. A PS model is compliant with the CPSV-AP 2.0 model if it includes at least information for the mandatory properties of the mandatory classes. In figure 1, the mandatory classes and properties are shown in blue (or dark grey). There are two mandatory classes and seven mandatory properties of these classes. If an optional class is chosen to be included in a PS model then its mandatory properties should be also included in the model.

THE CITIZEN'S GUIDE PROJECT
The Citizen's Guide of the Region of Epirus is a structured catalogue of descriptions of public services provided by the Region of Epirus, which is in Greece [3].
The methodology used to develop the guide follows. Firstly, a template was created for describing each public service in a structured manner. The information collected using that form constitutes the profile of each service. The template includes fields that contain important information, such as the cost for citizens, the completion time, the relevant legal framework, the steps for the completion of the procedures, the supporting documents, the validity time, etc. All public services are grouped by Directorate General of the Region of Epirus and in a second level by thematic category. All information is provided to citizens by a website specifically designed for that purpose. Currently, the Citizen's Guide of the Region of Epirus includes about 250 public services. The produced certificate(s), document(s) or other output is described Validity time of the output document(s) The time period that the produced output is in effect is included Renewal procedure (if exists) It is an optional field. The renewal procedure (if exists) is described Competent organizational unit The organizational unit(s) that are responsible for the execution of the PS are included Additional Sources of Information It is an optional field. Any sources providing relevant information are included.

Notes
It is an optional field. Any additional notes, relevant to the PS, are included Relevant Files The citizen's application or any relevant document, for example legal documents, are grouped and uploaded as a .zip file The template of a Public Service description is shown in table 1 (the names of the fields have been translated in English from "the Citizen's Guide of the Region of Epirus" where are appeared in Greek).
Some of the benefits for citizens are the following: • Direct and easier access to information for all citizens, especially for people with moving difficulties (e.g. disabled, inaccessible areas residents, etc.). • Reduction of commuting for citizens, thus saving time and money. Also, the reduction of traffic congestion that is huge problem mostly for large urban centres. • Increased transparency and sense of trust in public service provision. • Increased participation of citizens in assessing and improving administrative procedures is promoted by incorporating a tool for communicating with the Region and a tool for assessing each public service description.

The benefits for the public administration (Region of Eripus) include:
• Increased decongestion of the authority by the large volume of people waiting to be served. As a result, better utilization of the time of civil servants is achieved and thus their productivity is increased. • Increased cooperation between the organizational units of public authorities for improving public services. • Development of a solid basis to further model and reengineer public services. • The public services descriptions could be used as core material for the developing of relevant transactional applications for these public services. • Public services are becoming a part of Regions Total Quality Management (TQM) methodology that would potentially be implemented in the future.
In addition, the Citizen's Guide can be used as a platform for promoting the cooperation between all Regions of the country as many of the administrative processes are common to all regions.
Finally, it is worth noting that the implementation team of Citizen's Guide was composed exclusively of civil servants whereas Free/Open Source software was used. As a result, the implementation, maintenance and support costs were kept very low. The implementation period of the Citizen's Guide was about two and a half years, specifically from the summer of 2011 until the beginning of 2014.

MOVING INTO CPSV AND LINKED DATA: THE APPROACH FOLLOWED
In the framework of ISA programme and its successor ISA 2 programme, a procedure for public service modeling based on CPSV-AP 2.0 and the transformation to linked data has been proposed [23]. To support this procedure, a number of software tools has been developed. From this toolbox, in our work, we have used the following tools: • CPSV-AP 2.0 Template Spreadsheet [25], which is an Excel template for publishing Linked Data according to CPSV model. • CPSV-AP 2.0 RDF skeleton for mapping the sheets and columns of the Template Spreadsheet to the Classes and Properties of the CPSV-AP 2.0. This also includes the RDF predicates of the CPSV-AP 2.0 model [24]. • CPSV-AP 2.0 Validator [26], which can be used to validate the compliance of produced linked data with the CPSV model.
In addition, we have also used the following tools for transforming PS descriptions from a tabular format to a machine-readable (RDF/Turtle) format: The adopted procedure [23] has been initially devised for the modeling of interoperability solutions descriptions, based on another EC ISA model, namely the ADMS-AP. It comprises the following steps: • Step 1: Fill in the spreadsheet. CPSV-AP 2.0 Template Spreadsheet consists of one sheet per CPSV-AP 2.0 class, which in turn contains a column per property. Cells contain the values of the properties, i.e. the actual information about a Public Service. Two classes can be interlinked by having properties sharing the same URI. To provide a second value for a property, two rows need to be used. All rows with the same URI in the first column will be merged by OpenRefine. • Step 2: Import the spreadsheet in OpenRefine. In this step, a new project in created in OpenRefine and the spreadsheet from the previous step is imported to this project. It should be noted that, to the best of our knowledge, no further instructions are provided for publishing public services descriptions as linked data or validating the produced linked data.

PUBLISHING "CITIZEN'S GUIDE" AS LINKED DATA
The publishing procedure started with the selection of a set of public services. The selection was made by the authors in cooperation with domain experts of the Region of Epirus. A basic criterion was PS popularity. As a result, five public services were selected for inclusion in this first pilot phase. For modeling the selected public services according to CPSV-AP 2.0 and consequently the transformation of the tabular data to linked data format we followed the procedure described in the previous section. More specifically, first, we extracted the selected public services descriptions from the Citizen's Guide and copied it to the CPSV-AP 2.0 Template Spreadsheet (step 1). The mapping between the properties of the Citizen's Guide of the Region of Epirus and the properties of the CPSV-AP 2.0 is depicted in Table 2.
The population of CPSV-AP 2.0 concepts was performed using the CPSV-AP 2.0 Template Spreadsheet (step 1). This is a simple to use Excel file with a clear structure that corresponds to the CPSV-AP 2.0 model. Here, we had to map the concepts of the existing template to the Excel template and to migrate the content. In the process, we faced syntactic and semantic interoperability obstacles. For example, all documents, required as an input to a public service, were located in a single text field in the original PS Catalogue. However, these had to be split manually, as in the Excel template each input document should be inserted in a separate row (this corresponds to the value of the property PublicService:HasInput which contains the URI of a document). Subsequently, we transformed the public service descriptions from a tabular format to a machine-readable format (namely RDF/Turtle) using OpenRefine and the corresponding RDF extension (steps 2 4). In this process, we utilised the RDF skeleton (see Fig.2) to automate the alignment of the tabular data with the CPSV-AP 2.0 schema.
The installation and use of OpenRefine and the corresponding RDF extension was straightforward. Attention should be only paid to ensure that the version of RDF extension is compatible with the version of OpenRefine. Moreover, the RDF skeleton was very helpful, facilitating the transformation of tabular data to linked data, in the form of an RDF file.
Following, we imported the RDF file to the CPSV-AP 2.0 Validator (step 5), which provides a user-friendly interface for validating the produced RDF file for syntax violations or other inconsistencies with the CSV-AP 2.0 schema.
CPSV-AP 2.0 Validator is a user-friendly tool. Here, we encountered some minor technical issues, which were fixed. More specifically, some false warnings were produced by the tool due to a minor syntactic error in the RDF skeleton.

DISCUSSION
In the first step of the followed approach, namely the modeling of public services based on CPSV-AP 2.0, we encountered a number of challenges. Here, we discuss the most important.
Firstly, the format of the properties of the "Citizen's Guide" and the CPSV-AP 2.0 were not identical. For example, in the original catalogue, the titles of all required (input) documents for a public service were included in one text cell while in CPSV-AP 2.0 model each supporting document must be placed (as a URI) in a separate cell.
Furthermore, all properties of the original catalogue were of string type. Thus, there were no URI values in any field of the model. More importantly, there is no URI policy at National or European level for any property (field) that requires a URI as a value. We believe this is a major drawback of the whole process.
Another difficulty is that the categorization of the public services of the Region of Epirus cannot be directly mapped to the proposed categorization of the CPSV-AP 2.0 (which is reported in the PublicService:Sector property). In addition, the cardinality of some properties of the models is not the same. Finally, some properties of CPSV-AP 2.0 are in TBC (to be confirmed) status, e.g. the property "type" of the "Evidence" class.
Beyond the above challenges, there are also additional challenges that should be addressed. For example, there are no guidelines on how to publish the RDF file or how to evaluate its potential added-value.

CONCLUSIONS FUTURE WORK
The evolution of the semantic web and linked open data technologies is promising for providing benefits in eGovernment and public service provisioning [9,27]. In this respect, CPSV-AP 2.0 seems like a promising European standard for modeling and publishing public services, which also incorporates Linked Data technologies.
However, there are still challenges for exploiting CPSV-AP 2.0 at a large scale. One of the most important, is the absence of a concrete and tested procedure, including detailed guidelines, that a public organisation could follow to easily model and publish its administrative procedures (public services) using CPSV-AP 2.0. These guidelines should include information about all needed tools, a URI design policy, detailed examples, etc. These would enable us to evaluate whether the potential benefits are worth the effort needed towards the exploitation of CPSV-AP 2.0 for public service modeling and publishing.
Our future work includes the introduction of a concrete procedure for using CPSV-AP 2.0. Concequently, we will apply it to a major part of "Citizen's Guide" content and evaluate the potential added value produced by its adoption. The evaluation process will follow a number of scenarios for different stakeholders (e.g. citizens, policy makers, public servants etc).