IoT as a Digital Game Changer in Rural Areas: the DESIRA Conceptual Approach

Digital transformation is a process encompassing significant changes in both social and economical domains because of the adoption of digital technologies. The EU H2020 DESIRA (Digitisation: Economic and Social Impacts in Rural Areas) project is working on defining a methodology and creating a knowledge base to characterize digital transformation. The goal is to support those in charge of responding to digitization-related challenges in rural areas, especially considering agriculture and forestry. This work presents preliminary activities in the project aiming to identify (i) Digital Game Changers, like Internet of Things (IoT), facilitating the digital transformation; and (ii) a robust set of exemplary Application Scenarios (ASs). This task will support forthcoming activities aiming to assess the socio-economic impact of digital transformation in rural areas.


I. INTRODUCTION
Digital transformation is a process responsible of profound changes in the economy and in the society as a result of the uptake and integration of digital technologies. The latter play a central role, disrupting existing mechanisms and thus triggering responses from actors impacted both positively and negatively by such a process [1].
The EU H2020 DESIRA project 1 aims at assessing and anticipating the impact of digital transformation in rural areas, with a strong focus on agricultural and forestry activities. A large and multi-disciplinary consortium deals with the creation of a conceptual and analytical framework, with the assessment of past and present game changing effects of digital technologies, and with the set-up of a methodology to anticipate future effects. The framework and assessment will be evaluated through 20 Living Labs (LLs) in different European geographical areas; each LL is composed of several stakeholders, being either part of the project consortium or providing external expertise. LLs can be defined as user-centered, open innovation ecosystems based on systematic user co-creation approach, integrating research and innovation processes in real-life communities and settings. The main purpose of a LL is to assess the past and present situation (system-as-is) in the geographical 1 Project website available at: http://desira2020.eu area regarding its focal question, identifying both drivers and obstacles in the present system, and then agree on a desired future situation (system-to-be), highlighting the role that the introduction of digital technologies may play in enabling it. LLs will exploit an impact model of digital technologies in exemplary ASs -which we preliminary introduced in [2] -derived from a structured survey with internal experts, interviews with external experts, and literature analysis. An AS can be defined as a clear context in which a set of digital technologies is used to meet specific needs. For instance, the IoTrees project 2 aims at assessing both frequently and consistently plant growth in forests in order to properly support forests' owners through a set of digital tools. It exploits an IoT solution based on sensorised dendrometers. This example is as an instance of the AS Digital Forestry.
How to build such an impact model is the main topic of this work, a conceptual contribution towards the assessment of both involved digital technologies and socio-economic impacts for ASs in rural areas. As depicted in Figure 1, the main objective is to link socio-economic aspects -in a preliminary way -to digital technologies through ASs. Once the context, the needs, and the involved digital technologies in an AS are identified, then the socio-economic impact can be estimated, ultimately referring to the United Nations (UN) Sustainable Development Goals (SDGs) 3 .
The rest of this work is organised as follows: Section II provides a brief overview of related works, then the proposed approach is described in Section III. Eventually, the conclusions and future plans are in Section IV.

II. RELATED WORK
This section presents relevant contributions linked to IoTtriggered digital transformation. The paper in [3] provides a very comprehensive survey in the field of Industry 4.0. The authors refer to the fourth industrial revolution as a phenomenon mainly based on Cyber-Physical Systems (CPSs), integrating computing, communication, and control capabilities with data analytics. The pervasiveness of devices able to collect data from the physical world, fueled by the IoT paradigm and coupled with CPSs, is radically changing industry processes. Several technological enablers are highlighted in [3], including Artificial Intelligence (AI), robotics, cloud computing, and others, further than IoT; those allows to implement what the authors call central paradigms, as for instance the smart product.
When looking closely to the domains of interest for the DESIRA project -i.e., rural areas, explicitly encompassing agriculture and forestry -, references to CPSs can be found as well, as for instance in [4], which focuses on agricultural machinery. The authors consider both physical and cyber elements, communicating through interfaces, and two external entities: the worker, and the global environment. The former is connected to the CPS via a human-machine interface, while the latter is a source of information through external services, e.g., large-scale monitoring systems as aerospace systems [5]. On a similar note, the work in [6] looks to agriculture considering its relations with natural and human systems. As the work in [3] provides a high-level taxonomy of digital technology in the context of Industry 4.0, also reference [6] provides a taxonomy of digital dimensions for Digital Agriculture (or Agriculture 4.0): four areas are presented -smart farming, precision farming, precision agronomy, and precision livestock -, supported by: IoT, cloud infrastructures storing datasets to feed Decision Support Systems (DSSs), CPSs and autonomous vehicles, Unmanned Aerial Vehicles (UAVs), and the key feature of interoperability. Technological enablers for Digital Agriculture are also discussed in [7], and the work in [8] widens the scope looking to EU-funded projects -such as DEMETER, SmartAgriHubs, IoF2020, and ICTAgriFood [9] projectstowards a larger adoption, and to non-technological barriers still hampering it. Considerations on IoT in agriculture can be read in [10], and the increasing use of robotics and autonomous systems is highlighted in [11], [12].
Reference [13] focuses on digital technology in rural areas, classifying it according to its complexity and stage of penetration, as in the following ordered list: (i) mobile devices and social media; (ii) precision agriculture and remote sensing technologies; (iii) big data, cloud, analytics, and cybersecurity; (iv) integration and coordination, as for instance blockchain-based products; (v) intelligent systems, based on AI and Machine Learning (ML). The use of digital technologies in rural areas fosters the creation of the socalled smart villages [14], i.e., the revitalising of rural services through digital and social innovation. Rural servicessuch as health, energy, transport -can be improved and made more sustainable thanks to Information and Communication Technologies (ICT) tools, and the paper in [15] analyses the ICT role with respect to SDGs. Similar considerations can be made for forestry, or paradigms like digital forestry or smart forestry [16], and the work in [17] highlights the role that big data can play for smart forestry.
When assessing the effects of the digital transformation, the work in [18] provides valuable insights on side effects, especially unintended ones. A key question relates to the ownership, economic value, use and access of data, which calls for guidelines on the responsible use of digital data 4 .

III. PROPOSED APPROACH
The approach proposed in this work is depicted in Figure 2. Similarly to [3], a four-layers IoT-inspired framework is considered: the perception layer at the very basis, the transmission layer on top of it, then the computation layer, and finally the application layer. Such a model sees the interactions with the physical world through sensors and actuators in the perception layer; data collected from sensor nodes are then transmitted to the computational layer for storing and analysis via the transmission layer, using short to long-range connections. On the top there is the application layer, providing services and applications.
As anticipated in Section I, collecting and then clustering instances in ASs is one of the first tasks in the DESIRA project. The collection is performed through an online questionnaire. The respondents are the DESIRA project partners, each with different expertise in social, economic, technological, and policy-related topics. About 1000 answers are expected. The questionnaire includes three types of questions: (i) open-ended questions; (ii) closed-choice questions; and (iii) LIKERT scales on socio-economic impact (including environment). In the following of this work, we focus on a few illustrative examples of those answers, considering only the agricultural field.
The questionnaire enquires about three main aspects: (i) the purpose of the digital solution (e.g., pest control in agriculture); (ii) the technological paradigms in use (e.g., cloud computing, IoT); (iii) the expert's opinion on 12 socio-economic indicators. The latter is a preliminary activity to trigger a discussion on such topics: because of Figure 2: A four-layer logical framework for IoT-based applications (top left), collecting data from the physical world and then offering services and applications (on the right). Rural areas are here considered, and three example applications (or instances) are shown (i.e., fire prevention, pest control, and e-health). Instances can be grouped in ASs, the latter then linked to an evaluation of their socio-economic impact (bottom left).  this, responses are more likely to carry actual value when clustered together, thus 'averaging' several experts' opinions around a chosen dimension, like an AS. The 12 topics related to the socio-economic impact can be read in Table I, and the candlestick chart in Figure 2  To better clarify our approach, we provide an example in Section III-A, then we describe semi-automatic techniques to identify the ASs in Section III-B, and finally we briefly present the digital inventory tool in Section III-C.

A. Example of applied approach
We provide here an example of AS, along with its description and evaluation. We analyse three responses to the online questionnaire, centered around precision farming (i.e., the AS under consideration in this work). The first response, which we refer to as R1, provides details about the WEEDELEC project 5 , aiming at automatic detection and removal of weeds in crops. The second one, or R2, is related to the AgriCLOUD system 6 , a cloud-based precision farming management system for efficient and sustainable production crops. The third answer, or R3, describes the VINBOT project 7 , which provides tools to capture and analyse vineyard images to estimate the potential yield. For each of them, Tables II and III provides details on collected answers, the former on technological aspects, the latter on socio-economic aspects. As already stated in Section III, several responses per AS are more likely to carry actual value because they represent the 'aggregated' opinion of several experts. For instance, and only focusing on socioeconomic questions, Table III shows the aggregated opinion on R1, R2, and R3: the weight is calculated as the number of collected responses to the specific question divided by the total number of responses. Eventually, the score is the average opinion value multiplied by the weight.

B. Semi-automatic Clustering and Retrieval of Application Scenarios
Once enough instances have been collected, a knowledge base (KB) is ready for use, to be clustered into ASs (or different topic-based groups) in a semi-automatic fashion. This can be useful to multiple actors, including policy makers and system analysts. Policy makers can use the tool to get statistical information, and to understand trends about the impact of digital technologies. System analysts can exploit the platform during the early phases of requirements definition to estimate the socio-economical impact of their proposals by looking for similar instances in the KB, the latter described in Section III-C.
To address the needs of the different stakeholders, the envisioned platform supports typical database queries (e.g., retrieve all instances using a certain technology), as well as semi-automatic clustering and retrieval of the instances. To enable clustering and retrieval, each existing instance is represented as a numerical binary vector, and similar instances (i.e., instances associated to vectors that are closer in the vector space) can be clustered together on-demand. Furthermore, given a novel instance, this can be represented as a vector in the same space, and, through similarity measures, we can retrieve a ranked list of similar instances to take as reference for comparison.
As aforementioned in Section III, instances are derived from the questionnaire. The questionnaire includes three types of questions: (i) open-ended questions, to characterise the instance from an application viewpoint; (ii) n closedchoice questions q i , with c qi choices for each question; (iii) 12 impact-related questions on a [−5, 5] I scale (see Table I). Each instance can be represented through a vector s that has N = |V |+ This representation entails a vector space in [0, 1] N R . Within the vector space, a metric of similarity among vectors can be defined. A typical similarity metric is the cosine similarity [19]. Given two vectors a, b ∈ [0, 1] N R , the cosine similarity σ, taking values in [0, 1] R , is defined as follows: The variable K is the number of vector components taken into account when computing the similarity among vectors. From a geometrical standpoint, when the cosine of the angle between the two vectors is close to one, the two vectors are closer, thus their similarity is higher. When the cosine is closer to zero, the two vectors tend to be orthogonal, as they have few components with the same values, and are therefore less similar. For instance, only considering the features in Table II, σ(R1, R2) = 0.55, σ(R1, R3) = 0.8, and σ(R2, R3) = 0.73.
Given the similarity metric in Eq. 1, different clustering algorithms can be used to group the vectors (instances) into clusters (which can be either ASs or different topic-based groups). Typical clustering algorithms include k-means and DBSCAN [19]. Clusters can be created based on a subset of the components. A first clustering foreseen is oriented to identify ASs. This clustering considers only the textual components (i.e., K = |V | in Eq. 1), since they characterise the application. Those will belong to the same AS, pending a manual verification. The name of each AS will be chosen manually, and each instance can be tagged with it to ease retrieval. Furthermore, additional semantic tags are envisioned trough the exploitation of Wikipedia portals, which are structured collections of categories and pages in a thematic area, e.g., Agricultural Technology 8 . In Fig. 3, we showcase the Wikipedia subcategories connected to the latter one. In Figure 3b, the main category Agricultural Techno- Domain Maturity  TP1  TP2 TP3 TP4 TP5  TP6 TP7 TP8 TP9 TP10 TP11  R1  RA  prototype  R2 RA prototype R3 RA in use   Table III: Experts' opinions on socio-economic aspects in the three analysed responses ('-' translates into 'no response', and '0' into 'no impact'). Please refer to Table I for details on the questions Q1-Q12, and on Section III-A for more details.

ID
logy is selected, showing related sub-categories, and therefore potential additional tags. Other clustering approaches are possible through this vector-based representation. For example, one can be interested in excluding the impactrelated components, and group only according to the answers to the closed-choice questions. In this way, similar instances, e.g. in terms of used technologies, can be clustered together. Given a cluster, the socio-economic impact can be visualised. The similarity measure directly entails a similaritybased ranking (i.e., a partial order) among instances similar to a given one. System analysts can exploit the questionnaire describing their project -without filling the impact sectionand retrieve similar instances: in doing so, their project will fall withing an existing cluster, having a given impact score.

C. Digital Inventory Tool
On the basis of the conceptual framework presented above, DESIRA will develop the presented impact model of digital technologies in relevant ASs. The four-layered approach in Figure 2 is expected to create a substantial amount of knowledge artifacts. DESIRA will develop a KB for representing and exploring the related resources. Users of the KB can access information on ICT tools both applied and applicable to rural areas. The searchable Digital Inventory Tool (DIT) will visualize the information and support the proper annotation of content for efficient retrieval and exploration. Furthermore, it will open to the collaborative handling of the impact model and to the authoring of KB items. The DIT aspires to contribute to a better understanding of the various perspectives on ICT and support users, both technological and non-technological ones, to familiarise with the domain.

IV. CONCLUSION
The massive presence and the growing set of ICT technologies, and combinations of them, creates also the need for a systematization and categorization, in order for them to be effectively deployed. Following on this, this work has described, in a conceptual manner, the approach proposed in the H2020 DESIRA project to build an impact model aiming at analysing digital tools, such as IoT-based solutions, supporting the on-going process of digitization in rural areas. The core idea sees the semi-automatic identification of relevant ASs by analysing responses to an online questionnaire, experts' interviews, and literature analysis. The identified ASs will be characterised in terms of involved technological paradigms and of a preliminary assessment of their socio-economic impact. A Digital Inventory Tool will be developed to explore the resulting KB, also opening to annotation of the content and collaborative handling of the KB items.