Investigating Platform Characteristics and Their Implications
Authors/Creators
- 1. Dr. Astrid van der Meer and Dr. Julian Lee, Department of Information Studies, University of Amsterdam, The Netherlands; Dr. Liam O'Connor, Research Institute for Sociotechnical Systems, Delft University of Technology, The Netherlands; Dr. Sofia Jensen, SURFnet, Moreelsepark 48, 3511 EP Utrecht, The Netherlands
Description
To help facilitate expertse in IoT technologies, the Netherlands eScience Center (NLeSC) and SURF worked together on a project focusing on IoT applicatons and platorrs. The inforraton included in this case study show the results of NLeSC and SURF's investgaton, exarining diferent features ofered by cloud and self-raintained IoT platorrs with an overall surrary of an IoT architecture Introducton Internet of Things (IoT) is a paradigm shif, in which all inanimate and animate 'things', are connected and made intelligent while at the same tme are embedded and part of the environment. IoT is an integrated technology composed of collaboratve sensing, wireless (opportunistc) networking, pervasive computng, in-situ intelligence, sensor data analytcs, and actve interacton. Although not an entrely new concept, it has recently gained much popularity especially because of its adopton in many domains, for example health, real-tme monitoring and control, and logistcs, and new predicton regarding an explosion in number of connected devices in coming years. Unlike their predecessor, i.e., wireless sensor network applicatons, IoT applicatons are not applicaton specifc, but domain specifc and as such bring heterogeneity (in technology, use, requirements, etc), dynamicity, scale, autonomy, and adaptability challenges to a new dimension. While currently there exist a number of solutons, architectures and platorms supportng co-creaton of IoT eco-systems, the diversity and heterogeneity of technological solutons, applicaton segments, requirements, and use cases make it difcult to identfy which platorm is the best suitable. The challenge is not only to select a platorm that solves the interoperability and unifcaton problem of existng IoT technologies and applicatons, but also the ones yet unforeseen. This technical note examines diferent features ofered by cloud and self-maintained IoT platorms with an overall summary of an IoT architecture. It is organized as follows: Secton 2 describes a generic architecture of IoT platorm and its components. In Secton 3, we describe and compare most promising open-source IoT platorms. We conclude the note with recommendatons in Secton 4. The architecture of IoT platorm The term Internet of Things (IoT) loosely refers to the number of devices (including vehicles and appliances) interconnected with each other and exchanging data via a so-called IoT platorr. A careful approach to the architectural design can ensure proper integraton of a large variety of devices. We will now describe individual components of a generic IoT platorm (see an overall architecture in Figure 1), and discuss diferent optons for realizing each component. The interested reader is referred to other surveys [7, 1, 8, 23] and the references as examples of studies on IoT architecture and IoT taxonomy. Figure 1: Scheratc depicton of the corponents of an IoT environrent. The blue box depicts the scope of the platorr. Device management The device management component handles the interacton with the devices. Examples of these interactons are: device registraton and actvaton, monitoring and frmware updates. How to implement this functonality is very dependent on the device types and connectvity. This can make it difcult to provide a generic soluton. An advantage is that this component does not interact much with the other components of the platorm and is therefore independent. Preprocessing Preprocessing concerns the transformaton of the raw data received from the device before storing it in the data store. This transformaton can be done for various diferent reasons. Three common reasons are applying quality control, adding metadata and data restructuring: Quality control Devices might send corrupt data or have sensor malfunctons. During this step we want to detect these malfunctons where possible and either drop the corrupt data or fag the suspicious records so we can decide how to handle this in the processing/enrichrent. Adding metadata Adding additon meta data to the records is very important for data provenance and can be required for reproducible science. Examples are informaton about the device (identfer, sofware version), tme received, and version of the preprocessing sofware. Data restructuring The data format sent by devices is usually optmized for minimal bandwidth and on-device computaton requirements. When storing the data, it is more important to have it structured in a way that allows efcient processing, has good compression and contains schema informatonnversioning. There are additonal concerns for the preprocessing that depend on the nature of the data transfer between the device and the platorm. The frst is the grouping of the data: does it received one record at a tme, or in batches of data based on tme or size? Secondly, is the data fow constant in volume over tme, or can there be a sudden peak or large bursts of records? Data store The data store is responsible for the long-term persistent storage of the data. The two most important aspects of the data store are durability, not losing data afer it has been stored, and the ability to handle ever growing data volumes (a form of scalability). Additonally, it is convenient if the data store has good access methods: both efcient querying of subsets of the data and the ability to do parallel reads of large data volumes. Although we use the singular term of Data Store, this component might include multple subsystems that each contain either the full dataset or a subset. A common setup uses a scalable fle or object store (sometmes called the data lake) for all the raw data as received from the preprocessing component, and one or more databases that contain a subset of the processednenriched data. These databases are designed and optmized for specifc applicatons. When choosing the systems used by the data store there are a few diferent optons: File-based Data is stored in a fle-system hierarchy in multple fles. A single fle usually contains many records. Data can be accessed via flename, and additonal query capabilites are limited. Because of the storage requirements, this is ofen a distributed fle-system where the data is stored on multple servers and accessible by diferent clients via a network protocol. Some, but not all, ofer a POSIX-like interface to clients to access the data as if it was locally available. Object-based Similar to fle-based storage, but with a fat instead of a hierarchical namespace were just the object label is used to access the data. This has limited capabilites (no in-place editng) with no or limited POSIX-like interface. Database (relatonal) In a relatonal database the data is stored in tables consistng of rows and columns. Relatonal databases are useful when all rows (also known as records) have the same structure. In practce all relatonal databases are based on SQL. Database (non-relatonal) These are sometmes called NoSQL databases. These databases have diferent object modes. Examples are document stores, graph databases, key-value stores, column family stores. Ofen they focus on functoning at a large scale, sacrifcing query capabilites or strong consistency to accomplish this. Additonal data sources Ofen the data stream from sensors or other IoT devices is combined with `statc' datasets. These datasets can be part of the research project, or could be from an external party. Examples of these datasets with are the GPS locaton of all the sensors, or weather informaton. The platorm needs to be able to incorporate these additonal datasets and either store a copy or interface with the source data. Processing/Enrichment Having the raw data available in the data store can be useful, but is ofen not sufcient. To give meaningful results to the end-user additonal processing is needed. This can be simple data processing that only restructures the data, or data enrichrent where we refne or enhance the data, for example by combining it with additonal data sources. There are few diferent aspects to the processingnenrichment component. Control fow Control fow can be defned as what triggers the enrichmentnprocessing. There are multple optons that could make sense for separate parts of the processing. This could be event-driven, triggered by new input data; request-driven, triggered by usernAPI requests; or periodically. The best opton depends on the update frequency and if higher latency is acceptable. Storage The results of the processing can be stored in a database as part of the data store or recomputed on every new user request. Batch/streaming Depending on the requirements of the applicaton the processing can be done in large batches, or should be using a streaming system. Scalability It can be the case that a single machine cannot keep up with the processing requirements, as new data keeps coming in and results should be delivered within a short tme frame. The processing soluton therefore should be scalable in that the work can be distributed over multple machines. If all records can be processes independent from each other this need not be complicated, but if there are dependencies or aggregatons a suitable distributed data processing framework should be used. Validaton There should be a way to check the validity of the data processing, and processing should be annotated to allow the development and improvement in a reproducible fashion. External gateway We decided that the visualizaton, dashboards and analysis are out of scope for the core research IoT platorm. But these are very important and need a way to interface with the system. This interface (API) is provided by the external gateway. This gateway handles requests from the end-user and returns data based on the request. This data can be be processed before it is returned. The data could be returned as fles that the user downloads and processes of-line, or directly handled by a web applicaton. It is important that the API and the structure of the returned data are properly documented. User management (external users/researchers) Research is never done in isolaton, so the external gateway should provide access to endusers from diferent insttutes. We do want to apply some access restrictons, so some form of authentcaton and authorizaton is required. Ideally, we do not want to force them to create yet another account but be able to use the credentals from their home insttute. With SURFconext [18], SURF (which is part of the natonal e-infrastructure for research and educaton in the Netherlands) ofers federatve access for academia in the Netherlands. SURFconext enables single sign-on access to web, cloud, insttutonal services based on the user's insttutonal account (and therefore re-using the university identty management user registratons). With millions of authentcatons per month SURFconext is a very successful soluton for any HTTP-based applicaton. A limitaton of SURFconext is the fact that by default it only handles web-based applicatons. Rich clientnnon-web applicatons cannot make easy use of it. If the external gateway is only accessed as a web applicaton, this is not an issue. However, we can imagine some cases where there would be a need for non-web access. A soluton for this could be the use of an authorizaton proxy. SURF is currently working on a setup of such a proxy, in a project called the Science Collaboraton Zone [17], which includes a soluton called COmanage [11]. COmanage is a tool that adds a number of useful features, such as on-boarding researchers to one or more virtual collaboratve organizaton (groups) and functonality to register ssh keys to generate one-tme passwords and applicaton-specifc passwords to enable access to non-web-based resources but all afer inital on-boarding based on a verifed insttutonal account. Summary of open source candidates We started this project with the aim to develop a prototype of the IoT platorm that works with a wide range of IoT applicatons as a fnal deliverable. We were looking for a scalable soluton (so able to serve multple applicatonsnuse cases) with minimal changes to the platorm, especially with respect to interfaces. To this end, we identfed two categories of open-source IoT platorms, cloud-centric and self-raintained. Table 1: Corparison of the cloud-centric platorrs and their corponents. Table 2: Corparison of the self-raintained platorrs and their corponents. Cloud-centric solutons For the cloud-centric IoT platorms, we refer to the recent detailed comparison by Guth et al. [9]. This includes the open-source platorms such as FIWARE, OpenMTC, SiteWhere, and Webinos as well as the proprietary solutons such as AWS IoT, IBM's Watson IoT Platorm, Microsof Azure IoT Hub, and Samsung SmartThings. Some of these commercial solutons are surveyed in Table 2. Self-maintained solutons For the self-maintained solutons (also known as on-site or on-prerise), we identfed four promising open-source community projects: Kaa, IoTivity, ThingsBoard and OpenHAB. The summary of the survey is presented in Table 2. We discarded one of the criteria from the previous comparison table, FaaS. None of the four platorms ofer functon platorm solutons (FaaS) as a part of the sofware stack. However, there are a lot of on-premise FaaS that one can embed such as Iron.io (2014), Apache OpenWhisk (2016), Fission (2016), Galactc Fog's Gestalt (2016), OpenLambda (2016), and OpenFaaS (2017). Figure 2: Kaa IoT platorr: Conceptual Architecture (lef), and connectng Kaa to Arbela. (Kaa IoT Technologies, https://www.kaaproject.org.) Kaa [19] is an open-source middleware platorm for implementng IoT applicatons and applicatons for smart devices. The platorm sofware is easy to install thanks to Kaa Sandbox which is a complete virtual machine image. The sandbox comes with a complete Kaa installaton, the sandbox environment, sample applicatons, three types of databases (PostgreSQL, MongoDB, and Cassandra), Android SDK, and other third-party integraton related to enabling diferent hardware vendors. Fig. 2 (lef) depicts a conceptual architecture of Kaa; for more details on the components, we refer to the Kaa documentaton (http:nnkaaproject.github.ionkaandocsnv0.10.0nArchitecture-overviewn). It is released under an Apache 2.0 licence via a GitHub repository (https:nngithub.comnkaaproject). Kaa enables collectng data from devices that use PANbased protocols such as Bluetooth, ZigBee, and Z-Wave. Kaa endpoint sofware development kits (SDKs) handle client-server communicaton, authentcaton, data marshaling, encrypton, persistence and other services provided by the Kaa platorm. In principle, Kaa can handle both structured and unstructured data, though it can manage devices that share the same set of data schemas (Apache Avro-compatble). Kaa supports a framework of pluggable log appenders (e.g., PubNub Log Appender) in order to load data into a database. The data can be send to stream processing or can be made available to custom data processing modules via REST or Apache Flume. The Kaa Cluster uses Apache Zookeeper for the coordinaton of servers, Kaa node electons, failure mitgaton, and load balancing. To enable real-tme monitoring, Arbela [22] can be used as a Kaa IoT Dashboard using the PubNub channel (see Fig. 2 (right)). Figure 3: ThingsBoard IoT Platorr: architecture (lef)) and an exarple of the ThingsBoard IoT Gateway for Sigfox devices (right). (ThingsBoard, Inc. https://thingsboard.io.) ThingsBoard Community Editon [20] is an open-source IoT platorm available from a GitHub repository (https:nngithub.comnthingsboardnthingsboard) under the Apache License version 2.0. The company behind the platorm also ofers a commercial ````professional editon'' with additonal support and extra platorm integratons. The general architecture of ThingsBoard is shown in Fig. 3 (lef). Connectvity with devices is handled via diferent transport components. In additon to the IoT platorm there is also the ThingsBoard IoT Gateway to integrate IoT devices connected to third-party systems with ThingsBoard. An example of the usage of the IoT Gateway can be seen in Fig. 3 (right). Messages received are handled by the rule engine, which allows for both the processing of the data and triggering external alerts based on the content of the message. Data can be stored in an external PostgreSQL or Cassandra database. ThingsBoard utlizes Apache Zookeeper for cluster coordinaton and Cassandra as a NoSQL database. The core services are responsible for the device management, user management and dashboards. The server-side API Gateway provides a REST gateway that allows access to tme-series data, and also allows registered users to send commands to devices. ThingsBoard has a plug-in architecture that allows coupling to external components. Existng plug-ins for Apache Kafa and sending emails are available. Internally, ThingsBoard uses Akka for event-driven message processing. Figure 4: openHAB 2 conceptual architecture. (openHAB Corrunity, https://github.cor/openhab.) openHAB 2 [15] openHAB 2 is an open-source home automaton platorm, which is used for controlling and monitoring devices in the smart homes. It is licensed under Eclipse Public License 1.0, and uses a couple of Eclipse IoT projects (https:nniot.eclipse.orgn) mainly Eclipse SmartHome framework; see the reference architecture in Fig. 4. This platorm has a welldocumented, actvely maintained GitHub repository (https:nngithub.comnopenhab), and provides an excellent support for variety of the smart devices. We had inital concerns about the applicability of a mobility use case, namely if there is a restricton on the number of smart devices that can be connected to the platorm. It turns out scalability is not an issue; however, the security corponent is entrely rissing fror the architecture design, and it needs to be implemented from scratch. This is because of the intrinsic assumpton that openHAB is used behind the home router frewall within one internal network. This ruled out the use of openHAB for our project. Figure 5: IoTivity 1.2 Conceptual Architecture. (Open Interconnect Consortur, https://iotvity.org.) IoTivity [3] started out as a device management platorm which enables seamless connectvity between devices. Upon merging it with another project AllJoyn, Open Connectvity Foundaton (OCF) defned the purpose of IoTivity to be a set of ``specifcatons by OCF to ensure interoperability among connected devices'', as well as ``a reference implementaton of the OCF specifcatons to the open-source community''. IoTivity is an actve project with the source code available via the GitHub repository (https:nn github.comniotvity). Similar to FiWARE, IoTivity ofers cloud interface at the external gateway, it also supports discovery, messaging and security services within its base layer. This can be used to integrate the platorm components with third-party systems. (For more details on the cloud part, we refer to one of the recent publicatons [4].) It is worth to menton that the platorm provides a tool called Sirulator which can help developers test their implementatons without purchasing real hardware. The project also ofers sofware components for the IoT device side for handshaking, resource registratonndiscovery, etc. The conceptual architecture of IoTivity is depicted in Fig. 5, and more details on the functonality of each component can be found at https:nnwiki.iotvity.orgnarchitecture. There are Docker containers to ease the setup procedure, and it can be installed on various Linux distributons and Android system. Conclusions This project wanted to investgate the possible IoT platorms and look at the diferent features ofered by each platorm. In general, the choice of a suitable platorm depends on the applicatons (use cases) researchers are trying to serve. We identfed Kaa and ThingsBoard as candidate solutons based on the following criteria: permissible license, an actvely maintained GitHub repository, clear architecture, and good documentaton. However, if the aim is to have multple applicatons served by the IoT platorm, then it is best to start with a generic framework for interoperability reasons. In this case, the best suited platorms are IoTivity and FiWARE (e.g., smart city use case [13]), although they might require more efort in the implementaton. There are a lot of actve developments in this feld that researchers should to be aware of. For instance, Eclipse has a few IoT projects (https:nniot.eclipse.orgn), which look promising including Eclipse Agail [6], and other open-source projects that have been reviewed by the recently published technical report [21]. There are advantages and disadvantages for using cloud-centric or self-maintained solutons. The self-maintained platorm requires the presence of dedicated servers and an administrator maintaining the setup, connecton, and is responsible for the backup. Finding a good hostng platorm for the self-maintained systems is also a challenge. When arranging this on-premise, inside the academic insttuton, this will require collaboraton with the centralized ICT services in the insttute. Since the platorm has strong requirements for the external network availability and access, centralized ICT might be hesitant in supportng it. External hostng providers will bring additonal costs and risks concerning data security and availability. In contrast, a cloud-centric soluton comes with the cost determined by the service provider but will fully bypass the aforementoned issues. However, this type of soluton means a full dependency on the service provider, including any changes to the API and service costs. References Martn Bauer, Mathieu Boussard, Nicola Bui, Jourik De Loof, Carsten Magerkurth, StefanMeissner, Andreas Nettstrrter, Julinda Stefa, Matthias Thoma, and Joachim W. Walewski. IoT Reference Architecture, Chapter 8. Enabling Things to Talk. 2013. The FIWARE Community. Fiware. https:nnwww.fware.org, 2018. Open Interconnect Consortum. IoTivity. https:nniotvity.org, 2018. Thien-Binh Dang, Manh-Hung Tran, Duc Tai Le, and Hyunseung Choo. On evaluatng Iotvity cloudplatorm. In Corputatonal Science and Its Applicatons - ICCSA 2017 - 17th Internatonal Conference, Trieste, Italy, July 3-6, 2017, Proceedings, Part V, Volume 10408 of LNCS, pages 137–147. Springer, 2017. Netherlands eScience Center. Natonal eScience Symposium 2017. Science in a digital world.https:nnwww.esciencecenter.nlneventnnlesc17, 2017. Eclipse Foundaton. Eclipse Agail. https:nnprojects.eclipse.orgnprojectsniot.agail, 2018. Jayavardhana Gubbia, Rajkumar Buyyab, Slaven Marusic, and Marimuthu Palaniswami. Internetof Things (IoT): A vision, architectural elements, and future directons. Future Generaton Corputer Systers, 29:1645–1660, 2013. J. Guth, U. Breitenbucher, M. Falkenthal, F. Leymann, and L. Reinfurt. Comparison of IoT platormarchitectures: A feld study based on a reference architecture. In 2016 Cloudifcaton of the Internet of Things (CIoT), pages 1–6, Nov 2016. Jasmin Guth, Uwe Breitenbucher, Michael Falkenthal, Paul Fremantle, Oliver Kopp, Frank Leymann, and Lukas Reinfurt. A Detailed Analysis of IoT Platorr Architectures: Concepts, Sirilarites, and Diferences, pages 81–101. Internet of Everything: Algorithms, Methodologies, Technologies and Perspectves. Springer Singapore, Singapore, 2018. Intel. IoT Rest API server. https:nngithub.comnintelniot-rest-api-server, 2018. [11] Internet2. Comanage. https:nnwww.internet2.edunproducts-servicesntrust-identtyn comanagen, 2018. FIWARE Lab. Fiware. Internet of Things (IoT) services enablement architecture, 2018. Andrea Gonzalez Mallo. Development of an IOT front-end with the FIWARE platorm for SmartCity solutons. Retrieved from https:nncore.ac.ukndownloadnpdfn79176696.pdf, 2015. Daniel Moran. Fi-beer. https:nngithub.comndmoranjnf-beer, 2018. openHAB Community. The open home automaton bus. https:nngithub.comnopenhab, 2018. [16] Peter Salhofer. Evaluatng the FIWARE platorm. A case-study on implementng smart applicaton with FIWARE. In Proceedings of the 51st Hawaii Internatonal Conference on Syster Sciences, pages 5797–5805, 2018. SURF. Science collaboraton zone. https:nnwiki.surfnet.nlndisplaynSCZnScience+Collaboraton+Zone+Home, 2018. SURF. SURFconext. https:nnwww.surf.nlnennservices-and-productsnsurfconextnindex.html, 2018. KaaIoT Technologies. Kaa IoT development platorm. https:nnwww.kaaproject.org, 2017. Inc. ThingsBoard. Thingsboard IoT platorm. https:nnthingsboard.io, 2018. Christos Tranoris. Open source sofware solutons implementng a reference IoT architecturefrom the things and edge to the cloud. Technical report, University of Patras, 2018. Walking Tree. Arbela. http:nnwalkingtree.github.ionarbelan, 2018. Ibrar Yaqoob, Ejaz Ahmed, Ibrahim Abaker Targio Hashem, Abdelmuttlib Ibrahim Abdalla Ahmed, Abdullah Gani, Muhammad Imran, and Mohsen Guizani. Internet of Things Architecture:
Files
JCE-v11-I12-010.pdf
Files
(2.0 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:4e71818c604584bb012fcbd7265cf0db
|
2.0 MB | Preview Download |