There is a newer version of this record available.

Project deliverable Open Access

BigDataGrapes D4.1 - Methods and Tools for Scalable Distributed Processing

Tonellotto, Nicola; Nardini, Franco Maria; Perego, Raffaele; Monteiro de Lira, Vinicius; Mele, Ida; Catena, Matteo; Muntean, Cristina

MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="">
  <datafield tag="041" ind1=" " ind2=" ">
    <subfield code="a">eng</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">grapevine-related assets; Docker container; precision agriculture; geospatial raster data</subfield>
  <controlfield tag="005">20200120173928.0</controlfield>
  <controlfield tag="001">1481819</controlfield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">11129912</subfield>
    <subfield code="z">md5:073df6efddf5aa1d2c6d1062cd835ad9</subfield>
    <subfield code="u"></subfield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2018-09-28</subfield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire</subfield>
    <subfield code="p">user-bigdatagrapes</subfield>
    <subfield code="o"></subfield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">CNR</subfield>
    <subfield code="a">Tonellotto, Nicola; Nardini, Franco Maria; Perego, Raffaele; Monteiro de Lira, Vinicius; Mele, Ida; Catena, Matteo; Muntean, Cristina</subfield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">BigDataGrapes D4.1 - Methods and Tools for Scalable Distributed Processing</subfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-bigdatagrapes</subfield>
  <datafield tag="536" ind1=" " ind2=" ">
    <subfield code="c">780751</subfield>
    <subfield code="a">Big Data to Enable Global Disruption of the Grapevine-powered Industries</subfield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u"></subfield>
    <subfield code="a">Creative Commons Attribution Non Commercial 4.0 International</subfield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2"></subfield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;This accompanying document for deliverable D4.1 Methods and Tools for Scalable Distributed Processing describes the main mechanisms and tools used in the BigDataGrapes (BDG) platform to support efficient processing of large datasets in the context of grapevine-related assets. The BDG software stack designed provides efficient and fault-tolerant tools for distributed processing, aiming at providing scalability and reliability for the applications.&lt;/p&gt;

&lt;p&gt;The document first introduces the big picture of the architecture of the BDG platform and the main technologies currently used in the Persistence and Processing Layers of the platform to perform efficient data processing over extremely large dataset.&lt;br&gt;
Then the requirements needed to run the BigDataGrapes platform are introduced and discussed, by also providing instructions to set up and to launch the platform. The platform has been built, re-using and customizing the software stack of the Big Data Europe (BDE, Besides the customization of some existing components, the BigDataGrapes software stack extends the BDE to better support efficient processing and distributed predictive analytics of geospatial raster data in the context of precision agriculture and Farm Management Systems. Furthermore, all the platform components have been designed and built using Docker containers. They thus include everything needed to deploy the BDG platform with a guaranteed behavior on any suitable system that can run a Docker engine.&lt;/p&gt;

&lt;p&gt;Finally, to provide the reader with practical examples of usage of the current release of the BDG platform, we report about two demos that have been already developed on the top of it by the project&amp;rsquo;s partner. Specifically, the two demonstrators perform scalable operations on geospatial raster data using the Spark-based GeoTrellis geographic data processing engine provided by the BDG platform. The first demo regards the tiling of large raster satellite images. Tiling is a mandatory process that allows the large raster datasets to be split-up into manageable pieces that can be processed on parallel and distributed resources. As a second demonstrator, the tiles previously computed are processed to extract from each tile image two relevant indexes. The first index is the normalized difference vegetation index (NDVI), a graphical indicator that assess at what degree the target being observed contains live green vegetation or not. The second index is instead the Normalized Difference Water Index (NDWI), most appropriate for water body mapping.&lt;/p&gt;</subfield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.1481818</subfield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.1481819</subfield>
    <subfield code="2">doi</subfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">publication</subfield>
    <subfield code="b">deliverable</subfield>
All versions This version
Views 15550
Downloads 19439
Data volume 623.5 MB434.1 MB
Unique views 12743
Unique downloads 17537


Cite as