Project deliverable Open Access

D5.1 Operator Cost Estimation and Workflow Optimisation Technology V1

Project consortium members

DCAT Export

<?xml version='1.0' encoding='utf-8'?>
<rdf:RDF xmlns:rdf="" xmlns:adms="" xmlns:dc="" xmlns:dct="" xmlns:dctype="" xmlns:dcat="" xmlns:duv="" xmlns:foaf="" xmlns:frapo="" xmlns:geo="" xmlns:gsp="" xmlns:locn="" xmlns:org="" xmlns:owl="" xmlns:prov="" xmlns:rdfs="" xmlns:schema="" xmlns:skos="" xmlns:vcard="" xmlns:wdrs="">
  <rdf:Description rdf:about="">
    <rdf:type rdf:resource=""/>
    <dct:type rdf:resource=""/>
    <dct:identifier rdf:datatype=""></dct:identifier>
    <foaf:page rdf:resource=""/>
        <rdf:type rdf:resource=""/>
        <foaf:name>Project consortium members</foaf:name>
    <dct:title>D5.1 Operator Cost Estimation and Workflow Optimisation Technology V1</dct:title>
    <dct:issued rdf:datatype="">2020</dct:issued>
    <frapo:isFundedBy rdf:resource="info:eu-repo/grantAgreement/EC/H2020/825070/"/>
        <dct:identifier rdf:datatype="">10.13039/501100000780</dct:identifier>
        <foaf:name>European Commission</foaf:name>
    <dct:issued rdf:datatype="">2020-04-30</dct:issued>
    <owl:sameAs rdf:resource=""/>
        <skos:notation rdf:datatype=""></skos:notation>
    <dct:isVersionOf rdf:resource=""/>
    <dct:isPartOf rdf:resource=""/>
    <dct:description>&lt;p&gt;Big Data processing workflows typically span a multitude of execution and storage platforms. Parts of the processing could be pushed to the input sensor level, as in the case of the wavegliders in the Maritime use case, while other more computationally intensive parts/operators (such as stock correlation functions in the Financial use case, or gene simulations in Life Sciences use case) could be executed either within one or more (potentially distributed) Big Data platforms or within other clusters (i.e., GPUs) of a supercomputer. Even within a single (i.e., BSC&amp;rsquo;s MareNostrum 4) supercomputer one often finds different available clusters, with different hardware and processing capabilities, which could process a given workflow. Hence, the space of potential plans (a.k.a. physical execution plans) to process a Big Data workflow could be vast. Finding in a timely fashion the right plan that is both efficient and cost effective is not trivial.&amp;nbsp;&lt;/p&gt; &lt;p&gt;This deliverable presents techniques for optimizing workflow execution in terms of a set of optimization objectives (e.g., throughput, resource utilization) of extreme-scale analytics across different, potentially geo-dispersed computer clusters each hosting one or more Big Data platforms.&amp;nbsp;&lt;/p&gt; &lt;p&gt;WP5 interacts with WP4 since the Optimizer Component is a fundamental component of the overall INFORE architecture. WP5 receives a logical workflow as JSON formatted input from the Graphical Editor Component of the architecture via the Manager Component. It ingests statistics collected by the Manager Component to perform cost estimations and judge the performance of alternative execution plans i.e., the Optimizer Component transforms the logical workflow to a physical one to be deployed in the available computer clusters and Big Data platforms. Having performed this mapping, it returns it to the Manager Component to visualize it to the Graphical Editor Component of the INFORE architecture and deploy it to the available computer clusters. Moreover, WP5 interacts with the Synopses Data Engine Component and the Machine Learning and Data Mining Component of WP6 which provide the physical implementations of respective logical operators drawn in the Graphical Editor Component during code-free workflow specification. Finally, WP5 optimizes the logical workflows satisfying the application needs of the Biological (WP1), Financial (WP2) and Maritime (WP3) use cases.&lt;/p&gt;</dct:description>
    <dct:accessRights rdf:resource=""/>
      <dct:RightsStatement rdf:about="info:eu-repo/semantics/openAccess">
        <rdfs:label>Open Access</rdfs:label>
        <dct:license rdf:resource=""/>
        <dcat:accessURL rdf:resource=""/>
        <dcat:downloadURL> Operator Cost Estimation and Workflow Optimization Technology V1.pdf</dcat:downloadURL>
  <foaf:Project rdf:about="info:eu-repo/grantAgreement/EC/H2020/825070/">
    <dct:identifier rdf:datatype="">825070</dct:identifier>
    <dct:title>Interactive Extreme-Scale Analytics and Forecasting</dct:title>
        <dct:identifier rdf:datatype="">10.13039/501100000780</dct:identifier>
        <foaf:name>European Commission</foaf:name>
All versions This version
Views 2626
Downloads 3333
Data volume 207.6 MB207.6 MB
Unique views 2222
Unique downloads 2828


Cite as