Published August 28, 2020 | Version v1
Project deliverable Open

BigDataStack - D3.1 WP 3 Scientific Report and Prototype Description - Y1


This deliverable presents Scientific Report and Prototype Description for the work carried out in the first year of the BigDataStack project,  related to the so-called Data-Driven Infrastructure Management capability of the BigDataStack platform. The document shows how the implementation of the solution is planned to be delivered following an incremental and iterative methodology, having cycles of implementation and experimentation. The document describes:

  1. the high-level assumptions and architecture of the capability, as well as detailed requirements, design and prototypes per component;
  2. the experimental use case scenarios and plans, as well as the experimental plan per component and its mapping with the use case scenarios.



Files (3.7 MB)

Name Size Download all
3.7 MB Preview Download

Additional details


BigDataStack – High-performance data-centric stack for big data applications and operations 779747
European Commission


  • Network Policies in Kubernetes. Available Online:
  • Project Calico. Available Online:
  • Istio. Available Online:
  • de Vaulx, Frederic J., Eric D. Simmon, and Robert B. Bohn (2018). "Cloud computing service metrics description." Special Publication (NIST SP)-500-307. 2018.
  • William Voorsluys, James Broberg, Srikumar Venugopal, Rajkumar Buyya, Martin Gilje Jaatun, Gansen Zhao, Chunming Rong (2009). "Cost of Virtual Machine Live Migration in Clouds: A Performance Evaluation", Cloud Computing, Springer Berlin Heidelberg, 2009, P 254-265
  • D. Guyon, A. Orgerie, C. Morin and D. Agarwal (2017). "How Much Energy Can Green HPC Cloud Users Save?" in 25th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP), St. Petersburg, 2017, pp. 416-420.
  • Gulisano, V., Jimenez-Peris, R., Patino-Martinez, M., Soriente, C., & Valduriez, P. (2012). "Streamcloud: An elastic and scalable data streaming system." IEEE Transactions on Parallel and Distributed Systems, pp. 2351-2365.
  • H. Rui et al. (2014). "Enabling cost-aware and adaptive elasticity of multi-tier cloud applications." Future Generation Computer Systems, pp. 82-98.
  • Kalervo and Jaana. (2002). "Cumulated gain-based evaluation of IR techniques." ACM Transactions on Information Systems (TOIS), pp. 422--446.
  • L. Tie-Yan. (2009). "Learning to rank for information retrieval." Foundations and Trends in Information Retrieval, pp. 225-331.
  • M. Ferdman et al. (2012). "Clearing the clouds: a study of emerging scale-out workloads on modern hardware." ACM SIGPLAN Notices, pp. 37-48. ACM.
  • Raschke, R. (2010). "Process-based view of agility: The value contribution of IT and the effects on process outcomes." International Journal of Accounting Information Systems, 11(4), pp. 297-313.
  • Salton and McGill. (1986). "Introduction to modern information retrieval." McGraw-Hill, Inc.
  • Sergey and Christian. (2015). "Batch normalization: Accelerating deep network training by reducing internal covariate shift." arXiv preprint.
  • Z. Jia et al. (2013). "Characterizing data analysis workloads in data centers." IEEE International Symposium on Workload Characterization (IISWC), pp. 66-76. IEEE.