Published September 20, 2022
| Version v1
Dataset
Open
hive-24hr
Description
24 hours of logs generated from a Hive stack consisting of Hive, YARN, MapReduce, and HDFS driven by HiBench workloads that repeatedly create, query, and delete tables. This dataset was first used in the evaluation of "Non-Intrusive Performance Profiling for Entire Software Stacks Based on the Flow Reconstruction Principle."
Files
Files
(128.5 MB)
Name | Size | Download all |
---|---|---|
md5:74cba61062fc9366f8c748a191f96b36
|
128.5 MB | Download |
Additional details
References
- Xu Zhao, Kirk Rodrigues, Yu Luo, Ding Yuan, Michael Stumm (OSDI'16). Non-intrusive Performance Profiling for Entire Software Stacks based on the Flow Reconstruction Principle. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation, pp. 603–618.
- S. Huang, J. Huang, J. Dai, T. Xie, and B. Huang. The HiBench benchmark suite: Characterization of the MapReduce-based data analysis. In 26th International Conference on Data Engineering Workshops, ICDEW '10, pages 41–51. IEEE Computer Society, 2010.