hive-24hr

Xu Zhao; Kirk Rodrigues; Yu Luo; Ding Yuan; Michael Stumm

doi:10.5281/zenodo.7094921

Published September 20, 2022 | Version v1

Dataset Open

hive-24hr

1. University of Toronto

24 hours of logs generated from a Hive stack consisting of Hive, YARN, MapReduce, and HDFS driven by HiBench workloads that repeatedly create, query, and delete tables. This dataset was first used in the evaluation of "Non-Intrusive Performance Profiling for Entire Software Stacks Based on the Flow Reconstruction Principle."

Files

Files (128.5 MB)

Name	Size	Download all
hive-24hr.tar.gz md5:74cba61062fc9366f8c748a191f96b36	128.5 MB	Download

Additional details

Xu Zhao, Kirk Rodrigues, Yu Luo, Ding Yuan, Michael Stumm (OSDI'16). Non-intrusive Performance Profiling for Entire Software Stacks based on the Flow Reconstruction Principle. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation, pp. 603–618.
S. Huang, J. Huang, J. Dai, T. Xie, and B. Huang. The HiBench benchmark suite: Characterization of the MapReduce-based data analysis. In 26th International Conference on Data Engineering Workshops, ICDEW '10, pages 41–51. IEEE Computer Society, 2010.

782

Views

280

Downloads

Show more details

	All versions	This version
Views	782	779
Downloads	280	279
Data volume	39.2 GB	39.1 GB

More info on how stats are collected....

DOI

Resource type

Dataset

Publisher

Zenodo

Languages

English

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: September 20, 2022
Modified: September 20, 2022

hive-24hr

Authors/Creators

Description

Files

Files (128.5 MB)

Additional details

References