Published June 22, 2018 | Version 1.0
Dataset Restricted

DYNAMISM - Postprocessed Execution Traces Of Android Malware and Benign Apps

  • 1. ALaRI, Faculty of Informatics, Università della Svizzera italiana
  • 2. Institute for Informatics and Telematics, National Research Council of Italy (CNR)
  • 3. Institute of Telecommunications, TU Wien

Description

Protection against malware is particularly relevant on systems running the Android operating system, due to its huge use base and, therefore, its potential for monetization from the attackers.

Protection against malware is particularly relevant in systems running the Android operating system, due to its huge users’ base and, therefore, its potential for monetization from the attackers.

Dynamic malware detection has been widely adopted by the scientific community but not yet in practical applications.

We release DYNAMISM (Dynamic Analysis of Malware), a dataset containing execution traces of both benign and malicious applications running on Android OS, in order to facilitate further research as well as to facilitate the adoption of dynamic detection in practice. The dataset contains execution traces from 2,386 benign applications and 2,495 malicious applications taken from the Malware Genome Project repository [http://www.malgenomeproject.org] and from Drebin Dataset [https://www.sec.cs.tu-bs.de/~danarp/drebin/]. Execution records were obtained by running the applications, one at a time, on the Android emulator. For each application, a maximum of 2,000 stimuli were applied with a maximum execution time of 10 minutes. For most of the applications, all the stimuli could be applied in this timeframe. In some of the traces none of the two limits is reached due to emulator hiccups. Collected features are related to the memory and CPU usage, network interaction and system calls and their monitoring is performed with a period of two seconds. The Android emulator of the Android Software Development Kit for Android 4.0 (release 20140702) was used. To guarantee that the system was always in a mint condition when a new sample is started, thus avoiding possible interference (e.g., changed settings, running processes, and modifications of the operating system files) from previously run samples, the Android operating system was each time re-initialized before running each application. The application execution process was automated by means of a shell script that made use of Android Debug Bridge (adb) and that was run on a Linux PC. The Monkey application exerciser was used in the script as a generator of the aforementioned stimuli. The Monkey is a command-line tool that can be run on any emulator instance or on a device; it sends a pseudo-random stream of user events (stimuli) into the system, which acts as a stress test on the application software.

In this dataset, we provide both per-app CSV files as well as unified files, in which CSV files of single applications have been concatenated. The CSV files contain the features extracted from the raw execution record. The provided files are listed below:

  • benign-per_app-csv.zip - features obtained by executing benign applications, one CSV per application

  • benign-unified-csv.zip - features obtained by executing benign applications, only one CSV file

  • malicious-per_app-csv.zip - features obtained by executing malicious applications, one CSV per application

  • malicious-unified-csv.zip - features obtained by executing malicious applications, only one CSV file

Files

Restricted

The record is publicly accessible, but files are restricted to users with access.

Request access

If you would like to request access to these files, please fill out the form below.

You need to satisfy these conditions in order for this request to be accepted:

The dataset can only be used for research purposes.

You are currently not logged in. Do you have an account? Log in here

Additional details

References

  • Milosevic, J., M. Malek, and A. Ferrante, "Runtime Classification of Mobile Malware for Resource-constrained Devices", Lecture Notes in Communications in Computer and Information Science, vol. 764: Springer International Publishing AG, pp. 195-215, 2017.
  • Milosevic, J., A. Ferrante, and M. Malek, "What Does the Memory Say? Towards the most indicative features for efficient malware detection", CCNC 2016, The 13th Annual IEEE Consumer Communications & Networking Conference, Las Vegas, NV, USA, IEEE Communication Society, 01/2016.
  • Milosevic, J., A. Ferrante, and M. Malek, "MalAware: Effective and Efficient Run-time Mobile Malware Detector", The 14th IEEE International Conference on Dependable, Autonomic and Secure Computing (DASC 2016), Auckland, New Zealand, IEEE Computer Society Press, 08/2016.
  • Milosevic, J., M. Malek, and A. Ferrante, "A Friend or a Foe? Detecting Malware Using Memory and CPU Features", SECRYPT 2016, 13th International Conference on Security and Cryptography, Lisbon, Portugal, SciTePress Digital Library, 07/2016.
  • Ferrante, A., F. Mercaldo, J. Milosevic, and C. Aaron Visaggio, "Spotting the Malicious Moment: Characterizing Malware Behavior Using Dynamic Features", 2016 11th International Conference on Availability, Reliability and Security (ARES), Salzburg, Austria, 08/2016.