EnCoD: Distinguishing Compressed and Encrypted File Fragments

De Gaspari, Fabio; Hitaj, Dorjan; Pagnotta, Giulio; De Carli, Lorenzo; Mancini, Luigi V.

doi:10.1007/978-3-030-65745-1_3

Published November 28, 2020 | Version 1.0

Conference paper Open

EnCoD: Distinguishing Compressed and Encrypted File Fragments

1. Sapienza University of Rome
2. Worcester Polytechnic Institute

Reliable identification of encrypted file fragments is a requirement for several security applications, including ransomware detection, digital forensics, and traffic analysis. A popular approach consists of estimating high entropy as a proxy for randomness. However, many modern content types (e.g. office documents, media files, etc.) are highly compressed for storage and transmission efficiency. Compression algorithms also output high-entropy data, thus reducing the accuracy of entropy-based encryption detectors.

Over the years, a variety of approaches have been proposed to distinguish encrypted file fragments from high-entropy compressed fragments. However, these approaches are typically only evaluated over a few, selected data types and fragment sizes, which makes a fair assessment of their practical applicability impossible. This paper aims to close this gap by comparing existing statistical tests on a large, standardized dataset. Our results show that current approaches cannot reliably tell apart encryption and compression, even for large fragment sizes. To address this issue, we design EnCoD, a learning-based classifier which can reliably distinguish compressed and encrypted data, starting with fragments as small as 512 bytes. We evaluate EnCoD against current approaches over a large dataset of different data types, showing that it outperforms current state-of-the-art for most considered fragment sizes and data types.

Files

Gaspari2020_Chapter_EnCoDDistinguishingCompressedA.pdf

Files (675.6 kB)

Name	Size	Download all
Gaspari2020_Chapter_EnCoDDistinguishingCompressedA.pdf md5:0ed3d62036b738093bb0fd8755d87354	675.6 kB	Preview Download

Additional details

European Commission
GEN4OLIVE - Mobilization of Olive GenRes through pre-breeding activities to face the future challenges and development of an intelligent interface to ensure a friendly information availability for end users 101000427

	All versions	This version
Views	195	195
Downloads	303	302
Data volume	209.4 MB	208.8 MB

EnCoD: Distinguishing Compressed and Encrypted File Fragments

Authors/Creators

Description

Files

Gaspari2020_Chapter_EnCoDDistinguishingCompressedA.pdf

Files (675.6 kB)

Additional details

Funding