Published February 6, 2026 | Version v1
Dataset Open

In-House Subset of MFID for AI-Generated Face Detection

  • 1. Southwestern University of Finance and Economics, Tianfu College, Mianyang, China

Description

This repository provides the In-House Subset of MFID (Multidimensional Facial Image Dataset), a curated dataset designed for AI-generated face detection research. The dataset contains both authentic human face images and synthetic face images generated by multiple state-of-the-art generative models, including diffusion-based models such as Stable Diffusion 2/3 and Flux 1, as well as other modern generation pipelines.

The dataset is constructed to support robust and generalizable deepfake detection by covering diverse demographic attributes (e.g., age, gender, pose) and multiple synthesis styles. It is intended as a benchmark for evaluating detection models under cross-model generalization scenarios.

This dataset accompanies the paper:

AI-Generated Face Detection Using Multi-Feature Fusion

The proposed approach integrates multi-domain forensic features (spatial RGB signals, statistical texture descriptors, frequency spectrum anomalies, and structural edge cues) to detect subtle artifacts introduced by generative models. Experimental results in the paper demonstrate strong detection performance and generalization ability across heterogeneous generative architectures.

The dataset is released under the Creative Commons Attribution 4.0 International (CC BY 4.0) license. Users are free to share and adapt the dataset for research purposes with proper attribution.

File format:

  • Image files organized by category (real vs. AI-generated)

  • labels.csv provides binary labels for training and evaluation

If you use this dataset, please cite our paper.

Files

MFID-InHouse.zip

Files (226.0 MB)

Name Size Download all
md5:a6e0881eed1c5017044cc58af37eb700
226.0 MB Preview Download

Additional details

Related works

Is supplemented by
Software: https://github.com/TimeLabHub/AuthentiVision (URL)