Published August 1, 2025 | Version Version 1.1
Dataset Open

SCIMD-6: Source Camera Identification — Mobile Devices Dataset

Description

# 📷 SCIMD-6: Source Camera Identification — Mobile Devices Dataset

## 📂 Overview

**SCIMD-6** is a carefully curated image dataset developed at **Bapatla Engineering College** to support research in **source camera identification** using images from **mobile devices**. The dataset contains **6315 RGB images**, acquired from **six different smartphones** under **diverse real-world conditions**.

## 📱 Devices Used

Mobile Device

Number of Images

Moto G64 5G         

1006

Moto G85 5G         

1037

Nothing A001

1036

Realme 8 Pro        

1001

Redmi 14C 5G

1014

Xiaomi M2101K6P     

1221

Total

6315

 

📌 *Note*: Slight imbalance exists across classes but overall distribution is fairly uniform.

## 🌄 Image Characteristics

- 📐 **Resolution**: All images are resized to **224×224** pixels for compatibility with CNN architectures.

- 🌤️ **Conditions**: Captured in a variety of **uncontrolled environments**, including:

  - Indoor and outdoor

  - Sunny and rainy weather

  - Casual perspectives and variable lighting

- 🤳 **Capture Style**: Intentional lack of discipline in framing adds **real-world complexity** for model robustness testing.

## 📑 Included Files

- 📁  A zipped file consisting of `Motog64_5G/`, `Motog85_5G/`, ..., `Xiaomi_M2101K6P/`: Folders containing 224×224 RGB images per mobile device.

- 📄 `merged_common.csv`: A metadata file containing **EXIF information**  (Exchangeable Image File Format ) extracted from all images (e.g., Make, Model, ExposureTime, FocalLength).

## 🎯 Intended Use

This dataset is intended for tasks such as:

- 📸 **Source Camera Identification (SCI)**

- 🔬 **Image Forensics and Provenance Analysis**

- 🤖 **Fine-grained Classification and Transfer Learning**

- 🧠 **Deep Learning Model Benchmarking in Forensic Settings**

## 🧪 Benchmark Baseline

We provide a baseline experiment using **ResNet-50**, achieving an initial test accuracy of **80%** on this dataset. This suggests the dataset's **challenging and discriminative nature** despite class similarity.

📚 Potential Applications of the Dataset

This dataset, although primarily designed for source camera identification using mobile device images, supports a wide range of research directions and practical applications:

1. Source Camera Identification (SCI)

  • Classification of images based on the originating mobile device using intrinsic sensor characteristics.
  • Enables research in PRNU-based techniques and camera model/device fingerprinting.

2. Image Forensics and Metadata Consistency Analysis

  • Verification of metadata integrity using image content.
  • Detection of inconsistencies in EXIF fields such as shutter speed, ISO, focal length, and timestamp.
  • Applicable in detecting tampered or manipulated media.

3. Shutter Speed and ISO Estimation (Regression Tasks)

  • Pixel-to-metadata learning: predicting EXIF fields like ISO speed rating or exposure time directly from the image content.
  • Useful for modeling camera behavior and building metadata synthesis pipelines.

4. Image Quality Assessment (IQA) and Denoising

  • Training and benchmarking denoising models under real-world noise conditions (e.g., high ISO settings).
  • Correlation of EXIF parameters with perceptual quality for no-reference IQA research.

5. Environmental and Scene Classification

  • Scene-type inference (indoor/outdoor, sunny/cloudy, low-light conditions) based on visual content and EXIF cues.
  • Aids in tasks like environmental awareness, adaptive imaging, or low-light enhancement.

6. Image Provenance and Authorship Verification

  • Attribution of images to devices for media forensics and misinformation detection.
  • Combines device classification with temporal and spatial metadata for provenance tracing.

7. Training and Evaluation of Robust Vision Models

  • Offers real-world diversity in lighting, context, and device pipeline characteristics.
  • Supports robustness evaluation of CNNs, Vision Transformers, and vision-language models in uncontrolled environments.

 

The SCIMD-6 dataset is publicly available on multiple trusted platforms for broad accessibility and reproducibility:

 

## 📌 Citation

If you use this dataset in your research, please cite as:

@dataset{chandramohan2025scimd6,

  author       = {B. Chandra Mohan and Ch. Pavan Kumar and K. Sri Harsha and Ch. Nagaraju and Sandhyana T and Suvarna Lakshmi M},

  title        = {SCIMD-6:  Source Camera Identification Mobile Devices Dataset},

  year         = {2025},

  publisher    = {Zenodo},

  url          = {https://your-dataset-link-here},

  note         = {A benchmark dataset for source mobile camera identification with diversified conditions and EXIF metadata.}

}

---

 

## 📬 Contact

 

For inquiries or academic collaborations:

**Dr. Chandra Mohan Bhuma** 

Department of Electronics & Communication Engineering 

Bapatla Engineering College 

✉️ chandrabhuma@gmail.com

## 🔒 License

This dataset is released under the **Creative Commons Attribution 4.0 International (CC BY 4.0)** license.

 

 

Files

BECSCIMD-6.zip

Files (66.1 MB)

Name Size Download all
md5:c80da494489b29c4899e800be2a3663f
64.5 MB Preview Download
md5:bdd272b98e10fe9322129d8b87b1c313
1.6 MB Preview Download

Additional details

Dates

Created
2025