styloformer-artfilm-scene-classification

Zhaojun, An

doi:10.5281/zenodo.17214437

Published September 27, 2025 | Version v1

Dataset Open

styloformer-artfilm-scene-classification

Zhaojun, An

# Styloformer: Automatic Classification of Art Film Scenes

This repository contains the implementation of **Styloformer**, a multimodal transformer framework for **automatic classification of art film scenes** based on **image and audio deep features**.
The project integrates **visual, auditory, textual, and curatorial signals** into a unified representation space, enabling both predictive performance and art-historical interpretability.

---

## ✨ Key Features

- **Multimodal Fusion**
Cross-modal attention mechanism dynamically aligns visual and auditory features for robust scene understanding.

- **Styloformer Architecture**
A transformer-based framework integrating:
- Stylistic clustering
- Canonicality estimation
- Influence prediction
- Historiographic navigation

- **Historiographic Navigation**
Novel interpretive module embedding ontological priors and temporal logic for reasoning about artistic influence.

- **State-of-the-Art Performance**
- **MovieNet dataset**: 91.85% accuracy, 94.31% AUC
- Outperforms baselines like **CLIP**, **ViT**, and **PANDA**:contentReference[oaicite:1]{index=1}

---

## 📂 Datasets

Experiments were conducted on several benchmarks:

- **MovieNet** – narrative and stylistic structure in cinema
- **Hollywood2** – action and scene classification
- **MovieGraphs** – graph-based social interaction semantics
- **TACoS** – fine-grained visual-text alignment
- **CineArtSet (new)** – curated art film dataset (1,920 clips, 54 films, 9,458 labeled scenes):contentReference[oaicite:2]{index=2}

---

## ⚙️ Installation

```bash
# Clone this repo
git clone https://github.com/<your-username>/styloformer.git
cd styloformer

# Create environment
conda create -n styloformer python=3.9
conda activate styloformer

# Install dependencies
pip install -r requirements.txt

Files

Files (18.4 kB)

Name	Size	Download all
data.py md5:a3665d55ac0389d246c1cc2dfa303188	18.4 kB	Download

	All versions	This version
Views	101	101
Downloads	26	26
Data volume	478.7 kB	478.7 kB

styloformer-artfilm-scene-classification

Authors/Creators

Description

Files

Files (18.4 kB)