Published September 27, 2025 | Version v1
Dataset Open

styloformer-artfilm-scene-classification

Authors/Creators

Description

# Styloformer: Automatic Classification of Art Film Scenes

This repository contains the implementation of **Styloformer**, a multimodal transformer framework for **automatic classification of art film scenes** based on **image and audio deep features**.  
The project integrates **visual, auditory, textual, and curatorial signals** into a unified representation space, enabling both predictive performance and art-historical interpretability.

---

## ✨ Key Features

- **Multimodal Fusion**  
  Cross-modal attention mechanism dynamically aligns visual and auditory features for robust scene understanding.

- **Styloformer Architecture**  
  A transformer-based framework integrating:
  - Stylistic clustering  
  - Canonicality estimation  
  - Influence prediction  
  - Historiographic navigation

- **Historiographic Navigation**  
  Novel interpretive module embedding ontological priors and temporal logic for reasoning about artistic influence.

- **State-of-the-Art Performance**  
  - **MovieNet dataset**: 91.85% accuracy, 94.31% AUC  
  - Outperforms baselines like **CLIP**, **ViT**, and **PANDA**​:contentReference[oaicite:1]{index=1}  

---

## 📂 Datasets

Experiments were conducted on several benchmarks:

- **MovieNet** – narrative and stylistic structure in cinema  
- **Hollywood2** – action and scene classification  
- **MovieGraphs** – graph-based social interaction semantics  
- **TACoS** – fine-grained visual-text alignment  
- **CineArtSet (new)** – curated art film dataset (1,920 clips, 54 films, 9,458 labeled scenes)​:contentReference[oaicite:2]{index=2}

---

## ⚙️ Installation

```bash
# Clone this repo
git clone https://github.com/<your-username>/styloformer.git
cd styloformer

# Create environment
conda create -n styloformer python=3.9
conda activate styloformer

# Install dependencies
pip install -r requirements.txt

Files

Files (18.4 kB)

Name Size Download all
md5:a3665d55ac0389d246c1cc2dfa303188
18.4 kB Download