There is a newer version of the record available.

Published September 23, 2023 | Version v3
Dataset Open

AzSLD - Azerbaijani Sign Language Dataset

  • 1. French-Azerbaijani University
  • 2. ROR icon ADA University

Contributors

Data collector:

Description

The Azerbaijani Sign Language Dataset (AzSLD) is a comprehensive, multi-modal dataset designed to facilitate the development and evaluation of machine learning models for the recognition and translation of Azerbaijani Sign Language (AzSL). 

AzSLD is the first publicly available dataset focused on Azerbaijani Sign Language. It contributes to the global effort to improve accessibility for the deaf and hard-of-hearing community in Azerbaijan. The dataset aims to bridge the gap between technology and accessibility by providing high-quality data for researchers, developers, and practitioners working on sign language recognition or translation systems.

Dataset Composition

AzSLD is organized into three primary components:

1. AzSLD_Sentences

This component contains video sequences of complete sentences in AzSL. It is designed to capture the fluidity and contextual nature of sign language, providing data for more complex language modeling tasks. It includes over 60 hours of high-definition video recordings, annotated with timestamped glosses for 500 distinct classes, enabling precise analysis and robust model training.

2. AzSLD_Words

This component comprises a collection of 7,230 video samples representing 100 commonly used words in AzSL. 

3. AzSLD_Fingerspelling

This component includes over 14,000 video and image samples of letters of the Azerbaijani alphabet. Each sign is captured from multiple angles to ensure comprehensive coverage of dactylology in AzSL. This component is ideal for tasks involving letter recognition and the integration of fingerspelling into broader sign language recognition systems.

Key Features

Double-View Recordings

The dataset includes 10,104 synchronized video recordings from two camera angles to capture both frontal and side views of hand and body movements, ensuring that the subtle nuances of sign language are well-represented.

Diverse Signers

The dataset features recordings from a diverse group of native AzSL signers, encompassing variations in age, gender, and signing style. This diversity is crucial for training models that are robust to variations in signing.

Detailed Annotations

Each video is annotated with comprehensive metadata, including the sign’s label (dactyl, word, or sentence), signer ID, and timestamped glosses for sentence-level signs. 

High-Quality Data Format

The dataset comprises RGB videos in high-definition (HD) resolution at 35 frames per second, accompanied by JSON files containing annotations and metadata. The data is systematically organized into folders by category for ease of navigation.

Citation: When using AzSLD in your research, please cite the following paper:

Alishzade, N., Hasanov, J. (2024). AzSLD: Azerbaijani Sign Language Dataset for Dactyl, Word, and Sentence Translation with Baseline Software. [Journal Name], [Volume(Issue)], [Pages]. DOI: [DOI link].

Contact:
For questions, feedback, or contributions, please contact the project team at: slr.project.ada@gmail.com

Files

AzSLD.zip

Files (45.5 GB)

Name Size Download all
md5:5b79bed6bca3e17782c7d2d864948f91
45.5 GB Preview Download

Additional details

Software

Repository URL
https://github.com/ADA-SITE-JML/azsl_dataloader
Programming language
Python
Development Status
Active