AzSLD - Azerbaijani Sign Language Dataset
Authors/Creators
Contributors
Data collector:
Description
The Azerbaijani Sign Language Dataset (AzSLD) is a comprehensive, multi-modal dataset designed to facilitate the development and evaluation of machine learning models for the recognition and translation of Azerbaijani Sign Language (AzSL).
AzSLD is the first publicly available dataset focused on Azerbaijani Sign Language. It contributes to the global effort to improve accessibility for the deaf and hard-of-hearing community in Azerbaijan. The dataset aims to bridge the gap between technology and accessibility by providing high-quality data for researchers, developers, and practitioners working on sign language recognition or translation systems.
Dataset Composition
AzSLD is organized into three primary components:
1. AzSLD_Sentences
This component contains video sequences of complete sentences in AzSL. It is designed to capture the fluidity and contextual nature of sign language, providing data for more complex language modeling tasks. It includes over 60 hours of high-definition video recordings, annotated with timestamped glosses for 500 distinct classes, enabling precise analysis and robust model training.
2. AzSLD_Words
This component comprises a collection of 7,230 video samples representing 100 commonly used words in AzSL.
3. AzSLD_Fingerspelling
This component includes over 14,000 video and image samples of letters of the Azerbaijani alphabet. Each sign is captured from multiple angles to ensure comprehensive coverage of dactylology in AzSL. This component is ideal for tasks involving letter recognition and the integration of fingerspelling into broader sign language recognition systems.
Key Features
Double-View Recordings
The dataset includes 10,104 synchronized video recordings from two camera angles to capture both frontal and side views of hand and body movements, ensuring that the subtle nuances of sign language are well-represented.
Diverse Signers
The dataset features recordings from a diverse group of native AzSL signers, encompassing variations in age, gender, and signing style. This diversity is crucial for training models that are robust to variations in signing.
Detailed Annotations
Each video is annotated with comprehensive metadata, including the sign’s label (dactyl, word, or sentence), signer ID, and timestamped glosses for sentence-level signs.
High-Quality Data Format
The dataset comprises RGB videos in high-definition (HD) resolution at 35 frames per second, accompanied by JSON files containing annotations and metadata. The data is systematically organized into folders by category for ease of navigation.
Citation: When using AzSLD in your research, please cite the following paper:
Alishzade, N., Hasanov, J. (2024). AzSLD: Azerbaijani Sign Language Dataset for Dactyl, Word, and Sentence Translation with Baseline Software. [Journal Name], [Volume(Issue)], [Pages]. DOI: [DOI link].
Contact:
For questions, feedback, or contributions, please contact the project team at: slr.project.ada@gmail.com
Files
AzSLD.zip
Files
(45.5 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:5b79bed6bca3e17782c7d2d864948f91
|
45.5 GB | Preview Download |
Additional details
Software
- Repository URL
- https://github.com/ADA-SITE-JML/azsl_dataloader
- Programming language
- Python
- Development Status
- Active