Hand Gesture Landmark Coordinates Dataset

Triantafyllou, Dimitrios

doi:10.5281/zenodo.18108472

Published December 31, 2025 | Version v1

Dataset Open

Hand Gesture Landmark Coordinates Dataset

Triantafyllou, Dimitrios¹

1. Centre for Research and Technology Hellas

# Hand Gesture Landmark Coordinates Dataset

## Description

This dataset contains normalized 3D hand landmark coordinates extracted from real-time video capture for hand gesture recognition tasks. It is designed for training machine learning models to classify different hand gestures based on spatial relationships between hand landmarks.

This dataset focuses on **static hand gestures** captured from single frames, not temporal sequences. Each sample represents a single hand pose from **one hand** at a specific moment in time, making it suitable for frame-by-frame classification rather than multi-frame temporal modeling.

Hand landmarks are captured using MediaPipe's hand tracking solution, which detects and tracks 21 key points on a single hand in real-time. Each landmark represents a specific anatomical point on the hand, including fingertips, knuckles, and the wrist.

## Dataset Structure

### File Format

The dataset is provided as a CSV file (`hand-gestures.csv`) with the following structure:

- **Features**: 63 numerical columns representing normalized 3D coordinates

- `x0` to `x20`: X-coordinates of 21 hand landmarks

- `y0` to `y20`: Y-coordinates of 21 hand landmarks

- `z0` to `z20`: Z-coordinates of 21 hand landmarks

- **Label**: 1 categorical column (`label`) indicating the gesture class

### Hand Landmark Indices

The 21 landmarks follow MediaPipe's hand landmark model:

```

0: Wrist

1: Thumb CMC

2: Thumb MCP

3: Thumb IP

4: Thumb Tip

5: Index Finger MCP

6: Index Finger PIP

7: Index Finger DIP

8: Index Finger Tip

9: Middle Finger MCP

10: Middle Finger PIP

11: Middle Finger DIP

12: Middle Finger Tip

13: Ring Finger MCP

14: Ring Finger PIP

15: Ring Finger DIP

16: Ring Finger Tip

17: Pinky MCP

18: Pinky PIP

19: Pinky DIP

20: Pinky Tip

```

## Data Collection and Processing

### 1. Landmark Capture

Hand landmarks are extracted from video frames using MediaPipe Hands with the following configuration:

- Maximum of 1 hand tracked per frame

- Minimum detection confidence: 0.5

- Minimum tracking confidence: 0.5

Each captured frame provides 21 3D landmarks (x, y, z coordinates) representing the hand pose.

### 2. Normalization

Raw landmarks undergo a multi-step normalization process to ensure invariance to hand position, size, and orientation:

#### Step 1: Translation (Centering)

All landmarks are translated so that the wrist (landmark 0) is positioned at the origin by subtracting the wrist position from all coordinate values.

#### Step 2: Scale Normalization

The hand is scaled to a unit size based on the Euclidean distance in the XY plane between the wrist and the middle finger MCP (landmark 12). All coordinates are divided by this hand size measurement.

#### Step 3: Rotation Alignment

The hand is rotated in the XY plane to align with a canonical orientation using the middle finger MCP as a reference point.

This normalization ensures that gestures are recognized regardless of:

- Hand position in the frame

- Hand size or distance from camera

- Hand rotation in the image plane

### 3. Data Augmentation

To increase dataset diversity and improve model robustness, each captured gesture is augmented 5 times using:

#### Scale Jitter

Random isotropic scaling is applied to all landmarks by multiplying coordinates with a scale factor uniformly sampled between 0.9 and 1.1 (±10% variation).

#### Additive Noise

Small Gaussian noise with zero mean and standard deviation of 0.02 is added independently to each landmark coordinate.

Each original capture results in 6 samples total: 1 original + 5 augmented versions.

## Dataset Statistics

- **Format**: CSV (Comma-Separated Values)

- **Encoding**: UTF-8

- **Features per sample**: 63 (21 landmarks × 3 coordinates)

- **Data type**: Float32

- **Missing values**: None

- **Augmentation ratio**: 6:1 (including original)

## References

- MediaPipe Hand Landmarker: https://ai.google.dev/edge/mediapipe/solutions/vision/hand_landmarker

Files

hand-gestures.csv

Files (1.0 MB)

Name	Size	Download all
hand-gestures.csv md5:0b7cea54fab40806eecc3241fd1eaa47	1.0 MB	Preview Download

	All versions	This version
Views	662	662
Downloads	319	319
Data volume	453.9 MB	453.9 MB

Hand Gesture Landmark Coordinates Dataset

Authors/Creators

Description

Files

hand-gestures.csv

Files (1.0 MB)