Published September 22, 2019 | Version v1
Dataset Open

BHAAV (भाव) - A Text Corpus for Emotion Analysis from Hindi Stories

  • 1. Adobe, New Delhi
  • 2. Bloomberg LP
  • 3. NSIT, New Delhi
  • 4. USICT, New Delhi
  • 5. IIIT, New Delhi

Description

The first and largest Hindi text corpus, named BHAAV (भाव), which means emotions in Hindi, for analyzing emotions that a writer expresses through his characters in a story, as perceived by a narrator/reader. The corpus consists of 20,304 sentences collected from 230 different short stories spanning across 18 genres such as प्रेरणादायक (Inspirational) and रहस्यमयी (Mystery). Each sentence has been annotated into one of the five emotion categories anger, joy, suspense, sad, and neutral) by three native Hindi speakers with at least ten years of formal education in Hindi.

Files

Datasets-20190922T151602Z-001.zip

Files (15.9 MB)

Name Size Download all
md5:cdcaf1997a422f18ee12d713a17ffdea
15.9 MB Preview Download