Published July 12, 2017 | Version v1
Dataset Open

Sentence representations generated by Inner Attention model (arxiv: 1707.03103)

  • 1. The University of Tokyo

Description

600-dimensional sentence vector representations created by the model described in the paper "Refining Raw Sentence Representations for Textual Entailment Recognition via Attention".

The dataset is in tab-delimited format: ID\tSENTENCE_TYPE\tVECTOR, where ID is the id corresponding to the sentence pair as specified in the Repeval 2017 test dataset for both matched and mismatched evaluations, available in https://inclass.kaggle.com/c/multinli-matched-evaluation/download/multinli_0.9_test_matched_unlabeled.jsonl and https://inclass.kaggle.com/c/multinli-mismatched-evaluation/download/multinli_0.9_test_mismatched_unlabeled.jsonl (you will probably have to create an account to download them).

SENTENCE_TYPE can either be p, meaning the sentence is the premise or h, meaning it is the hypothesis.

VECTOR is a space-delimited 600-dim vector.

Files

sentence_representations.zip

Files (72.5 MB)

Name Size Download all
md5:c66e38aaeb88963b2d293ea9a14ae38d
72.5 MB Preview Download