Dataset Open Access

# The SICK (Sentences Involving Compositional Knowledge) dataset for relatedness and entailment

Marco Marelli; Stefano Menini; Marco Baroni; Luisa Bentivogli; Raffaella Bernardi; Roberto Zamparelli

### Citation Style Language JSON Export

{
"publisher": "Zenodo",
"DOI": "10.5281/zenodo.2787612",
"language": "eng",
"title": "The SICK (Sentences Involving Compositional Knowledge) dataset for relatedness and entailment",
"issued": {
"date-parts": [
[
2014,
5,
26
]
]
},
"abstract": "<p>The SICK data set consists of about 10,000 English sentence pairs, generated starting from two existing sets: the&nbsp;<a href=\"http://nlp.cs.illinois.edu/HockenmaierGroup/data.html\">8K ImageFlickr data set</a>&nbsp;and the&nbsp;<a href=\"http://www.cs.york.ac.uk/semeval-2012/task6/index.php?id=data\">SemEval 2012 STS MSR-Video Description data set</a>. We randomly selected a subset of sentence pairs from each of these sources and we applied a 3-step generation process: first, the original sentences were normalized to remove unwanted linguistic phenomena; the normalized sentences were then expanded to obtain up to three new sentences with specific characteristics suitable to CDSM evaluation; as a last step, all the sentences generated in the expansion phase were paired with the normalized sentences in order to obtain the final data set.</p>\n\n<p>Each sentence pair was annotated for relatedness and entailment by means of crowdsourcing techniques. The&nbsp;<strong>sentence relatedness score</strong>&nbsp;(on a 5-point rating scale) provides a direct way to evaluate CDSMs, insofar as their outputs are meant to quantify the degree of semantic relatedness between sentences; the categorizations in terms of the&nbsp;<strong>entailment relation between the two sentences</strong>&nbsp;(with&nbsp;<em>entailment, contradiction</em>, and&nbsp;<em>neutral</em>&nbsp;as gold labels) is also a crucial aspect to consider, since detecting the presence of entailment is one of the traditional benchmarks of a successful semantic system.</p>\n\n<p>In the final set, gold scores for relatedness and entailment were distributed as follows: the relatednes scoring resulted in 923 pairs within the [1,2) range, 1373 pairs within the [2,3) range, 3872 pairs within the [3,4) range, and 3672 pairs within the [4,5] range; the entailment annotation led to 5595&nbsp;<em>neutral</em>&nbsp;pairs, 1424&nbsp;<em>contradiction</em>&nbsp;pairs, and 2821&nbsp;<em>entailment</em>&nbsp;pairs.</p>\n\n<p><strong>Files</strong></p>\n\n<ul>\n\t<li>SICK.zip (main file)</li>\n\t<li>SICK_Annotated.zip (a&nbsp;version of the data set annotated for the expansion rule which was used in each case)</li>\n\t<li>SICK_subsets.zip (a&nbsp;Indexes specifying further classifications, used in the JLRE 2016 publication)</li>\n</ul>\n\n<p>&nbsp;</p>",
"author": [
{
"family": "Marco Marelli"
},
{
"family": "Stefano Menini"
},
{
"family": "Marco Baroni"
},
{
"family": "Luisa Bentivogli"
},
{
"family": "Raffaella Bernardi"
},
{
"family": "Roberto Zamparelli"
}
],
"type": "dataset",
"id": "2787612"
}
3,035
1,219
views