Conference paper Open Access

A Data-Driven Metric of Hardness for WSC Sentences

Nicos Isaak; Loizos Michael

The Winograd Schema Challenge (WSC) | the task of resolving pronouns in certain sentences where shallow parsing techniques seem not to be directly applicable | has been proposed as an alternative to the Turing Test. According to Levesque, having access to a large corpus of text would likely not help much in the WSC. Among a number of attempts to tackle this challenge, one particular approach has demonstrated the plausibility of using commonsense knowledge automatically acquired from raw text in English Wikipedia. Here, we present the results of a large-scale experiment that shows how the performance of that particular automated approach varies with the availability of training material. We compare the results of this experiment with two studies: one from the literature that investigates how adult native speakers tackle the WSC, and one that we design and undertake to investigate how teenager non-native speakers tackle the WSC. We nd that the performance of the automated approach correlates positively with the performance of humans, suggesting that the performance of the particular automated approach could be used as a metric of hardness for WSC instances.

This work has been partly supported by the project that has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 739578 (RISE – Call: H2020-WIDESPREAD-01-2016-2017-TeamingPhase2) and the Government of the Republic of Cyprus through the Directorate General for European Programmes, Coordination and Development. ©The authors
Files (1.5 MB)
Name Size
1.5 MB Download
  • Adam Richard-Bollans, L Gomez Alvarez, and Anthony G Cohn. The Role of Pragmatics in Solving the Winograd Schema Challenge. In Proceedings of 13th International Symposium on Commonsense Reasoning (Commonsense-2017). CEUR Workshop Proceedings, 2017.

  • Altaf Rahman and Vincent Ng. Resolving Complex Cases of Denite Pronouns: The Winograd Schema Challenge. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL '12, pages 777{789, Stroudsburg, PA, USA, 2012. Association for Computational Linguistics.

  • Arpit Sharma, Nguyen H Vo, Somak Aditya, and Chitta Baral. Towards Addressing the Winograd Schema Challenge - Building and Using a Semantic Parser and a Knowledge Hunting Module. In Proceedings of the Twenty-Fourth International Joint Conference on Articial Intelligence, IJCAI, pages 25{31, 2015.

  • Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David McClosky. The Stanford CoreNLP natural language processing toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 55{60, 2014.

  • Dan Bailey, Amelia Harrison, Yuliya Lierler, Vladimir Lifschitz, and Julian Michael. The Winograd Schema Challenge and Reasoning about Correlation. In In Working Notes of the Symposium on Logical Formalizations of Commonsense Reasoning, 2015.

  • David Bender. Establishing a Human Baseline for the Winograd Schema Challenge. In MAICS, pages 39{45, 2015.

  • Eric Bengtson and Dan Roth. Understanding the Value of Features for Coreference Resolution. In EMNLP, 10 2008.

  • Evan Ackerman. Winograd Schema Challenge Results: AI Common Sense Still a Problem, for Now. Spectrum, 2016.

  • Haoruo Peng, Daniel Khashabi, and Dan Roth. Solving Hard Coreference Problems. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 809{819, 2015.

  • Hector J. Levesque. The Winograd Schema Challenge. In AAAI Spring Symposium: Logical Formalizations of Commonsense Reasoning, number SS-11-06. American Association for Articial Intelligence, 2011.

  • Leslie G. Valiant. Knowledge Infusion. In Proceedings of the 21st National Conference on Articial Intelligence - Volume 2, AAAI'06, pages 1546{1551. AAAI Press, 2006.

  • Loizos Michael and Leslie G. Valiant. A First Experimental Demonstration of Massive Knowl- edge Infusion. In Proceedings of the 11th International Conference on Principles of Knowledge Representation and Reasoning (KR 2008), pages 378{388. AAAI Press, September 2008.

  • Loizos Michael. Machines with Websense. In Proc. of 11th International Symposium on Logical Formalizations of Commonsense Reasoning (Commonsense 13), 2013.

  • Loizos Michael. Partial observability and learnability. Artif. Intell., 174(11):639{669, 2010.

  • Loizos Michael. Reading Between the Lines. In Proceedings of the 21st International Joint Con- ference on Articial Intelligence (IJCAI 2009), pages 1525{1530, July 2009.

  • Nicos Isaak and Loizos Michael. Tackling the Winograd Schema Challenge Through Machine Logical Inferences. In David Pearce and Helena Soa Pinto, editors, STAIRS, volume 284 of Frontiers in Articial Intelligence and Applications, pages 75{86. IOS Press, 2016.

  • Nicos Isaak and Loizos Michael. Using the Winograd Schema Challenge as a CAPTCHA. In Proceedings of the 4th Global Conference on Articial Intelligence (GCAI 2018). EasyChair, 2018.

  • Tejas Ulhas Budukh. An intelligent co-reference resolver for Winograd schema sentences containing resolved semantic entities, 2013.

Views 4
Downloads 4
Data volume 5.8 MB
Unique views 4
Unique downloads 4


Cite as