00000nmm##2200000uu#4500 2630551 doi 10.5281/zenodo.2630551 oai:zenodo.org:2630551 user-eu Kruszewski, Germán University of Trento Lazaridou, Angeliki University of Trento Pham, Quan Ngoc University of Trento Bernardi, Raffaella University of Trento Pezzelle, Sandro University of Trento Baroni, Marco University of Trento Boleda, Gemma University of Trento Fernández, Raquel University of Amsterdam The LAMBADA dataset Paperno, Denis University of Trento info:eu-repo/semantics/openAccess Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode cc-by-4.0 spdx We introduce LAMBADA, a dataset to evaluate the capabilities of computational models for text understanding by means of a word prediction task. LAMBADA is a collection of narrative passages sharing the characteristic that human subjects are able to guess their last word if they are exposed to the whole passage, but not if they only see the last sentence preceding the target word. To succeed on LAMBADA, computational models cannot simply rely on local context, but must be able to keep track of information in the broader discourse. We show that LAMBADA exemplifies a wide range of linguistic phenomena, and that none of several state-of-the-art language models reaches accuracy above 1% on this novel benchmark. We thus propose LAMBADA as a challenging test set, meant to encourage the development of new models capable of genuine understanding of broad context in natural language text.   The LAMBADA paper can be found <a href="http://anthology.aclweb.org/P/P16/P16-1144.pdf">here</a>. Zenodo 2016-08-07 user-eu info:eu-repo/semantics/other 283554 Compositional Operations in Semantic Space 655577 Linking Objects to Vectors in distributional semantics: A framework to anchor corpus-based meaning representations to the external world 20200124192617.0 1717206 md5:6d9fcfc38c2068a360597ea63b814045 https://zenodo.org/records/2630551/files/rejected-data1.tar.gz 334527694 md5:8014f6ba29b80dd27fb853a7373af7c3 https://zenodo.org/records/2630551/files/lambada-dataset.tar.gz open 10.5281/zenodo.2630550 isVersionOf doi