Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

Published August 9, 2019 | Version v3
Dataset Open

A Computational Theory for the Emergence of Grammatical Categories in Cortical Dynamics

  • 1. University of Buenos Aires
  • 2. Argonne National Laboratory
  • 3. Loyola University Chicago and Argonne National Laboratory
  • 4. Instituto de Ciencias Humanas, Centro Científico Tecnológico-CONICET, Mendoza, Argentina
  • 5. Uppsala University, Angstrom Laboratory

Description

The file Corpora.txt keeps the corpus used to train the model and the different instances of the classifier. It is basically a text file with one sentence per line from the original corpus called test.tsv available at https://github.com/google-research-datasets/wiki-split.git. We eliminated punctuation marks and special characters from the original file putting each sentence per line.

Enju_Output.txt holds the outputs generated by Enju in -so mode (Output in stand-off format) using Corpora.txt as input. This file has basically a natural language English per-sentence parse with a wide-coverage probabilistic for HPSG grammar.

The file Supervision.txt keeps the grammatical tags of the corpus. This file holds a tag per word and each tag is situated in a single line. Sentences are separated by one empty line while tags from words in the same sentence are located in adjacent lines.

The file Word_Category.txt carries the coarse-grained word category information needed by the model and introduced in it by apical dendrites. Each word in the corpus has a word-category tag which provides additional constraints to those provided by lateral dendrites. This file contains a tag per word and each tag is situated in a single line. Sentences are separated by one empty line while tags from words in the same sentence are located in adjacent lines.

The file SynSemTests.xlsx keeps all the grammar classification results as well as the statistical analysis in the classification tests.

Files

Corpora.txt

Files (112.8 MB)

Name Size Download all
md5:fdcb80d7affb09ff9529e7333269bb21
1.8 MB Preview Download
md5:05cf39d1e6a64e0d0f6eedff24e46e4f
2.7 kB Download
md5:b4f1d7c433811c8671e2203593f8b5e9
107.2 MB Preview Download
md5:679ba37769254bfba93a09a42859b313
204.8 kB Preview Download
md5:c43c01dd892649b7e1d6123d24bb5845
38.4 kB Download
md5:f23b401cf2d9de17eca46940190a6e94
130.6 kB Download
md5:f100551ad364d19e657ef77ceccfa17f
1.5 kB Download
md5:eef32667038a07a32cde6c1c4434b166
1.8 MB Preview Download
md5:f933f1623c3f4b1131821a7d0a426d2c
24.6 kB Download
md5:8a395f94246ecd0a2203d7335f17f070
1.6 MB Preview Download