Recurrent Neural Network Language Models Always Learn English-Like Relative Clause Attachment

Davis, Forrest; van Schijndel, Marten

doi:10.5281/zenodo.3778994

Published April 30, 2020 | Version v1

Dataset Open

Recurrent Neural Network Language Models Always Learn English-Like Relative Clause Attachment

1. Cornell University

This repository contains the raw results (by word information-theoretic measures for the experimental stimuli) and the LSTM models analyzed in Recurrent Neural Network Language Models Always Learn English-Like Relative Clause Attachment. The models from the synthetic experiments are given in the synthetic archive, as well as the training data generation script. There is a README included that gives more details for recreating/evaluating results from those experiments.

The naming convention for each model in the models directory is:
[Language]_hidden[Hidden Units]_batch[Batch Size]_dropout[Dropout Rate]_lr[Learning Rate]_[Model Number].pt

Language: en for English and es for Spanish
Hidden Units: All models had two layers with 650 hidden units per layer
Batch Size: The size of the batch (128 for English, 64 for Spanish)
Dropout Rate: All models used a dropout rate of 0.2
Learning Rate: All models has a learning rate of 20
Model Number: Identifier of the model (English model 0 is the best model from Gulordava et al. (2018))

Files

Files (5.4 GB)

Name	Size	Download all
models.tar.gz md5:01ef23e1a95175f01bc32c55be3513fd	1.6 GB	Download
raw_results.tar.gz md5:c62dc12445cbed42a4aaad3d02254632	380.3 MB	Download
synthetic.tar.gz md5:29d2366dab3c2b2c1194cadfdf022b50	3.4 GB	Download

Citations

Oops! Something went wrong while fetching results.

	All versions	This version
Views	263	260
Downloads	49	48
Data volume	100.0 GB	99.6 GB

Recurrent Neural Network Language Models Always Learn English-Like Relative Clause Attachment

Creators

Description

Files

Files (5.4 GB)