Published May 9, 2022 | Version 1
Dataset Open

Trained Models from "General Cross-Architecture Distillation of Pretrained Language Models into Matrix Embeddings"

  • 1. Max Planck Institute for Psycholinguistics
  • 2. University of Ulm

Description

Trained models from the paper:

Lukas Galke, Isabell Cuber, Christoph Meyer, Henrik Ferdinand Noelscher, Angelina Sonderecker, and Ansgar Scherp: General Cross-Architecture Distillation of Pretrained Language Models into Matrix Embeddings, in: International Joint Conference on Neural Networks (IJCNN), 2022.

  • File seq2mat_hybrid_bidirectional_sbertlike-100p-bsz512 holds the model from pretraining
  • File ws2020_transformer_final_models holds the fine-tuned models for each task of the GLUE benchmark

Files

seq2mat_hybrid_bidirectional_sbertlike-100p-bsz512.zip

Files (26.3 GB)

Name Size Download all
md5:0a543e81efafb7b688f15d613b76b0ea
974.6 MB Preview Download
md5:e3b8125b6f3dde41790a032c7f09d5ca
25.3 GB Download