MLPerf Inference quantized BERT PyTorch model on SQuAD v1.1 dataset

Ian Tramble; Patrick Judd

doi:10.5281/zenodo.4792496

Published May 24, 2021 | Version v2

Software Open

MLPerf Inference quantized BERT PyTorch model on SQuAD v1.1 dataset

1. NVIDIA

This model is finetuned and quantized based on a pretrained huggingface BERT model.

The quantization method is: per-tensor, symmetric, zero_point=0. It uses NVIDIA's quantization toolkit on top of PyTorch to perform quantization. Achieved accuracy is f1_score=90.633%.

A description of the quantization steps can be found in README.md. All code necessary to reproduce can be found in the upload: Dockerfile, run_squad.py, quant_trainer.py, and modeling_bert.patch. The PyTorch model itself is pytorch_model.bin.

Files

README.md

Files (1.3 GB)

Name	Size	Download all
Dockerfile md5:acd8a4f652d1d4653de33ee130761d0c	1.2 kB	Download
modeling_bert.patch md5:af17111f17c622268f864f28544ae99a	8.0 kB	Download
pytorch_model.bin md5:0734c580cb53b4b56a3f400771ffcb7c	1.3 GB	Download
quant_trainer.py md5:cb534ea4fdd4c186d9a1d0983179ddc0	9.4 kB	Download
README.md md5:53a1fd283ff0e3871bbb524eaf85d3ba	3.7 kB	Preview Download
run_squad.py md5:d71ca747e7f0fc077bbb0c295b446b66	38.6 kB	Download

Views

Downloads

Show more details

	All versions	This version
Views	1,839	1,457
Downloads	1,711	1,629
Data volume	1.7 TB	1.6 TB

More info on how stats are collected....

DOI

Resource type

Software

Publisher

Zenodo

License: Apache License 2.0

A permissive license whose main conditions require preservation of copyright and license notices. Contributors provide an express grant of patent rights. Licensed works, modifications, and larger works may be distributed under different terms and without source code. Read more

Technical metadata

Created: May 25, 2021
Modified: May 26, 2021

MLPerf Inference quantized BERT PyTorch model on SQuAD v1.1 dataset

Authors/Creators

Description

Files

README.md

Files (1.3 GB)