gpt2 / README.md
wanzin's picture
Update README.md
1305186 verified
---
license: apache-2.0
datasets:
- wikitext
- ptb_text_only
language:
- en
metrics:
- perplexity
pipeline_tag: text-generation
model-index:
- name: distilgpt2
results:
- task:
type: text-generation
dataset:
name: penn_treebank
type: ptb_text_only
metrics:
- name: perlexity@distilgpt2:BASELINE
type: dmx-perlexity
value: 63.45857238769531
- name: perlexity@distilgpt2:BASIC
type: dmx-perlexity
value: 64.36720275878906
- task:
type: text-generation
dataset:
name: wikitext2
type: wikitext-2-raw-v1
metrics:
- name: perlexity@distilgpt2:BASELINE
type: dmx-perlexity
value: 46.05925369262695
- name: perlexity@distilgpt2:BASIC
type: dmx-perlexity
value: 46.570838928222656
---
This is a d-Matrix functional reference of the GPT2 model family, with the following *revisions*:
- [`distilgpt2`](https://huggingface.co/distilbert/distilgpt2)
- [`gpt2`](https://huggingface.co/openai-community/gpt2)
- [`gpt2-medium`](https://huggingface.co/openai-community/gpt2-medium)
- [`gpt2-large`](https://huggingface.co/openai-community/gpt2-large)
- [`gpt2-xl`](https://huggingface.co/openai-community/gpt2-xl)
The reference provides the following functional *configurations*:
Configuration | Explanation
:-- | :--
**`BASELINE`** | a reference functionally equivalent to the original model
**`BASIC`** | all linear algebraic operands quantized to `BFP16-64`, and all other operations transformed to approximated kernel simulations
### Usage
Install d-Matrix [Dmx_Compressor](https://github.com/d-matrix-ai/dmx-compressor) first.
```sh
pip install dmx_compressor
```
The following is an example model and its evaluation.
```python
from dmx.compressor.dmx import pipeline
pipe = pipeline(
task="text-generation",
model="d-matrix/gpt2",
revision="gpt2-xl", # see above for other variants
dmx_config="BASELINE", # see above for other variants
)
results = pipe.evaluate(
metric="d-matrix/dmx_perplexity",
dataset="wikitext",
dataset_version="wikitext-2-raw-v1",
)
```
### Evaluation results
- `perplexity` on `penn_treebank`
Revision \ Configuration | **`BASELINE`** | **`BASIC`**
:-- | --: | --:
`distilgpt2` | 63.46 | 64.13
`gpt2` | 35.77 | 35.93
`gpt2-medium` | 27.06 | 27.10
`gpt2-large` | 23.03 | 23.04
`gpt2-xl` | 21.01 | 21.02
- `perplexity` on `wikitext2`
Revision \ Configuration | **`BASELINE`** | **`BASIC`**
:-- | --: | --:
`distilgpt2` | 46.06 | 46.44
`gpt2` | 29.94 | 30.08
`gpt2-medium` | 21.71 | 21.73
`gpt2-large` | 19.42| 19.43
`gpt2-xl` | 17.40| 17.40
- `perplexity` on `wikitext103`
Revision \ Configuration | **`BASELINE`** | **`BASIC`**
:-- | --: | --:
`distilgpt2` | 46.06 | 46.44
`gpt2` | 29.94 |30.08
`gpt2-medium` | 21.71 | 21.73
`gpt2-large` | 19.43 | 19.43
`gpt2-xl` | 17.40 | 17.40