kailasps commited on
Commit
dc8ad0c
1 Parent(s): 4f2b730

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -12
README.md CHANGED
@@ -4,7 +4,6 @@ base_model: gpt2
4
  tags:
5
  - generated_from_trainer
6
  - code
7
- - not-for-all-audiences
8
  model-index:
9
  - name: codeparrot-ds
10
  results: []
@@ -23,21 +22,21 @@ should probably proofread and complete it, then remove this comment. -->
23
 
24
  # GPT2-Codeparrot
25
 
26
- This model is trained from scratch using random initilization [gpt2](https://huggingface.co/gpt2) on the validaition set of codeparrot.
27
 
28
  ## Model description
29
 
30
- More information needed
31
-
32
  ## Intended uses & limitations
33
 
34
- More information needed
35
-
36
  ## Training and evaluation data
37
 
38
- More information needed
39
 
40
- ## Training procedure
 
 
41
 
42
  ### Training hyperparameters
43
 
@@ -54,10 +53,6 @@ The following hyperparameters were used during training:
54
  - num_epochs: 1
55
  - mixed_precision_training: Native AMP
56
 
57
- ### Training results
58
-
59
-
60
-
61
  ### Framework versions
62
 
63
  - Transformers 4.42.4
 
4
  tags:
5
  - generated_from_trainer
6
  - code
 
7
  model-index:
8
  - name: codeparrot-ds
9
  results: []
 
22
 
23
  # GPT2-Codeparrot
24
 
25
+ Generative Pre-trained Transformer 2 (GPT-2) is a large language model from OpenAI that was first introduced in [gpt2](https://huggingface.co/gpt2). It is a decoder-only Transformer model trained using a masked language modeling (MLM) objective. This means the model is trained to predict the next word in a sequence, given the previous words. GPT-2 models are known for their ability to generate realistic and coherent text, making them useful for a variety of natural language processing tasks such as text generation, translation, and question answering.
26
 
27
  ## Model description
28
 
29
+ This model is a base GPT-2 architecture with [insert number] parameters. It was trained on the hf.rst.imurse/codeparrot-ds-valid dataset, which is a small subset of the original WebText dataset used to train GPT-2. Due to the limited training data, this model may not perform as well as other pre-trained GPT-2 models available on Hugging Face.
 
30
  ## Intended uses & limitations
31
 
32
+ This model is intended for personal learning and exploration of the GPT-2 architecture. Due to its limited training data, it may not be suitable for real-world applications.
 
33
  ## Training and evaluation data
34
 
35
+ This model was trained using the Transformers library with the following specifications:
36
 
37
+ - Training Data: `hf.rst.imurse/codeparrot-ds-valid`
38
+ - Training Script: [Training_a_causal_language_model_from_scratch](https://github.com/kailas711/HugginFace-NLP-Course/blob/af464abed3f79fe7434f3310ceb97bfb68cddcef/Training_a_causal_language_model_from_scratch.ipynb)
39
+ -
40
 
41
  ### Training hyperparameters
42
 
 
53
  - num_epochs: 1
54
  - mixed_precision_training: Native AMP
55
 
 
 
 
 
56
  ### Framework versions
57
 
58
  - Transformers 4.42.4