kailasps
/

GPT2-codeparrot

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

kailasps commited on Jul 19

Commit

dc8ad0c

•

1 Parent(s): 4f2b730

Update README.md

Files changed (1) hide show

README.md +7 -12

README.md CHANGED Viewed

@@ -4,7 +4,6 @@ base_model: gpt2
 tags:
 - generated_from_trainer
 - code
-- not-for-all-audiences
 model-index:
 - name: codeparrot-ds
   results: []
@@ -23,21 +22,21 @@ should probably proofread and complete it, then remove this comment. -->
 # GPT2-Codeparrot
-This model is trained from scratch using random initilization [gpt2](https://huggingface.co/gpt2) on the validaition set of codeparrot.
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
-## Training procedure
 ### Training hyperparameters
@@ -54,10 +53,6 @@ The following hyperparameters were used during training:
 - num_epochs: 1
 - mixed_precision_training: Native AMP
-### Training results
 ### Framework versions
 - Transformers 4.42.4

 tags:
 - generated_from_trainer
 - code
 model-index:
 - name: codeparrot-ds
   results: []
 # GPT2-Codeparrot
+Generative Pre-trained Transformer 2 (GPT-2) is a large language model from OpenAI that was first introduced in [gpt2](https://huggingface.co/gpt2). It is a decoder-only Transformer model trained using a masked language modeling (MLM) objective. This means the model is trained to predict the next word in a sequence, given the previous words. GPT-2 models are known for their ability to generate realistic and coherent text, making them useful for a variety of natural language processing tasks such as text generation, translation, and question answering.
 ## Model description
+This model is a base GPT-2 architecture with [insert number] parameters. It was trained on the hf.rst.imurse/codeparrot-ds-valid dataset, which is a small subset of the original WebText dataset used to train GPT-2. Due to the limited training data, this model may not perform as well as other pre-trained GPT-2 models available on Hugging Face.
 ## Intended uses & limitations
+This model is intended for personal learning and exploration of the GPT-2 architecture. Due to its limited training data, it may not be suitable for real-world applications.
 ## Training and evaluation data
+This model was trained using the Transformers library with the following specifications:
+- Training Data: `hf.rst.imurse/codeparrot-ds-valid`
+- Training Script: [Training_a_causal_language_model_from_scratch](https://github.com/kailas711/HugginFace-NLP-Course/blob/af464abed3f79fe7434f3310ceb97bfb68cddcef/Training_a_causal_language_model_from_scratch.ipynb)
+-
 ### Training hyperparameters
 - num_epochs: 1
 - mixed_precision_training: Native AMP
 ### Framework versions
 - Transformers 4.42.4