BEE-spoke-data
/

Qwen2-1.5B-stepbasin-books

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

pszemraj commited on Jul 16

Commit

a0c273b

•

1 Parent(s): 900c716

Update README.md

Files changed (1) hide show

README.md +5 -48

README.md CHANGED Viewed

@@ -1,12 +1,12 @@
 ---
 license: apache-2.0
 base_model: Qwen/Qwen2-1.5B
-tags:
-- generated_from_trainer
 metrics:
 - accuracy
 datasets:
 - BEE-spoke-data/stepbasin-books
 ---
 [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/pszemraj/long-generation-tests/runs/ethp25f9)
@@ -15,52 +15,9 @@ datasets:
 > [!IMPORTANT]
 > this was finetuned at 16384 context length
-This model is a fine-tuned version of [Qwen/Qwen2-1.5B](https://huggingface.co/Qwen/Qwen2-1.5B) on the BEE-spoke-data/stepbasin-books dataset.
 It achieves the following results on the evaluation set:
 - Loss: 2.8110
 - Accuracy: 0.4298
-- Num Input Tokens Seen: 44040192
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 3e-05
-- train_batch_size: 1
-- eval_batch_size: 1
-- seed: 80085
-- gradient_accumulation_steps: 32
-- total_train_batch_size: 32
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: cosine
-- lr_scheduler_warmup_ratio: 0.05
-- num_epochs: 3.0
-### Training results
-| Training Loss | Epoch  | Step | Validation Loss | Accuracy | Input Tokens Seen |
-|:-------------:|:------:|:----:|:---------------:|:--------:|:-----------------:|
-| 2.7792        | 0.9967 | 28   | 2.8183          | 0.4287   | 14729216          |
-| 2.6971        | 1.9933 | 56   | 2.8112          | 0.4297   | 29458432          |
-| 2.7116        | 2.9900 | 84   | 2.8110          | 0.4298   | 44040192          |
-### Framework versions
-- Transformers 4.42.4
-- Pytorch 2.3.1+cu121
-- Datasets 2.20.0
-- Tokenizers 0.19.1

 ---
 license: apache-2.0
 base_model: Qwen/Qwen2-1.5B
 metrics:
 - accuracy
 datasets:
 - BEE-spoke-data/stepbasin-books
+language:
+- en
 ---
 [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/pszemraj/long-generation-tests/runs/ethp25f9)
 > [!IMPORTANT]
 > this was finetuned at 16384 context length
+This model is a fine-tuned version of [Qwen/Qwen2-1.5B](https://huggingface.co/Qwen/Qwen2-1.5B) on https://github.com/stepbasin/books/tree/master/books
 It achieves the following results on the evaluation set:
 - Loss: 2.8110
 - Accuracy: 0.4298
+- Num Input Tokens Seen: 44040192