AdamLucek
/

EduMixtral-4x7B

Text Generation

Mixture of Experts

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

AdamLucek commited on Aug 7

Commit

cb2ebbb

•

1 Parent(s): e23f7b7

Update README.md

Files changed (1) hide show

README.md +17 -5

README.md CHANGED Viewed

@@ -26,28 +26,27 @@ EduMixtral is a Mixture of Experts (MoE) made with the following models using [M
 ## Usage
 ```python
 from transformers import AutoTokenizer, AutoModelForCausalLM
-import torch
 # Load the tokenizer and model
 tokenizer = AutoTokenizer.from_pretrained("AdamLucek/EduMixtral-4x7B")
 model = AutoModelForCausalLM.from_pretrained(
     "AdamLucek/EduMixtral-4x7B",
     device_map="cuda",
-    torch_dtype=torch.bfloat16
 )
 # Prepare the input text
-input_text = "The length of a rectangular vegetable field is 120m, the length is more than the width (2/3), calculate how many meters is the width of this vegetable field?"
 input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
 # Generate the output with specified parameters
 outputs = model.generate(
     **input_ids,
     max_new_tokens=256,
-    temperature=0.7,
-    top_p=0.9,
     num_return_sequences=1
 )
@@ -55,6 +54,19 @@ outputs = model.generate(
 print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 ```
 ## 🧩 Configuration
 ```yaml

 ## Usage
+It is reccomended to load in 8bit or 4bit quantization
 ```python
 from transformers import AutoTokenizer, AutoModelForCausalLM
 # Load the tokenizer and model
 tokenizer = AutoTokenizer.from_pretrained("AdamLucek/EduMixtral-4x7B")
 model = AutoModelForCausalLM.from_pretrained(
     "AdamLucek/EduMixtral-4x7B",
     device_map="cuda",
+    quantization_config=BitsAndBytesConfig(load_in_8bit=True)
 )
 # Prepare the input text
+input_text = "Math problem: Xiaoli reads a 240-page story book. She reads (1/8) of the whole book on the first day and (1/5) of the whole book on the second day. How many pages did she read in total in two days?"
 input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
 # Generate the output with specified parameters
 outputs = model.generate(
     **input_ids,
     max_new_tokens=256,
     num_return_sequences=1
 )
 print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 ```
+**Output:**
+>Solution:
+>To find the total number of pages Xiaoli read in two days, we need to add the number of pages she read on the first day and the second day.
+>On the first day, Xiaoli read 1/8 of the book. Since the book has 240 pages, the number of pages she read on the first day is:
+>\[ \frac{1}{8} \times 240 = 30 \text{ pages} \]
+>On the second day, Xiaoli read 1/5 of the book. The number of pages she read on the second day is:
+>\[ \frac{1}{5} \times 240 = 48 \text{ pages} \]
+>To find the total number of pages she read in two days, we add the pages she read on the first day and the second day:
+>\[ 30 \text{ pages} + 48 \text{ pages} = 78 \text{ pages} \]
+>Therefore, Xiaoli read a total of 78 pages in two days.
+>Final answer: Xiaoli read 78 pages in total
 ## 🧩 Configuration
 ```yaml