prompteus commited on
Commit
e588049
1 Parent(s): 2068669

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -61
README.md CHANGED
@@ -1,6 +1,4 @@
1
  ---
2
- # For reference on model card metadata, see the spec: https://github.com/huggingface/hub-docs/blob/main/modelcard.md?plain=1
3
- # Doc / guide: https://huggingface.co/docs/hub/model-cards
4
  datasets:
5
  - MU-NLPC/Calc-gsm8k
6
  - MU-NLPC/Calc-aqua_rat
@@ -9,51 +7,19 @@ datasets:
9
  metrics:
10
  - exact_match
11
  - rouge
12
- model-index:
13
- - name: calc-t5-xl
14
- results:
15
- - task:
16
- type: question-answering
17
- name: Question Answering
18
- dataset:
19
- type: gsm8k
20
- name: GSM8K
21
- split: validation
22
- metrics:
23
- - type: exact_match
24
- value: 0.420
25
- - type: rouge
26
- value: 0.627
27
- - task:
28
- type: question-answering
29
- name: Question Answering
30
- dataset:
31
- type: aqua_rat
32
- name: AQUA-RAT
33
- split: validation
34
- metrics:
35
- - type: exact_match
36
- value: 0.06
37
- - type: rouge
38
- value: 0.323
39
  license: apache-2.0
40
  language:
41
  - en
42
  ---
43
 
44
- # Model Card for calc-t5-xl
45
-
46
- <!-- Provide a quick summary of what the model is/does. -->
47
 
48
  This model generates reasoning chains over mathematical questions while **using an external tool: Sympy calculator**.
49
 
50
- ## Model Details
51
-
52
- ### Model Description
53
 
54
- <!-- Provide a longer summary of what this model is. -->
55
 
56
- With the idea to offload a symbolic reasoning from the stochastic language model,
57
  we train this model to utilize a calculator **for all applicable numeric operations**.
58
  This is achieved by training the model to construct calls to the tool's API in this format:
59
 
@@ -64,25 +30,28 @@ This is achieved by training the model to construct calls to the tool's API in t
64
  where `<gadget>` segment triggers a call of the tool,
65
  which is subsequently served by extending model's decoder input context by adding the output of the tool within the `<output>` segment.
66
 
67
- - **Developed by:** Anonymous
68
  - **Model type:** Autoregressive Encoder-Decoder
69
  - **Language(s):** en
70
- - **Finetuned from:** t5-3b
71
 
72
- ### Model Sources
73
 
74
- <!-- Provide the basic links for the model. -->
 
 
 
 
 
75
 
76
- - **Repository:** https://github.com/emnlp2023sub/gadgets
77
- - **Paper:** Stay tuned!
78
 
79
  ## Usage
80
 
81
  Additionally to conventional generation, using Tool-augmented generation requires
82
  (1) implementation of the tool(s) and
83
- (2) a customization of generate() method augmenting input context on-demand with the outputs of the tools.
84
 
85
- You can find these two components implemented in the **gadgets/models.py** and **gadgets/gadget.py** in the project's [home repo](https://github.com/emnlp2023sub/gadgets).
 
86
 
87
  After adding these two scripts to your directory, you can use the model as follows:
88
 
@@ -93,8 +62,9 @@ from gadgets.model import gadget_assisted_model
93
  from gadgets.gadget import Calculator
94
 
95
  GadgetAssistedT5 = gadget_assisted_model(T5ForConditionalGeneration)
96
- model = GadgetAssistedT5.from_pretrained("emnlp2023/calc-t5-xl")
97
- tokenizer = T5Tokenizer.from_pretrained("emnlp2023/calc-t5-xl")
 
98
 
99
  model.prepare_for_generate(tokenizer,
100
  enabled_gadgets=[Calculator()],
@@ -110,7 +80,9 @@ inputs = tokenizer(query, return_tensors="pt")
110
  output_ids = model.generate(**inputs)
111
  tokenizer.decode(output_ids[0], spaces_between_special_tokens=False)
112
  ```
 
113
  This returns:
 
114
  ```html
115
  According to the ratio, for every 5 parts that Johnson gets, Mike gets 2 parts Since Johnson got $2500,
116
  each part is therefore $2500/5 = $<gadget id="calculator">2500/5</gadget><output>500</output> 500
@@ -119,26 +91,16 @@ After buying the shirt he will have $1000-$200 = $<gadget id="calculator">1000-2
119
  Final result is<result>800</result></s>
120
  ```
121
 
122
- ### Out-of-Scope Usage
123
-
124
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
125
 
126
  Note that given the limited scope of the exercises' complexity in the training, this model will not work well for tasks requiring
127
  more complex algebraic operations, including equations, variables and operations outside the scope of (+-*/).
128
 
129
- ## Training Details
130
-
131
- ### Training Data
132
-
133
- <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
134
 
135
- This model was trained on our Calculator-augmented set of
136
- - [Calc Ape210k](https://huggingface.co/datasets/emnlp2023/Calc-ape210k) ([original Ape210k on github](https://github.com/Chenny0808/ape210k))
137
- - [Calc MathQA](https://huggingface.co/datasets/emnlp2023/Calc-math_qa) ([original MathQA on HF](https://huggingface.co/datasets/math_qa))
138
- - [Calc GSM8K](https://huggingface.co/datasets/emnlp2023/Calc-gsm8k) ([original GSM8K on HF](https://huggingface.co/datasets/gsm8k))
139
- - [Calc Aqua-RAT](https://huggingface.co/datasets/emnlp2023/Calc-aqua_rat) ([original Aqua-RAT on HF](https://huggingface.co/datasets/aqua_rat))
140
 
141
- in a standard auto-regressive setup i.e. for a conditional next-token prediction with teacher-forced prefix.
 
142
 
143
 
144
  ## Cite
 
1
  ---
 
 
2
  datasets:
3
  - MU-NLPC/Calc-gsm8k
4
  - MU-NLPC/Calc-aqua_rat
 
7
  metrics:
8
  - exact_match
9
  - rouge
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  license: apache-2.0
11
  language:
12
  - en
13
  ---
14
 
15
+ # Model Card for calcformer-t5-xl
 
 
16
 
17
  This model generates reasoning chains over mathematical questions while **using an external tool: Sympy calculator**.
18
 
 
 
 
19
 
20
+ ## Model Description
21
 
22
+ With the idea to offload the symbolic computation from the stochastic language model,
23
  we train this model to utilize a calculator **for all applicable numeric operations**.
24
  This is achieved by training the model to construct calls to the tool's API in this format:
25
 
 
30
  where `<gadget>` segment triggers a call of the tool,
31
  which is subsequently served by extending model's decoder input context by adding the output of the tool within the `<output>` segment.
32
 
33
+ - **Developed by:** Calcformer team
34
  - **Model type:** Autoregressive Encoder-Decoder
35
  - **Language(s):** en
36
+ - **Finetuned from:** t5-xl
37
 
 
38
 
39
+ ## Sources
40
+
41
+ - **Repository:** <https://github.com/prompteus/calc-x>
42
+ - **Paper:** <https://arxiv.org/abs/2305.15017>
43
+ - [**Calcformer model family on HF**](https://huggingface.co/collections/MU-NLPC/calcformers-65367392badc497807b3caf5)
44
+ - [**Calc-X dataset collection on HF**](https://huggingface.co/collections/MU-NLPC/calc-x-652fee9a6b838fd820055483)
45
 
 
 
46
 
47
  ## Usage
48
 
49
  Additionally to conventional generation, using Tool-augmented generation requires
50
  (1) implementation of the tool(s) and
51
+ (2) a customization of `generate()` method augmenting input context on-demand with the outputs of the tools.
52
 
53
+ You can find these two components implemented in the attached **gadgets/model.py** and **gadgets/gadget.py** in this model's repo
54
+ and the project's [home repo](https://github.com/prompteus/calc-x).
55
 
56
  After adding these two scripts to your directory, you can use the model as follows:
57
 
 
62
  from gadgets.gadget import Calculator
63
 
64
  GadgetAssistedT5 = gadget_assisted_model(T5ForConditionalGeneration)
65
+ model_name = "MU-NLPC/calcformer-t5-xl"
66
+ model = GadgetAssistedT5.from_pretrained(model_name)
67
+ tokenizer = T5Tokenizer.from_pretrained(model_name)
68
 
69
  model.prepare_for_generate(tokenizer,
70
  enabled_gadgets=[Calculator()],
 
80
  output_ids = model.generate(**inputs)
81
  tokenizer.decode(output_ids[0], spaces_between_special_tokens=False)
82
  ```
83
+
84
  This returns:
85
+
86
  ```html
87
  According to the ratio, for every 5 parts that Johnson gets, Mike gets 2 parts Since Johnson got $2500,
88
  each part is therefore $2500/5 = $<gadget id="calculator">2500/5</gadget><output>500</output> 500
 
91
  Final result is<result>800</result></s>
92
  ```
93
 
94
+ ## Out-of-Scope Usage
 
 
95
 
96
  Note that given the limited scope of the exercises' complexity in the training, this model will not work well for tasks requiring
97
  more complex algebraic operations, including equations, variables and operations outside the scope of (+-*/).
98
 
 
 
 
 
 
99
 
100
+ ## Training
 
 
 
 
101
 
102
+ This model was trained on [Calc-X](https://huggingface.co/collections/MU-NLPC/calc-x-652fee9a6b838fd820055483), a collection of math problem datasets which we converted into CoT with calculator interactions.
103
+ We used a standard auto-regressive transformer training, i.e. a conditional next-token prediction with cross-entropy loss. For more detail about data, training or evaluation, see the [Calc-X and Calcformers paper](https://arxiv.org/abs/2305.15017).
104
 
105
 
106
  ## Cite