dotw commited on
Commit
29dd675
1 Parent(s): b72b898

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -18
README.md CHANGED
@@ -78,7 +78,7 @@ Users (both direct and downstream) should be made aware of the risks, biases and
78
 
79
  Use the code below to get started with the model.
80
 
81
- [Todo: Insert Code Here]
82
 
83
  ## Training Details
84
 
@@ -113,11 +113,11 @@ SEA LION 3B was trained on 980B tokens of RefinedWeb (English) and mC4 (Chinese,
113
 
114
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
115
 
116
- #### Preprocessing [optional]
117
 
118
- [More Information Needed]
119
 
120
- SEA LION 3B was trained on 256 A100 40GB GPUs, using MosaicML Composer.
121
 
122
  #### Training Hyperparameters
123
 
@@ -146,23 +146,23 @@ The training took 14 days to complete.
146
 
147
  <!-- This should link to a Dataset Card if possible. -->
148
 
149
- [More Information Needed]
150
 
151
  #### Factors
152
 
153
  <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
154
 
155
- [More Information Needed]
156
 
157
  #### Metrics
158
 
159
  <!-- These are the evaluation metrics being used, ideally with a description of why. -->
160
 
161
- [More Information Needed]
162
 
163
  ### Results
164
 
165
- [More Information Needed]
166
 
167
  #### Summary
168
 
@@ -202,7 +202,6 @@ SEA LION 3B is a decoder model using the MPT architecture.
202
 
203
  ### Compute Infrastructure
204
 
205
-
206
  #### Hardware
207
 
208
  SEA LION 3B was trained on AWS EC2 cluster comprising 32 p4d.24xlarge instances, using a total of 256 A100 40GB GPUs.
@@ -217,28 +216,47 @@ SEA LION 3B was trained using MosaicML Composer using PyTorch FullyShardedDataPa
217
 
218
  **BibTeX:**
219
 
220
- [More Information Needed]
221
 
222
  **APA:**
223
 
224
- [More Information Needed]
225
 
226
  ## Glossary [optional]
227
 
228
  <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
229
 
230
- [More Information Needed]
231
 
232
  ## More Information [optional]
233
 
234
- [More Information Needed]
235
-
236
- ## Model Card Authors [optional]
237
-
238
- [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
239
 
240
  ## Model Card Contact
241
 
242
- [More Information Needed]
243
 
244
 
 
78
 
79
  Use the code below to get started with the model.
80
 
81
+ [ Todo: Insert Code Here ]
82
 
83
  ## Training Details
84
 
 
113
 
114
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
115
 
116
+ SEA LION 3B was trained on 256 A100 40GB GPUs, using MosaicML Composer.
117
 
118
+ #### Preprocessing [optional]
119
 
120
+ N/A
121
 
122
  #### Training Hyperparameters
123
 
 
146
 
147
  <!-- This should link to a Dataset Card if possible. -->
148
 
149
+ _Coming soon_
150
 
151
  #### Factors
152
 
153
  <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
154
 
155
+ _Coming soon_
156
 
157
  #### Metrics
158
 
159
  <!-- These are the evaluation metrics being used, ideally with a description of why. -->
160
 
161
+ _Coming soon_
162
 
163
  ### Results
164
 
165
+ _Coming soon_
166
 
167
  #### Summary
168
 
 
202
 
203
  ### Compute Infrastructure
204
 
 
205
  #### Hardware
206
 
207
  SEA LION 3B was trained on AWS EC2 cluster comprising 32 p4d.24xlarge instances, using a total of 256 A100 40GB GPUs.
 
216
 
217
  **BibTeX:**
218
 
219
+ N/A
220
 
221
  **APA:**
222
 
223
+ N/A
224
 
225
  ## Glossary [optional]
226
 
227
  <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
228
 
229
+ N/A
230
 
231
  ## More Information [optional]
232
 
233
+ N/A
234
+
235
+ ## The Team
236
+
237
+ Hamsawardhini Rengarajan
238
+ Holy Lovenia
239
+ Lam Clarence
240
+ Leong Weiqi
241
+ Li Yier
242
+ Ng Raymond
243
+ Ngui Jian Gang
244
+ Railey Montalan
245
+ Tai Ngee Chia
246
+ Tan Choon Meng
247
+ Thanh Ngan Nguyen
248
+ Teo Jin Howe
249
+ Teo Wei Yi
250
+ Yeo Yeow Tong
251
+ Yong Xianbin
252
+ Yosephine
253
+ William Tjhi
254
+ Ong Tat-Wee David
255
+ Darius Liu
256
+ Leslie Teo
257
 
258
  ## Model Card Contact
259
 
260
+ [ Todo: Get AISG Contact ]
261
 
262