lnxdx commited on
Commit
3fb5390
1 Parent(s): 2917ab9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -2
README.md CHANGED
@@ -129,11 +129,60 @@ The following hyperparameters were used during training:
129
  | 0.8238 | 11.88 | 1900 | 0.6735 | 0.3297 |
130
  | 0.7618 | 12.5 | 2000 | 0.6728 | 0.3286 |
131
 
132
- #### Choosing the best model
133
  Several models with differet hyperparameters were trained. The following figures show the training process for three of them.
134
  ![wer](wandb-wer.png)
135
  ![loss](wandb-loss.png)
136
- As you can see this model performs better on evaluation set.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
137
 
138
  #### Framework versions
139
 
 
129
  | 0.8238 | 11.88 | 1900 | 0.6735 | 0.3297 |
130
  | 0.7618 | 12.5 | 2000 | 0.6728 | 0.3286 |
131
 
132
+ #### Hyperparameter Tuning
133
  Several models with differet hyperparameters were trained. The following figures show the training process for three of them.
134
  ![wer](wandb-wer.png)
135
  ![loss](wandb-loss.png)
136
+ '20_2000_1e-5_hp-mehrdad' is the current model and it's hyperparameter are:
137
+ ```python
138
+ model = Wav2Vec2ForCTC.from_pretrained(
139
+ model_name_or_path if not last_checkpoint else last_checkpoint,
140
+ # hp-mehrdad: Hyperparams of 'm3hrdadfi/wav2vec2-large-xlsr-persian-v3'
141
+ attention_dropout = 0.05316,
142
+ hidden_dropout = 0.01941,
143
+ feat_proj_dropout = 0.01249,
144
+ mask_time_prob = 0.04529,
145
+ layerdrop = 0.01377,
146
+ ctc_loss_reduction = 'mean',
147
+ ctc_zero_infinity = True,
148
+ )
149
+
150
+ learning_rate = 1e-5
151
+ ```
152
+ The hyperparameters of '19_2000_1e-5_hp-base' are:
153
+ ```python
154
+ model = Wav2Vec2ForCTC.from_pretrained(
155
+ model_name_or_path if not last_checkpoint else last_checkpoint,
156
+ # hp-base: Hyperparams simmilar to ('facebook/wav2vec2-large-xlsr-53' or 'facebook/wav2vec2-xls-r-300m')
157
+ attention_dropout = 0.1,
158
+ hidden_dropout = 0.1,
159
+ feat_proj_dropout = 0.1,
160
+ mask_time_prob = 0.075,
161
+ layerdrop = 0.1,
162
+ ctc_loss_reduction = 'mean',
163
+ ctc_zero_infinity = True,
164
+ )
165
+
166
+ learing_rate = 1e-5
167
+ ```
168
+
169
+ And the hyperparameters of '22_2000_1e-5_hp-masoud' are:
170
+ ```python
171
+ model = Wav2Vec2ForCTC.from_pretrained(
172
+ model_name_or_path if not last_checkpoint else last_checkpoint,
173
+ # hp-masoud: Hyperparams of 'masoudmzb/wav2vec2-xlsr-multilingual-53-fa'
174
+ attention_dropout = 0.2,
175
+ hidden_dropout = 0.2,
176
+ feat_proj_dropout = 0.1,
177
+ mask_time_prob = 0.2,
178
+ layerdrop = 0.2,
179
+ ctc_loss_reduction = 'mean',
180
+ ctc_zero_infinity = True,
181
+ )
182
+
183
+ learning_rate = 1e-5
184
+ ```
185
+ As you can see this model performs better with WER metric on validation(evaluation) set.
186
 
187
  #### Framework versions
188