Update README.md
Browse files
README.md
CHANGED
@@ -80,8 +80,9 @@ C(CCCCCCCCO)=CCC=C
|
|
80 |
|
81 |
## Training Procedure
|
82 |
- **Batch Size**: 64
|
|
|
83 |
- **Learning Rate**: 1.5e-5
|
84 |
-
- **Optimizer**: Ranger21 (MADGRAD-Lookahead-AdaBelief with gradient centralization, gradient clipping, and weight decay)
|
85 |
|
86 |
## Training Logs
|
87 |
|
|
|
80 |
|
81 |
## Training Procedure
|
82 |
- **Batch Size**: 64
|
83 |
+
- **Num Epoch for Each Chunk**: 1
|
84 |
- **Learning Rate**: 1.5e-5
|
85 |
+
- **Optimizer**: Ranger21 (MADGRAD-Lookahead-AdaBelief with gradient centralization, linear warm up (22%), gradient clipping, and L2 weight decay)
|
86 |
|
87 |
## Training Logs
|
88 |
|