Update README.md
Browse files
README.md
CHANGED
@@ -7,7 +7,7 @@ tags:
|
|
7 |
---
|
8 |
# chemfie-gpt-experiment-1
|
9 |
|
10 |
-
On-going training (
|
11 |
|
12 |
## Model Details
|
13 |
- **Model Type**: GPT-2
|
@@ -72,7 +72,8 @@ C(CCCCCCCCO)=CCC=C
|
|
72 |
|
73 |
## Training Data
|
74 |
- **Source**: Curated and merged from COCONUTDB (Sorokina et al., 2021), ChemBL34 (Zdrazil et al., 2023), and SuperNatural3 (Gallo et al. 2023) database
|
75 |
-
- **Total**: 2,
|
|
|
76 |
- **Validation**: 293,336 samples
|
77 |
- **Per chunk**: 586,670 train, 73,334 validation, 73,334 test
|
78 |
- **Random seed for split**: 42
|
@@ -85,12 +86,12 @@ C(CCCCCCCCO)=CCC=C
|
|
85 |
## Training Logs
|
86 |
|
87 |
|
88 |
-
| Chunk | Training Loss | Validation Loss | Status |
|
89 |
-
| :---: |
|
90 |
-
| I |
|
91 |
-
| II |
|
92 |
-
| III |
|
93 |
-
| IV |
|
94 |
|
95 |
|
96 |
## Evaluation Results
|
@@ -106,8 +107,8 @@ C(CCCCCCCCO)=CCC=C
|
|
106 |
- The information and model provided is for academic purposes only. It is intended for educational and research use, and should not be used for any commercial or legal purposes. The author do not guarantee the accuracy, completeness, or reliability of the information.
|
107 |
|
108 |
## Additional Information
|
109 |
-
- Part of
|
110 |
-
- Serves as a baseline for future experiments with further curated datasets and architectural modifications
|
111 |
|
112 |
## Citation
|
113 |
### BibTeX
|
|
|
7 |
---
|
8 |
# chemfie-gpt-experiment-1
|
9 |
|
10 |
+
On-going training (2/4)
|
11 |
|
12 |
## Model Details
|
13 |
- **Model Type**: GPT-2
|
|
|
72 |
|
73 |
## Training Data
|
74 |
- **Source**: Curated and merged from COCONUTDB (Sorokina et al., 2021), ChemBL34 (Zdrazil et al., 2023), and SuperNatural3 (Gallo et al. 2023) database
|
75 |
+
- **Total**: 2,933,355 samples
|
76 |
+
- **Total Train**: 2,346,680 samples
|
77 |
- **Validation**: 293,336 samples
|
78 |
- **Per chunk**: 586,670 train, 73,334 validation, 73,334 test
|
79 |
- **Random seed for split**: 42
|
|
|
86 |
## Training Logs
|
87 |
|
88 |
|
89 |
+
| Chunk | Chunk's Training Loss | Chunk's Validation Loss | Status |
|
90 |
+
| :---: | :-------------------: | :---------------------: | :-------: |
|
91 |
+
| I | 1.346400 | 1.065180 | Done |
|
92 |
+
| II | 1.123500 | 0.993118 | Done |
|
93 |
+
| III | | | Ongoing |
|
94 |
+
| IV | | | Scheduled |
|
95 |
|
96 |
|
97 |
## Evaluation Results
|
|
|
107 |
- The information and model provided is for academic purposes only. It is intended for educational and research use, and should not be used for any commercial or legal purposes. The author do not guarantee the accuracy, completeness, or reliability of the information.
|
108 |
|
109 |
## Additional Information
|
110 |
+
- Part of experimental chemfie-gpt/T5 project
|
111 |
+
- Serves as a baseline for future experiments with further curated datasets, training, and architectural modifications
|
112 |
|
113 |
## Citation
|
114 |
### BibTeX
|