gbyuvd
/

chemfie-gpt-experiment-1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

gbyuvd commited on Aug 15

Commit

d093493

•

1 Parent(s): 3f23a38

Update README.md

Files changed (1) hide show

README.md +11 -10

README.md CHANGED Viewed

@@ -7,7 +7,7 @@ tags:
 ---
 # chemfie-gpt-experiment-1
-On-going training (1/4)
 ## Model Details
 - **Model Type**: GPT-2
@@ -72,7 +72,8 @@ C(CCCCCCCCO)=CCC=C
 ## Training Data
 - **Source**: Curated and merged from COCONUTDB (Sorokina et al., 2021), ChemBL34 (Zdrazil et al., 2023), and SuperNatural3 (Gallo et al. 2023) database
-- **Total**: 2,346,680 samples
 - **Validation**: 293,336 samples
 - **Per chunk**: 586,670 train, 73,334 validation, 73,334 test
 - **Random seed for split**: 42
@@ -85,12 +86,12 @@ C(CCCCCCCCO)=CCC=C
 ## Training Logs
-| Chunk | Training Loss | Validation Loss |  Status   |
-| :---: | :-----------: | :-------------: | :-------: |
-|   I   |   1.346400    |    1.065180     |   Done    |
-|  II   |               |                 |  Ongoing  |
-|  III  |               |                 | Scheduled |
-|  IV   |               |                 | Scheduled |
 ## Evaluation Results
@@ -106,8 +107,8 @@ C(CCCCCCCCO)=CCC=C
 - The information and model provided is for academic purposes only. It is intended for educational and research use, and should not be used for any commercial or legal purposes. The author do not guarantee the accuracy, completeness, or reliability of the information.
 ## Additional Information
-- Part of the chemfie-gpt/T5 project
-- Serves as a baseline for future experiments with further curated datasets and architectural modifications
 ## Citation
 ### BibTeX

 ---
 # chemfie-gpt-experiment-1
+On-going training (2/4)
 ## Model Details
 - **Model Type**: GPT-2
 ## Training Data
 - **Source**: Curated and merged from COCONUTDB (Sorokina et al., 2021), ChemBL34 (Zdrazil et al., 2023), and SuperNatural3 (Gallo et al. 2023) database
+- **Total**: 2,933,355 samples
+- **Total Train**: 2,346,680 samples
 - **Validation**: 293,336 samples
 - **Per chunk**: 586,670 train, 73,334 validation, 73,334 test
 - **Random seed for split**: 42
 ## Training Logs
+| Chunk | Chunk's Training Loss | Chunk's Validation Loss |  Status   |
+| :---: | :-------------------: | :---------------------: | :-------: |
+|   I   |       1.346400        |        1.065180         |   Done    |
+|  II   |       1.123500        |        0.993118         |   Done    |
+|  III  |                       |                         |  Ongoing  |
+|  IV   |                       |                         | Scheduled |
 ## Evaluation Results
 - The information and model provided is for academic purposes only. It is intended for educational and research use, and should not be used for any commercial or legal purposes. The author do not guarantee the accuracy, completeness, or reliability of the information.
 ## Additional Information
+- Part of experimental chemfie-gpt/T5 project
+- Serves as a baseline for future experiments with further curated datasets, training, and architectural modifications
 ## Citation
 ### BibTeX