megagonlabs
/

roberta-long-japanese

Inference Endpoints

Model card Files Files and versions Community

hiroshi-matsuda-rit commited on Oct 4, 2022

Commit

d9559d4

•

1 Parent(s): f838186

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -48,7 +48,7 @@ This model is trained on the Japanese texts extracted from the [mC4](https://hug
 We used the [Sudachi](https://github.com/WorksApplications/Sudachi) to split texts into sentences, and also applied a simple rule-based filter to remove nonlinguistic segments of mC4 multilingual corpus.
 The extracted texts contains over 600M sentences in total, and we used approximately 200M sentences for pretraining.
-We used [huggingface/transformers RoBERTa implementation](https://github.com/huggingface/transformers/tree/v4.21.0/src/transformers/models/roberta) for pretraining. The time required for the pretrainig was about 300 hours using GCP A100 8gpu instance with enabling Automatic Mixed Precision.
 ## Licenses

 We used the [Sudachi](https://github.com/WorksApplications/Sudachi) to split texts into sentences, and also applied a simple rule-based filter to remove nonlinguistic segments of mC4 multilingual corpus.
 The extracted texts contains over 600M sentences in total, and we used approximately 200M sentences for pretraining.
+We used [huggingface/transformers RoBERTa implementation](https://github.com/huggingface/transformers/tree/v4.21.0/src/transformers/models/roberta) for pretraining. The time required for the pretrainig was about 700 hours using GCP A100 8gpu instance with enabling Automatic Mixed Precision.
 ## Licenses