alanakbik commited on
Commit
ab9a59c
1 Parent(s): 26d2568

update readme

Browse files
Files changed (1) hide show
  1. README.md +12 -8
README.md CHANGED
@@ -9,11 +9,11 @@ datasets:
9
  inference: false
10
  ---
11
 
12
- ## English NER in Flair (Ontonotes defeault model)
13
 
14
  This is the 18-class NER model for English that ships with [Flair](https://github.com/flairNLP/flair/).
15
 
16
- F1-Score: **89.3** (Ontonotes)
17
 
18
  Predicts 18 tags:
19
 
@@ -51,7 +51,7 @@ from flair.data import Sentence
51
  from flair.models import SequenceTagger
52
 
53
  # load tagger
54
- tagger = SequenceTagger.load("flair/ner-english")
55
 
56
  # make example sentence
57
  sentence = Sentence("On September 1st George Washington won 1 dollar.")
@@ -77,7 +77,7 @@ Span [4,5]: "George Washington" [− Labels: PERSON (0.9604)]
77
  Span [7,8]: "1 dollar" [− Labels: MONEY (0.9837)]
78
  ```
79
 
80
- So, the entities "*September 1st*" (labeled as a **date**), "*George Washington*" (labeled as a **person**) and "*1 dollar*" (labeled as a **money**) are found in the sentence "*George Washington went to Washington*".
81
 
82
 
83
  ---
@@ -91,8 +91,12 @@ from flair.data import Corpus
91
  from flair.datasets import CONLL_03
92
  from flair.embeddings import WordEmbeddings, StackedEmbeddings, FlairEmbeddings
93
 
94
- # 1. get the corpus
95
- corpus: Corpus = CONLL_03()
 
 
 
 
96
 
97
  # 2. what tag do we want to predict?
98
  tag_type = 'ner'
@@ -104,7 +108,7 @@ tag_dictionary = corpus.make_tag_dictionary(tag_type=tag_type)
104
  embedding_types = [
105
 
106
  # GloVe embeddings
107
- WordEmbeddings('glove'),
108
 
109
  # contextual string embeddings, forward
110
  FlairEmbeddings('news-forward'),
@@ -130,7 +134,7 @@ from flair.trainers import ModelTrainer
130
  trainer = ModelTrainer(tagger, corpus)
131
 
132
  # 7. run training
133
- trainer.train('resources/taggers/ner-english',
134
  train_with_dev=True,
135
  max_epochs=150)
136
  ```
 
9
  inference: false
10
  ---
11
 
12
+ ## English NER in Flair (Ontonotes default model)
13
 
14
  This is the 18-class NER model for English that ships with [Flair](https://github.com/flairNLP/flair/).
15
 
16
+ F1-Score: **89.27** (Ontonotes)
17
 
18
  Predicts 18 tags:
19
 
 
51
  from flair.models import SequenceTagger
52
 
53
  # load tagger
54
+ tagger = SequenceTagger.load("flair/ner-english-ontonotes")
55
 
56
  # make example sentence
57
  sentence = Sentence("On September 1st George Washington won 1 dollar.")
 
77
  Span [7,8]: "1 dollar" [− Labels: MONEY (0.9837)]
78
  ```
79
 
80
+ So, the entities "*September 1st*" (labeled as a **date**), "*George Washington*" (labeled as a **person**) and "*1 dollar*" (labeled as a **money**) are found in the sentence "*On September 1st George Washington won 1 dollar*".
81
 
82
 
83
  ---
 
91
  from flair.datasets import CONLL_03
92
  from flair.embeddings import WordEmbeddings, StackedEmbeddings, FlairEmbeddings
93
 
94
+ # 1. load the corpus (Ontonotes does not ship with Flair, you need to download and reformat into a column format yourself)
95
+ corpus: Corpus = ColumnCorpus(
96
+ "resources/tasks/onto-ner",
97
+ column_format={0: "text", 1: "pos", 2: "upos", 3: "ner"},
98
+ tag_to_bioes="ner",
99
+ )
100
 
101
  # 2. what tag do we want to predict?
102
  tag_type = 'ner'
 
108
  embedding_types = [
109
 
110
  # GloVe embeddings
111
+ WordEmbeddings('en-crawl'),
112
 
113
  # contextual string embeddings, forward
114
  FlairEmbeddings('news-forward'),
 
134
  trainer = ModelTrainer(tagger, corpus)
135
 
136
  # 7. run training
137
+ trainer.train('resources/taggers/ner-english-ontonotes',
138
  train_with_dev=True,
139
  max_epochs=150)
140
  ```