flair
/

ner-english-ontonotes

Token Classification

sequence-tagger-model

Model card Files Files and versions Community

alanakbik commited on Feb 21, 2021

Commit

ab9a59c

•

1 Parent(s): 26d2568

update readme

Files changed (1) hide show

README.md +12 -8

README.md CHANGED Viewed

@@ -9,11 +9,11 @@ datasets:
 inference: false
 ---
-## English NER in Flair (Ontonotes defeault model)
 This is the 18-class NER model for English that ships with [Flair](https://github.com/flairNLP/flair/).
-F1-Score: **89.3** (Ontonotes)
 Predicts 18 tags:
@@ -51,7 +51,7 @@ from flair.data import Sentence
 from flair.models import SequenceTagger
 # load tagger
-tagger = SequenceTagger.load("flair/ner-english")
 # make example sentence
 sentence = Sentence("On September 1st George Washington won 1 dollar.")
@@ -77,7 +77,7 @@ Span [4,5]: "George Washington" [− Labels: PERSON (0.9604)]
 Span [7,8]: "1 dollar" [− Labels: MONEY (0.9837)]
 ```
-So, the entities "*September 1st*" (labeled as a **date**), "*George Washington*" (labeled as a **person**) and "*1 dollar*" (labeled as a **money**) are found in the sentence "*George Washington went to Washington*".
 ---
@@ -91,8 +91,12 @@ from flair.data import Corpus
 from flair.datasets import CONLL_03
 from flair.embeddings import WordEmbeddings, StackedEmbeddings, FlairEmbeddings
-# 1. get the corpus
-corpus: Corpus = CONLL_03()
 # 2. what tag do we want to predict?
 tag_type = 'ner'
@@ -104,7 +108,7 @@ tag_dictionary = corpus.make_tag_dictionary(tag_type=tag_type)
 embedding_types = [
  # GloVe embeddings
- WordEmbeddings('glove'),
  # contextual string embeddings, forward
  FlairEmbeddings('news-forward'),
@@ -130,7 +134,7 @@ from flair.trainers import ModelTrainer
 trainer = ModelTrainer(tagger, corpus)
 # 7. run training
-trainer.train('resources/taggers/ner-english',
  train_with_dev=True,
  max_epochs=150)
 ```

 inference: false
 ---
+## English NER in Flair (Ontonotes default model)
 This is the 18-class NER model for English that ships with [Flair](https://github.com/flairNLP/flair/).
+F1-Score: **89.27** (Ontonotes)
 Predicts 18 tags:
 from flair.models import SequenceTagger
 # load tagger
+tagger = SequenceTagger.load("flair/ner-english-ontonotes")
 # make example sentence
 sentence = Sentence("On September 1st George Washington won 1 dollar.")
 Span [7,8]: "1 dollar" [− Labels: MONEY (0.9837)]
 ```
+So, the entities "*September 1st*" (labeled as a **date**), "*George Washington*" (labeled as a **person**) and "*1 dollar*" (labeled as a **money**) are found in the sentence "*On September 1st George Washington won 1 dollar*".
 ---
 from flair.datasets import CONLL_03
 from flair.embeddings import WordEmbeddings, StackedEmbeddings, FlairEmbeddings
+# 1. load the corpus (Ontonotes does not ship with Flair, you need to download and reformat into a column format yourself)
+corpus: Corpus = ColumnCorpus(
+ "resources/tasks/onto-ner",
+ column_format={0: "text", 1: "pos", 2: "upos", 3: "ner"},
+ tag_to_bioes="ner",
+ )
 # 2. what tag do we want to predict?
 tag_type = 'ner'
 embedding_types = [
  # GloVe embeddings
+ WordEmbeddings('en-crawl'),
  # contextual string embeddings, forward
  FlairEmbeddings('news-forward'),
 trainer = ModelTrainer(tagger, corpus)
 # 7. run training
+trainer.train('resources/taggers/ner-english-ontonotes',
  train_with_dev=True,
  max_epochs=150)
 ```