mkulkarni24 commited on
Commit
7e3196d
1 Parent(s): d939379

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +105 -3
README.md CHANGED
@@ -1,4 +1,106 @@
1
- ---
2
- license: cc-by-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  Please direct all questions to [email protected]
 
1
+ ---
2
+ license: cc-by-4.0
3
+ ---
4
+
5
+ # KeyBART
6
+ KeyBART as described in Learning Rich Representations of Keyphrase from Text (https://arxiv.org/pdf/2112.08547.pdf), pre-trains a BART-based architecture to produce a concatenated sequence of keyphrases in the CatSeqD format.
7
+
8
+ We provide some examples on Downstream Evaluations setups and and also how it can be used for Text-to-Text Generation in a zero-shot setting.
9
+
10
+ ## Downstream Evaluation
11
+
12
+ ### Keyphrase Generation
13
+ ```
14
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
15
+
16
+ tokenizer = AutoTokenizer.from_pretrained("bloomberg/KeyBART")
17
+ model = AutoModelForSeq2SeqLM.from_pretrained("bloomberg/KeyBART")
18
+
19
+ from datasets import load_dataset
20
+
21
+ dataset = load_dataset("midas/kp20k")
22
+ ```
23
+
24
+ Reported Results:
25
+
26
+ #### Present Keyphrase Generation
27
+ | | Inspec | | NUS | | Krapivin | | SemEval | | KP20k | |
28
+ |---------------|--------|-------|-------|-------|----------|-------|---------|-------|-------|-------|
29
+ | Model | F1@5 | F1@M | F1@5 | F1@M | F1@5 | F1@M | F1@5 | F1@M | F1@5 | F1@M |
30
+ | catSeq | 22.5 | 26.2 | 32.3 | 39.7 | 26.9 | 35.4 | 24.2 | 28.3 | 29.1 | 36.7 |
31
+ | catSeqTG | 22.9 | 27 | 32.5 | 39.3 | 28.2 | 36.6 | 24.6 | 29.0 | 29.2 | 36.6 |
32
+ | catSeqTG-2RF1 | 25.3 | 30.1 | 37.5 | 43.3 | 30 | 36.9 | 28.7 | 32.9 | 32.1 | 38.6 |
33
+ | GANMR | 25.8 | 29.9 | 34.8 | 41.7 | 28.8 | 36.9 | N/A | N/A | 30.3 | 37.8 |
34
+ | ExHiRD-h | 25.3 | 29.1 | N/A | N/A | 28.6 | 34.7 | 28.4 | 33.5 | 31.1 | 37.4 |
35
+ | Transformer (Ye et al., 2021) | 28.15 | 32.56 | 37.07 | 41.91 | 31.58 | 36.55 | 28.71 | 32.52 | 33.21 | 37.71 |
36
+ | BART* | 23.59 | 28.46 | 35.00 | 42.65 | 26.91 | 35.37 | 26.72 | 31.91 | 29.25 | 37.51 |
37
+ | KeyBART-DOC* | 24.42 | 29.57 | 31.37 | 39.24 | 24.21 | 32.60 | 24.69 | 30.50 | 28.82 | 37.59 |
38
+ | KeyBART* | 24.49 | 29.69 | 34.77 | 43.57 | 29.24 | 38.62 | 27.47 | 33.54 | 30.71 | 39.76 |
39
+ | KeyBART* (Zero-shot) | 30.72 | 36.89 | 18.86 | 21.67 | 18.35 | 20.46 | 20.25 | 25.82 | 12.57 | 15.41 |
40
+
41
+ #### Absent Keyphrase Generation
42
+ | | Inspec | | NUS | | Krapivin | | SemEval | | KP20k | |
43
+ |---------------|--------|------|------|------|----------|------|---------|------|-------|------|
44
+ | Model | F1@5 | F1@M | F1@5 | F1@M | F1@5 | F1@M | F1@5 | F1@M | F1@5 | F1@M |
45
+ | catSeq | 0.4 | 0.8 | 1.6 | 2.8 | 1.8 | 3.6 | 1.6 | 2.8 | 1.5 | 3.2 |
46
+ | catSeqTG | 0.5 | 1.1 | 1.1 | 1.8 | 1.8 | 3.4 | 1.1 | 1.8 | 1.5 | 3.2 |
47
+ | catSeqTG-2RF1 | 1.2 | 2.1 | 1.9 | 3.1 | 3.0 | 5.3 | 2.1 | 3.0 | 2.7 | 5.0 |
48
+ | GANMR | 1.3 | 1.9 | 2.6 | 3.8 | 4.2 | 5.7 | N/A | N/A | 3.2 | 4.5 |
49
+ | ExHiRD-h | 1.1 | 2.2 | N/A | N/A | 2.2 | 4.3 | 1.7 | 2.5 | 1.6 | 3.2 |
50
+ | Transformer (Ye et al., 2021) | 1.02 | 1.94 | 2.82 | 4.82 | 3.21 | 6.04 | 2.05 | 2.33 | 2.31 | 4.61 |
51
+ | BART* | 1.08 | 1.96 | 1.80 | 2.75 | 2.59 | 4.91 | 1.34 | 1.75 | 1.77 | 3.56 |
52
+ | KeyBART-DOC* | 0.99 | 2.03 | 1.39 | 2.74 | 2.40 | 4.58 | 1.07 | 1.39 | 1.69 | 3.38 |
53
+ | KeyBART* | 0.95 | 1.81 | 1.23 | 1.90 | 3.09 | 6.08 | 1.96 | 2.65 | 2.03 | 4.26 |
54
+ | KeyBART* (Zero-shot) | 1.83 | 2.92 | 1.46 | 2.19 | 1.29 | 2.09 | 1.12 | 1.45 | 0.70 | 1.14 |
55
+
56
+
57
+ ### Abstractive Summarization
58
+ ```
59
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
60
+
61
+ tokenizer = AutoTokenizer.from_pretrained("bloomberg/KeyBART")
62
+ model = AutoModelForSeq2SeqLM.from_pretrained("bloomberg/KeyBART")
63
+
64
+ from datasets import load_dataset
65
+
66
+ dataset = load_dataset("cnn_dailymail")
67
+ ```
68
+
69
+ Reported Results:
70
+
71
+ | Model | R1 | R2 | RL |
72
+ |--------------|-------|-------|-------|
73
+ | BART (Lewis et al., 2019) | 44.16 | 21.28 | 40.9 |
74
+ | BART* | 42.93 | 20.12 | 39.72 |
75
+ | KeyBART-DOC* | 42.92 | 20.07 | 39.69 |
76
+ | KeyBART* | 43.10 | 20.26 | 39.90 |
77
+
78
+ ## Zero-shot settings
79
+ ```
80
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
81
+
82
+ tokenizer = AutoTokenizer.from_pretrained("bloomberg/KeyBART")
83
+ model = AutoModelForSeq2SeqLM.from_pretrained("bloomberg/KeyBART")
84
+ ```
85
+
86
+ Alternatively use the Hosted Inference API console provided in https://huggingface.co/bloomberg/KeyBART
87
+
88
+ Sample Zero Shot result:
89
+
90
+ ```
91
+ Input: In this work, we explore how to learn task specific language models aimed towards learning rich representation of keyphrases from text documents.
92
+ We experiment with different masking strategies for pre-training transformer language models (LMs) in discriminative as well as generative settings.
93
+ In the discriminative setting, we introduce a new pre-training objective - Keyphrase Boundary Infilling with Replacement (KBIR),
94
+ showing large gains in performance (upto 9.26 points in F1) over SOTA, when LM pre-trained using KBIR is fine-tuned for the task of keyphrase extraction.
95
+ In the generative setting, we introduce a new pre-training setup for BART - KeyBART, that reproduces the keyphrases related to the input text in the CatSeq
96
+ format, instead of the denoised original input. This also led to gains in performance (upto 4.33 points in F1@M) over SOTA for keyphrase generation.
97
+ Additionally, we also fine-tune the pre-trained language models on named entity recognition (NER), question answering (QA), relation extraction (RE),
98
+ abstractive summarization and achieve comparable performance with that of the SOTA, showing that learning rich representation of keyphrases is indeed beneficial
99
+ for many other fundamental NLP tasks.
100
+
101
+ Output: language model;keyphrase generation;new pre-training objective;pre-training setup;
102
+
103
+ ```
104
+
105
+
106
  Please direct all questions to [email protected]