Edit model card

xsum_108_5000000_2500000_test

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("KingKazma/xsum_108_5000000_2500000_test")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 14
  • Number of training documents: 11334
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 said - win - first - one - time 13 -1_said_win_first_one
0 said - mr - would - people - also 1003 0_said_mr_would_people
1 win - game - league - goal - right 7868 1_win_game_league_goal
2 race - olympic - sport - gold - team 1707 2_race_olympic_sport_gold
3 england - cricket - wicket - test - captain 225 3_england_cricket_wicket_test
4 race - hamilton - mercedes - f1 - lap 192 4_race_hamilton_mercedes_f1
5 match - murray - konta - seed - set 62 5_match_murray_konta_seed
6 round - birdie - shot - par - bogey 59 6_round_birdie_shot_par
7 fight - boxing - champion - ali - title 49 7_fight_boxing_champion_ali
8 yn - ar - ei - yr - wedi 48 8_yn_ar_ei_yr
9 unsupported - updated - playback - media - device 33 9_unsupported_updated_playback_media
10 world - champion - osullivan - event - snooker 29 10_world_champion_osullivan_event
11 fifa - blatter - football - platini - fifas 25 11_fifa_blatter_football_platini
12 ebola - sierra - leone - outbreak - people 21 12_ebola_sierra_leone_outbreak

Training hyperparameters

  • calculate_probabilities: True
  • language: english
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: None
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: False

Framework versions

  • Numpy: 1.22.4
  • HDBSCAN: 0.8.33
  • UMAP: 0.5.3
  • Pandas: 1.5.3
  • Scikit-Learn: 1.2.2
  • Sentence-transformers: 2.2.2
  • Transformers: 4.31.0
  • Numba: 0.57.1
  • Plotly: 5.13.1
  • Python: 3.10.12
Downloads last month
1
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.