Published April 26, 2021
| Version v0.7.0
Software
Open
MaartenGr/BERTopic: Major Release v0.7
Description
The two main features are (semi-)supervised topic modeling and several backends to use instead of Flair and SentenceTransformers!
Highlights:
- (semi-)supervised topic modeling by leveraging supervised options in UMAP
model.fit(docs, y=target_classes)
- Backends:
- Added Spacy, Gensim, USE (TFHub)
- Use a different backend for document embeddings and word embeddings
- Create your own backends with
bertopic.backend.BaseEmbedder
- Click here for an overview of all new backends
- Calculate and visualize topics per class
- Calculate:
topics_per_class = topic_model.topics_per_class(docs, topics, classes)
- Visualize:
topic_model.visualize_topics_per_class(topics_per_class)
- Calculate:
- Several tutorials were updated and added:
Fixes:
- Fixed issues with Torch req
- Prevent saving term frequency matrix in CTFIDF class
- Fixed DTM not working when reducing topics (#96)
- Moved visualization dependencies to base BERTopic
pip install bertopic[visualization]
becomespip install bertopic
- Allow precomputed embeddings in bertopic.find_topics() (#79):
model = BERTopic(embedding_model=my_embedding_model)
model.fit(docs, my_precomputed_embeddings)
model.find_topics(search_term)
Files
MaartenGr/BERTopic-v0.7.0.zip
Files
(6.1 MB)
Name | Size | Download all |
---|---|---|
md5:b245d5a13e401c8100e5d0fed5e2cc20
|
6.1 MB | Preview Download |
Additional details
Related works
- Is supplement to
- https://github.com/MaartenGr/BERTopic/tree/v0.7.0 (URL)