Skip to main content

✨ Awesome - A curated list of amazing Topic Models (implementations, libraries, and resources)

103
GitHub Stars
187
Curated Resources
10
Categories
4 hours ago
Last Refreshed
Libraries & ToolkitsModelsProbabilistic Programming Languages (PPL) (a.k.a. Build your own Topic Model)Research ImplementationsPopular Implementations (but not maintained anymore)Learning Implementations (hopefully easy to understand)VisualizationsDirichlet hyperparameter optimization techniquesResourcesRelated awesome lists

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me latent dirichlet allocation (lda) :page_facing_up: resources from awesome-topic-models"

Installation instructions →

What's inside

Models

  • AliasLDALatent Dirichlet Allocation (LDA) :page_facing_up:

    C++ implemenation using Metropolis-Hastings and

  • Anchored CorExEmbedding based Topic Models

    Hierarchical Topic Modeling with Minimal Domain Knowledge

  • BayesPAMiscellaneous topic models

    Python interface for streaming implementation of MedLDA, maximum entropy discrimination LDA (max-margin supervised topic model)

  • BERTopicEmbedding based Topic Models

    BERTopic supports guided, (semi-) supervised, and dynamic topic modeling and visualization

  • BIDMachNon-Negative Matrix Factorization (NMF or NNMF)

    CPU and GPU-accelerated Scala implementation with L2 loss

  • BIDMachTruncated Singular Value Decomposition (SVD) / Latent Semantic Analysis (LSA) / Latent Semantic Indexing (LSI)

    Scala implementation of a scalable approximate SVD using subspace iteration

Libraries & Toolkits

  • BIDMach

    CPU and GPU-accelerated machine learning library

  • BigARTM

    Fast topic modeling platform

  • gensim

    Python library for topic modelling

  • OCTIS

    Python package to integrate, optimize and evaluate topic models

  • RMallet

    R package to interface with the Java machine learning tool MALLET

  • scikit-learn

    Python library for machine learning

Research Implementations

  • ctm-cCorrelated Topic Model (CTM) a.k.a. logistic-normal topic models

    C implementation of the correlated topic model by David Blei

  • ctr

    C++ implementation of collaborative topic models by Chong Wang

  • cvbLDA

    Python C extension implementation of collapsed variational Bayesian inference for LDA

  • diln

    C implementation of Discrete Infinite Logistic Normal (with HDP option) by John Paisley

  • dtm

    C implementation of dynamic topic models by David Blei & Sean Gerrish

  • fast

    A Fast And Scalable Topic-Modeling Toolbox (Fast-LDA, CVB0) by Arthur Asuncion and colleagues

Resources

  • David Blei

    David Blei's Homepage with introductory materials

Visualizations

  • dfr-browser

    Explore Mallet's topic models of texts in a web browser

  • dtmvisual

    Python package for visualizing DTM (trained with gensim)

  • LDAvis

    R package for interactive topic model visualization

  • Mallet-GUI

    GUI for creating and analyzing topic models produced by MALLET

  • pyLDAvis

    Python library for interactive topic model visualization

  • scalaLDAvis

    Scala port of pyLDAvis

Dirichlet hyperparameter optimization techniques

Probabilistic Programming Languages (PPL) (a.k.a. Build your own Topic Model)

  • edward

    A PPL built on TensorFlow, e.g.,

  • edward2

    Simple PPL with core utilities in the NumPy and TensorFlow ecosystem

  • PyMC3

    Python package for Bayesian statistical modeling and probabilistic machine learning, e.g.,

  • pyro

    PPL built on PyTorch, e.g.,

  • Stan

    Platform for statistical modeling and high-performance statistical computation, e.g.,

  • TFP

    Probabilistic reasoning and statistical analysis in TensorFlow, e.g.,

Showing a sample of 187 resources. View the full list on GitHub →