awesome-nlg
github.com/accelerated-text/awesome-nlg ↗A curated list of resources dedicated to Natural Language Generation (NLG)
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me papers and articles resources from awesome-nlg"
Installation instructions →What's inside
Papers and Articles
- 2016: Natural Language Generation enhances human decision-making with uncertain information
- 2017: Survey of the State of the Art in NaturalLanguage Generation: Core tasks, applicationsand evaluation
- 2019: A Closer Look at Recent Results of Verb Selection for Data-to-Text NLG
- 2019: A Personalized Data-to-Text Support Tool for Cancer Patients
- 2019: Controlling Contents in Data-to-Document Generation with Human-Designed Topic Labels
- 2019: Generated Texts Must Be Accurate!
Products
- Accelerated Text
Automatically generate multiple natural language descriptions of your data varying in wording and structure.
- RosaeNLG
An open-source library for node.js or client side (browser) execution, based on the Pug template engine, to generate texts in English, French, German and Italian.
- Twine
An open-source tool for telling interactive, nonlinear stories.
Neural Natural Language Generation
- aitextgen
A robust Python tool for text-based AI training and generation using GPT-2.
- graph-2-text
Graph to sequence implemented in Pytorch combining Graph convolutional networks and opennmt-py.
- Image Caption Generator
A Neural Network based generative model for captioning images using Tensorflow.
- lightnlg
A minimalistic codebase for finetuning and interacting with NLG models using PyTorch Lightning.
- PaperRobot: Incremental Draft Generation of Scientific Ideas
We present a PaperRobot who performs as an automatic research assistant.
- PPLM
Plug and Play Language Model implementation. Allows to steer topic and attributes of GPT-2 models.
Datasets
- Alex Context NLG Dataset
A dataset for NLG in dialogue systems in the public transport information domain.
- Box-score data
This dataset consists of (human-written) NBA basketball game summaries aligned with their corresponding box- and line-scores.
- E2E
This shared task focuses on recent end-to-end (E2E), data-driven NLG methods, which jointly learn sentence planning and surface realisation from non-aligned data.
- Neural-Wikipedian
The repository contains the code along with the required corpora that were used in order to build a system that "learns" how to generate English biographies for Semantic Web triples.
- The Schema-Guided Dialogue Dataset
The Schema-Guided Dialogue (SGD) dataset consists of over 20k annotated multi-domain, task-oriented conversations between a human and a virtual assistant.
- The Wikipedia company corpus
Company descriptions collected from Wikipedia. The dataset contains semantic representations, short, and long descriptions for 51K companies in English.
Evaluation
- BLEURT: a Transfer Learning-Based Metric for Natural Language Generation
- compare-mt
A tool for holistic analysis of language generations systems.
- GEM
a benchmark environment for NLG with a focus on its Evaluation, both through human annotations and automated Metrics.
- NLG-eval
Evaluation code for various unsupervised automated metrics for Natural Language Generation.
- VizSeq
A Visual Analysis Toolkit for Text Generation Tasks.
Templating Languages
- calyx
A Ruby library for generating text with recursive template grammars.
- nalgene
Natural language generation language.
- StringTemplate
Java template engine (with ports for C##, Objective-C, JavaScript, Scala) for generating source code, web pages, emails, or any other formatted text output.
Grammar
- CCG Lab
All combinators, common grammar format, parsing to logical form, parameter estimation for probabilistic CCG.
- CCGweb
A Web platform for parsing and annotation.
- EasyCCG
CCG: All combinators, common grammar format, parsing to logical form, parameter estimation for probabilistic CCG.
- GrammaticalFramework
A programming language for multilingual grammar applications.
- OpenCCG
OpenCCG library for parsing and realization with CCG.
Dialog
- Chatito
Generate datasets for AI chatbots, NLP tasks, named entity recognition or text classification models using a simple DSL!
- NNDIAL
NNDial is an open source toolkit for building end-to-end trainable task-oriented dialogue models.
- Plato
This is the Plato Research Dialogue System, a flexible platform for developing conversational AI agents.
- RNNLG
RNNLG is an open source benchmark toolkit for Natural Language Generation (NLG) in spoken dialogue system application domains.
- TGen
Statistical NLG for spoken dialogue systems.
Showing a sample of 74 resources. View the full list on GitHub →