Skip to main content

More than 50+ collections of Thai Natural Language Processing libraries. Update daily.

393
GitHub Stars
58
Curated Resources
6
Categories
23 hours ago
Last Refreshed
Libraries/ServicesCorpus and DatasetPre-trained Language ModelsBenchmarksToolsAcknowledgements

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me corpus extractors resources from nlp_thai_resources"

Installation instructions →

What's inside

Acknowledgements

Tools

  • BEST2010 cookerCorpus extractors

    A tool for extracting segmented words from Thai segmented BEST2010 corpus

Libraries/Services

  • Chart-parserSyntactic Parsing & Tools

    Extract Syntactic Structure from POS Tagged Sentence.

  • Chart-POSPart of Speech Tagging (POS Tagging)

    Thai POS Tagger

  • CutKumWord Segmentation

    Thai word segmentation with Deep Learning in Tensorflow. RNN.

  • CutThaiWord Segmentation

    Thai word segmentation written in coffee-script Edit

  • DeepCutWord Segmentation

    A Thai word tokenization library using Deep Neural Network. CNN.

  • Grammar ProcessingSyntactic Parsing & Tools

    Labelled Brackets -> Context Free Grammars (CFGs)

Corpus and Dataset

Pre-trained Language Models

  • fastText

    Skip-Gram model trained on Wikipedia using fastText

  • thai2fit

    ULMFit on Wikipedia. Perplexity of 46.80959 with 60,002 embeddings.

  • thbert

    Yet another pre-trained BERT particularly in Thai

Showing a sample of 58 resources. View the full list on GitHub →