Skip to main content

A curated list of awesome resources for Danish language technology

195
GitHub Stars
123
Curated Resources
5
Categories
5 hours ago
Last Refreshed
DataToolsCompetitionsBenchmarksResources about resources

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me dictionaries and ontologies resources from awesome-danish"

Installation instructions →

What's inside

Data

  • 1,290,000 lexemesDictionaries and ontologies

    official dump of lexeme-only part of Wikidata.

  • ÆlæctraNeural text models

    Malte Højmark-Bertelsen's Danish Gigaword-trained Electra-based model

  • AFINNDictionaries and ontologies

    Danish lexicons annotated for sentiment.

  • Alvenir Wav2vec2Neural speech models

    Pretrained Danish neural model.

  • A-ttackNeural text models

    Ælæctra-based model for detection of "textual attacks" developed by

  • Byte-Pair Encoding embeddingEmbeddings

    Gensim-based subword embedding. A large number of Danish embeddings are available. They differ in the size of the vocabulary (from 1000 to 200000) and subspace dimensions (from 25 to 300).

Tools

  • afinnSentiment analysis

    Python package with AFINN Danish lexicon annotated for sentiment, also installable with

  • Amazon PollySpeech Synthesis (text-to-speech)

    Commercial Web-based text-to-speech synthesis for a number of languages, including Danish. Part of Amazon's commercial AWS services. Female and male voices are available as examples. Limited unregistered free service available at

  • BabelfyEntity linking

    Web app and service for linking words and entities.

  • bornholmskFundamental processing

    Datasets and embeddings for the Bornholmsk dialect.

  • cstlemmaLemmatization

    lemmatiser.

  • dacyFundamental processing

    Danish spaCy pipeline.

Resources about resources

Benchmarks

  • Danoliterate

    Overview of the performance of language models on a range of individual benchmarks.

  • ScandEval

    Overview of the performance of language models on a range of individual benchmark, Danish as well as other Germanic languages.

Competitions

Showing a sample of 123 resources. View the full list on GitHub →