awesome-audio-speech

Awesome list of Audio, Speech, and DSP(Digital signal processing)

GitHub Stars

Curated Resources

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me filtering / denoising resources from awesome-audio-speech"

Audacity
A cross-platform audio editor and recorder that supports many formats and provides a user-friendly interface.
DeepSpeech
A speech-to-text engine developed by Mozilla Research.
librosa
A library for audio and music analysis in Python, providing functions for computing features, such as MFCCs, chroma, and beat-related features.
PulseAudio
A cross-platform sound server for Linux, Unix, and Windows systems that provides sound server functionality to other applications.
PyTorch Audio
A library that provides a PyTorch-based implementation of common audio functions, such as spectrogram computation, audio pre-processing, and spectrogram-based features.
SoX
A cross-platform audio processing tool that provides a command-line interface for converting, editing, and playing audio files.

Fully Supervised Speaker Diarization
A novel approach to speaker diarization using fully supervised learning.
NVIDIA's Speaker Diarization
NVIDIA's advanced approach to speaker diarization.
Speaker Diarization with LSTM
A paper on using LSTM networks for speaker diarization.

Showing a sample of 59 resources. View the full list on GitHub →