awesome-hpc
github.com/dstdev/awesome-hpc ↗A collection of Awesome HPC software and tools
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me prometheus based resources from awesome-hpc"
Installation instructions →What's inside
Environment Management
- Anaconda
Anaconda is a Python and R distribution for use in computational science
- Environment Modules
Environment Modules: provides dynamic modification of a user's environment (
- Lmod
Lmod: An Environment Module System based on Lua, Reads TCL Modules, Supports a Software Hierarchy (
- Mamba
Mamba is a reimplementation of the conda package manager in C++ (
Containers
- Apptainer
Apptainer is an open source container system (
- Charliecloud
Charliecloud provides user-defined software stacks (UDSS) for high-performance computing (HPC) centers (
- Docker
Docker is a set of platform as a service products that use OS-level virtualization to deliver software in packages called containers
- HPC Container Maker
HPC Container Maker is an open source tool to make it easier to generate container specification files.
- Scarus
An OCI-compatible container engine for HPC
- Shifter
Shifter is Linux containers for HPC (
Parallel Computing
Provisioning
- Base Command Manager
Base Command Manager allows administrator to quickly build and manage heterogeneous clusters
- BlueBanquise
BlueBanquise is an open source cluster deployment and management stack built on Python and Ansible (
- Cobbler
Cobbler is a Linux installation server that allows for rapid setup of network installation environments (
- Grendel
Bare Metal Provisioning system for HPC Linux clusters (
- Rocks
A Linux distribution for developing Linux clusters
- Scyld
Scyld Clusterware Scyld ClusterWare is developed based on the continuing evolution of Beowulf clusters first developed at NASA in the 1990s
Parallel Filesystems
- BeeGFS
BeeGFS is a hardware-independent POSIX parallel file system developed with a strong focus on performance and designed for ease of use, simple installation, and management
- Ceph
Ceph is a distributed object, block, and file storage platform (
- GPFS
GPFS is a high-performance clustered file system software developed by IBM
- Lustre/Exascaler
Lustre is an open-source, distributed parallel file system software platform designed for scalability, high-performance, and high-availability (
- MooseFS
Moose File System is an Open-source, POSIX-compliant distributed file system developed by Core Technology (
- OrangeFS
OrangeFS is a next generation parallel file system for Linux clusters (
Conferences
- CCGrid
IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing.
- ESPM2 Workshop
International Workshop on Extreme Scale Programming Models and Middleware.
- ESSA
Workshop on Extreme-Scale Storage and Analysis.
- Hot Chips
Semiconductor industry's leading conference on high-performance microprocessors and related circuits.
- Hot Interconnects
IEEE conference on software architectures and implementations for interconnection networks of all scales.
- HPC Carpentry
Teaching basic skills for high-performance computing.
Monitoring
- Cgroup ExporterPrometheus Based
A Prometheus exporter for cgroup-level metrics
- Cgroup ExporterPrometheus Based
Produces metrics from cgroups
- DCGM ExporterPrometheus Based
NVIDIA GPU metrics exporter for Prometheus leveraging DCGM
- GPFS ExporterPrometheus Based
The GPFS exporter collects metrics from the GPFS filesystem
- Infiniband ExporterPrometheus Based
The InfiniBand exporter collects counters from InfiniBand switches and HCAs
- Lustre ExporterPrometheus Based
Prometheus exporter for use with the Lustre parallel filesystem
Showing a sample of 115 resources. View the full list on GitHub →