awesome-serialization
github.com/maximveksler/awesome-serialization ↗Curated list of data serialization formats — API, ML, Agentic AI, Big Data, Configuration, and beyond
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me agentic resources from awesome-serialization"
Installation instructions →What's inside
Agentic
- A2A
Agent2Agent Protocol. Google's open protocol for agent-to-agent communication and interoperability. JSON based. Textual.
- BAML
Boundary AI Markup Language. Domain-specific language for defining LLM function signatures with type-safe structured output. Textual.
- Markdown
Lightweight markup widely used as the native "language" of LLM input/output. Highly token-efficient vs HTML/XML. Textual.
- MCP
Model Context Protocol. Anthropic's open standard for connecting LLM agents to tools and data sources. JSON-RPC based. Textual.
- TOON
Token-Oriented Object Notation. Compact, schema-aware JSON alternative achieving 30–60% token savings for LLM prompts. Textual.
- YAML
Indentation-based format often more token-efficient than JSON for LLM contexts due to lack of braces/quotes. Textual.
Workflow
- Apache Airflow DAGs
Python-based Directed Acyclic Graphs for workflows.
- common-workflow-language
Specification for describing analysis workflows and tools in a way that makes them portable and scalable across a variety of software and hardware environments.
- Cromwell
Scientific workflow management, compatible with WDL and CWL.
- Nextflow
Scalable and reproducible scientific workflows.
- Relational Algebra and Datalog for Graphs
Coursera course on graph data manipulation.
- WDL
Workflow Description Language for genomics and scientific workflows.
Big Data
- Arrow
Cross-language columnar data format optimized for analytics workloads. Binary.
- Avro
Scheme embedded, dynamic rich data structures. Textual/Binary.
- Delta Lake
Transactional storage layer for big data workflows. Binary.
- FlatBuffers
Protocol Buffers suitable for larger datasets. Binary.
- Iceberg
Open table format for large datasets. Binary.
- Ion
Row storage with skip scan parsing. Structured, schema embedded. Amazon. Textual/Binary.
Scientific
- ASDF
Advanced Scientific Data Format for astronomy and beyond. Binary/Textual.
- HDF5®
n-dimensional datasets, complex objects, with schema. Efficient I/O. Binary.
- NetCDF
Self-describing, machine-independent data format for scientific data. Binary.
- npy
Numpy arrays, cell sparse metadata. Binary.
- Zarr
Scalable storage of n-dimensional arrays. Binary.
API
- AsyncAPI
OpenAPI equivalent for event-driven and message-driven architectures. Textual.
- bson
Binary schemeless JSON encoding. Binary.
- Cap'n Proto
High-performance, schema-based data interchange format. Binary.
- CBOR
Concise Binary Object Representation. Schema-free. Binary.
- CloudEvents
CNCF specification for describing event data in a common way. Textual/Binary.
- Connect
Modern RPC framework compatible with gRPC, with HTTP/1.1, JSON, and browser support. Binary/Textual.
Programming
- avsc
JavaScript implementation of Apache Avro. Textual.
- bincode
High-performance binary serialization for Rust.
- BSON.js
BSON serializer. Binary.
- Dart Object Serialization
RAM to Disk serialization. Dart-specific. Binary.
- GOB
Go's built-in serialization format for arbitrary data structures. Binary.
- Java Object Serialization
RAM to Disk serialization. Binary.
Academic
- Category theory
General theory of functions. Axiomatic foundation for mathematics, as an alternative to set theory.
- Efficient Serialization in Distributed Systems
Study of efficient serialization techniques for scalability.
- Graph Compression Techniques
Research on optimizing graph serialization.
- Type theory
Studies types, which informally are attributes that objects can possess.
Machine Learning
- CoreML
Apple's on-device ML model format. Binary.
- GGUF
Quantized model format for llama.cpp/ggml. The de facto standard for local LLM inference. Binary.
- GraphDef
TensorFlow graphs. Binary.
- MLIR
Intermediate representation for machine learning computations. Textual/Binary.
- MLX format
Apple's ML framework format, safetensors-based. Optimized for Apple Silicon. Binary.
- ONNX
Open Neural Network Exchange. Interoperability focused. Binary.
Showing a sample of 85 resources. View the full list on GitHub →