Skip to main content

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

5.3k
GitHub Stars
1
Curated Resources
1
Categories
1 hour ago
Last Refreshed
📖 News 🔥🔥

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me 📖 news 🔥🔥 resources from awesome-llm-inference"

Installation instructions →

What's inside

📖 News 🔥🔥

Showing a sample of 1 resources. View the full list on GitHub →