Worth-reading papers and related resources on attention mechanism, Transformer and pretrained language model (PLM) such as BERT. 值得一读的注意力机制、Transformer和预训练语言模型论文与相关资源集合
131
GitHub Stars
349
Curated Resources
3
Categories
2 hours ago
Last Refreshed
AttentionTransformerPretrained Language Model
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me english blog resources from atpapers"
Installation instructions →What's inside
Pretrained Language Model
- A Fair Comparison Study of XLNet and BERT with Large ModelsEnglish Blog
- All The Ways You Can Compress BERTEnglish Blog
- Andy Yang / BERT 瘦身之路:Distillation,Quantization,PruningChinese Blog
- bojone / albert_zhRepository
转换brightmart版的albert权重到Google版格式
- bojone / bert4kerasRepository
bojone's (苏神) BERT Keras implementation
- [book]Tutorial & Survey
Transformer
- andreamad8 / Universal-Transformer-PytorchRepositories
Universal Transformer PyTorch implementation
- DongjunLee / transformer-tensorflowRepositories
Transformer Tensorflow implementation
- Google / Constructing Transformers For Longer Sequences with Sparse Attention MethodsEnglish Blog
- Google / Moving Beyond Translation with the Universal TransformerEnglish Blog
- Google / Transformer-XL: Unleashing the Potential of Attention ModelsEnglish Blog
- Havard NLP / The Annotated TransformerEnglish Blog
Attention
- Illustrated: Self-AttentionEnglish Blog
- JayLou / NLP中的Attention注意力机制+Transformer详解Chinese Blog
Showing a sample of 349 resources. View the full list on GitHub →