Context Awesome

awesome-pretrained-models-for-information-retrieval

github.com/ict-bigdatalab/awesome-pretrained-models-for-information-retrieval ↗

A curated list of awesome papers related to pre-trained models for information retrieval (a.k.a., pretraining for IR).

677

GitHub Stars

205

Curated Resources

8

Categories

20 hours ago

Last Refreshed

Survey PapersFirst Stage RetrievalRe-ranking StageJointly Learning Retrieval and Re-rankingModel-based IR SystemLLM and IRMultimodal RetrievalOther Resources

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me multi-stream architecture applied on input resources from awesome-pretrained-models-for-information-retrieval"

Installation instructions →

What's inside

Multimodal Retrieval

12-in-1: Multi-Task Vision and Language Representation Learning.Multi-stream Architecture Applied on Input
Dynamic Modality Interaction Modeling for Image-Text Retrieval.Unified Single-stream Architecture
ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph.Multi-stream Architecture Applied on Input
Learning Transferable Visual Models From Natural Language Supervision.Multi-stream Architecture Applied on Input
M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-training.Multi-stream Architecture Applied on Input
M6-v0: Vision-and-Language Interaction for Multi-modal Pretraining.Multi-stream Architecture Applied on Input

LLM and IR

First Stage Retrieval

A Contrastive Pre-training Approach to Learn Discriminative Autoencoder for Dense Retrieval.Dense Retrieval
Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval.Dense Retrieval
Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval.Dense Retrieval
Augmenting Document Representations for Dense Retrieval with Interpolation and Perturbation.Dense Retrieval
BERT-based Dense Retrievers Require Interpolation with BM25 for Effective Passage Retrieval.Hybrid Retrieval
COIL: Revisit Exact Lexical Match in Information Retrieval with Contextualized Inverted List.Sparse Retrieval

Survey Papers

Jointly Learning Retrieval and Re-ranking

Model-based IR System

Re-ranking Stage

Are Neural Ranking Models Robust?Other Topics
A Unified Pretraining Framework for Passage Ranking and Expansion.Other Topics
Axiomatically Regularized Pre-training for Ad hoc Search.Other Topics
BERT-QE: Contextualized Query Expansion for Document Re-ranking.Other Topics
Beyond 512 Tokens: Siamese Multi-depth Transformer-based Hierarchical Encoder for Long-Form Document Matching.Long Document Processing Techniques
Beyond [CLS] through Ranking by Generation.Basic Usage

Other Resources

BERT-related-papersOther Resources About Pre-trained Models in NLP
Efficient Transformers: A Survey.Surveys About Efficient Transformers
Faiss: a library for efficient similarity search and clustering of dense vectorsSome Retrieval Toolkits
MatchZoo: a library consisting of many popular neural text matching modelsSome Retrieval Toolkits
Pre-trained Languge Model Papers from THU-NLPOther Resources About Pre-trained Models in NLP
Pre-trained Models for Natural Language Processing: A Survey.Other Resources About Pre-trained Models in NLP

Showing a sample of 205 resources. View the full list on GitHub →