cross encoders and colbert

BENEDICT NEO 梁耀恩

💭📰now curius projects library findme archive 🎲

cross encoders and colbert

April 2, 2025

TIL about cross encoders and colbert while reading about to introduce joint attention between query and candidates to prepare for my interview.

Cross Encoders:

process query and document pairs together through a single transformer
compute relevance scores directly without creating separate embeddings
highly accurate but computationally expensive for large document collections
cannot pre-compute document representations, requiring full processing for each query
use case: re-ranking top-k search results from a first-pass retrieval system

ColBERT:

uses late interaction architecture with separate encodings for queries and documents
creates contextualized embeddings for each token rather than a single vector
performs efficient token-level interactions between query and document representations
enables both pre-computation of document representations and fast retrieval
use case: semantic search over millions of documents with better accuracy than bi-encoders

some herbs that made me very heaty today and i will avoid in the future, especially when i'm stressed

淮山 (huái shān) - Chinese Yam
党参 (dǎng shēn) - Codonopsis Root
北祈 (běi qí) - Probably a typo or local name; might mean Astragalus Root
龙眼肉 (lóng yǎn ròu) - Longan Fruit
首乌 (shòu wū) - Fo-Ti Root
当归 (dāng guī) - Angelica Sinensis
熟地 (shú dì) - Prepared Rehmannia Root
黄精 (huáng jīng) - Polygonatum
川芎 (chuān xiōng) - Ligusticum Wallichii
构纪子 (gòu jì zǐ) - Possibly Goji Berries
大枣 (dà zǎo) - Red Dates/Jujube

Next:

Previous:

how logistic regression works