BENEDICT NEO梁耀恩

TIL about cross encoders and colbert while reading about to introduce joint attention between query and candidates to prepare for my interview.

Cross Encoders:

  • process query and document pairs together through a single transformer
  • compute relevance scores directly without creating separate embeddings
  • highly accurate but computationally expensive for large document collections
  • cannot pre-compute document representations, requiring full processing for each query
  • use case: re-ranking top-k search results from a first-pass retrieval system

ColBERT:

  • uses late interaction architecture with separate encodings for queries and documents
  • creates contextualized embeddings for each token rather than a single vector
  • performs efficient token-level interactions between query and document representations
  • enables both pre-computation of document representations and fast retrieval
  • use case: semantic search over millions of documents with better accuracy than bi-encoders

some herbs that made me very heaty today and i will avoid in the future, especially when i'm stressed

  1. 淮山 (huái shān) - Chinese Yam
  2. 党参 (dǎng shēn) - Codonopsis Root
  3. 北祈 (běi qí) - Probably a typo or local name; might mean Astragalus Root
  4. 龙眼肉 (lóng yǎn ròu) - Longan Fruit
  5. 首乌 (shòu wū) - Fo-Ti Root
  6. 当归 (dāng guī) - Angelica Sinensis
  7. 熟地 (shú dì) - Prepared Rehmannia Root
  8. 黄精 (huáng jīng) - Polygonatum
  9. 川芎 (chuān xiōng) - Ligusticum Wallichii
  10. 构纪子 (gòu jì zǐ) - Possibly Goji Berries
  11. 大枣 (dà zǎo) - Red Dates/Jujube