Intro to LLMs

Andrej Karpathy, the GOAT who gifted us the Neural Networks from Zero To Hero videos, released a 1hr Intro to LLMs video (slides)

Below are my takeaways.

  • LLMs are neutal networks with billions of parameters dispersed through them, we know how to iteratively adjust the parameters to make them better, but we don't know how they collaborate to do it
  • training LLMs = lossy compression (100x compression ratio), there is close relationship between compression and performance
  • hallucinations: LLMs dreams, think of them as mostly inscrutable artifacts, develop correspondingly sophisticated evaluations
  • how LLMs are build
    • 1) pre-training for knowledge
      • get large amounts of text (~10 TB), and get expensive compute (~6000 GPUs)
      • compress text into neural network (pay ~$2m and wait ~12 days)
      • get a base model
    • 2) fine-tuning for alignment
      • write labelling instructions
      • hire people (scale.ai) to collect 100k high quality ideal Q&A instructions or comparisons
      • finetune base model on this data (wait ~1 day)
      • obtain assistant model
      • run lots of evaluations
      • deploy
      • monitor, collect misbehaviours, go back to step 1
    • 3) RLHF with comparisons data, train model on good/bad outputs
  • scaling laws:
    • performance of LLMs is a smooth, well-behaved, predictable function of
      • N: no. of parameters in network
      • D: amount of text
    • expect a lot of "general capability" across all areas of knowledge
  • system 2 thinking
    • LLMs currently only have system 1 fast thinking, just next word prediction
    • the goal is system 2, where they take time to think through a problem, providing more accurate answers
    • how? create tree of thought and reflect on question before answering
  • self-improvement
    • what is equivalent of AlphaGO self-play for LLMs?
    • main challenge is the lack of reward criterion (language is a large space and not well defined)
  • LLM OS
    • "LLMs is the kernel process of an emergent operating system"
    • RAM = working memory = context window
    • OS systems had closed and open source (llama2 vs gpt4)
  • llm security
    • LLMs are vulnerable to jailbreaks, prompt injection, data poisoning

Reading list

11/27/2023