Building an LLM
The Internet is the Dataset
Intro to pre-training; downloading/processing internet data (e.g., FineWeb); biases in medical sources like PubMed vs. forums.
From Text to Tokens:
The Lego Blocks of LLMs
Tokens: The Lego Blocks of LLMs
Tokenization process; byte-pair encoding; med example: Tokenizing patient symptoms or rare disease terms.
How AI Learns to Write:
Neural Network I/O and Internals
Neural Network I/O and Internals
Input/output of neural nets; Transformer architecture; how LLMs "compress" medical knowledge like diagnostic patterns.
Unlocking AI Training to Talking
Inference and Base Model Examples
Generating text (inference); base models like GPT-2/Llama; med tie: Predicting next symptoms in a case history.
LLM Training Simulator to Assistant
Pretraining to Post-Training Transition
Shift from pre-training to fine-tuning; overview of stages; med: Building general knowledge before clinical specialization.
Training ChatGPT's Behavior
Post-Training Data: Conversations and SFT
Supervised fine-tuning on conversations; human labelers; med: Curating ideal responses for patient Q&A or ethics.
LLM Hallucinations
Hallucinations: The #1 Risk in Medicine
Causes/mitigations for hallucinations; tool use; med: Risks like invented drug interactions or false diagnoses.
Engineering the AI Self
Self-Knowledge Tricks & Limitations
Model's lack of self-awareness (e.g., knowledge cutoffs); med: Prompting for transparency in evidence-based advice.
Tokens to Think, Tools to Work
Models Need Tokens to Think
Chain-of-thought reasoning; tokens as "thinking space"; med: Step-by-step differentials to avoid errors.
The Jagged Edges of LLM Performance
Tokenization Revisited: Spelling and Jagged Intelligence
Tokenization flaws (e.g., spelling/math issues); jagged capabilities; med: Miscounting lab values or syndrome names.
Architecting AI Assistants
From Supervised Fine-Tuning to Reinforcement Learning
Transition to RL; analogies to med training (e.g., practice problems); med: Aligning for accurate consultations.
LLM Training Self Discovery
Reinforcement Learning Process and Examples
RL basics; trial-and-error; med: Optimizing for better diagnostic simulations or treatment plans.
Emergent Reasoning
DeepSeek-R1 and Advanced RL Models
Examples like DeepSeek-R1, AlphaGo; emergent strategies; med: Potential for novel research insights.
RLHF: Aligning Models to Be Good Doctors
RL from human feedback; gaming issues
RL from human feedback; gaming issues; med: Ensuring helpful, harmless responses in high-stakes care.
LLM Frontiers
Decoding LLMs: Modalities, Agents, and the Ecosystem
Multimodality, agents, test-time training; med: Roadmap for diagnostics, supervision, ethical integration.