- BPE Regex live
- Whole Word Tokenization Problem live
-
Why BPE Wins
live
A mathematical breakdown of why BPE is superior to whole-word tokenization.
-
Wordnet Reference
live
WordNet organizes words into synsets based on semantic similarity and shared meaning, using criteria like part of speech, contextual usage, and lexical relations. It maps complex semantic hierarchies—including hypernymy, hyponymy, meronymy, and troponymy—to represent how concepts relate as generalities, parts, or specific manners of action.
-
AI vs Human Cognition: The Zoo Scenario Case Study
live
A deep technical dive into how AI world models handle physical and social boundaries compared to human causal logic. Covers embedding table updates, "Surprise" math, and handling out-of-distribution (OOD) emergency events.
-
Symbolic Logic & Axiomatic Fine-tuning
new
The shift from statistical guessing to rule-based logic gates and symbolic maps in embedding tables.
-
LLM Training Basics
new
Foundational concepts of large language model training, architecture, and optimization techniques.
-
LLM diff concept
new
advanced features of large tokenizer - Google search analysis