1 May 2026
Linkpost: Sanity-Checking "Incompressible Knowledge Probes"
Day 30 of Inkhaven: 30 Days of Posts
A short writeup with LawrenceC on a recent paper that claimed to reverse-engineer the parameter counts of frontier AI models (GPT-5.5 at 9.7T, etc.). We dug in and found that the actual scoring method diverged from the explicitly stated methods used in the paper, and ~9.4% of questions in the datasets were ambiguous or wrong. These issues inflate the paper's parameter estimates by up to 10x.
Read it on LessWrong: Sanity-checking "Incompressible Knowledge Probes"