Unpacking the multilayer perceptrons in a transformer, and how they may store facts
Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support
An equally valuable form of support is to share the videos.
AI Alignment forum post from the Deepmind researchers referenced at the video's start:
https://www.alignmentforum.org/posts/iGuwZTHWb6DFY3sKB/fact-finding-attempting-to-reverse-engineer-factual-recall
Anthropic posts about superposition referenced near the end:
https://transformer-circuits.pub/2022/toy_model/index.html…