Twitter thinks they killed MLPs. But what are Kolmogorov-Arnold Networks?
What if all your weights were functions?
Are MLPs (Multi-Layer Perceptrons) — the foundation of modern deep learning — on the brink of obsolescence? One of this week’s top papers introduces a novel architecture that promises to outperform MLPs in both accuracy and interpretability.
In this post, we’ll dive into the research behind this new approach to solving a key machine learning problem. We’ll examine the key ideas, technical details, and implications, while also taking a look at its limitations and some ideas for further research. Let’s go!
AIModels.fyi is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.
Subscribe or follow me on Twitter for more content like this!
Grounding: What are Multi-Layer Perceptrons (MLPs)?
The new approach we’re talking about is called KAN — Kolmogorov-Arnold networks. They got a ton of play on Twitter. Two examples are shown below: