TL;DR: In a hurry? Jump to the interactive guide: data-stores.sharmaprateek.com ↗
Raise your hand if you’ve ever nodded along to someone explaining “Lakehouse architecture” while secretly thinking, “Isn’t that just a database with extra steps?”
Yeah, me too.
Data engineering is weird. One minute you’re happily writing SQL in a relational database, and the next you’re being told that you need a “Modern Data Stack” involving five different SaaS tools, an S3 bucket, a compute engine that sounds like a Transformer (hello, Spark), and something called “Vector Search” because apparently, our databases need to hallucinate now.
It’s overwhelming. The buzzwords come at you fast: CAP Theorem, PACELC, Columnar Storage, Separation of Compute and Storage… it’s enough to make you want to go back to CSV files and a calculator.
So, I decided to build a map.
I wanted to create something that didn’t just tell you how these things work, but actually showed you. Not with dry textbook definitions, but with things that move, glow, and click.
Introducing Data Stores Evolution (name pending, but it sounds official, right?).
What is this thing?
It’s an interactive journey through the history and future of data storage. Think of it as a museum tour, but instead of dusty bones, we have:
- The Big Data Explosion: Watch what happens when you try to shove petabytes of data into a single server. (Spoiler: It doesn’t end well).
- The Spark Era: See how we learned to process data in parallel without losing our minds.
- The Lakehouse: Finally understand why we’re dumping everything into a “lake” and building a “house” on top of it.
- Vector & AI: The new frontier where we turn words into math so robots can understand us.
Why did I build it?
Because I learn by seeing. I wanted to build the resource I wish I had when I was first trying to figure out why NoSQL was a big deal or why everyone suddenly cares about “Iceberg” (and not the lettuce).
What can you do?
- Play with the CAP Theorem: Adjust the sliders and watch your distributed system cry as you try to have Consistency, Availability, and Partition Tolerance all at once.
- Visualize Vector Search: See how “King - Man + Woman = Queen” looks in multi-dimensional space (with cool glowing dots!).
- Explore the Modern Data Stack: Click through the layers of a modern architecture without having to pay for a single license.
Come take a look!
If you’re a student, a developer, or just someone who likes clicking on things that animate, give it a spin. It’s my attempt to turn the chaos of data engineering into something you can actually wrap your head around.
And hey, if you still prefer CSV files after this… well, at least they’re consistent.