Skip to main content

Why every Software Engineer should read Designing Data-Intensive Applications

· 3 min read
Evan Tay

Picking up this book was one of the best decisions I made for my fledgling software engineering career. Its insights enabled me to make well-reasoned software design decisions, and confidently communicate them, in spite of my relative professional inexperience. Given how helpful it has been, I’m here today to share more about the impression it has left on me, and convince you that it is a must-read if you are a software engineer.

Designing Data-Intensive Applications

I kickstarted my engineering career back in January 2021, as a full stack engineer at Padlet. During the onboarding process, my (amazing) mentor, Brian, imparted a great deal of guidance to me. One of his tips was that I should take a look at Kleppman’s Designing Data-Intensive Applications. Thankfully, we had two copies of the book in the office, purchased by my (also amazing) boss, Shu Yang, who recommended me to read it too. I’m thankful I ended up taking their advice, because I was able to glean so much insights from Kleppman, which happened to be highly applicable to the infrastructure and full stack projects I was developing.

"This book should be required reading for software engineers." - Kevin Scott, Chief Technology Officer at Microsoft

Like Brian, Shu Yang and Kevin, I now also believe all software engineers working on a distributed, cloud or data-intensive system will greatly benefit from reading the book. It provides the fundamental framework for thinking about these systems, and also the vocabulary to communicate such thoughts. Coupled together, these insights will empower you to make better design decisions and effectively convey them, even if you lack prior experience in the problem domain.

Kleppman also compared the key fundamental ideas behind the broad range of popular data systems out there today, by discussing their advantages, limitations and trade-offs, rather than diving deep into the intricacies of each tool. This was ideal given that the book's objective was to help us choose the right tool for the right occasion, which these characteristics will be more relevant for.

If you lack the time (or will) to pour over the entire book, you should at least check out the opening chapter. In it, Kleppman gives a comprehensive yet succinct overview of what I mentioned above, and provides a clear, detailed explanation of the three key principles in designing data-intensive system architecture: Reliability, Scalability and Maintainability. Just reading this first chapter alone was beneficial to me, as I was now able to better understand and discuss architectural concerns with my team.

If you're still not convinced whether to invest your time into this book, you can check out a summary I've written for the first chapter, where I’ve condensed Kleppman’s opening discourse on Reliability, Scalability and Maintainability. I’m certain it’ll provide a glimpse into the many lessons that Designing Data-Intensive Applications has to share, and if you do read the book, definitely let me know what you think!

Special thanks to Vanessa Tay for editing this!