Cursor

Simon Eskildsen on scaling Shopify, building turbopuffer, and the future of databases

Cursor·May 15, 2026

OVERVIEW

This episode features Simon Eskildsen, known for his work scaling Shopify's infrastructure and founding turbopuffer. The discussion covers the challenges and lessons learned from rapidly scaling a major e-commerce platform, delves into the philosophy behind robust software design, and explores the historical evolution and future trends of database technologies.

KEY TOPICS

  • Scaling Shopify's infrastructure in the 2010s
  • Specific Shopify outages and their root causes (e.g., MySQL stalls, flash sales)
  • The evolution of Shopify's infrastructure team structure and "DevOps" adoption
  • Key software engineering principles for building resilient systems
  • The story behind Logrus, Simon's open-source logging library
  • The concept of "Napkin Math" for estimating system performance and understanding fundamental properties
  • Detailed analysis of a TCP windowing issue and its impact on web performance
  • The historical evolution of databases driven by new workloads and storage architectures
  • Inspiration from SQLite, Google Cloud Storage, and other databases
  • The role of AI in coding and engineering, and how to "RL" models for good code
  • The characteristics and traits of "P99" (top 1%) engineers
  • The future of databases, including GPU databases and the lasting impact of object storage
  • Interviewing techniques for finding great engineers

MAIN TAKEAWAYS

  • Scaling a company like Shopify through the 2010s involved overcoming numerous infrastructure challenges, from unexpected MySQL stalls caused by PHP cron jobs to handling massive traffic spikes from celebrity flash sales.
  • Simplicity is a paramount engineering principle; software that ages well is often simple and easy to debug, especially when on call for critical systems impacting millions of dollars per minute.
  • Benchmarks alone are insufficient for understanding system performance; a first-principles understanding of fundamental properties (e.g., memory movement, network round trips) is crucial to close the gap between theoretical limits and actual performance.
  • Databases evolve approximately every 10-15 years, driven by the emergence of new workloads (e.g., web, OLAP, AI) and the availability of fundamentally new storage architectures (e.g., object storage, NVMe SSDs) that incumbent databases cannot easily adapt to.
  • MySQL scaling during the 2010s was primarily about reducing connection counts and offloading work to caching layers like NGINX Lua due to the high overhead of numerous individual process connections.
  • Early collaboration between companies like Shopify, GitHub, and Zendesk was vital for sharing knowledge and developing open-source tools to tackle common scaling problems.
  • Designing software with graceful failure modes and high recoverability is essential; the ability to shut down every server and lose no data is a property Simon highly values.
  • "P99" engineers are characterized by their speed, ability to bend complex systems to their will, and an insatiable drive to learn and constantly seek higher levels of understanding and performance.
  • Tools like Cursor for AI-assisted coding can significantly boost productivity for smaller, routine tasks, but making global, architecturally optimal decisions still heavily relies on human expertise and deep domain knowledge, especially in critical areas like databases.
  • The future of databases will likely be shaped by new hardware platforms like GPUs and continued advancements in storage, demanding new architectures that can take advantage of massive parallelism and speculative execution.

NOTABLE QUOTES

"Simplicity surprise you, and complexity has to be deserved."
"The only thing that I just keep coming back to... is just to make it as simple as possible."
"The gaps between the first principle understanding of the system and how the system is actually performing... We have to close that gap before we can conclude anything."
"The thing that breaks when you scale a website is generally the database."

Summarized with DriftNote — AI-powered podcast summaries

Try it free