Posts

The demise of coding is greatly exaggerated

Image
NVDIA CEO Jensen Huang recently made very contraversial remarks : "Over the course of the last 10 years, 15 years, almost everybody who sits on a stage like this would tell you that it is vital that your children learn computer science, and everybody should learn how to program. And in fact, it’s almost exactly the opposite. It is our job to create computing technology such that nobody has to program and that the programming language is human. Everybody in the world is now a programmer.  This is the miracle of artificial intelligence." I am not going to wise crack and say that this is power poisioning and this is what happens when your company valuation more than triples in a year and surpasses Amazon and Google. (Although I don't discount this effect completely.) Jensen is very smart and also has some great wisdom , so I think we should give this the benefit of doubt and try to respond in a thoughtful manner.  A response is warranted because this statement got a lot of p

Checking Causal Consistency of MongoDB

Image
This paper  declares the Jepsen testing of MongoDB for causal consistency a bit lacking in rigor, and goes on to test MongoDB against three variants of causal consistency (CC, CCv, and CM) under node failures, data movement, and network partitions. They also add TLA+ specifications of causal consistency and their checking algorithms, and verify them using the TLC model checker, and discuss how TLA+ specification can be related to Jepsen testing. This is a journal paper, so it is long. The writing could have been better. But it is apparent that a lot of effort went into the paper. One thing I didn't like in the paper was the wall-of-formalism around defining causal consistency in Section 2:Preliminaries. This was hard to follow. And I was upset about the use of operational definitions such as "bad patterns" for testing. Why couldn't they define this in a state-centric manner, as in the client-centric database isolation paper ? It turns out that Section 2 did not intr

Transaction Processing Monitors (Chapter 5. Transaction processing book)

Image
"Transaction Processing Monitors" is chapter 5 as part of our transaction processing book reading journey.  This chapter had been the hardest to chapter to read and understand. It went into implementation concerns in contrast to the previous chapters which discussed design principles and concepts. It turns out 1980s were a very different time, and it is very hard to relate to the systems and infrastructure at that time. The people in our reading group also were lost, and found this chapter very hard to engage with.  1980s Ok, we are back at the year 1990 when this book is being written. This is when the internet and the web were still obscure.  Client-server processing is the modern thing then, and had just started to replace for time-sharing for databases and transaction processing. This chapter talks about transaction oriented processing. If I squint at it, I can see the concepts of SLAs and shared responsibility models between a cloud customer and provider in these descrip

Why I blog

Image
My blog has been going for 14 years now, and has just passed 4 million pageviews. Yay! I remember the 1 million pageviews moment in 2017 ! The main reason I was able to persist for so long is because I blog for selfish reasons. Let me try to unpack why I blog, and why I keep blogging. I write for myself The audience I have in mind is myself. I blog to clarify my understanding and thinking about a topic.  Reading a research/technical paper is already time consuming. I can't do it in less than 4 hours. Period. I love learning. And I am fortunate that I get to read research papers as part of my work. I double-dip on this effort to blog about them, to improve my understanding of these papers. Writing a blog post is the final step in my pipeline for reading a paper. I think my blog reviews of papers hits a good niche. Research papers are written for the wrong audience (or rather maybe the right audience but for the wrong reason): they are written to please 3 specific expert reviewers

Transaction models (Chapter 4. Transaction processing book)

Image
Atomicity does not mean that something is executed as one instruction at the hardware level with some magic in the circuitry preventing it from being interrupted. Atomicity merely conveys the impression that this is the case, for it has only two outcomes: the specified result or nothing at all, which means in particular that it is free of side effects. Ensuring atomicity becomes trickier as faults and failures are involved. Consider the disk write operation, which comes in four quality levels: Single disk write: when something goes wrong the outcome of the action is neither all nor nothing, but something in between. Read-after write: This implementation of the disk write first issues a single disk write, then rereads the block from disk and compares the result with the original block. If the two are not identical, the sequence of writing and rereading is repeated until the block is successfully written. This has problems due to no abort path, no termination guarantee, and no partial ex

Recent reads (Feb 2024)

Image
Here are the three books I have read recently. Argh, I wish I took some notes while going through these books. It feels hard to write these reviews in retrospect. The Culture Code: The Secrets of Highly Successful Groups (2018) "What do Pixar, Google and the San Antonio Spurs basketball team have in common?" That was the pitch for the culture code book, which came out. That didn't age well, for Google's case at least. Well, the Google example was not about team work, but rather Jeff Dean fixing search for adwords over a weekend, so this is neither here or there. We can forgive the book for trying to choose sensational examples. I did like the book overall. It identifies three things to get the culture right: 1) creating belonging, 2) sharing vulnerability, and 3) establishing purpose. Creating belonging is about safety/security. Maslow's hierarchy emphasizes safety and security as fundamental human needs. In a work environment where we feel judged or constantly ne

TLA+ modeling of MongoDB logless reconfiguration

Image
Here we do a walkthrough of the TLA+ specs for the MongoDB logless reconfiguration protocol we have reviewed recently. The specs are available at the  https://github.com/will62794/logless-reconfig  repo provided by Will Schultz, Siyuan Zhou, and Ian Dardik.  This is the protocol model for managing logless reconfiguration. Let's call this the "config state machine" (CSM). This is the protocol model for static MongoDB replication protocol based on Raft. Let's call this the "oplog state machine" (OSM).  Finally this model composes the above two protocols so they work in a superimposed manner. I really admire how these specs provided a modular composition of the reconfiguration protocol and Raft-based replication protocol. I figured I would explain how this works here, since walkthroughs of advanced/intermediate TLA+ specifications, especially for composed systems, are rare. I will cover the structure of the two protocols (CSM and OSM) briefly, before diving

Adapting TPC-C Benchmark to Measure Performance of Multi-Document Transactions in MongoDB

Image
This paper appeared in VLDB 2019 . Benchmarks are a necessary evil for database evaluation. Benchmarks often focus on narrow aspects and specific workloads, creating a misleading picture for broader real-world applications/workloads. However, for a quick comparative performance snapshot, they still remain a crucial tool. Popular benchmarks like YCSB, designed for simple key-value operations, fall short in capturing MongoDB's features, including secondary indexes, flexible queries, complex aggregations, and even multi-statement multi-document ACID transactions (since version 4.0). Standard RDBMS benchmarks haven’t been a good fit for MongoDB either, since they require normalized relational schema and SQL operations. Consider TPC-C, which simulates a commerce system with five types of transactions involving customers, orders, warehouses, districts, stock, and items represented with data in nine normalized tables. TPC-C requires specific relational schema and prescribed SQL statements

Fault tolerance (Transaction processing book)

Image
This is Chapter 3 from the Transaction Processing Book Gray/Reuter 1992 . Why does the fault-tolerance discussion come so early in the book? We haven't even started talking about transactional programming styles, concurrency theory, concurrency control. The reason is that the book uses dealing with failures as a motivation for adopting transaction primitives and a transactional programming style. I will highlight this argument now, and outline how the book builds to that crescendo in about 50 pages. The chapter starts with an astounding observation. I'm continuously astounded by the clarity of thinking in this book: "The presence of design faults is the ultimate limit to system availability; we have techniques that mask other kinds of faults." In the coming sections, the book introduces the concepts of faults, failures, availability, reliability, and discusses hardware fault-tolerance through redundancy. It celebrates wins in hardware reliability through several examp

Popular posts from this blog

The end of a myth: Distributed transactions can scale

Hints for Distributed Systems Design

Foundational distributed systems papers

Learning about distributed systems: where to start?

Metastable failures in the wild

Scalable OLTP in the Cloud: What’s the BIG DEAL?

SIGMOD panel: Future of Database System Architectures

The demise of coding is greatly exaggerated

Dude, where's my Emacs?

There is plenty of room at the bottom