Designing data-intensive applications — Storage

This is going to be a series of post taken from my learnings of “Designing data-intensive applications” by “Martin Kleppmann

picture from

Bare minimal

echo "$1,$2" >> database
grep "^$1," database | sed -e "s/^$1,//" | tail -n 1



Read: The db_get has a bit of performance problem when a large number of records are written this takes time to lookup for the key scanning the entire file for each read, cost of reading is O(n)

To make effective reads we could go for a data structure called index, lets take a look at an implementation of this with Hash indexes

Hash indexes


However, we are bottlenecked with the fact that our keys being stored as an in-memory needs to be able to fit in the RAM. This also fails in performance over any range-queries for data.

Real word analysis

Crash recovery: In such cases, rather than rebuilding the segment to generate the hash map a better idea would be to go for having snapshots of each segment’s hash map to be stored on disk.

Partially written records: Adding a checksum alongside the write could address this issue.

Concurrency: Being sequential in writes, it allows only one write thread but supports multiple read threads concurrently.

SSTable (Sorted s̶e̶g̶m̶e̶n̶t̶e̶d String Table)


To maintain the sorted structure in-memory, we can make use of the AVL tree(memtable), which provides key insertion in any order but reads back in sorted order. To handle the case of a DB crash, we can have a backup mechanism of our initial approach working in parallel on disk, keeping a separate unsorted log of every writes that is immediately appended.

Real word analysis


Rather than having to break the database into segments of variable size, the B tree breaks down the database into fixed-size blocks or Pages of approx 4kb in size to performs read and write. Each page has an address and one location in it can refer to another page on disk. one page will be made as a root of the B tree and can refer to number of child pages. When there is no space for a new key in a page, it splits into two half-filled pages and updates the parent page. B tree with a n keys will have a depth of O(log n).



LSM trees are faster for writes, B trees are faster for reads.

#devloper ,likes art, origami and instrumental music

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store