Skip to main content

Scaling Memcached

Created on April 14, 2021

Article summary for Melodies Sim's Medium Article: Lessons Learnt from Scaling Memcached in Production


Memcached is a(n):

  • In-memory cache typically used for look-aside caching
    • Look-aside versus look-through caching: see slides 14 and 15 of Princeton COS 316 Intro to Caching
    • Look-aside caching: client requests item from backing store (database) on cache miss; simpler implementation, more flexible, slower on misses
    • Look-through caching: cache requests item from backing store on cache miss; harder implementation, less flexible, better consistency
  • Simple implementation without built-in high availability features
  • Used by planet-scale companies such as Facebook and Twitter

The article discusses strategies for building caches with high availability and strong data consistency.

Key takeaways#

  • Trade-off between ACID (C stands for consistency) properties and availability
  • Memcached can be used as Authoritative Storage via Remote Marker mechanism
    • Mechanism used to minimize probability of reading stale data in DB replicas
    • When the client writes to the master DB, there is a replication delay between it and its replicas
    • The client will write a marker to Memcached after writing to the master DB
    • Marker will have a short TTL - duration depending on DB replication delay
    • If marker exists, read from the master DB; otherwise read from DB replica
  • Writes are not idempotent nor commutative; deletes are.
    • If we write to DB then write to cache naively, race conditions will arise when there are concurrent writes as writes are non-commutative
      • Leads to infinitely stale data given writes and no eventual data consistency
    • To avoid this issue, use delete instead: after writing to DB, do cache invalidation by deleting the data item from the cache
      • Multiple deletes sent to the cache would not matter as you can only delete something once; ordering of delete doesn't matter too
      • Client will fetch the data item from the database on cache miss and update the most recent value to Memcached instead
    • There is still a small risk of stale data, if a delete occurs between a cache read and set (see Melodies's article for illustration)
      • Facebook deals with it using a check-and-set approach via a lease mechanism
      • On cache miss, client receives a lease token for the requested key
      • On cache set, client verifies the lease token before setting the key
      • If Memcached receive a delete request for the key, the lease is invalidated