Tuần 06: Cache Strategy

“There are only two hard things in Computer Science: cache invalidation and naming things.” — Phil Karlton

Tags: system-design cache redis alex-xu devops security Student: Hieu Prerequisite: Tuan-02-Back-of-the-envelope · Tuan-05-Load-Balancer Liên quan: Tuan-03-Networking-DNS-CDN · Tuan-07-Database-Sharding-Replication · Tuan-08-Message-Queue · Tuan-13-Monitoring-Observability

1. Context & Why

Analogy đời thường — Tủ lạnh trong bếp

Hieu, tưởng tượng em đang nấu ăn trong bếp. Mỗi lần cần rau, thịt, trứng — em phải chạy ra chợ mua. Mỗi chuyến đi chợ mất 30 phút. Nấu một bữa cơm 4 món = 4 chuyến = 2 tiếng chỉ để mua nguyên liệu.

Giải pháp? Mua tủ lạnh (Cache). Cuối tuần đi chợ một lần, mua đủ nguyên liệu cho cả tuần. Khi nấu, chỉ cần mở tủ lạnh — 5 giây thay vì 30 phút.

Đời thường	System Design
Chợ (xa, chậm)	Database (disk I/O, network latency)
Tủ lạnh (gần, nhanh)	Cache (in-memory, local hoặc Redis)
Thức ăn hết hạn	Cache expiration (TTL)
Tủ lạnh đầy, phải bỏ đồ cũ	Cache eviction (LRU, LFU)
Mua nhầm đồ cũ vẫn để trong tủ	Stale data
Tủ lạnh hỏng, phải chạy ra chợ	Cache miss → fallback to DB
Tủ đông (ít dùng, lưu lâu)	CDN / Cold cache layer
Tủ lạnh mini trên bàn bếp	L1 local cache (Caffeine/Guava)

Cache trong System Design là bộ nhớ tạm tốc độ cao, nằm giữa application và data source (DB, API bên ngoài, file system), giúp giảm latency và giảm tải cho backend.

Tại sao Alex Xu đặt Cache ngay sau Load Balancer?

Vì sau khi biết cách phân tải request (Load Balancer), câu hỏi tiếp theo là: làm sao giảm số lần phải xử lý request? Cache là câu trả lời. Nhìn lại bảng latency từ Tuan-02-Back-of-the-envelope:

Operation	Latency
Memory reference	100 ns
SSD random read	16 us
Network round trip (same DC)	0.5 ms
DB query (simple, indexed)	1-5 ms
DB query (complex join)	50-500 ms

Cache hit (memory) nhanh hơn DB query từ 10x đến 5000x. Đó là lý do mọi hệ thống lớn đều có cache layer.

Khi nào dùng Cache?

Nên dùng	Không nên dùng
Read-heavy workload (read:write > 5:1)	Write-heavy workload
Data ít thay đổi (product catalog, user profile)	Data thay đổi liên tục (stock prices realtime)
DB query tốn nhiều resource (complex joins, aggregations)	Data cần consistency 100% (bank balance)
Response cho nhiều user giống nhau (homepage, trending)	Data unique per request (search results cá nhân hóa cao)
Latency requirement thấp (< 10ms)	Data quá lớn không fit vào memory

2. Deep Dive — Cache Patterns & Architecture

2.1 Cache-Aside (Lazy Loading)

Mô tả: Application tự quản lý cache. Đọc cache trước, nếu miss thì đọc DB rồi ghi vào cache.

Luồng hoạt động:

1. App nhận request → kiểm tra cache
2. Cache HIT → trả data ngay (fast path)
3. Cache MISS → query DB → ghi kết quả vào cache → trả data
4. Khi data thay đổi (write) → invalidate cache entry

Ưu điểm:

Chỉ cache data thực sự được truy cập (không lãng phí memory)
Cache failure không ảnh hưởng hệ thống (fallback về DB)
Linh hoạt — app kiểm soát hoàn toàn logic cache

Nhược điểm:

Cache miss đầu tiên luôn chậm (cold start)
Có thể xảy ra stale data nếu invalidation không đúng
Application code phức tạp hơn (phải handle cả cache + DB logic)
3 network calls cho mỗi cache miss: check cache + query DB + write cache

Khi nào dùng: Hệ thống read-heavy, data thay đổi không quá thường xuyên. Đây là pattern phổ biến nhất — Amazon, Netflix, Facebook đều dùng.

2.2 Read-Through

Mô tả: Cache layer tự động đọc DB khi miss. Application chỉ cần nói chuyện với cache, không cần biết DB.

Luồng hoạt động:

1. App nhận request → đọc cache
2. Cache HIT → trả data
3. Cache MISS → Cache library/proxy TỰ ĐỘNG query DB → lưu vào cache → trả data

Khác biệt với Cache-Aside: Ở Cache-Aside, application chịu trách nhiệm query DB và ghi cache. Ở Read-Through, cache layer tự lo.

Ưu điểm:

Application code đơn giản — chỉ cần gọi cache.get(key)
Logic nhất quán, giảm lỗi
Dễ áp dụng với library hỗ trợ (như Caffeine loader)

Nhược điểm:

Cache layer phải biết cách query DB (tight coupling)
Khó customize query logic phức tạp
Ít library hỗ trợ cho distributed cache (Redis không có built-in read-through)

Khi nào dùng: Khi dùng local cache library (Caffeine, Guava) có sẵn loader mechanism, hoặc khi muốn abstract cache logic ra khỏi business code.

2.3 Write-Through

Mô tả: Mỗi lần write, data được ghi vào cache VÀ DB đồng thời (synchronous). Cache luôn có data mới nhất.

Luồng hoạt động:

1. App write data → ghi vào cache
2. Cache ĐỒNG THỜI ghi vào DB (synchronous)
3. Khi cả hai thành công → trả response

Ưu điểm:

Cache luôn consistent với DB (không stale data)
Read sau write luôn có data mới nhất
Đơn giản hóa cache invalidation (không cần invalidate vì data luôn fresh)

Nhược điểm:

Write latency tăng (phải đợi cả cache + DB)
Cache chứa cả data ít được đọc (lãng phí memory)
Nếu cache down, write cũng fail (SPOF nếu không xử lý)

Khi nào dùng: Khi data consistency quan trọng hơn write performance. Thường kết hợp với Cache-Aside: write-through đảm bảo consistency, cache-aside xử lý read.

2.4 Write-Behind (Write-Back)

Mô tả: App ghi vào cache trước, cache bất đồng bộ (async) ghi vào DB sau. Nhanh nhất nhưng rủi ro nhất.

Luồng hoạt động:

1. App write data → ghi vào cache → trả response NGAY LẬP TỨC
2. Cache queue batches → ghi vào DB sau (async, batched)

Ưu điểm:

Write latency cực thấp (chỉ ghi vào memory)
Batch writes giúp giảm tải DB
Tối ưu cho write-heavy workload

Nhược điểm:

Data loss risk: Nếu cache crash trước khi flush vào DB → mất data
Eventual consistency — DB có thể chậm hơn cache
Phức tạp: phải xử lý retry, idempotency, ordering

Khi nào dùng: Write-heavy system mà chấp nhận eventual consistency. Ví dụ: page view counter, analytics events, logging. Không bao giờ dùng cho financial transactions.

2.5 So sánh tổng hợp Cache Patterns

Pattern	Read Latency	Write Latency	Consistency	Data Loss Risk	Complexity
Cache-Aside	Miss: cao, Hit: thấp	N/A (app ghi DB trực tiếp)	Eventual	Thấp	Trung bình
Read-Through	Miss: cao, Hit: thấp	N/A	Eventual	Thấp	Thấp
Write-Through	Thấp (luôn hit)	Cao (sync 2 nơi)	Strong	Thấp	Trung bình
Write-Behind	Thấp (luôn hit)	Rất thấp (async)	Eventual	Cao	Cao

2.6 Cache Eviction Policies — Chính sách thay thế

Khi cache đầy, cần quyết định bỏ entry nào để nhường chỗ cho entry mới. Ba chiến lược phổ biến:

LRU — Least Recently Used

Logic: Bỏ entry lâu nhất chưa được truy cập
Cấu trúc dữ liệu: Doubly Linked List + HashMap → O(1) get/put
Ưu điểm: Hoạt động tốt với temporal locality (data vừa dùng có khả năng dùng lại)
Nhược điểm: Scan pollution — một lần full scan có thể đẩy hết hot data ra
Dùng khi: Đa số trường hợp (default choice). Redis dùng approximate LRU.
Redis config: maxmemory-policy allkeys-lru

LFU — Least Frequently Used

Logic: Bỏ entry ít được truy cập nhất (đếm số lần access)
Ưu điểm: Giữ lại data thực sự popular (hot data)
Nhược điểm: Frequency count cũ có thể khiến new data khó vào cache; tốn thêm memory cho counter
Dùng khi: Workload có rõ hot data vs cold data (e-commerce trending products)
Redis config: maxmemory-policy allkeys-lfu

TTL — Time To Live

Logic: Mỗi entry có thời gian sống cố định, hết hạn tự động bị xóa
Ưu điểm: Đảm bảo freshness — data không bao giờ stale quá lâu
Nhược điểm: Hot data cũng bị xóa khi hết TTL → cache miss burst
Dùng khi: Data có “hạn sử dụng” tự nhiên (session token: 30 phút, exchange rate: 60 giây)
Redis command: SET key value EX 3600 (TTL 1 giờ)

Best practice: Kết hợp LRU + TTL — mỗi entry có TTL để đảm bảo freshness, đồng thời LRU xử lý khi memory đầy trước khi TTL hết. Đây là cách Redis khuyến nghị.

2.7 Redis vs Memcached

Feature	Redis	Memcached
Data structures	String, Hash, List, Set, Sorted Set, HyperLogLog, Stream, Bitmap	String only (key-value)
Persistence	RDB snapshots + AOF	Không (pure in-memory)
Replication	Master-Replica built-in	Không (client-side)
Clustering	Redis Cluster (auto-sharding)	Client-side consistent hashing
Pub/Sub	Có	Không
Lua scripting	Có	Không
Multithreading	Single-thread command execution (I/O threads từ 6.0)	Multi-threaded
Memory efficiency	Kém hơn (overhead per key ~50-70 bytes)	Tốt hơn (slab allocator)
Max value size	512 MB	1 MB (default)
Transactions	MULTI/EXEC (optimistic locking)	CAS (Compare-And-Swap)

Khi nào chọn Redis:

Cần data structures phức tạp (sorted set cho leaderboard, hash cho user session)
Cần persistence (không muốn mất cache khi restart)
Cần pub/sub, streaming
Cần atomic operations phức tạp (Lua scripts)
Đa số trường hợp → chọn Redis

Khi nào chọn Memcached:

Chỉ cần simple key-value cache
Cần tận dụng multi-core CPU tốt hơn
Memory efficiency là ưu tiên tuyệt đối (caching billions of small objects)
Legacy system đã dùng Memcached

2.8 Redis Data Structures — Deep Dive

String

Dùng cho: Simple key-value caching, counters, distributed locks
Commands: SET, GET, INCR, DECR, SETNX (set if not exists), SETEX (set with expiry)
Max size: 512 MB per value
Use case: Cache API response, session token, rate limit counter

SET user:1001:profile '{"name":"Hieu","role":"dev"}' EX 3600
GET user:1001:profile
INCR page:home:views    # Atomic counter

Hash

Dùng cho: Object với nhiều fields (thay vì serialize toàn bộ object thành string)
Commands: HSET, HGET, HMSET, HGETALL, HINCRBY
Ưu điểm: Update từng field mà không cần đọc/ghi toàn bộ object. Memory-efficient khi < 128 fields (ziplist encoding).
Use case: User profile, product info, session data

HSET user:1001 name "Hieu" role "dev" login_count 42
HGET user:1001 name              # → "Hieu"
HINCRBY user:1001 login_count 1  # Atomic increment 1 field
HGETALL user:1001                # Get toàn bộ fields

Sorted Set (ZSet)

Dùng cho: Ranking, leaderboard, priority queue, time-series index
Commands: ZADD, ZRANGE, ZREVRANGE, ZRANK, ZRANGEBYSCORE
Cấu trúc: Skip list + hash table → O(log N) insert/delete, O(1) score lookup
Use case: Game leaderboard, trending posts, delayed job scheduling

ZADD leaderboard 9500 "player:hieu"
ZADD leaderboard 8700 "player:nam"
ZADD leaderboard 9800 "player:linh"
ZREVRANGE leaderboard 0 9 WITHSCORES   # Top 10 players
ZRANK leaderboard "player:hieu"         # Rank của Hieu

HyperLogLog

Dùng cho: Đếm số lượng unique elements (cardinality) với memory cực thấp
Commands: PFADD, PFCOUNT, PFMERGE
Memory: Chỉ 12 KB dù đếm hàng tỷ unique elements
Sai số: Standard error ~0.81%
Use case: Unique visitors, unique search queries, unique IPs

PFADD page:home:visitors "user:1001" "user:1002" "user:1001"
PFCOUNT page:home:visitors  # → 2 (unique)
# 1 billion unique users vẫn chỉ tốn 12 KB!

Aha Moment: Nếu dùng SET để đếm 1 billion unique visitors → cần ~8GB memory. HyperLogLog chỉ cần 12KB. Trade-off: 0.81% sai số, nhưng đủ chính xác cho analytics.

2.9 Redis Cluster vs Redis Sentinel

Redis Sentinel — High Availability

Mục đích: Automatic failover cho master-replica setup
Cách hoạt động: Sentinel processes monitor master. Nếu master down → tự promote replica lên master
Phù hợp khi: Data fit trong 1 node (< 25-50 GB), cần HA nhưng không cần horizontal scaling
Limitation: Không auto-shard — tất cả data trên 1 master

                    ┌─────────┐
                    │Sentinel │
                    │ Quorum  │
                    │ (3 min) │
                    └────┬────┘
                         │ monitors
              ┌──────────┼──────────┐
              ▼          ▼          ▼
         ┌────────┐ ┌────────┐ ┌────────┐
         │Master  │ │Replica │ │Replica │
         │(write) │ │ (read) │ │ (read) │
         └────────┘ └────────┘ └────────┘

Redis Cluster — Horizontal Scaling + HA

Mục đích: Auto-sharding data across multiple nodes + built-in failover
Cách hoạt động: 16,384 hash slots phân chia giữa master nodes. Mỗi master có replica(s). Client redirect tới đúng node bằng MOVED/ASK.
Phù hợp khi: Data > 25 GB, cần horizontal write scaling, QPS vượt 100K ops/s
Limitation: Không hỗ trợ multi-key operations across slots (trừ khi dùng hash tags), Lua scripts phải chạy trên 1 node

┌──────────────────────────────────────────────────┐
│                 Redis Cluster                     │
│                                                   │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐       │
│  │ Master 1 │  │ Master 2 │  │ Master 3 │       │
│  │Slots 0-  │  │Slots     │  │Slots     │       │
│  │5460      │  │5461-10922│  │10923-16383│      │
│  └────┬─────┘  └────┬─────┘  └────┬─────┘       │
│       │              │              │              │
│  ┌────▼─────┐  ┌────▼─────┐  ┌────▼─────┐       │
│  │ Replica  │  │ Replica  │  │ Replica  │       │
│  │   1a     │  │   2a     │  │   3a     │       │
│  └──────────┘  └──────────┘  └──────────┘       │
└──────────────────────────────────────────────────┘

Feature	Sentinel	Cluster
Auto-failover	Co	Co
Auto-sharding	Khong	Co (16,384 slots)
Max data size	1 node memory	Sum of all masters
Write scaling	1 master only	Multiple masters
Multi-key ops	Full support	Same slot only
Min nodes	3 Sentinel + 1 master + 1 replica	6 (3 masters + 3 replicas)
Client complexity	Low	Medium (redirect handling)

2.10 Cache Stampede / Thundering Herd

Vấn đề: Khi một cache entry phổ biến expire, hàng nghìn requests đồng thời đều miss cache và cùng query DB → DB quá tải.

Timeline:
T=0: Cache entry "trending_posts" expires (TTL hết)
T=0.001s: 5,000 concurrent requests → all cache MISS
T=0.002s: 5,000 DB queries fired simultaneously
T=0.005s: DB CPU 100%, connection pool exhausted
T=0.01s: DB timeout → cascading failure

Giải pháp 1: Mutex Lock (Singleflight)

Chỉ 1 request được query DB, các request khác đợi
Xem code example ở Section 6

Giải pháp 2: Early Expiration (Staggered TTL)

Thêm random jitter vào TTL: TTL = base_ttl + random(0, base_ttl * 0.1)
Tránh nhiều entries expire cùng lúc

Giải pháp 3: Background Refresh

Trước khi TTL hết, background job tự refresh cache
Request luôn đọc từ cache, không bao giờ miss

Giải pháp 4: “Never expire” + Logical Expiration

Cache entry không có TTL thật
Kèm timestamp trong value, application tự kiểm tra “hết hạn chưa?”
Nếu hết hạn → trả data cũ (stale) + trigger async refresh

2.11 Cache Penetration

Vấn đề: Request data không tồn tại trong cả cache lẫn DB. Cache luôn miss, DB luôn query rồi trả empty → cache không bao giờ được warm.

Ví dụ: Attacker gửi hàng triệu requests với random IDs (user:999999999) — data không tồn tại, mỗi request đều hit DB.

Giải pháp 1: Cache null/empty result

GET user:999999999 → cache MISS → DB query → not found
SET user:999999999 "NULL" EX 300  # Cache giá trị NULL với TTL ngắn (5 phút)

Giải pháp 2: Bloom Filter

Bloom filter ở trước cache — check nhanh “key CÓ THỂ tồn tại không?”
Nếu bloom filter nói “không” → return 404 ngay, không query cache/DB
False positive rate ~1% với memory cực thấp
Redis có module RedisBloom hỗ trợ: BF.ADD, BF.EXISTS

Giải pháp 3: Input Validation

Validate format trước khi query: ID phải là UUID hợp lệ, positive integer, v.v.
Reject invalid requests ở API gateway layer

2.12 Cache Avalanche

Vấn đề: Nhiều cache entries expire cùng lúc (hoặc cache server crash) → sudden spike tới DB → DB quá tải → toàn hệ thống sập.

Khác biệt với Cache Stampede: Stampede là 1 key hot expire, Avalanche là nhiều keys cùng expire hoặc toàn bộ cache down.

Giải pháp 1: Staggered TTL

TTL = base_ttl + random(0, max_jitter)
# Ví dụ: base 1 giờ + random 0-10 phút
# → entries expire rải rác, không đồng loạt

Giải pháp 2: Redis HA (Sentinel hoặc Cluster)

Đảm bảo cache server không down hoàn toàn
Automatic failover nếu master crash

Giải pháp 3: Circuit Breaker + Fallback

Khi detect cache down → circuit breaker mở → trả stale data hoặc default response
Không để tất cả request đổ vào DB

Giải pháp 4: Cache Warming

Trước khi traffic peak (biết trước: Black Friday, Tết) → pre-populate cache
Background job refresh hot data trước giờ cao điểm

2.13 Distributed Cache Consistency

Khi có nhiều application instances đọc/ghi cache, consistency là thách thức lớn:

Race Condition — Double Write Problem:

Time    Thread A              Thread B              Cache     DB
T1      Read DB: value=10                           old=10    10
T2                            Update DB: value=20             20
T3                            Delete cache                    20
T4      Write cache: value=10                       10(!)     20
→ Cache bị stale! Cache=10, DB=20

Giải pháp: Delete thay vì Update cache

Khi write DB → delete cache entry (không update)
Read tiếp theo sẽ miss → đọc DB (đã có data mới) → ghi lại vào cache
Giảm race condition window đáng kể

Giải pháp nâng cao: Delayed Double Delete

1. Delete cache
2. Update DB
3. Sleep 500ms (đợi concurrent reads hoàn thành)
4. Delete cache LẦN NỮA

Giải pháp mạnh nhất: Event-driven invalidation

Dùng CDC (Change Data Capture) — đọc DB binlog/WAL
Khi detect data change → publish event → invalidate cache
Tools: Debezium, Maxwell (MySQL), Kafka Connect
Xem thêm: Tuan-08-Message-Queue

2.14 Local Cache vs Distributed Cache

Local Cache (In-Process) — Caffeine / Guava Cache

Nơi lưu: Trong memory của application process (JVM heap, Node.js heap)
Latency: ~100 ns (memory access, không network)
Ưu điểm: Cực nhanh, không network overhead, không SPOF
Nhược điểm: Mỗi instance có bản copy riêng → inconsistency giữa instances. Cache size bị giới hạn bởi instance memory. Restart mất toàn bộ cache.
Phù hợp: Config data, static reference data, data thay đổi rất ít

// Java — Caffeine (recommended over Guava Cache)
LoadingCache<String, UserProfile> cache = Caffeine.newBuilder()
    .maximumSize(10_000)                  // Max 10K entries
    .expireAfterWrite(Duration.ofMinutes(5))  // TTL 5 phút
    .refreshAfterWrite(Duration.ofMinutes(1)) // Async refresh sau 1 phút
    .recordStats()                        // Enable hit/miss metrics
    .build(key -> userRepository.findById(key)); // Loader
 
UserProfile user = cache.get("user:1001"); // Auto-load nếu miss

Distributed Cache — Redis / Memcached

Nơi lưu: Separate server(s), truy cập qua network
Latency: ~0.5-1 ms (network round trip same DC)
Ưu điểm: Shared across all instances → consistency. Có thể scale independently. Persist across deploys.
Nhược điểm: Network latency. Thêm infra component phải manage. SPOF nếu không HA.
Phù hợp: Session data, DB query results, API responses, most use cases

2.15 Multi-Level Caching — L1 + L2 + L3

Hệ thống lớn không dùng 1 cache layer mà dùng nhiều tầng, mỗi tầng có trade-off khác nhau:

┌─────────────────────────────────────────────────────┐
│                    Client/Browser                    │
│              ┌──────────────────────┐                │
│              │ Browser Cache (HTTP) │  ← L0          │
│              └──────────┬───────────┘                │
└─────────────────────────┼───────────────────────────┘
                          ▼
┌─────────────────────────────────────────────────────┐
│                      CDN                             │
│              ┌──────────────────────┐                │
│              │ CDN Edge Cache       │  ← L3          │
│              │ (CloudFront, CF)     │                │
│              └──────────┬───────────┘                │
└─────────────────────────┼───────────────────────────┘
                          ▼
┌─────────────────────────────────────────────────────┐
│                 Application Server                   │
│  ┌─────────────────────────────┐                     │
│  │ L1: Local Cache (Caffeine)  │  ← ~100 ns          │
│  │ Size: 100 MB - 1 GB        │                     │
│  └──────────────┬──────────────┘                     │
│                 ▼ miss                               │
│  ┌─────────────────────────────┐                     │
│  │ L2: Redis (Distributed)     │  ← ~1 ms            │
│  │ Size: 10 GB - 1 TB         │                     │
│  └──────────────┬──────────────┘                     │
│                 ▼ miss                               │
│  ┌─────────────────────────────┐                     │
│  │ Database (Source of Truth)   │  ← 5-50 ms          │
│  └─────────────────────────────┘                     │
└─────────────────────────────────────────────────────┘

Layer	Technology	Latency	Size	TTL	Use Case
L0	Browser HTTP Cache	0 ms	Limited	Cache-Control headers	Static assets, API responses
L1	Caffeine / Node LRU	~100 ns	100 MB - 1 GB	30s - 5 min	Hot config, frequently accessed data
L2	Redis Cluster	~1 ms	10 GB - 1 TB	5 min - 24h	DB query results, session, general cache
L3	CDN (CloudFront)	~5-50 ms	Unlimited	1h - 7 days	Static content, API responses at edge
Source	PostgreSQL / MySQL	5-500 ms	Unlimited	N/A	Source of truth

L1 TTL phải ngắn hơn L2 TTL — nếu không, L1 sẽ serve stale data dù L2 đã được update.

Invalidation flow khi data thay đổi:

Update DB
Delete L2 (Redis) → các instance khác sẽ miss L2 và fetch from DB
L1 (local) tự expire theo TTL ngắn (hoặc dùng pub/sub để broadcast invalidation)
CDN invalidation via API (nếu cần)

3. Estimation — Cache Impact Analysis

3.1 Cache Hit Ratio Impact on Latency

Giả sử:

Cache hit latency: $t_{hi t} = 1$ ms (Redis lookup)
Cache miss latency: $t_{mi ss} = 50$ ms (DB query + cache write)
Hit ratio: $h$

t_{a vg} = h \times t_{hi t} + (1 - h) \times t_{mi ss}

Hit Ratio ( $h$ )	Average Latency	Improvement vs No Cache
0% (no cache)	$0 \times 1 + 1 \times 50 = 50$ ms	Baseline
50%	$0.5 \times 1 + 0.5 \times 50 = 25.5$ ms	49% faster
80%	$0.8 \times 1 + 0.2 \times 50 = 10.8$ ms	78% faster
90%	$0.9 \times 1 + 0.1 \times 50 = 5.9$ ms	88% faster
95%	$0.95 \times 1 + 0.05 \times 50 = 3.45$ ms	93% faster
99%	$0.99 \times 1 + 0.01 \times 50 = 1.49$ ms	97% faster

Aha Moment: Từ 80% lên 95% hit rate, latency giảm từ 10.8ms xuống 3.45ms — cải thiện 3x. Từ 95% lên 99% chỉ cải thiện thêm 2x nhưng đòi hỏi nhiều memory hơn. Diminishing returns.

3.2 Memory Sizing for Cache — 80/20 Rule (Pareto)

Nguyên tắc Pareto: 20% data tạo ra 80% traffic. Không cần cache toàn bộ DB — chỉ cần cache 20% hot data là đạt ~80% hit rate.

Ví dụ: E-commerce Product Cache

Assumptions:

Total products: 10 triệu
Average product JSON size: 2 KB
Read QPS: 50,000/s
Hot products (top 20%): 2 triệu

T o t a l d a t a s i ze = 10 M \times 2 K B = 20 GB

Ho t d a t a (20%) = 2 M \times 2 K B = 4 GB

R e d i s o v er h e a d (\sim 2 x c h o k ey m e t a d a t a) = 4 GB \times 2 = 8 GB

Kết luận: Một Redis node 16 GB memory là đủ chứa toàn bộ hot data. Chi phí: ~$50/tháng trên AWS (cache.r6g.large).

Nếu muốn 95%+ hit rate (cache thêm “warm” data — top 50%):

Wa r m d a t a (50%) = 5 M \times 2 K B = 10 GB

R e d i s w i t h o v er h e a d = 10 GB \times 2 = 20 GB

Cần Redis cluster hoặc 1 node cache.r6g.xlarge (32 GB, ~$100/tháng).

3.3 Cost Comparison: Cache vs DB Reads

Metric	Direct DB (no cache)	With Redis Cache (90% hit)
Read QPS	50,000 → all to DB	5,000 to DB + 45,000 to Redis
DB instances needed	3-5 read replicas (10-15K QPS each)	1 primary + 1 read replica
DB cost (RDS)	5 x db.r6g.xlarge = $5,000/mo	2 x db.r6g.large = $1,000/mo
Redis cost	$0	1 x cache.r6g.large = $50/mo
Total cost	$5,000/mo	$1,050/mo
Savings	Baseline	$3,950/mo (79% savings)

A nn u a l s a v in g s = $3, 950 \times 12 = $47, 400/ ye a r

Aha Moment: Chi $50/ t h \overset{a}{ˊ} n g c h o R e d i s, t i \overset{ˊ}{\overset{e}{^}} t ki ệ m$ 3,950/tháng cho DB. Cache là một trong những ROI cao nhất trong infrastructure.

3.4 Cache Memory Quick Formula

C a c h e M e m ory = \frac{QP S _{re a d} \times 86 , 400 \times a vg _ o bj ec t _ s i ze \times h o t _ d a t a _ p c t}{d e d u p _ f a c t or}

Ở đó:

$h o t_d a t a_p c t$ = 0.2 (Pareto 80/20)
$d e d u p_f a c t or$ = 5-10 (nhiều request cùng 1 key)
Nhân thêm 2x cho Redis metadata overhead

4. Security — Cache cũng cần bảo mật

4.1 Cache Poisoning

Vấn đề: Attacker inject malicious data vào cache. Tất cả users sau đó nhận data độc hại từ cache mà không qua validation.

Ví dụ attack vector:

Attacker craft request khiến backend trả response chứa malicious script
Response được cache
Mọi user tiếp theo nhận cached malicious response (XSS via cache)

Hoặc — Web Cache Poisoning (HTTP):

Attacker gửi request với unkeyed header (ví dụ X-Forwarded-Host) chứa malicious value
CDN/reverse proxy cache response dựa trên URL nhưng response chứa reflected malicious header
Mọi user request cùng URL → nhận poisoned response

Phòng chống:

Validate output, không chỉ input — escape HTML/JS trước khi cache
Cache key phải bao gồm tất cả factors ảnh hưởng response (bao gồm relevant headers)
Không cache response của unauthenticated error pages (có thể bị lợi dụng)
Sign cached values nếu critical — HMAC với secret key
Review cache key strategy — đảm bảo không có unkeyed input ảnh hưởng response

Vấn đề: Cache thường không encrypted at rest (Redis default). Nếu cache chứa PII (Personally Identifiable Information) — CCCD, email, address — vi phạm GDPR/CCPA.

Nguyên tắc:

Phân loại data trước khi cache:

Data Type	Cacheable?	Ghi chú
Public product info	Co	Safe to cache
User preferences (non-PII)	Co	Theme, language
User email, phone	Than trong	Encrypt hoặc tokenize
Credit card, CCCD	KHONG BAO GIO	Không cache PII nhạy cảm
Session token	Co	Với TTL ngắn + encryption
Health records (HIPAA)	KHONG	Compliance cấm

Nếu bắt buộc cache PII:
- Encrypt value trước khi SET (AES-256-GCM)
- TTL ngắn (tối đa 15-30 phút)
- Implement RIGHT TO BE FORGOTTEN — khi user request delete → phải xóa cả cache
- Log cache access cho audit trail
Cache key cũng có thể leak PII:
- Xấu: cache:user:hieu@email.com:profile
- Tốt: cache:user:hash(hieu@email.com):profile hoặc cache:user:uuid:profile

4.3 Redis AUTH + TLS

Default Redis có NO authentication, NO encryption. Bất kỳ ai access được network đều đọc/ghi data.

Redis AUTH (password):

# redis.conf
requirepass "YourStr0ng!P@ssw0rd#2024"
 
# ACL (Redis 6.0+) — Granular access control
user hieu-app on >app_password ~cache:* +get +set +del -@admin
user admin on >admin_password ~* +@all

# Connect with password
redis-cli -a "YourStr0ng!P@ssw0rd#2024"
 
# Or in application
redis-cli AUTH "YourStr0ng!P@ssw0rd#2024"

Redis TLS (encryption in transit):

# redis.conf — Enable TLS
tls-port 6380
port 0                          # Disable non-TLS port
tls-cert-file /path/redis.crt
tls-key-file /path/redis.key
tls-ca-cert-file /path/ca.crt
tls-auth-clients yes            # Require client certificates (mTLS)

# Connect with TLS
redis-cli --tls --cert /path/client.crt --key /path/client.key --cacert /path/ca.crt -p 6380

Checklist bảo mật Redis:

Enable requirepass hoặc ACL
Enable TLS (đặc biệt nếu Redis trên separate network)
Disable dangerous commands: rename-command FLUSHALL "" rename-command FLUSHDB "" rename-command CONFIG ""
Bind specific interface: bind 10.0.1.5 (không bind 0.0.0.0)
Đặt Redis trong private subnet, không expose ra internet
Enable protected-mode yes (default, từ chối external connections khi không có password)

4.4 Cache Timing Attacks

Vấn đề: Attacker đo thời gian response để suy ra cache state → suy ra thông tin nhạy cảm.

Ví dụ:

Request GET /api/users/hieu → 2ms (cache hit) → user “hieu” tồn tại và recently active
Request GET /api/users/nonexistent → 50ms (cache miss + DB) → user không tồn tại hoặc ít active
Attacker enumerate tất cả usernames bằng cách đo latency

Phòng chống:

Constant-time response: Thêm artificial delay cho cache hits để response time đồng nhất
Cache negative results: Cache cả “not found” → timing giống nhau
Rate limiting: Giới hạn enumeration attempts → Tuan-09-Rate-Limiter
Random padding: Thêm random delay nhỏ (0-5ms jitter) vào mọi response

5. DevOps — Redis Operations

5.1 Redis Docker Compose Cluster Setup

# docker-compose.redis-cluster.yml
# Redis Cluster 6 nodes: 3 masters + 3 replicas
version: "3.8"
 
services:
  redis-node-1:
    image: redis:7.2-alpine
    container_name: redis-node-1
    command: >
      redis-server
      --port 6379
      --cluster-enabled yes
      --cluster-config-file nodes.conf
      --cluster-node-timeout 5000
      --appendonly yes
      --requirepass "${REDIS_PASSWORD}"
      --masterauth "${REDIS_PASSWORD}"
      --maxmemory 256mb
      --maxmemory-policy allkeys-lru
    ports:
      - "6381:6379"
    volumes:
      - redis-data-1:/data
    networks:
      - redis-cluster-net
    deploy:
      resources:
        limits:
          memory: 512M
    healthcheck:
      test: ["CMD", "redis-cli", "-a", "${REDIS_PASSWORD}", "ping"]
      interval: 10s
      timeout: 5s
      retries: 3
 
  redis-node-2:
    image: redis:7.2-alpine
    container_name: redis-node-2
    command: >
      redis-server
      --port 6379
      --cluster-enabled yes
      --cluster-config-file nodes.conf
      --cluster-node-timeout 5000
      --appendonly yes
      --requirepass "${REDIS_PASSWORD}"
      --masterauth "${REDIS_PASSWORD}"
      --maxmemory 256mb
      --maxmemory-policy allkeys-lru
    ports:
      - "6382:6379"
    volumes:
      - redis-data-2:/data
    networks:
      - redis-cluster-net
    deploy:
      resources:
        limits:
          memory: 512M
 
  redis-node-3:
    image: redis:7.2-alpine
    container_name: redis-node-3
    command: >
      redis-server
      --port 6379
      --cluster-enabled yes
      --cluster-config-file nodes.conf
      --cluster-node-timeout 5000
      --appendonly yes
      --requirepass "${REDIS_PASSWORD}"
      --masterauth "${REDIS_PASSWORD}"
      --maxmemory 256mb
      --maxmemory-policy allkeys-lru
    ports:
      - "6383:6379"
    volumes:
      - redis-data-3:/data
    networks:
      - redis-cluster-net
    deploy:
      resources:
        limits:
          memory: 512M
 
  redis-node-4:
    image: redis:7.2-alpine
    container_name: redis-node-4
    command: >
      redis-server
      --port 6379
      --cluster-enabled yes
      --cluster-config-file nodes.conf
      --cluster-node-timeout 5000
      --appendonly yes
      --requirepass "${REDIS_PASSWORD}"
      --masterauth "${REDIS_PASSWORD}"
      --maxmemory 256mb
      --maxmemory-policy allkeys-lru
    ports:
      - "6384:6379"
    volumes:
      - redis-data-4:/data
    networks:
      - redis-cluster-net
    deploy:
      resources:
        limits:
          memory: 512M
 
  redis-node-5:
    image: redis:7.2-alpine
    container_name: redis-node-5
    command: >
      redis-server
      --port 6379
      --cluster-enabled yes
      --cluster-config-file nodes.conf
      --cluster-node-timeout 5000
      --appendonly yes
      --requirepass "${REDIS_PASSWORD}"
      --masterauth "${REDIS_PASSWORD}"
      --maxmemory 256mb
      --maxmemory-policy allkeys-lru
    ports:
      - "6385:6379"
    volumes:
      - redis-data-5:/data
    networks:
      - redis-cluster-net
    deploy:
      resources:
        limits:
          memory: 512M
 
  redis-node-6:
    image: redis:7.2-alpine
    container_name: redis-node-6
    command: >
      redis-server
      --port 6379
      --cluster-enabled yes
      --cluster-config-file nodes.conf
      --cluster-node-timeout 5000
      --appendonly yes
      --requirepass "${REDIS_PASSWORD}"
      --masterauth "${REDIS_PASSWORD}"
      --maxmemory 256mb
      --maxmemory-policy allkeys-lru
    ports:
      - "6386:6379"
    volumes:
      - redis-data-6:/data
    networks:
      - redis-cluster-net
    deploy:
      resources:
        limits:
          memory: 512M
 
  # Cluster initializer — chạy 1 lần để tạo cluster
  redis-cluster-init:
    image: redis:7.2-alpine
    container_name: redis-cluster-init
    depends_on:
      - redis-node-1
      - redis-node-2
      - redis-node-3
      - redis-node-4
      - redis-node-5
      - redis-node-6
    command: >
      sh -c "sleep 5 &&
      redis-cli -a ${REDIS_PASSWORD} --cluster create
      redis-node-1:6379 redis-node-2:6379 redis-node-3:6379
      redis-node-4:6379 redis-node-5:6379 redis-node-6:6379
      --cluster-replicas 1 --cluster-yes"
    networks:
      - redis-cluster-net
 
volumes:
  redis-data-1:
  redis-data-2:
  redis-data-3:
  redis-data-4:
  redis-data-5:
  redis-data-6:
 
networks:
  redis-cluster-net:
    driver: bridge

# Khởi động
REDIS_PASSWORD=MyStr0ng!Pass docker-compose -f docker-compose.redis-cluster.yml up -d
 
# Verify cluster
docker exec redis-node-1 redis-cli -a MyStr0ng!Pass cluster info
docker exec redis-node-1 redis-cli -a MyStr0ng!Pass cluster nodes

5.2 Prometheus Redis Exporter

# docker-compose.monitoring.yml (thêm vào hệ thống trên)
services:
  redis-exporter:
    image: oliver006/redis_exporter:latest
    container_name: redis-exporter
    environment:
      - REDIS_ADDR=redis://redis-node-1:6379
      - REDIS_PASSWORD=${REDIS_PASSWORD}
    ports:
      - "9121:9121"
    networks:
      - redis-cluster-net
 
  prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
    ports:
      - "9090:9090"
    networks:
      - redis-cluster-net
 
  grafana:
    image: grafana/grafana:latest
    container_name: grafana
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin
    ports:
      - "3000:3000"
    volumes:
      - grafana-data:/var/lib/grafana
    networks:
      - redis-cluster-net
 
volumes:
  grafana-data:

# prometheus.yml
global:
  scrape_interval: 15s
 
scrape_configs:
  - job_name: 'redis'
    static_configs:
      - targets: ['redis-exporter:9121']
    metrics_path: /metrics

5.3 Grafana Dashboard — Key Panels

Panel	PromQL Query	Alert Threshold
Hit Rate	`redis_keyspace_hits_total / (redis_keyspace_hits_total + redis_keyspace_misses_total)`	Warning < 80%, Critical < 60%
Hit Rate (rate)	`rate(redis_keyspace_hits_total[5m]) / (rate(redis_keyspace_hits_total[5m]) + rate(redis_keyspace_misses_total[5m]))`	Chính xác hơn cho dashboards
Memory Used	`redis_memory_used_bytes / redis_memory_max_bytes`	Warning > 80%, Critical > 90%
Memory Fragmentation	`redis_mem_fragmentation_ratio`	Warning > 1.5 (fragmented), Alert < 1.0 (swapping!)
Connected Clients	`redis_connected_clients`	Warning > 80% of `maxclients`
Evicted Keys	`rate(redis_evicted_keys_total[5m])`	Warning > 0 (means cache too small)
Ops/sec	`rate(redis_commands_processed_total[1m])`	Monitor trend
Slow Log	`redis_slowlog_length`	Warning > 10
Blocked Clients	`redis_blocked_clients`	Warning > 0
Replication Lag	`redis_connected_slaves` + `redis_replication_backlog_active`	Alert if replica disconnected

# prometheus-alerts.yml — Redis alerts
groups:
  - name: redis_alerts
    rules:
      - alert: RedisHitRateLow
        expr: >
          rate(redis_keyspace_hits_total[5m])
          / (rate(redis_keyspace_hits_total[5m]) + rate(redis_keyspace_misses_total[5m]))
          < 0.8
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "Redis cache hit rate below 80% ({{ $value | humanizePercentage }})"
          description: "Investigate: cold start? key pattern change? insufficient memory?"
 
      - alert: RedisMemoryHigh
        expr: redis_memory_used_bytes / redis_memory_max_bytes > 0.9
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Redis memory usage above 90%"
          description: "Evictions will start. Scale up memory or review TTL policies."
 
      - alert: RedisEvictions
        expr: rate(redis_evicted_keys_total[5m]) > 0
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Redis is evicting keys — cache is full"
 
      - alert: RedisDown
        expr: redis_up == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Redis instance is down!"

5.4 Redis-CLI Debugging Commands

# === Thông tin tổng quát ===
redis-cli INFO                      # Toàn bộ server info
redis-cli INFO memory               # Memory chi tiết
redis-cli INFO stats                 # Hit/miss stats
redis-cli INFO replication           # Master/replica status
redis-cli INFO keyspace              # Số keys per DB
 
# === Memory debugging ===
redis-cli MEMORY USAGE mykey         # Memory dùng cho 1 key cụ thể
redis-cli MEMORY DOCTOR              # Redis memory health check
redis-cli DBSIZE                     # Tổng số keys
 
# === Performance debugging ===
redis-cli SLOWLOG GET 10             # 10 queries chậm nhất
redis-cli SLOWLOG RESET              # Reset slow log
redis-cli LATENCY LATEST             # Latency events gần nhất
redis-cli LATENCY HISTORY event      # History of latency events
redis-cli --latency                  # Realtime latency monitoring
redis-cli --latency-history          # Latency history continuous
redis-cli --bigkeys                  # Scan tìm keys lớn nhất (từng type)
redis-cli --memkeys                  # Scan tìm keys tốn memory nhất
redis-cli --hotkeys                  # Keys được access nhiều nhất (cần LFU policy)
 
# === Cluster debugging ===
redis-cli CLUSTER INFO               # Cluster state
redis-cli CLUSTER NODES              # Tất cả nodes + role + slot range
redis-cli CLUSTER SLOTS              # Slot distribution
redis-cli CLUSTER KEYSLOT mykey      # Key nào ở slot nào?
 
# === Dangerous — CHỈ dùng khi debug dev/staging ===
redis-cli MONITOR                    # Realtime stream TẤT CẢ commands (heavy, KHÔNG dùng production)
redis-cli DEBUG SLEEP 0              # Test latency
redis-cli OBJECT ENCODING mykey      # Encoding type (ziplist, hashtable, etc.)
redis-cli OBJECT FREQ mykey          # Access frequency (LFU)
redis-cli OBJECT IDLETIME mykey      # Idle time (LRU)
 
# === Scan thay vì KEYS (production safe) ===
redis-cli SCAN 0 MATCH "user:*" COUNT 100   # Iterate keys (non-blocking)
# KHÔNG DÙNG: redis-cli KEYS "user:*"       # BLOCKS server, NGUY HIỂM

Rule #1: KHÔNG BAO GIỜ dùng KEYS * trên production. Dùng SCAN thay thế. KEYS là O(N) và blocks toàn bộ Redis.

Rule #2: KHÔNG BAO GIỜ dùng MONITOR trên production lâu. Nó log MỌI command → giảm throughput 50%+.

6. Code Examples

6.1 Node.js — Cache-Aside Pattern with Redis

// cache-aside.js — Production-ready cache-aside pattern
const Redis = require('ioredis');
const { Pool } = require('pg');
 
// === Redis connection with retry ===
const redis = new Redis({
  host: process.env.REDIS_HOST || 'localhost',
  port: parseInt(process.env.REDIS_PORT || '6379'),
  password: process.env.REDIS_PASSWORD,
  retryStrategy(times) {
    const delay = Math.min(times * 50, 2000);
    return delay; // Retry with exponential backoff, max 2s
  },
  maxRetriesPerRequest: 3,
  enableReadyCheck: true,
  lazyConnect: true,
});
 
// === PostgreSQL connection pool ===
const db = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 20,
});
 
// === Cache-Aside Implementation ===
class CacheAside {
  /**
   * @param {string} prefix - Cache key prefix (e.g., "product", "user")
   * @param {number} ttlSeconds - Time-to-live in seconds
   * @param {number} ttlJitter - Random jitter to prevent cache avalanche
   */
  constructor(prefix, ttlSeconds = 3600, ttlJitter = 300) {
    this.prefix = prefix;
    this.ttlSeconds = ttlSeconds;
    this.ttlJitter = ttlJitter;
  }
 
  _key(id) {
    return `${this.prefix}:${id}`;
  }
 
  _ttl() {
    // Staggered TTL: base + random jitter → chống cache avalanche
    return this.ttlSeconds + Math.floor(Math.random() * this.ttlJitter);
  }
 
  /**
   * GET — Cache-Aside read pattern
   * 1. Check cache → 2. If miss, query DB → 3. Populate cache
   */
  async get(id, dbQueryFn) {
    const key = this._key(id);
 
    // Step 1: Check cache
    try {
      const cached = await redis.get(key);
      if (cached !== null) {
        // Cache HIT
        return JSON.parse(cached);
      }
    } catch (err) {
      // Cache failure → fallback to DB (cache should not be SPOF)
      console.error(`Cache read error for ${key}:`, err.message);
    }
 
    // Step 2: Cache MISS → Query DB
    const data = await dbQueryFn(id);
 
    if (data === null || data === undefined) {
      // Cache null result to prevent cache penetration
      // Shorter TTL for null results
      try {
        await redis.set(key, JSON.stringify(null), 'EX', 300);
      } catch (err) {
        console.error(`Cache write null error for ${key}:`, err.message);
      }
      return null;
    }
 
    // Step 3: Populate cache
    try {
      await redis.set(key, JSON.stringify(data), 'EX', this._ttl());
    } catch (err) {
      console.error(`Cache write error for ${key}:`, err.message);
    }
 
    return data;
  }
 
  /**
   * INVALIDATE — Delete cache entry when data changes
   */
  async invalidate(id) {
    const key = this._key(id);
    try {
      await redis.del(key);
    } catch (err) {
      console.error(`Cache invalidate error for ${key}:`, err.message);
    }
  }
 
  /**
   * WRITE — Update DB then invalidate cache (not update!)
   */
  async write(id, data, dbWriteFn) {
    // Step 1: Write to DB first (source of truth)
    await dbWriteFn(id, data);
 
    // Step 2: Invalidate cache (delete, NOT update)
    // Next read will fetch fresh data from DB
    await this.invalidate(id);
  }
}
 
// === Usage Example ===
const productCache = new CacheAside('product', 3600, 300);
 
// GET product — Cache-Aside
async function getProduct(productId) {
  return productCache.get(productId, async (id) => {
    const result = await db.query(
      'SELECT * FROM products WHERE id = $1',
      [id]
    );
    return result.rows[0] || null;
  });
}
 
// UPDATE product — Invalidate cache
async function updateProduct(productId, updateData) {
  await productCache.write(productId, updateData, async (id, data) => {
    await db.query(
      'UPDATE products SET name = $1, price = $2, updated_at = NOW() WHERE id = $3',
      [data.name, data.price, id]
    );
  });
}
 
module.exports = { CacheAside, getProduct, updateProduct };

6.2 Python — Write-Through Example

"""
write_through_cache.py — Write-Through pattern implementation
Mỗi lần write, data được ghi vào CACHE + DB đồng thời (synchronous).
"""
 
import json
import hashlib
import logging
from typing import Any, Optional, Callable
from functools import wraps
 
import redis
import psycopg2
from psycopg2.extras import RealDictCursor
 
logger = logging.getLogger(__name__)
 
# === Connections ===
redis_client = redis.Redis(
    host="localhost",
    port=6379,
    password="YourStr0ng!P@ssw0rd",
    decode_responses=True,
    socket_connect_timeout=5,
    socket_timeout=2,
    retry_on_timeout=True,
)
 
db_conn = psycopg2.connect(
    dsn="postgresql://user:pass@localhost:5432/mydb",
    cursor_factory=RealDictCursor,
)
 
 
class WriteThroughCache:
    """
    Write-Through: ghi cache + DB đồng thời.
    Read luôn hit cache (trừ cold start).
    """
 
    def __init__(
        self,
        prefix: str,
        ttl_seconds: int = 3600,
    ):
        self.prefix = prefix
        self.ttl_seconds = ttl_seconds
 
    def _key(self, entity_id: str) -> str:
        return f"{self.prefix}:{entity_id}"
 
    def get(self, entity_id: str, db_fallback: Callable) -> Optional[dict]:
        """Read: cache first, fallback to DB on miss."""
        key = self._key(entity_id)
 
        try:
            cached = redis_client.get(key)
            if cached is not None:
                logger.debug(f"Cache HIT: {key}")
                return json.loads(cached)
        except redis.RedisError as e:
            logger.warning(f"Cache read failed: {e}")
 
        # Cache miss → query DB
        logger.debug(f"Cache MISS: {key}")
        data = db_fallback(entity_id)
 
        if data is not None:
            try:
                redis_client.setex(key, self.ttl_seconds, json.dumps(data, default=str))
            except redis.RedisError as e:
                logger.warning(f"Cache write failed: {e}")
 
        return data
 
    def write(
        self,
        entity_id: str,
        data: dict,
        db_write_fn: Callable,
    ) -> dict:
        """
        Write-Through: ghi DB + Cache đồng thời.
        Nếu DB write thất bại → raise exception, KHÔNG ghi cache.
        Nếu cache write thất bại → log warning, data vẫn consistent trong DB.
        """
        key = self._key(entity_id)
 
        # Step 1: Write to DB FIRST (source of truth)
        db_write_fn(entity_id, data)
        logger.info(f"DB write success: {entity_id}")
 
        # Step 2: Write to cache (synchronous, nhưng failure non-critical)
        try:
            redis_client.setex(key, self.ttl_seconds, json.dumps(data, default=str))
            logger.info(f"Cache write success: {key}")
        except redis.RedisError as e:
            # Cache write failure is NOT critical
            # Next read sẽ miss cache → query DB → populate cache
            logger.warning(f"Cache write failed (non-critical): {e}")
 
        return data
 
    def delete(self, entity_id: str, db_delete_fn: Callable) -> None:
        """Delete from both DB and cache."""
        db_delete_fn(entity_id)
 
        key = self._key(entity_id)
        try:
            redis_client.delete(key)
        except redis.RedisError as e:
            logger.warning(f"Cache delete failed: {e}")
 
 
# === Usage Example ===
user_cache = WriteThroughCache(prefix="user", ttl_seconds=1800)
 
 
def get_user(user_id: str) -> Optional[dict]:
    def db_fallback(uid: str) -> Optional[dict]:
        with db_conn.cursor() as cur:
            cur.execute("SELECT * FROM users WHERE id = %s", (uid,))
            return cur.fetchone()
 
    return user_cache.get(user_id, db_fallback)
 
 
def update_user(user_id: str, name: str, email: str) -> dict:
    data = {"id": user_id, "name": name, "email": email}
 
    def db_write(uid: str, d: dict):
        with db_conn.cursor() as cur:
            cur.execute(
                "UPDATE users SET name = %s, email = %s WHERE id = %s",
                (d["name"], d["email"], uid),
            )
            db_conn.commit()
 
    return user_cache.write(user_id, data, db_write)
 
 
if __name__ == "__main__":
    # Write-through: ghi cả DB + cache
    update_user("1001", "Hieu", "hieu@example.com")
 
    # Read: luôn hit cache (vì vừa write-through)
    user = get_user("1001")
    print(f"User: {user}")

6.3 Cache Stampede Prevention — Mutex Lock (Node.js)

// cache-stampede-mutex.js
// Ngăn thundering herd: chỉ 1 request query DB, còn lại đợi.
 
const Redis = require('ioredis');
const redis = new Redis({ host: 'localhost', port: 6379 });
 
/**
 * Cache get with mutex lock to prevent stampede.
 *
 * Khi cache miss:
 * - Thread đầu tiên acquire lock → query DB → populate cache → release lock
 * - Các threads khác thấy lock → chờ → retry đọc cache
 *
 * @param {string} key - Cache key
 * @param {Function} fetchFn - Function để query DB
 * @param {number} ttl - Cache TTL in seconds
 * @param {number} lockTtl - Lock TTL in seconds (safety net)
 * @param {number} maxRetries - Max times to retry reading cache
 * @param {number} retryDelay - Delay between retries in ms
 */
async function cacheGetWithMutex(
  key,
  fetchFn,
  ttl = 3600,
  lockTtl = 10,
  maxRetries = 20,
  retryDelay = 100
) {
  // Step 1: Try cache
  const cached = await redis.get(key);
  if (cached !== null) {
    return JSON.parse(cached);
  }
 
  // Step 2: Cache miss → try to acquire lock
  const lockKey = `lock:${key}`;
  const lockValue = `${process.pid}:${Date.now()}`; // Unique lock owner
  const acquired = await redis.set(lockKey, lockValue, 'EX', lockTtl, 'NX');
 
  if (acquired === 'OK') {
    // === LOCK ACQUIRED: This thread queries DB ===
    try {
      const data = await fetchFn();
 
      // Populate cache
      if (data !== null && data !== undefined) {
        await redis.set(key, JSON.stringify(data), 'EX', ttl);
      } else {
        // Cache null to prevent penetration
        await redis.set(key, JSON.stringify(null), 'EX', 300);
      }
 
      return data;
    } finally {
      // Release lock (only if we still own it — prevent releasing someone else's lock)
      const currentLock = await redis.get(lockKey);
      if (currentLock === lockValue) {
        await redis.del(lockKey);
      }
    }
  }
 
  // === LOCK NOT ACQUIRED: Another thread is fetching ===
  // Wait and retry reading cache
  for (let i = 0; i < maxRetries; i++) {
    await new Promise((resolve) => setTimeout(resolve, retryDelay));
 
    const retryResult = await redis.get(key);
    if (retryResult !== null) {
      return JSON.parse(retryResult);
    }
  }
 
  // Fallback: if lock holder crashed, fetch directly
  // (Lock will auto-expire due to lockTtl)
  const fallbackData = await fetchFn();
  if (fallbackData !== null) {
    await redis.set(key, JSON.stringify(fallbackData), 'EX', ttl);
  }
  return fallbackData;
}
 
// === Alternative: Using Redlock for distributed mutex ===
// Khi có Redis Cluster, single SETNX không đủ.
// Dùng Redlock algorithm (Redis distributed lock):
//
// const Redlock = require('redlock');
// const redlock = new Redlock([redis], { retryCount: 3 });
// const lock = await redlock.acquire([`lock:${key}`], lockTtl * 1000);
// try { ... } finally { await lock.release(); }
 
// === Usage ===
async function getPopularProduct(productId) {
  return cacheGetWithMutex(
    `product:${productId}`,
    async () => {
      // Simulate DB query
      const result = await db.query(
        'SELECT * FROM products WHERE id = $1',
        [productId]
      );
      return result.rows[0] || null;
    },
    3600,   // cache TTL: 1 hour
    10,     // lock TTL: 10 seconds (safety)
    20,     // max retries
    100     // retry delay: 100ms
  );
}
 
module.exports = { cacheGetWithMutex };

7. System Design Diagrams

7.1 Cache-Aside Flow Diagram

flowchart TD
    Client([Client Request]) --> App[Application Server]

    App --> CacheCheck{Check Cache<br/>GET key}

    CacheCheck -->|HIT| ReturnCached[Return cached data<br/>⚡ ~1ms]
    ReturnCached --> Response([Response to Client])

    CacheCheck -->|MISS| QueryDB[Query Database<br/>🐢 ~5-50ms]
    QueryDB --> DataExists{Data exists?}

    DataExists -->|Yes| WriteCache[Write to Cache<br/>SET key value EX ttl]
    WriteCache --> ReturnFresh[Return fresh data]
    ReturnFresh --> Response

    DataExists -->|No| CacheNull[Cache NULL<br/>SET key NULL EX 300<br/>Prevent penetration]
    CacheNull --> Return404[Return 404]
    Return404 --> Response

    subgraph "Write Path (Invalidation)"
        WriteReq([Write Request]) --> UpdateDB[Update Database]
        UpdateDB --> InvalidateCache[DELETE cache key]
        InvalidateCache --> WriteResp([Write Response])
    end

    style ReturnCached fill:#4caf50,stroke:#333,color:#fff
    style QueryDB fill:#ff9800,stroke:#333,color:#fff
    style CacheNull fill:#f44336,stroke:#333,color:#fff
    style InvalidateCache fill:#e91e63,stroke:#333,color:#fff

7.2 Multi-Level Cache Architecture

flowchart TD
    Browser([Browser / Mobile App])

    Browser --> CDN{L3: CDN Edge Cache<br/>CloudFront / Cloudflare<br/>TTL: 1h - 7d}

    CDN -->|HIT| CDNResponse([Response ~5-20ms<br/>from nearest edge])
    CDN -->|MISS| LB[Load Balancer]

    LB --> AppServer[Application Server]

    AppServer --> L1{L1: Local Cache<br/>Caffeine / Node LRU<br/>TTL: 30s - 5min<br/>Size: 100MB - 1GB}

    L1 -->|HIT| L1Response([Response ~0.1ms<br/>in-process memory])
    L1 -->|MISS| L2{L2: Redis Cluster<br/>Distributed Cache<br/>TTL: 5min - 24h<br/>Size: 10GB - 1TB}

    L2 -->|HIT| L2Response[Response ~1ms<br/>network + memory]
    L2Response --> PopulateL1[Populate L1]
    PopulateL1 --> L1Response

    L2 -->|MISS| DB[(Database<br/>Source of Truth<br/>PostgreSQL / MySQL)]

    DB --> DBResponse[Response ~5-50ms]
    DBResponse --> PopulateL2[Populate L2 Redis]
    PopulateL2 --> PopulateL1_2[Populate L1]
    PopulateL1_2 --> FinalResponse([Response to Client])

    subgraph "Invalidation Flow"
        direction LR
        DataChange[Data Changed] --> DeleteL2[Delete L2 Redis]
        DeleteL2 --> L1Expires[L1 expires via short TTL<br/>or Pub/Sub broadcast]
        L1Expires --> CDNPurge[CDN Purge API<br/>if applicable]
    end

    style CDN fill:#2196f3,stroke:#333,color:#fff
    style L1 fill:#4caf50,stroke:#333,color:#fff
    style L2 fill:#ff9800,stroke:#333,color:#fff
    style DB fill:#9c27b0,stroke:#333,color:#fff

7.3 Cache Stampede — Mutex Lock Flow

sequenceDiagram
    participant R1 as Request 1
    participant R2 as Request 2
    participant R3 as Request 3
    participant Cache as Redis Cache
    participant Lock as Redis Lock
    participant DB as Database

    Note over Cache: Cache key expired (TTL=0)

    R1->>Cache: GET product:1001
    Cache-->>R1: MISS

    R2->>Cache: GET product:1001
    Cache-->>R2: MISS

    R3->>Cache: GET product:1001
    Cache-->>R3: MISS

    R1->>Lock: SETNX lock:product:1001 (EX 10s)
    Lock-->>R1: OK (acquired!)

    R2->>Lock: SETNX lock:product:1001
    Lock-->>R2: FAIL (already locked)

    R3->>Lock: SETNX lock:product:1001
    Lock-->>R3: FAIL (already locked)

    Note over R2,R3: Wait & retry...

    R1->>DB: SELECT * FROM products WHERE id=1001
    DB-->>R1: {name: "iPhone", price: 999}

    R1->>Cache: SET product:1001 {...} EX 3600
    R1->>Lock: DEL lock:product:1001

    R2->>Cache: GET product:1001 (retry)
    Cache-->>R2: HIT! {name: "iPhone", price: 999}

    R3->>Cache: GET product:1001 (retry)
    Cache-->>R3: HIT! {name: "iPhone", price: 999}

    Note over R1,R3: Only 1 DB query instead of 3!

8. Aha Moments & Pitfalls

Aha Moments

#1 — Cache invalidation is one of the hardest problems in CS. Phil Karlton (đồng nghiệp của Marc Andreessen tại Netscape) nói câu nổi tiếng này. Lý do: bạn phải đảm bảo mọi instance, mọi layer cache đều consistent — trong distributed system có network partition, race condition, clock skew. Không có giải pháp hoàn hảo, chỉ có trade-offs.

#2 — “Caching everything” là anti-pattern. Nếu cache 100% data → cache size = DB size → tốn gấp đôi memory. Pareto 80/20: cache 20% hot data, đạt 80% hit rate. Thêm memory cho cache chỉ nên khi monitoring cho thấy hit rate < 80%.

#3 — Stale data còn tốt hơn no data. Trong nhiều use cases (social media feed, product catalog, news), trả data cũ 30 giây vẫn tốt hơn 500 error hoặc 5-second latency. Eventual consistency là acceptable cho đa số hệ thống.

#4 — Cache không chỉ cho DB. Cache có thể đặt trước bất kỳ slow operation nào: API calls to 3rd party, complex computations (ML inference), file system reads, DNS lookups. Nếu operation > 10ms và kết quả reusable → xem xét cache.

#5 — Cold start problem. Sau deploy mới hoặc cache crash → hit rate = 0% → toàn bộ traffic đổ vào DB → có thể gây cascading failure. Giải pháp: cache warming script chạy trước khi route traffic, hoặc gradual rollout (chuyển 10% traffic trước, đợi cache warm, rồi tăng dần).

#6 — Redis single-threaded nhưng đủ nhanh. Redis xử lý 100K+ ops/s trên 1 core nhờ I/O multiplexing (epoll) và in-memory operations. Bottleneck thường là network, không phải CPU. Chỉ cần cluster khi data > memory hoặc cần > 100K ops/s.

Pitfalls — Sai lầm thường gặp

Pitfall 1: Update cache thay vì delete

Sai: Khi DB thay đổi → update cache entry. Đúng: Delete cache entry → next read sẽ fetch fresh data. Lý do: Update cache có race condition (xem Section 2.13). Delete an toàn hơn vì worst case chỉ là 1 cache miss.

Pitfall 2: Cache without TTL

Sai: SET key value (không TTL). Đúng: SET key value EX 3600 (luôn set TTL). Lý do: Không TTL = cache entry sống mãi = stale data mãi. TTL là safety net cuối cùng cho consistency.

Pitfall 3: Không handle cache failure gracefully

Sai: Cache down → toàn bộ application crash. Đúng: Cache down → fallback về DB (chậm hơn nhưng vẫn hoạt động). Lý do: Cache là optimization, không phải requirement. Application phải hoạt động được không có cache (chỉ chậm hơn).

Pitfall 4: KEYS command trên production

Sai: redis-cli KEYS "user:*" để tìm keys. Đúng: redis-cli SCAN 0 MATCH "user:*" COUNT 100. Lý do: KEYS là O(N), blocks Redis hoàn toàn. Với 10M keys, có thể block 5-10 giây → mọi operation bị timeout.

Pitfall 5: Over-caching — cache data không nên cache

Sai: Cache mọi thứ, kể cả data thay đổi mỗi giây. Đúng: Chỉ cache data có read:write ratio cao (> 5:1). Lý do: Nếu data thay đổi nhanh hơn TTL → cache luôn stale → invalidation overhead lớn hơn benefit.

Pitfall 6: Ignoring serialization cost

Sai: Cache large objects (10 MB JSON) mà không nghĩ đến serialize/deserialize cost. Đúng: Profile serialization time. Dùng efficient format (MessagePack, Protobuf) thay vì JSON cho large objects. Lý do: JSON.parse/stringify 10 MB có thể mất 50-100ms — ngang DB query. Cache không còn ý nghĩa.

Pitfall 7: Không monitor cache metrics

Sai: Deploy Redis xong rồi quên. Đúng: Monitor hit rate, memory, evictions, latency liên tục → Tuan-13-Monitoring-Observability. Lý do: Cache là living system. Traffic pattern thay đổi, data grow — cache config phải adapt theo.

9. Internal Links — Liên kết trong hệ thống

Liên kết	Quan hệ
Tuan-01-Scale-From-Zero-To-Millions	Cache là component cốt lõi khi scale từ 1 server lên nhiều server
Tuan-02-Back-of-the-envelope	Cache sizing estimation, hit ratio calculation, cost analysis
Tuan-03-Networking-DNS-CDN	CDN là cache layer L3 cho static content, DNS caching
Tuan-05-Load-Balancer	Load balancer phân tải request, cache giảm tải xuống backend
Tuan-07-Database-Sharding-Replication	Read replicas giảm read load, cache giảm read load hiệu quả hơn
Tuan-08-Message-Queue	Event-driven cache invalidation qua message queue (CDC)
Tuan-09-Rate-Limiter	Redis dùng cho rate limiting (INCR + EXPIRE), chống cache penetration
Tuan-10-Consistent-Hashing	Redis Cluster dùng hash slots (tương tự consistent hashing)
Tuan-13-Monitoring-Observability	Monitor cache hit rate, memory, evictions bằng Prometheus + Grafana
Tuan-15-Data-Security-Encryption	Encryption at rest cho cached PII, Redis AUTH + TLS
Tuan-16-Design-URL-Shortener	URL shortener dùng Redis cache cho hot URLs (ví dụ thực tế)
Tuan-17-Design-Chat-System	Chat system dùng cache cho user presence, recent messages

Tham khảo

Alex Xu, System Design Interview — Chapter 6: Design a Key-Value Store (cache patterns)
Alex Xu, System Design Interview Vol. 2 — Cache strategies throughout
Redis Documentation: redis.io/docs
Martin Kleppmann, Designing Data-Intensive Applications — Chapter 5 (Replication), Chapter 12 (Derived Data)
AWS ElastiCache Best Practices
Facebook TAO paper — TAO: Facebook’s Distributed Data Store for the Social Graph
Tuan-05-Load-Balancer — Component trước trong chuỗi
Tuan-07-Database-Sharding-Replication — Component tiếp theo

Tuần tới: Tuan-07-Database-Sharding-Replication — Khi database không đủ, chia để trị

lthieu's notes

Explorer

Tuan-06-Cache-Strategy

Tuần 06: Cache Strategy

1. Context & Why

Analogy đời thường — Tủ lạnh trong bếp

Tại sao Alex Xu đặt Cache ngay sau Load Balancer?

Khi nào dùng Cache?

2. Deep Dive — Cache Patterns & Architecture

2.1 Cache-Aside (Lazy Loading)

2.2 Read-Through

2.3 Write-Through

2.4 Write-Behind (Write-Back)

2.5 So sánh tổng hợp Cache Patterns

2.6 Cache Eviction Policies — Chính sách thay thế

LRU — Least Recently Used

LFU — Least Frequently Used

TTL — Time To Live

2.7 Redis vs Memcached

2.8 Redis Data Structures — Deep Dive

String

Hash

Sorted Set (ZSet)

HyperLogLog

2.9 Redis Cluster vs Redis Sentinel

Redis Sentinel — High Availability

Redis Cluster — Horizontal Scaling + HA

2.10 Cache Stampede / Thundering Herd

2.11 Cache Penetration

2.12 Cache Avalanche

2.13 Distributed Cache Consistency

2.14 Local Cache vs Distributed Cache

Local Cache (In-Process) — Caffeine / Guava Cache

Distributed Cache — Redis / Memcached

2.15 Multi-Level Caching — L1 + L2 + L3

3. Estimation — Cache Impact Analysis

3.1 Cache Hit Ratio Impact on Latency

3.2 Memory Sizing for Cache — 80/20 Rule (Pareto)

3.3 Cost Comparison: Cache vs DB Reads

3.4 Cache Memory Quick Formula

4. Security — Cache cũng cần bảo mật

4.1 Cache Poisoning

4.2 Sensitive Data in Cache (PII / GDPR / CCPA)

4.3 Redis AUTH + TLS

4.4 Cache Timing Attacks

5. DevOps — Redis Operations

5.1 Redis Docker Compose Cluster Setup

5.2 Prometheus Redis Exporter

5.3 Grafana Dashboard — Key Panels

5.4 Redis-CLI Debugging Commands

6. Code Examples

6.1 Node.js — Cache-Aside Pattern with Redis

6.2 Python — Write-Through Example

6.3 Cache Stampede Prevention — Mutex Lock (Node.js)

7. System Design Diagrams

7.1 Cache-Aside Flow Diagram

7.2 Multi-Level Cache Architecture

7.3 Cache Stampede — Mutex Lock Flow

8. Aha Moments & Pitfalls

Aha Moments

Pitfalls — Sai lầm thường gặp

Pitfall 1: Update cache thay vì delete

Pitfall 2: Cache without TTL

Pitfall 3: Không handle cache failure gracefully

Pitfall 4: KEYS command trên production

Pitfall 5: Over-caching — cache data không nên cache

Pitfall 6: Ignoring serialization cost

Pitfall 7: Không monitor cache metrics

9. Internal Links — Liên kết trong hệ thống

Tham khảo

Graph View

Table of Contents

Backlinks