Case Study: Design a Unique ID Generator in Distributed Systems
“Mỗi công dân có một số CMND/CCCD duy nhất. Mỗi sự kiện trong hệ thống phân tán cũng cần một ID duy nhất --- không trùng lặp, không xung đột, dù hàng triệu máy chủ cùng tạo ra cùng lúc. Đó chính là bài toán Unique ID Generator.”
Tags: system-design case-study unique-id snowflake distributed-systems alex-xu interview Student: Hieu Prerequisite: Tuan-01-Scale-From-Zero-To-Millions · Tuan-02-Back-of-the-envelope · Tuan-07-Database-Sharding-Replication Lien quan: Tuan-10-Consistent-Hashing · Tuan-16-Design-URL-Shortener · Tuan-20-Design-Key-Value-Store
1. Context & Why --- Tai sao can Unique ID Generator?
1.1 Analogy: So CMND/CCCD cho he thong
Hieu, em hay nghi the nay: moi cong dan Viet Nam co mot so CMND/CCCD duy nhat. So nay giup nha nuoc phan biet 100 trieu nguoi --- khong ai trung nhau. Trong he thong phan tan, moi event, moi record, moi transaction cung can mot “so CMND” --- mot globally unique identifier.
| The gioi thuc | He thong phan tan |
|---|---|
| So CMND/CCCD | Unique ID |
| Co quan cong an cap | ID Generator service |
| Khong trung giua cac tinh | Khong trung giua cac datacenter |
| Co the tra cuu theo thoi gian | Sortable by time |
| Co dinh dang (12 so) | Co dinh dang (64-bit integer) |
1.2 Tai sao khong dung AUTO_INCREMENT trong DB?
Khi he thong chi co mot database duy nhat, AUTO_INCREMENT la du. Nhung khi em scale ra nhieu DB server, nhieu datacenter:
| Van de | Giai thich |
|---|---|
| Single point of failure | Mot DB di xuong = khong tao duoc ID |
| Bottleneck | Moi request phai goi DB de lay ID --- latency cao |
| Khong scale horizontally | 10K+ IDs/sec se lam DB qua tai |
| Khong work across datacenter | 2 DB o 2 DC co the tao trung ID |
| Khong sortable by time | AUTO_INCREMENT chi tang dan, khong chua thong tin thoi gian |
Aha Moment:
AUTO_INCREMENTla anti-pattern trong distributed systems. Em can mot co che tao ID khong can coordination giua cac node.
2. Deep Dive --- Alex Xu 4-Step Framework
Overview --- Bon buoc cua Alex Xu
| Buoc | Ten goi | Thoi gian (45 phut) |
|---|---|---|
| 1 | Understand the Problem & Establish Design Scope | ~5 phut |
| 2 | Propose High-Level Design | ~15 phut |
| 3 | Design Deep Dive | ~20 phut |
| 4 | Wrap Up | ~5 phut |
Step 1 --- Understand the Problem & Establish Design Scope
2.1 Functional Requirements (Yeu cau chuc nang)
| Cau hoi cho Interviewer | Tra loi / Gia dinh |
|---|---|
| ID phai unique globally? | Co, khong bao gio trung lap |
| ID chi chua so hay co the chua chu cai? | Chi chua so (numerical values only) |
| ID co can sortable theo thoi gian? | Co, ID tao sau phai lon hon ID tao truoc (roughly) |
| Do dai ID? | 64-bit integer (fit vao BIGINT cua SQL) |
| He thong can tao bao nhieu ID/giay? | 10,000+ IDs/sec |
| He thong co nhieu datacenter? | Co, nhieu datacenter, nhieu machine |
Tom tat Functional Requirements:
- Globally unique --- Khong bao gio trung lap, ke ca across datacenter
- 64-bit integer --- Fit vao BIGINT, dung lam primary key
- Sortable by time --- ID tao sau > ID tao truoc (roughly ordered)
- High throughput --- 10,000+ IDs/sec per system
2.2 Non-Functional Requirements (Yeu cau phi chuc nang)
| Yeu cau | Muc tieu | Giai thich |
|---|---|---|
| High Availability | 99.99% uptime | ID generation khong bao gio duoc down --- moi service deu phu thuoc |
| Low Latency | < 1ms per ID | ID generation nam tren critical path cua moi write operation |
| Scalability | Linear scale voi so machine | Them machine = them throughput, khong can coordination |
| Ordering | Time-sortable | IDs tao trong cung millisecond tren cung machine phai co thu tu |
| No coordination | Khong can central authority | Moi node tu tao ID doc lap |
Step 2 --- Propose High-Level Design: 4 Approaches
Hieu, co 4 cach tiep can chinh de tao Unique ID trong distributed systems. Moi cach co trade-off rieng.
Approach 1: Multi-Master Replication (DB Auto-Increment voi buoc nhay)
Y tuong: Dung AUTO_INCREMENT cua DB, nhung moi server tang voi buoc nhay khac nhau.
Server 1: 1, 3, 5, 7, 9, ... (start=1, increment=2)
Server 2: 2, 4, 6, 8, 10, ... (start=2, increment=2)
Tong quat voi N servers:
Server k: k, k+N, k+2N, k+3N, ...
| Uu diem | Nhuoc diem |
|---|---|
| Don gian, dung san feature cua DB | Kho scale: them/bot server phai thay doi increment cua TAT CA server |
| Khong can them service nao | ID khong sortable by time (Server 1 co the tao ID 7 SAU khi Server 2 tao ID 8) |
| Tuong doi reliable | Khong work across datacenter (latency cao) |
| Throughput bi gioi han boi DB write speed |
Khi nao dung: He thong nho, it server, khong can time-sorted IDs.
Approach 2: UUID (Universally Unique Identifier)
Y tuong: Moi machine tu tao UUID 128-bit doc lap, khong can coordination.
UUID v4 Example: 550e8400-e29b-41d4-a716-446655440000
|--------| | | | |-----------|
random version variant random
| Uu diem | Nhuoc diem |
|---|---|
| Don gian nhat, moi ngon ngu deu ho tro | 128-bit (khong fit vao yeu cau 64-bit) |
| Khong can coordination giua cac node | Khong sortable by time |
| Khong co single point of failure | Hieu suat DB kem: random UUID lam B+ tree index fragmentation |
| Throughput vo han (moi node tu tao) | Dai (36 chars string) --- ton storage |
Collision probability cua UUID v4:
Voi UUIDs (1 quintillion):
Thuc te collision probability cuc thap, nhung khong phai zero va khong dam bao unique bang Snowflake.
Khi nao dung: Khi khong can sortable, khong gioi han 64-bit, can trien khai nhanh (vd: session ID, correlation ID, idempotency key).
Approach 3: Ticket Server (Flickr Approach)
Y tuong: Dung mot (hoac vai) server trung tam voi AUTO_INCREMENT lam “may ban ve”. Moi service goi den Ticket Server de lay ID tiep theo.
+------------------+
| Ticket Server |
| AUTO_INCREMENT |
| current_id: 1000|
+--------+---------+
|
+-----+-----+-----+
| | | |
Svc A Svc B Svc C Svc D
id=1001 id=1002 id=1003 id=1004
| Uu diem | Nhuoc diem |
|---|---|
| Don gian, de implement | Single Point of Failure (SPOF) |
| ID la so, 64-bit, tang dan | Bottleneck: moi request phai goi Ticket Server |
| Sortable by time (roughly) | Latency tang khi cross-datacenter |
| Flickr dung thanh cong nhieu nam | Can replicate Ticket Server de HA --- them complexity |
Flickr’s solution: Dung 2 Ticket Servers voi increment = 2, giong Multi-Master nhung chi danh rieng cho ID generation.
Khi nao dung: He thong vua, chap nhan single point of coordination, can ID tang dan don gian.
Approach 4: Snowflake (Twitter Approach) --- Recommended
Y tuong: Moi ID la mot 64-bit integer duoc compose tu nhieu thanh phan: timestamp, datacenter ID, machine ID, va sequence number. Moi machine tu tao ID doc lap.
+---+-------------------------------------------+---------+---------+----------------+
| 0 | 41-bit Timestamp | 5-bit | 5-bit | 12-bit |
| | (milliseconds since epoch) |Datacenter| Machine | Sequence |
+---+-------------------------------------------+---------+---------+----------------+
1 bit 41 bits 5 bits 5 bits 12 bits = 64 bits
| Uu diem | Nhuoc diem |
|---|---|
| 64-bit integer --- fit yeu cau | Phu thuoc vao dong ho may (clock) |
| Sortable by time (timestamp trong ID) | Clock skew giua cac machine gay ra out-of-order IDs |
| Khong can coordination | Can quan ly datacenter ID va machine ID |
| Throughput cuc cao: 4,096 IDs/ms/machine | Time-sorted chi trong pham vi “roughly” --- khong tuyet doi |
| De extract timestamp tu ID | Epoch overflow sau 69 nam |
Aha Moment: Snowflake la cach tiep can duoc su dung nhieu nhat trong thuc te. Twitter, Discord, Instagram, Baidu deu dung bien the cua Snowflake.
Bang so sanh toan bo Approaches
| Tieu chi | Multi-Master | UUID | Ticket Server | Snowflake | ULID | KSUID | MongoDB ObjectId |
|---|---|---|---|---|---|---|---|
| Bit size | 64 | 128 | 64 | 64 | 128 | 160 | 96 |
| Globally unique | Co (voi cau hinh dung) | Co (cuc cao) | Co | Co | Co | Co | Co |
| Sortable by time | Khong | Khong (v4) | Co | Co | Co | Co | Co |
| No coordination | Khong | Co | Khong | Co | Co | Co | Co |
| Throughput | Thap (DB bound) | Vo han | Thap (bottleneck) | 4,096/ms/machine | Vo han | Vo han | 16M/sec/process |
| Complexity | Thap | Cuc thap | Trung binh | Trung binh | Thap | Thap | Thap |
| DB index friendly | Co | Khong (random) | Co | Co | Co | Co | Co |
| Dung trong thuc te | MySQL replication | Session ID, correlation ID | Flickr | Twitter, Discord, Instagram | Shopify, CockroachDB | Segment | MongoDB |
| Fit 64-bit? | Co | Khong | Co | Co | Khong | Khong | Khong |
Cac bien the khac (Ngoai Alex Xu)
ULID (Universally Unique Lexicographically Sortable Identifier):
- 128-bit: 48-bit timestamp (millisecond) + 80-bit randomness
- Encoded thanh 26-char Crockford Base32
- Sortable by time, khong can coordination
- Su dung: Shopify, CockroachDB
KSUID (K-Sortable Unique Identifier):
- 160-bit: 32-bit timestamp (second) + 128-bit random payload
- Encoded thanh 27-char Base62
- Epoch: 2014-05-13 --- overflow sau ~136 nam
- Su dung: Segment
MongoDB ObjectId:
- 96-bit (12 bytes): 32-bit timestamp (second) + 40-bit random + 24-bit counter
- Sortable by time (second resolution)
- Counter cho phep 16,777,216 unique IDs per second per process
Step 3 --- Design Deep Dive: Snowflake ID Generator
3.1 Bit Layout Chi Tiet
0 | 00000000 00000000 00000000 00000000 00000000 0 | 00000 | 00000 | 000000000000
^ | 41 bits | 5 bits| 5 bits| 12 bits
| | | | |
Sign bit Timestamp Datacenter Machine Sequence
(luon = 0) (ms since custom epoch) ID ID Number
| Thanh phan | So bit | Range | Giai thich |
|---|---|---|---|
| Sign bit | 1 | Luon = 0 | Dam bao ID luon duong (positive) |
| Timestamp | 41 | 0 --- ms | Millisecond ke tu custom epoch |
| Datacenter ID | 5 | 0 --- 31 | Toi da 32 datacenter |
| Machine ID | 5 | 0 --- 31 | Toi da 32 machine per datacenter |
| Sequence | 12 | 0 --- 4095 | Counter reset moi millisecond |
3.2 Cach tao ID (Thuat toan)
Moi khi mot service request ID:
1. Lay current_timestamp (ms since epoch)
2. Neu current_timestamp == last_timestamp:
sequence = (sequence + 1) AND 0xFFF // 12-bit mask
Neu sequence == 0: // overflow! da tao 4096 IDs trong 1ms
Wait cho next millisecond
3. Neu current_timestamp > last_timestamp:
sequence = 0 // reset counter
4. Neu current_timestamp < last_timestamp:
CLOCK SKEW DETECTED! -> Handle error
5. last_timestamp = current_timestamp
6. Compose ID:
id = (timestamp << 22) | (datacenter_id << 17) | (machine_id << 12) | sequence
7. Return id
3.3 Epoch Selection (Chon epoch)
Snowflake KHONG dung Unix epoch (1970-01-01). Tai sao?
Neu dung Unix epoch (1970-01-01):
Da qua gan! Nen chon custom epoch = ngay he thong go-live:
| Custom Epoch | Overflow Date | Con lai (tu 2026) |
|---|---|---|
| 2010-01-01 (Twitter) | ~2079 | ~53 nam |
| 2015-01-01 (Discord) | ~2084 | ~58 nam |
| 2020-01-01 | ~2089 | ~63 nam |
| 2025-01-01 (recommended cho he thong moi) | ~2094 | ~68 nam |
| Unix epoch 1970-01-01 | ~2039 | ~13 nam |
Aha Moment: Chon custom epoch gan voi ngay deploy de toi uu hoa thoi gian song cua he thong. Discord chon 2015-01-01 --- ngay dau tien cua Discord.
3.4 Clock Sync va NTP (Network Time Protocol)
Van de cot loi: Snowflake phu thuoc vao timestamp --- nhung dong ho cua cac machine KHONG BAO GIO dong bo hoan hao.
NTP (Network Time Protocol) dong bo dong ho giua cac may:
- Stratum 0: Atomic clock, GPS receiver (do chinh xac ~nanosecond)
- Stratum 1: Server ket noi truc tiep voi Stratum 0 (~microsecond)
- Stratum 2: Server sync voi Stratum 1 (~millisecond)
- Stratum 3+: Moi tang them ~1-10ms drift
GPS Satellite / Atomic Clock
|
[Stratum 0]
|
[Stratum 1] <-- Google, Amazon NTP servers
|
[Stratum 2] <-- Datacenter NTP servers
|
[Stratum 3] <-- Application servers (cac machine cua em)
Clock drift thuc te:
- Server thuong: drift ~100ms/ngay neu khong co NTP
- Voi NTP sync moi 30 giay: drift < 1ms
- Google TrueTime (Spanner): uncertainty < 7ms (dung GPS + atomic clock)
3.5 Clock Skew Handling (Xu ly lech dong ho)
Scenario: Machine A co dong ho cham hon Machine B 5ms.
- Machine A tao ID voi timestamp = 1000
- Machine B tao ID voi timestamp = 1005
- Thuc te hai ID duoc tao cung luc --- nhung ID cua B “co ve” tao sau
Cac cach xu ly:
| Strategy | Mo ta | Trade-off |
|---|---|---|
| Wait | Neu current_time < last_time, doi cho den khi current_time >= last_time | Don gian nhung co the block neu clock nhay lui nhieu |
| Reject | Neu clock skew > threshold (vd: 5ms), reject request va raise alarm | An toan nhung co the mat ID trong thoi gian ngan |
| Logical clock | Dung Lamport timestamp hoac Hybrid Logical Clock (HLC) de dam bao causality | Phuc tap hon nhung dam bao ordering |
| Tolerance window | Chap nhan ID out-of-order trong mot window nho (vd: 10ms) | Thuc te nhat, hau het cac he thong dung cach nay |
Cach Twitter Snowflake xu ly: Neu current_timestamp < last_timestamp, throw exception va log warning. Operations team se dieu tra NTP issue.
Aha Moment: Clock skew la unavoidable trong distributed systems. Snowflake khong dam bao strict global ordering --- chi “roughly sorted”. Neu can strict ordering, phai dung single-machine approach (Ticket Server) hoac Hybrid Logical Clock.
3.6 Sequence Overflow Scenarios
Khi mot machine tao > 4,096 IDs trong 1 millisecond:
Millisecond T=1000:
ID #0: ...1000...0000 00000000
ID #1: ...1000...0000 00000001
...
ID #4095: ...1000...1111 11111111 <- MAX sequence
ID #4096: OVERFLOW! sequence quay ve 0
-> Phai WAIT cho T=1001
Handling strategies:
| Strategy | Mo ta | Khi nao dung |
|---|---|---|
| Spin-wait | Busy-loop cho den next millisecond | Default trong Snowflake, OK cho burst ngan |
| Sleep | Thread.sleep(1) | Giam CPU nhung co the mat chinh xac |
| Pre-generate pool | Tao truoc mot pool IDs, tra ve tu pool | High-throughput services, giam latency spike |
| Borrow from future | Tao ID voi timestamp = T+1 va sequence = 0 | Nguy hiem --- co the gay duplicate neu clock correction xay ra |
3.7 Ordering Guarantees
| Pham vi | Dam bao? | Giai thich |
|---|---|---|
| Cung machine, cung millisecond | Co, strict ordering | Sequence number tang dan |
| Cung machine, khac millisecond | Co, strict ordering | Timestamp lon hon = ID lon hon |
| Khac machine, cung datacenter | Khong dam bao | Clock skew co the lam out-of-order |
| Khac datacenter | Khong dam bao | Clock skew + network latency |
Chu y: Dieu nay co nghia la 2 ID tu 2 machine khac nhau co the co timestamp giong nhau nhung thu tu so sanh phu thuoc vao datacenter_id va machine_id, khong phai thoi gian thuc.
3. Back-of-the-Envelope Estimation
3.1 IDs per Millisecond per Machine
3.2 IDs per Second per Machine
3.3 Total Capacity voi N Machines
Voi max configuration (32 datacenter x 32 machine = 1,024 machines):
3.4 Yeu cau 10K IDs/sec can bao nhieu machine?
Chi can 1 machine la thua suc cho 10K IDs/sec. Thuc te, ta deploy nhieu machine de High Availability, khong phai vi throughput.
3.5 Timestamp Overflow
Voi custom epoch = 2025-01-01:
3.6 Tong so ID co the tao trong 69 nam
3.7 Storage Estimation
Moi ID la 8 bytes (64-bit):
So sanh voi UUID (16 bytes) hoac UUID string (36 bytes):
| Format | Size per ID | 1 Billion IDs | DB Index Impact |
|---|---|---|---|
| Snowflake (64-bit) | 8 bytes | 8 GB | Sequential --- B+ tree insert O(1) amortized |
| UUID binary (128-bit) | 16 bytes | 16 GB | Random --- B+ tree split O(log n) |
| UUID string (36 char) | 36 bytes | 36 GB | Random + larger index pages |
4. Security First --- Bao mat cho ID Generator
4.1 ID Predictability Attacks (Tan cong doan truoc ID)
Threat: Neu attacker biet ID structure, ho co the:
| Attack | Mo ta | Impact |
|---|---|---|
| Enumeration attack | Doan ID cua record khac bang cach tang/giam ID | Truy cap du lieu trai phep |
| Timing attack | Extract timestamp tu ID de biet thoi gian tao record | Information leakage |
| Volume estimation | So sanh 2 IDs de uoc luong so record giua chung | Business intelligence leak |
| Infrastructure mapping | Extract datacenter_id va machine_id | Biet so luong server, datacenter topology |
Vi du Enumeration:
Attacker thay order_id = 7100424390656 00001 00001 000000000001
^DC ^Machine
Ho biet: datacenter 1, machine 1
Ho tang sequence: 000000000002, 000000000003, ...
-> Co the duyet qua toan bo orders cua machine do
4.2 Information Leakage tu Snowflake ID
| Thanh phan | Thong tin bi lo | Muc do nghiem trong |
|---|---|---|
| Timestamp (41 bit) | Thoi diem chinh xac (ms) tao record | Trung binh --- attacker biet khi nao user dang ky, dat hang |
| Datacenter ID (5 bit) | Topology cua he thong | Thap --- nhung huu ich cho targeted attack |
| Machine ID (5 bit) | So luong va vi tri server | Thap --- co the dung cho DDoS targeting |
| Sequence (12 bit) | Traffic volume tai thoi diem do | Trung binh --- business intelligence |
4.3 Mitigation Strategies (Chien luoc giam thieu)
| Strategy | Mo ta | Trade-off |
|---|---|---|
| Khong expose internal ID | Dung public-facing ID khac (UUID hoac encrypted ID) cho API, giu Snowflake ID internal | Them mot lop mapping, tang storage |
| Encrypt ID truoc khi expose | Dung Format-Preserving Encryption (FPE) de encrypt 64-bit ID thanh 64-bit ciphertext | Van giu duoc size, nhung mat sortability |
| Shuffle bits | Xao tron thu tu cac bit de timestamp khong con o vi tri co dinh | Mat sortability, can lookup table |
| Add random bits | Thay datacenter+machine (10 bit) bang random bits | Mat kha nang trace ve source, nhung an toan hon |
| Hash-based public ID | public_id = HMAC-SHA256(secret_key, snowflake_id)[:8] | Mot chieu, khong reverse duoc, nhung can DB lookup |
| Rate limit API | Gioi han so request per user/IP de chan enumeration | Khong giai quyet root cause nhung giam impact |
Recommended approach cho hau het he thong:
Internal: Snowflake ID (dung cho DB, inter-service communication)
External: Encrypted Snowflake ID hoac UUID mapping (dung cho public API)
Database:
+------------------+------------------+
| snowflake_id (PK)| public_id (UNIQUE)|
| 71004243906560001 | a8f3k2m9x1 |
+------------------+------------------+
4.4 OWASP Considerations
| OWASP Risk | Ap dung cho ID Generator | Giai phap |
|---|---|---|
| A01 Broken Access Control | Attacker dung predicted ID de truy cap resource cua nguoi khac | Authorization check TREN MOI REQUEST, khong chi dua vao ID |
| A04 Insecure Design | Expose Snowflake ID tren public API | Dung public ID mapping hoac FPE encryption |
| A09 Security Logging | Khong log ai request ID nao | Log moi ID generation request voi caller identity |
Aha Moment: ID Generator ban than khong co loi bao mat --- loi la o cach em expose ID ra ngoai. Luon co mot lop abstraction giua internal ID va public-facing ID.
5. DevOps --- Van hanh ID Generator
5.1 Monitoring Metrics
| Metric | Mo ta | Alert Threshold |
|---|---|---|
id_generation_rate | So ID tao per second per machine | Spike bat thuong > 2x baseline |
id_generation_latency_p99 | Latency de tao 1 ID | > 1ms (thuong < 0.01ms) |
sequence_overflow_count | So lan sequence overflow (4096/ms) | > 0 per minute = can them machine hoac review traffic |
clock_skew_detected | So lan current_time < last_time | > 0 = NTP issue, can dieu tra ngay |
ntp_offset_ms | Do lech giua local clock va NTP server | > 10ms = critical alert |
ntp_sync_failures | So lan NTP sync that bai | > 3 consecutive failures = critical |
machine_id_conflicts | 2 machine co cung datacenter_id + machine_id | > 0 = IMMEDIATE action (duplicate IDs!) |
5.2 Clock Drift Detection
# prometheus-alerts.yml
groups:
- name: id_generator_alerts
rules:
# Clock skew detected
- alert: ClockSkewDetected
expr: increase(clock_skew_detected_total[5m]) > 0
for: 1m
labels:
severity: critical
annotations:
summary: "Clock skew detected on {{ $labels.instance }}"
description: "Machine {{ $labels.machine_id }} detected backward clock movement. IDs may be out of order."
# NTP offset too high
- alert: NTPOffsetHigh
expr: abs(ntp_offset_milliseconds) > 10
for: 5m
labels:
severity: warning
annotations:
summary: "NTP offset > 10ms on {{ $labels.instance }}: {{ $value }}ms"
# NTP sync failure
- alert: NTPSyncFailure
expr: increase(ntp_sync_failures_total[15m]) > 3
for: 1m
labels:
severity: critical
annotations:
summary: "NTP sync failing on {{ $labels.instance }}"
description: "NTP has failed to sync 3+ times in 15 minutes. Clock drift will increase."
# Sequence overflow
- alert: SequenceOverflow
expr: rate(sequence_overflow_total[1m]) > 0
for: 1m
labels:
severity: warning
annotations:
summary: "Sequence overflow on machine {{ $labels.machine_id }}"
description: "Machine is generating > 4096 IDs/ms. Consider adding more machines."
# ID generation latency spike
- alert: IDGenerationLatencyHigh
expr: histogram_quantile(0.99, rate(id_generation_duration_seconds_bucket[5m])) > 0.001
for: 5m
labels:
severity: warning
annotations:
summary: "P99 ID generation latency > 1ms on {{ $labels.instance }}"5.3 NTP Monitoring Best Practices
| Practice | Mo ta |
|---|---|
| Dung nhieu NTP source | Cau hinh it nhat 3 NTP server (vd: time.google.com, time.aws.com, pool.ntp.org) |
| Monitor NTP offset lien tuc | Export ntp_offset_ms metric qua node_exporter hoac custom exporter |
| Set NTP sync interval | Default 64-1024 giay, co the giam xuong 16-32 giay cho ID generator machine |
Dung chrony thay ntpd | chrony handle drift tot hon, khoi dong nhanh hon, chinh xac hon |
| Avoid NTP step (nhay dong ho) | Cau hinh NTP chi slew (dieu chinh tu tu) thay vi step (nhay dot ngot) |
| Hardware timestamping | Dung NIC co ho tro hardware timestamping de giam jitter |
5.4 Machine ID Management
| Strategy | Mo ta | Khi nao dung |
|---|---|---|
| Static config | Cau hinh datacenter_id va machine_id trong config file hoac env var | He thong nho, it thay doi |
| ZooKeeper/etcd | Moi machine dang ky voi ZooKeeper de nhan machine_id unique | He thong lon, can auto-assignment |
| Cloud metadata | Dung cloud instance metadata (AWS instance ID, GCP zone) de tinh machine_id | Cloud-native systems |
| MAC address hash | Hash MAC address thanh 10-bit (datacenter + machine) | Don gian nhung co the conflict |
Pitfall: Khi mot machine restart hoac bi thay the, machine_id PHAI duoc bao toan hoac de-register truoc khi assign cho machine moi. Neu 2 machine co cung ID chay dong thoi --- duplicate IDs se xay ra.
5.5 Grafana Dashboard Panels
| Panel | Query | Visualization |
|---|---|---|
| ID Generation Rate | sum(rate(ids_generated_total[1m])) by (machine_id) | Time series, stacked |
| Latency P50/P95/P99 | histogram_quantile(0.99, rate(id_gen_duration_bucket[5m])) | Time series, multi-line |
| Sequence Overflow Events | increase(sequence_overflow_total[1h]) | Stat panel (should be 0) |
| NTP Offset per Machine | ntp_offset_milliseconds | Time series, threshold line at 10ms |
| Active Machines | count(up{job="id-generator"} == 1) | Gauge |
| Clock Skew Events (24h) | increase(clock_skew_detected_total[24h]) | Stat panel (should be 0) |
6. System Design Diagrams
6.1 Snowflake Bit Layout
block-beta columns 4 block:header:4 A["Snowflake ID: 64-bit Layout"] end block:bits:4 B["0"] C["Timestamp (41 bits)"] D["DC(5) + Machine(5)"] E["Sequence (12 bits)"] end block:detail:4 F["Sign<br/>(always 0)"] G["Milliseconds since<br/>custom epoch<br/>(~69.7 years)"] H["Datacenter: 0-31<br/>Machine: 0-31<br/>(1,024 nodes max)"] I["0-4095 per ms<br/>(4.096M IDs/sec<br/>per machine)"] end style B fill:#6c757d,color:#fff style C fill:#0d6efd,color:#fff style D fill:#198754,color:#fff style E fill:#dc3545,color:#fff style F fill:#6c757d,color:#fff style G fill:#0d6efd,color:#fff style H fill:#198754,color:#fff style I fill:#dc3545,color:#fff
6.2 Multi-Datacenter ID Generation Architecture
graph TB subgraph "Client Services" SvcA["Order Service"] SvcB["User Service"] SvcC["Payment Service"] SvcD["Inventory Service"] end subgraph "Load Balancer Layer" LB["Load Balancer<br/>(Route to nearest DC)"] end subgraph "Datacenter 1 (US-East) — dc_id=00001" direction TB subgraph "ID Generator Cluster DC1" M1_1["Machine 1<br/>machine_id=00001<br/>Snowflake Generator"] M1_2["Machine 2<br/>machine_id=00010<br/>Snowflake Generator"] M1_3["Machine 3<br/>machine_id=00011<br/>Snowflake Generator"] end NTP1["NTP Server DC1<br/>(Stratum 2)<br/>Sync: time.google.com"] ZK1["ZooKeeper Node 1<br/>(Machine ID Registry)"] end subgraph "Datacenter 2 (EU-West) — dc_id=00010" direction TB subgraph "ID Generator Cluster DC2" M2_1["Machine 1<br/>machine_id=00001<br/>Snowflake Generator"] M2_2["Machine 2<br/>machine_id=00010<br/>Snowflake Generator"] end NTP2["NTP Server DC2<br/>(Stratum 2)<br/>Sync: time.aws.com"] ZK2["ZooKeeper Node 2<br/>(Machine ID Registry)"] end subgraph "Datacenter 3 (AP-Southeast) — dc_id=00011" direction TB subgraph "ID Generator Cluster DC3" M3_1["Machine 1<br/>machine_id=00001<br/>Snowflake Generator"] M3_2["Machine 2<br/>machine_id=00010<br/>Snowflake Generator"] end NTP3["NTP Server DC3<br/>(Stratum 2)<br/>Sync: pool.ntp.org"] ZK3["ZooKeeper Node 3<br/>(Machine ID Registry)"] end subgraph "Monitoring" Prometheus["Prometheus<br/>(Metrics Collection)"] Grafana["Grafana<br/>(Dashboard)"] AlertManager["AlertManager<br/>→ Slack / PagerDuty"] end SvcA & SvcB & SvcC & SvcD --> LB LB --> M1_1 & M1_2 & M1_3 LB --> M2_1 & M2_2 LB --> M3_1 & M3_2 NTP1 -.->|sync| M1_1 & M1_2 & M1_3 NTP2 -.->|sync| M2_1 & M2_2 NTP3 -.->|sync| M3_1 & M3_2 ZK1 -.->|assign machine_id| M1_1 & M1_2 & M1_3 ZK2 -.->|assign machine_id| M2_1 & M2_2 ZK3 -.->|assign machine_id| M3_1 & M3_2 ZK1 <-->|replicate| ZK2 <-->|replicate| ZK3 M1_1 & M1_2 & M1_3 & M2_1 & M2_2 & M3_1 & M3_2 --> Prometheus Prometheus --> Grafana Prometheus --> AlertManager style M1_1 fill:#0d6efd,color:#fff style M1_2 fill:#0d6efd,color:#fff style M1_3 fill:#0d6efd,color:#fff style M2_1 fill:#198754,color:#fff style M2_2 fill:#198754,color:#fff style M3_1 fill:#dc3545,color:#fff style M3_2 fill:#dc3545,color:#fff style NTP1 fill:#f9a825,stroke:#333 style NTP2 fill:#f9a825,stroke:#333 style NTP3 fill:#f9a825,stroke:#333 style ZK1 fill:#6f42c1,color:#fff style ZK2 fill:#6f42c1,color:#fff style ZK3 fill:#6f42c1,color:#fff style Prometheus fill:#e65100,color:#fff style Grafana fill:#388e3c,color:#fff
6.3 Clock Sync Flow
sequenceDiagram participant AC as Atomic Clock<br/>(Stratum 0) participant S1 as NTP Stratum 1<br/>(time.google.com) participant S2 as NTP Stratum 2<br/>(DC NTP Server) participant IDGen as ID Generator<br/>(Machine) participant Mon as Monitoring<br/>(Prometheus) Note over AC,S1: Stratum 0 → 1: ~microsecond accuracy AC->>S1: Time reference signal Note over S1,S2: Stratum 1 → 2: ~millisecond accuracy S2->>S1: NTP Request (T1) S1-->>S2: NTP Response (T2, T3) S2->>S2: Calculate offset<br/>offset = ((T2-T1) + (T3-T4)) / 2 Note over S2,IDGen: Stratum 2 → 3: ~1-10ms accuracy loop Every 16-32 seconds IDGen->>S2: NTP Request S2-->>IDGen: NTP Response IDGen->>IDGen: Adjust clock (slew, not step) end IDGen->>Mon: Export ntp_offset_ms metric alt NTP offset > 10ms Mon->>Mon: Fire NTPOffsetHigh alert Mon-->>IDGen: Alert → Ops team investigates end alt Clock goes backward IDGen->>IDGen: DETECT: current_time < last_time IDGen->>Mon: Increment clock_skew_detected counter IDGen->>IDGen: WAIT until current_time >= last_time Mon->>Mon: Fire ClockSkewDetected alert end
6.4 ID Generation Request Flow
sequenceDiagram participant Svc as Client Service participant LB as Load Balancer participant IDGen as ID Generator<br/>(Machine #5, DC #2) participant Clock as System Clock Svc->>LB: Request new ID LB->>IDGen: Forward to nearest generator IDGen->>Clock: Get current_timestamp_ms Clock-->>IDGen: timestamp = 1742300000123 alt timestamp == last_timestamp IDGen->>IDGen: sequence++ alt sequence > 4095 (overflow) IDGen->>IDGen: WAIT for next millisecond IDGen->>Clock: Get current_timestamp_ms Clock-->>IDGen: timestamp = 1742300000124 IDGen->>IDGen: sequence = 0 end else timestamp > last_timestamp IDGen->>IDGen: sequence = 0 else timestamp < last_timestamp (CLOCK SKEW!) IDGen->>IDGen: Log warning + wait end IDGen->>IDGen: Compose ID:<br/>= (timestamp << 22)<br/>| (dc_id << 17)<br/>| (machine_id << 12)<br/>| sequence IDGen-->>LB: Return ID = 7300811265438109697 LB-->>Svc: ID = 7300811265438109697 Note over Svc: Extract info from ID:<br/>timestamp: 2025-03-18T12:00:00.123Z<br/>datacenter: 2<br/>machine: 5<br/>sequence: 1
7. Aha Moments & Common Pitfalls
Aha Moments --- Duc ket
#1 --- Clock skew la dieu khong the tranh khoi trong distributed systems: Khong co cach nao dong bo hoan hao dong ho giua cac machine. Snowflake chap nhan “roughly sorted” thay vi “strictly sorted”. Neu can strict global ordering --- dung Lamport Clock, Hybrid Logical Clock, hoac Google TrueTime.
#2 --- UUID vs Snowflake khong phai chon 1 bo 1: UUID phu hop cho idempotency key, session ID, correlation ID (khong can sort, khong can compact). Snowflake phu hop cho primary key, event ID, message ID (can sort, can fit 64-bit). Nhieu he thong dung ca hai cho cac muc dich khac nhau.
#3 --- DB index performance voi random vs sequential IDs: Day la ly do lon nhat de khong dung UUID lam primary key trong relational DB. B+ tree index yeu sequential insert --- moi record moi append vao cuoi leaf node. Random UUID gay page split lien tuc, lam giam write throughput 2-5x va tang disk I/O.
#4 --- 1 machine Snowflake tao du 4.1M IDs/sec: Throughput khong phai ly do deploy nhieu machine. Ly do chinh la High Availability --- neu 1 machine chet, cac machine khac van tao ID binh thuong.
#5 --- Custom epoch la quyet dinh mot lan, khong the thay doi: Chon epoch sai (qua xa trong qua khu) = lang phi bit timestamp. Chon epoch trong tuong lai = ID am. Chon epoch = ngay go-live la tot nhat.
#6 --- Snowflake ID co the bi reverse-engineer: Bat ky ai co ID cung co the extract timestamp, datacenter, machine, va sequence. Dung bao gio expose Snowflake ID truc tiep tren public API neu em quan tam security.
Common Pitfalls --- Sai lam thuong gap
Pitfall 1: Khong xu ly clock skew
Sai: “Dong ho may luon chinh xac, khong can lo.” Dung: Clock skew LUON xay ra. NTP step, leap second, VM migration, hardware issue --- tat ca deu co the lam dong ho nhay lui. Phai co logic detect va handle.
Pitfall 2: Dung UUID lam primary key roi thac mac sao DB cham
Sai: “UUID unique, dung lam PK luon cho tien.” Dung: Random UUID gay B+ tree fragmentation. Neu phai dung UUID, dung UUID v7 (time-ordered) hoac ULID. Hoac dung Snowflake ID lam PK va UUID lam public-facing ID.
Pitfall 3: Hard-code machine_id
Sai:
machine_id = 1trong config file, deploy len 5 server. Dung: 5 server cungmachine_id = 1= duplicate IDs. Phai co co che assign unique machine_id --- ZooKeeper, etcd, hoac cloud metadata.
Pitfall 4: Khong monitor sequence overflow
Sai: “4096 IDs/ms la du roi, khong can lo.” Dung: Traffic spike, retry storm, hoac bug co the gay burst > 4096/ms. Sequence overflow = ID generator phai wait = latency spike. Phai monitor va alert.
Pitfall 5: Quen tinh epoch overflow date
Sai: Dung Unix epoch 1970 cho he thong deploy 2025. Dung: . Chi con 13 nam! Chon custom epoch = 2025 de co 69 nam runway.
Pitfall 6: Nghi Snowflake dam bao global ordering
Sai: “ID lon hon = tao sau, luon luon.” Dung: Chi dung tren cung 1 machine. Giua 2 machine khac nhau, clock skew co the lam ID nho hon duoc tao sau ID lon hon. Snowflake chi dam bao roughly time-sorted, khong phai strictly time-sorted.
Pitfall 7: Khong co ke hoach cho sau khi epoch overflow
Sai: “69 nam nua moi het, khong phai viec cua minh.” Dung: Document ro overflow date. Dat calendar reminder. Len ke hoach migration: tang timestamp len 42+ bit (giam machine/sequence bits), hoac chon epoch moi va migrate.
8. Wrap Up --- Tom tat cho Interviewer
| Buoc | Noi dung chinh |
|---|---|
| Step 1: Scope | 64-bit, globally unique, time-sortable, 10K+ IDs/sec, multi-datacenter |
| Step 2: High-Level | 4 approaches: Multi-Master, UUID, Ticket Server, Snowflake (chon) |
| Step 3: Deep Dive | Bit layout (1+41+5+5+12), clock sync (NTP), clock skew handling, sequence overflow |
| Step 4: Wrap Up | Monitoring (clock drift, overflow), Security (ID enumeration), Alternatives (ULID, KSUID) |
Interview tip: Khi interviewer hoi “Design a Unique ID Generator”, ho muon nghe:
- Em hoi cau hoi de lam ro requirements (khong nhay vao Snowflake ngay)
- Em trinh bay nhieu options roi chon co ly do
- Em hieu trade-offs (clock skew, ordering guarantees, security)
- Em biet van hanh he thong (monitoring, alerting, NTP)
9. Internal Links --- Lien ket noi bo
| Topic | Link | Lien quan |
|---|---|---|
| Scaling | Tuan-01-Scale-From-Zero-To-Millions | Tai sao can distributed ID khi scale |
| Capacity Estimation | Tuan-02-Back-of-the-envelope | Cong thuc IDs/sec, overflow calculation |
| Database Sharding | Tuan-07-Database-Sharding-Replication | ID lam shard key, sequential vs random impact |
| Consistent Hashing | Tuan-10-Consistent-Hashing | Phan phoi ID generation across nodes |
| Monitoring | Tuan-13-Monitoring-Observability | Clock drift monitoring, alerting |
| Security | Tuan-14-AuthN-AuthZ-Security · Tuan-15-Data-Security-Encryption | ID enumeration, information leakage |
| URL Shortener | Tuan-16-Design-URL-Shortener | Snowflake ID dung lam base cho short code |
| Key-Value Store | Tuan-20-Design-Key-Value-Store | ID lam key trong distributed KV store |
Tham khao
- Alex Xu, System Design Interview --- Chapter 7: Design a Unique ID Generator in Distributed Systems
- sdi.anhvy.dev --- Vietnamese System Design Reference
- Twitter Snowflake (GitHub) --- Original Snowflake implementation
- Announcing Snowflake (Twitter Engineering Blog, 2010)
- Discord Snowflake IDs --- Discord’s Snowflake variant
- Instagram Sharding & IDs --- Instagram’s ID generation
- ULID Spec --- Universally Unique Lexicographically Sortable Identifier
- KSUID (Segment) --- K-Sortable Unique Identifier
- Baidu uid-generator --- Baidu’s Snowflake variant
- Sony Sonyflake --- Sony’s Snowflake variant (focus on machine ID from IP)
- NTP RFC 5905 --- Network Time Protocol v4
- Google TrueTime (Spanner Paper) --- Globally-consistent distributed database
Case Study lien quan: Tuan-16-Design-URL-Shortener (Snowflake ID dung de tao short code) · Tuan-20-Design-Key-Value-Store (ID lam key trong distributed store)