Code and Life: Lessons for Big Systems

17. August 2007

Lessons for Big Systems

Lessons

Take load from the DB

Finally the DB is the bottleneck.
There is only one DB (cluster), but there can be hundreds of CPUs (web server) and caches (memcache server).
Let the CPUs work. 10 web server CPU cycles are better, than 1 DB CPU cycle.
Aim at 0,1 DB operations per web page by average.
Make it F5-safe. No DB operations for page reloads. No DB for views.

Avoid SQL

Keep all live data in memory.
Store only for persistency, not for report generation.
Use a quick storage, storing 50.000 items per sec is possible
DB != SQL, there are quicker interfaces
The index is always in memory. That's what SQL DBs are good for.
But there are other indexes as well.

External IDs

Do not use DB IDs externally. Map all IDs.
Use memcache to map external IDs to internal (often DB) IDs.
Use memcache as a huge hashtable.
External IDs may be strings. After the mapping continue with numbers internally.

DB search loves numbers

Everything you search for must be indexed.
Avoid indexes on TEXT, VARCHAR. INSERT with index takes significantly longer for text.
You may store text in the DB, but do not search for it.
You may spend some CPU to map text IDs to numbers for the DB.

100,000 concurrent

Imagine 1% of your users are doing the same thing in an instant.
If it affects online users, then each task is x 100,000.
If it affects all users then everything is x 1-10 Mio.
Anything must be at at least 1000/per sec.
Do maintenance all the time. There will never be a time of the day where load is so small, that you can cleanup something. Cleanup permanently.

Memcache every business object

No object is constructed from the DB.
Everything is buffered by the cache.
Code with real interfaces, which can be cache-enabled later.

Code for the speed

Code for the cache. It is there. It is essential. No way to pretend it is not just for the "beauty" of the code.
Write beautiful cache-aware code.

Memcache frontend data

Parsing template costs much CPU.
Cache generated HTML fragments.

Do not overload the cache

Not more than 10 memcache requests per script.
If you expect many items, say a mailbos with many messages, then put a summary into a list (mailbox) object even though the same information is in the individual messages.

No statistics on the live system

Occasionally they want statistics. Don't do it live.
Take snapshots, take the backup. Process it somewhere else.
Make statistics offline.

Simple SELECTs

Use only simple SELECTs on indexed columns
Forbidden keywords: JOIN, ORDER BY
Structure and code must guarantee small DB results.
Sort in the code not in the DB.
If you really need aggregated data, then aggregate permanently. Do not aggregate on demand.

Basics and Trivialities:

Distribute everything

Do not rely on a single server for a task.

Check all input

Check ALL input.
Not only query params are input.
Cookies, HTTP header fields are also input.

SQL injection

SQL-escape all data in SQL strings.
Use prepared statements and variables.

Framework

Use a real programming language.
Use a compiled language, because the compiler eliminates errors.
You will have errors which will wake you at night. So, reduce errors by any means, even if you like script languages.
Simple deployment of script languages won't work anyway in the long run, because you will switch on caching and you will have to invalidate the script cache for deployment.

Keine Kommentare:

Kommentar veröffentlichen

Abonnieren Kommentare zum Post (Atom)