With MySQL, people generally do what is called application based sharding.
In a nutshell, you will have the same database structure on multiple database servers. But it won't contain the same data.
So for example:
Users 1 - 10000: server A
Users 10001 - 20000: server B
Sharding (of course) is not a backup technique, it's meant to distribute reads and writes across a cluster.
Techniques employed to shard are the MySQL-Proxy, for example. This is nothing that HScale invented, it's more or less a simple LUA script which distributes reads and writes to different backend servers. There should be plenty of examples on the MySQL forge.
Another tool (based on MySQL Proxy) is SpockProxy. Completely tailored towards sharding. They also got rid off Lua, and they worked on various things to make it speedier than the proxy. So far, I have only tested SpockProxy, but never ran it in production.
Now aside from those proxies, you can shard yourself as well. Required would be a master table, e.g.:
-------------------
| userA | server1 |
| userB | server2 |
| userC | server1 |
-------------------
Then construct your reads and writes towards the server. Not very pretty but that works. The next obstactle would be to make it more falt tolarant. So for example, server1
, server2
and server3
each should be a small cluster.
And last but not least, another interesting approach to partition data and indices across servers is Digg's IDDB. I'm not sure if they ever released its code, but their blog posts gives great details on what it does.
Let me know if this helps!
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…