Staging Area
Distribution / Replication
09.01.13 (jmkogut) As I was thinking today about distribution I realized that under the current model (single server, single process) it would be difficult to distribute in any way at all. The following is what I’ve come up with.
The erwik cloud can be anything between a single erwik instance on one machine to N number of erwik instances across the Internet. Let’s say you start with a single erwik instance. We’ll call this E1. E1 starts by itself and serves through E1:8000 the normal wiki content.
E1 starts to get overloaded so we create a new instance on a new host (we’ll call this E2). E2 and E1 know nothing about each other, so we’ll have to intervene and tell E2 about E1. E2 tries to form a relationship with E1 (over erlang’s process message passing) and if it succeeds then the two have formed a network with each other.
Users can modify content interchangeably between the two nodes but the data itself isn’t replicated. Let’s add another node (E3) and form a mesh network. Where each node knows about all the other nodes.
A user on E1 changes the page titled ‘MyPantsAreMelting’ and saves it. E1 commits that change, then sends a commit notice (and the commit checksum) to all the other nodes in its network (E2, E3) who in turn pull that commit from E1 if they haven’t already received that checksum. (They verify first that the checksum is a new commit).
Scenario: The connection between E1 and E2 is blazing fast, and the connection between E2 and E3 is blazing fast, while the connection between E1 and E3 is artificially limited to ~2kbps. So E1 can send that commit notice out to E2, E3 and E2 can receive it, pull it, and forward that to E3 before E3 ever gets E1’s initial commit notice. E3 then discards E1’s notice since it already pulled from E2.
Routing
Essentially each node has a complete routing table at any one time. It’ll listen for commit events, new_neighbor events, and dead_neighbor events. and pass on commit, new_neighbor, and dead_neighbor events to everyone in the pool. Each event carries a checksum which is passed around with the update.
Here’s an event definition:
{event, Type, [{argument, Value}, {argument2, Value2}]and a few sample events:
{event, new_neighbor, [{node, erwik@dottru.net}] {event, dead_neighbor, [{node, erwik@dottru.net}] {event, commit, [{checksum, “”http://github.com/zuwiki/erwik/commit/bd093b2ba8f5b82c6997bfbe8024c190efb3ba1c">bd093b2ba8f5b82c6997bfbe8024c190efb3ba1c"}, {exclude, [erwik@dottru.net, erwik@zuwiki.net]}]} % Exclude is a list of nodes to not forward toIf E1 has a pool of E2, E4, E7, and E9, and a new instance joins to E1, then E1 sends the entire pool to that new instance to populate the routing table.
