It's been fun describing the feeds framework we use at Square. Today we'll dive into a concrete problem:
- We'll stick with the feed-published
- We want two instances of some application code to bidirectionally synchronize the writes that happened on their instance to the other.
- Eventually consistent is ok.
First, a bit of justification, though: I use this KV table to remember things I might have to look up without usual context, like my motorcycle's license plate number, or that one weird python snippet I can never remember. I also have a whole slew of them at work -- a bunch of random representative IDs for a bunch of things in our systems that I use from time to time. I also use a bunch of these as todo items at work, but that happens to work differently and is a topic for a future blog post :-)
The end goal here is that I should be able to partition my keys into
- work-only things (which don't get replicated to my personal machine at all)
- work-appropriate personal things (which are bidirectionally synchronized between my personal machines and my work laptop)
- personal things that I don't want on my work machine
In practice, I don't really have any of item (3). But the work vs personal laptop is a good case study for a system that has frequent network partitions.
The basic approach here is to have several
kv instances served by feeds
that each consume eachother.
The only other thing we need is some method to resolve conflicts when a certain
k already exists but the associated
v are different.
One easy approach is just inserting it as a new value, like
def safe_k(kv, source): return kv.k + "-Conflicted-" + source + "-" + kv.feed_sync_id