cross-dc sync with feed published KV

traviscj

July 10, 2018 - 2 minutes read - 299 words

It’s been fun describing the feeds framework we use at Square. Today we’ll dive into a concrete problem:

We’ll stick with the feed-published kv table again.
We want two instances of some application code to bidirectionally synchronize the writes that happened on their instance to the other.
Eventually consistent is ok.

First, a bit of justification, though: I use this KV table to remember things I might have to look up without usual context, like my motorcycle’s license plate number, or that one weird python snippet I can never remember. I also have a whole slew of them at work – a bunch of random representative IDs for a bunch of things in our systems that I use from time to time. I also use a bunch of these as todo items at work, but that happens to work differently and is a topic for a future blog post :-)

The end goal here is that I should be able to partition my keys into

work-only things (which don’t get replicated to my personal machine at all)
work-appropriate personal things (which are bidirectionally synchronized between my personal machines and my work laptop)
personal things that I don’t want on my work machine

In practice, I don’t really have any of item (3). But the work vs personal laptop is a good case study for a system that has frequent network partitions.

The basic approach here is to have several kv instances served by feeds

WORK     PERSONAL

that each consume eachother.

The only other thing we need is some method to resolve conflicts when a certain k already exists but the associated v are different. One easy approach is just inserting it as a new value, like

def safe_k(kv, source):
  return kv.k + "-Conflicted-" + source + "-" + kv.feed_sync_id