I realized I’ve left out a major part from my sequence of previous feed-related posts: a justification for why we should bother with a separate feed_sync_id
.
So let’s give it a shot!
The fundamental problem is:
AUTO_INCREMENT
ids are assigned in insertion order, but become visible to other threads in commit order.
To see how this causes a problem, consider the interactions and visibilities between three transactions to the same database:
t0: TRX0: BEGIN; INSERT INTO kv (ns, k, v) VALUES ("-", "k0", "v0"); COMMIT;
t1: TRX1: BEGIN; INSERT INTO kv (ns, k, v) VALUES ("-", "k1", "v1");
t2: TRX2: BEGIN; INSERT INTO kv (ns, k, v) VALUES ("-", "k2", "v2");
t3: TRX0: SELECT MAX(id) FROM kv;
t4: TRX2: COMMIT;
t5: TRX0: SELECT MAX(id) FROM kv;
t6: TRX1: COMMIT;
t7: TRX0: SELECT MAX(id) FROM kv;
Here, we have two transactions that both insert a new kv
record.
The database has to assign an id
value to each of those records, because we might be creating other associations to those records in our application code.
But other threads – TRX0
in this case – shouldn’t be able to see those records until we COMMIT
, and so indeed the SELECT
at t=t3
might return 1
.