There Is No Now – Problems with Simultaneity in Distributed Systems

marknadal · on March 11, 2015

Hallelujah, we need more articles like this. I work in distributed systems, and people hate hate hate hearing the truth, let's summarize the article:

1. You cannot beat the speed of light.

2. Machines break. Even the most reliable ones.

3. Networks are unreliable. Even local area networks.

4. It is an exciting time for distributed systems: CRDTs, Hybrid Logical Clocks, Zookeeper, etc.

These are things I feel like I've been preaching a lot, but get upset responses "Well, certainly Globally Consistent systems work, most databases are!" Lies, as @rdtsc mentions:

"Keeping an always consistent state in a large distributed [system] you are fighting against the laws of physics."

Next up, that is why master-master replication is important, because at some point your primary will go down (or at least the network to it). I started an interesting thought experiment that turned into a full on open source project: What if we were to build a database in the worst possible environment (the browser, aka javascript and unreliability)? What algorithms would we need to use to make such a system still survive and work?

This is why I chose and worked on solutions involving Hybrid Logical Clocks and CRDTs, which are at the core of my http://github.com/amark/gun database. An AP system with eventual consistency and no notion of "now", as every replica runs in its own special-relativity "state machine" view of the world.

These are all interesting concepts, and the article was a good one. I recommend it.

vdm · on March 12, 2015

Nice to see another example of applying CRDTs in the browser, I only knew about swarm.js. http://swarmjs.github.io/ They have a few good posts on why you would want to do this.

rdtsc · on March 11, 2015

Keeping an always consistent state in a large distributed you are fighting against the law of physics.

Like it is mentioned, Google did it with their F1/Spanner SQL database. But that also mean GPS receivers with antennas on the roofs of data centers.

Which is yet another thing that can fail and it, either by itself or in a cascade of other failures will lead to unspecified and possibly undesirable behavior.

Recently I see a lot of advocates of dropping NoSQL databases and moving back to Postgress or other SQL databases.

The problem is SQL and schemas is not the only reason NoSQL databases became popular, they became popular because they also started to have a default and more defined behavior in respect to replication and distribution.

Most solutions don't need that and sticking with a solid single database works very well. But those that need distributed operation have a pretty hard task ahead of them.

One heuristic you can look at is if and how is distribution implemented. Is it something that is bolted on top, like you download some proxy or addon, or added as an afterthought, be very careful. Those things should be baked into the core. For example, does it support CRTDs? Does it have master to master replication and so on. If it claims to have consistency instead availability figure out how it is implemented, Paxos, Raft, or something else.

So far I think Riak is probably the database that thought the hardest and did the most reasonable job. The other simpler database is CouchDB, they have well specified conflict resolution behavior and master to master replication. But you have to usually write your own cluster topology. There are probably others but those are the two I know of first hand.

logicallee · on March 11, 2015

I'm not an expert, and it sounds like you are, so I appreciate your feedback here:

what do you even mean by a consistent state? even in theory a person initiating a new additional record in Auckland, New Zealand at the same time somebody iniatiates a change in Gibraltar or London (which are antipodal to the former[1]) 66 milliseconds away, cannot have a confirmation in less than 120 milliseconds, right? So do you just wait for that before declaring 'consistency'? Do you literally add 120 milliseconds to each and every request? (And this is assuming you have a damned good solution to the two generals problem)? I mean suppose the database tracks something as simple as: number of web page hits. It's a counter. You now distribute it, and have a stochastic process of counter hits between 0 and 5 per second in your largest cities, distributed throughout the world. How can that database ever be consistent?

If there are ten new records per second in New Zealand and ten records per second in the UK, and they potentially depend on each other in some way, are you going to just make everyone wait until everything has been committed and confirmed to be consistent? Or is "a foolish consistency the hobgoblin of little minds", and you really can accept out-of-date data and deal with merging conflicts later?

I just don't understand why we would expect consistency to rank up there, when we deal with a worldwide real-time system where the difference between getting served by a local database in 40 milliesconds and one far away in 250 milliseconds is both staggering, and incredibly noticeable. why be consistent? what is consistence?

[1] http://www.findlatitudeandlongitude.com/antipode-map/

rdtsc · on March 11, 2015

> Do you literally add 120 milliseconds to each and every request?

Yap, you have to add all the delay until you get confirmation from the members of the cluster that the write was written. And you have to have linearizability, so that anyone reading after that (and one can argue what 'that' and where 'that' is) will now see the new state. If one of the members has failed you could potentially be stuck forever waiting. Now you also have to make sure how cluster membership and connectivity is defines and what are the possible state and transition during membership change, coupled with network partitions, coupled with hardware failures.

In other words you are fighting against the laws of physics. It is expensive and hard to do.

In case of the counters, one should ask is it worth it. Or is an CRDT based counter (that will eventually converge) good enough.

Even banks are eventually consistent. They choose to be available first. So you can withdraw $100 in New Zeland and then $100 in New York with a short period of time even if you only have $100 in your account. Inconsistency is handled later when you get a letter that your account is overdrawn.

chris_va · on March 11, 2015

I've spent too much of my life on this problem :).

The short version is that you can expose the different read/write modes to the client, and let them decide on what works best for their use case.

For spanner, table 2 in the paper[1] highlights the different interaction modes (read/write transaction, read wait for pending transactions, read skip pending transactions, read at some past timestamp). Each operation gives you a "consistent" view of the data, where "consistent" means that you don't see partial transactions.

It turns out that, for most high-qps web applications, you want a recent consistent view of the data, but you don't care about pending transactions. That means you can read at "now", or at "now - 1 second" without any latency penalty.

Clients doing read/write will care about pending transactions, but you can be opportunistic if the transactions do not overlap. Settling transactions does require quorum synchronization, so you are limited to max(median+1(latency)) milliseconds. These databases are usually keyed by something that minimizes transaction overlap (e.g. you can shard a counter, that kind of thing).

Regional migration of quorums (e.g. counter for asia, counter for europe) can also be done. So, the counter for "asia" consists in a "segment" (not the actual terminology, but we'll go with it) where the quorum exists mostly distinct asia regions. Metadata for the segment (which everything reads at startup, and is kept up to date) will tell you what servers are responsible.

[1] http://static.googleusercontent.com/media/research.google.co...

bdamm · on March 11, 2015

There are many systems which can work just fine in an eventually-consistent manner. A database of people (customers, users, etc) is a classic example of such a thing.

In general I think consistency is over valued. There are plenty of cases where it is important. Lots of people are brainwashed in college to think that all data must be consistent all the time, and that's just not necessary.

logicallee · on March 11, 2015

>Lots of people are brainwashed in college to think that all data must be consistent all the time, and that's just not necessary.

I knew it! So a foolish consistency is the hobgoblin of little minds.

olau · on March 11, 2015

While I agree strict consistency is probably overkill in many if not most situations, the problem with not having consistency is that it potentially makes the application logic much more complicated.

Take the database of customers - so if you don't have consistency, what happens in case someone changes the company address and another person simultaneously requests a delivery of something. Do you risk ending up with half of the old address and half of the new one on the parcel?

Note you can certainly have this problem in a consistent system too, e.g. if you make a UI without a save button where the address is changed one field at a time.

Concurrency is just intrinsically hard.

nostrademons · on March 11, 2015

Note that the real world operates like this too: before computers, if someone changes their address and simultaneously sends a package, the package probably will end up at the wrong address. We have a number of mechanisms in place to mitigate this when it occurs (address forwarding, return-to-sender, customer support, credit card chargebacks), but they still don't always work, and sometimes packages just get lost.

The real world solution to this is the acceptance that yes, sometimes bad things happen for no reason at all. I suspect that the computer world will eventually move to this as well, with consumers becoming more tolerant of machines that simply give the wrong answer some of the time, as long as they give the wrong answer less frequently as a human would.

chousuke · on March 11, 2015

"Lack of consistency" here doesn't have to imply lack of atomicity; In the case of your example, an eventually consistent system could return either the old address or the new address at some point in time X, but never a mixture of both. In your situation, the parcel could end up getting delivered to the old address, but there would be no corruption.

threeseed · on March 11, 2015

Couldn't agree more. Most people don't know that many critical systems e.g. banking are eventually consistent and rely on other methods.

The concepts of CQRS and Event Sourcing really need to be taught at universities.

xxxyy · on March 11, 2015

Given a set of perfectly synchronized distributed clocks you may not even have to wait the 66ms. Incoming transactions (both local and remote) go into the write-ahead log, their ordering is given by the timestamps (which are consistent, because clocks are synchronized). Periodic heartbeats from other nodes give you a green light to commit or abort parts of the write-ahead log into permanent database. All nodes will make the same decisions wrt each transaction.

You will only have to wait for the commited/aborted response, which cannot be achieved faster than 2*66=132ms (and this system can come arbitrarily close to that, by increasing the heartbeat frequency).

There is no need to wait any time before running a subsequent transaction though. Confirmations will flow with a 132ms delay, but there is no limit on transactions concurrency.

marknadal · on March 11, 2015

Warning, this is a very dangerous practice. The whole point of the article states that machines do fail and that networks are unreliable. He had a specific section in their that even Spanner has a 7ms uncertainty on time.

Therefore: You CANNOT trust timestamps or your clock.

xxxyy · on March 11, 2015

Yes, I know. This is a thought experiment, as there is really no such thing as perfectly synchronized clocks.

Also despite the need for futuristic clocks the system lacks resistance to failure (no heartbeat from arbitrary node -> no transaction gets commited and no transaction gets aborted -> a deadlock). Maybe this is fixable with Paxos, I'm not sure.

manigandham · on March 11, 2015

Nice points, have to mention Aerospike as another that did a lot of the right stuff with regard to distributed actions and clustering.

threeseed · on March 11, 2015

> Recently I see a lot of advocates of dropping NoSQL databases and moving back to Postgress or other SQL databases.

This is just from a vocal minority on HN. You just need to look at the facts.

Companies like Mongo, Datastax, Aerospike etc are growing bigger by the day, with increasingly higher valuations. Old school database companies like Teradata are now all about datalakes incorporating Hadoop and Mongo. And technologies like Spark, Impala are now on the front line for many data analytics and processing work.

In the enterprise at least SQL databases are increasingly being relegated to a small part of the whole data pipeline i.e. storing the consolidated, integrated data model.

pilsetnieks · on March 12, 2015

The advocates of dropping NoSQL and returning to SQL databases are indeed a minority — as the majority never went away from SQL databases, especially for the purposes they fulfil well.

Right tool for the job and all that stuff.

threeseed · on March 12, 2015

Again the facts simply don't agree with you. Companies have been moving away from SQL databases in droves compared to the 1990s. And why wouldn't they ? Vendors like Oracle, Microsoft, IBM etc have been screwing them over for ages. Low cost data lakes are the new norm.

Of course I am talking about middle to enterprise companies. I am sure for individuals a typical LAMP setup is still going to be fine. But then again many of those are running their apps in the cloud and hence want a database that is resilient in the face of node outages. MySQL and PostgreSQL are both a PITA to get this right compared to almost every NoSQL database.

louthy · on March 12, 2015

> Again the facts simply don't agree with you. Companies have been moving away from SQL databases in droves compared to the 1990s.

Do you have any links to said 'facts'? Otherwise your comments are just hearsay and anecdote.

chucksmash · on March 12, 2015

Indeed. I suppose it depends on the circles in which one runs but there are still companies actively writing COBOL (or were as of 2011 as far as I can personally verify).

And they were making millions of dollars from that one small segment of the company.

I'd wager that in the enterprise (whatever that means) that where NoSQL is used in companies more than, say, 10 years old, it is generally for non-critical, exploratory one off projects. I don't even think NoSQL is close to 50% market share among what I'll call the silent majority of more conservative, more enterprisey tech companies/IT departments. I have no figures to back that intuition up with, however.

jgrahamc · on March 11, 2015

Rear Admiral Grace Hopper (one of the most important pioneers in our field, whose achievements include creating the first compiler) used to illustrate this point by giving each of her students a piece of wire 11.8 inches long, the maximum distance that electricity can travel in one nanosecond.

I made download and print versions of this: http://blog.jgc.org/2012/10/a-downloadable-nanosecond.html

jhayward · on March 11, 2015

Thanks, this is a helpful introduction to the history and literature of concepts related to time in distributed systems. Most people's concept of time is quite simple and they need to be broken loose of some intuitively held, but unhelpful beliefs before they can really do engineering with respect to time.

Is there a paper somewhere that new folk should read first? One that includes:

- A tutorial to describe all of the things we think of as 'time', e.g. order of sequence, etc. and their dependence on each other.

- The idea that time as it occurs in the physical world is probabilistic - requiring such descriptors as precision (what is the smallest difference we can discern), and error bounds or probability distributions (how accurately can we describe it).

- And for the concrete thinkers who 'get' that true simultaneity is impossible, an easy to understand example of how we succeed in observing logical coherence, from the scale of a single CPU chip (internally non-coherent, externally consistent) to cross-continent compute clusters?

nuxi · on March 12, 2015

There was a great presentation given by Tom Van Baak on measuring time, precision, accuracy etc. at this year's FOSDEM . Abstract an video are available at https://fosdem.org/2015/schedule/event/precise_time/, he has lots of additional time-related information on his website http://leapsecond.com/.

nickbauman · on March 11, 2015

This article underscores a point I always end up explaining to people who look at Cloud Computing (implemented as "the lambda architecture" in best-of breed scenarios) as a golden hammer that it's a good technology, but only for a certain class of problems. You can do things like monitor trends over time and even act on them with soft deadlines using Cloud. But you will never have a cloud technology control your anti-lock braking system on your car. It basic understanding of CAP theorem, really.

dragonwriter · on March 11, 2015

Well you could have locally accessed dynamically provisioned computing capacity (cloud technology) in your car doing that. Remote/shared hosting is a different and older thing than cloud technology (though it's a popular use of cloud technology.)

pjc50 · on March 11, 2015

locally accessed dynamically provisioned

This is a mission critical safety system. There are some standards which are unhappy with dynamically allocating memory in that situation, let alone dynamically allocating the entire compute resource.

dragonwriter · on March 11, 2015

That's a good point, though it is different consideration from the response time issue imposed by physical limits of communications round-trip time.

jerf · on March 11, 2015

You already do. "Multitasking" covered that decades ago. I'm not sure that's a useful metaphor to cloud technology, though, most especially because multiprocessing systems are exactly where we got our wrong impressions about reliability and the lack of network issues in the first place....

Retric · on March 11, 2015

I would be shocked if any ABS systems in a car actually dynamicly shared resources with anything else.

padelt · on March 11, 2015

That is not uncommon at all and not unsafe if done right. Hard realtime systems are there for exactly that reason. Since you need to dimension everything to worst case scenarios, gains tend to not be as big as with soft realtime systems but there you have it: Dynamically shared resources in critical systems.

Retric · on March 11, 2015

I can see Vehicle Stability Assistance systems that also do ABS breaking, but that's not really dynamic resource allocation so much as new solution. Ditto, for sending information over the cars local network.

Can you give an example where something not related to traction is run on the same CPU as the ABS/traction system?

stcredzero · on March 11, 2015

With the way the world looks now, and the way it's shaping up to be in the future, we should be thinking about designing systems where there is no "now" but rather, there is instead a notion of "everything syncs soon." There can be a robust consensus reality in such a system, but it has to exist a short interval of time in the past.

I'm currently working on a multiplayer game design on these principles.

nosuchthing · on March 11, 2015

Having played a multiplayer mobile game that uses a "everything syncs soon" protocol, and at the same time relies on the main mechanics of the game being timing the moves of characters against your opponents it often can lead to making the game highly unpredictable and jittery with "alternate timeline warps" when it resyncs, having nullified important moves.

stcredzero · on March 12, 2015

Having played a multiplayer mobile game that uses a "everything syncs soon" protocol, and at the same time relies on the main mechanics of the game being timing the moves of characters against your opponents it often can lead to making the game highly unpredictable and jittery with "alternate timeline warps" when it resyncs, having nullified important moves.

Nope. Most designers pick a particular mechanic and try to make that exact mechanic work over the network. If you abandon the particular mechanic and aim on meta-goals, one of which might be no jitters, resyncs, and (visible) alternate timelines, then you can eliminate everything you just mentioned above.

nosuchthing · on March 12, 2015

If you'd like to share I'd love to hear how you approach solving sync issues with multiplayer time sensitive games.

Starcraft/Warcraft RTS games seem to force every client in the same "room" to slow down if other clients report back bad/slow sync clocks, FPS games like CS/HL engine based will penalize single clients if they lag, and the worst I've seen was a mobile game by SNK which seems to rely on and give too much information to the client which has resulted in many users abusing such protocols.

stcredzero · on March 13, 2015

I may do that in coming weeks if my current experiments work out.

jchrisa · on March 11, 2015

If you want to let a mature open source sync system do the heavy lifting we've recently added Unity 3D support to Couchbase Lite. http://developer.couchbase.com/mobile/unity/

Hello world is a game where players drop shapes into a world and they automatically sync / appear on other devices.

Sorry about the plug -- great article Justin! I sent it to my team because I think it does a great job stepping back from the buzzwords and actually talking about the problems distributed databases have to confront.

stcredzero · on March 11, 2015

Interesting. However I'm not just looking for syncing stats and inventory. What I'm talking about is designing a new game mechanic around the realities of distributed gaming, then designing a more robust server around that. The server will be finalizing entity positions, damage, and collisions a half second behind "real-time," and the game mechanic will be designed around this in a way that preserves player autonomy and immediate response. (Basically, deemphasizing aiming while emphasizing tactical dodging where the player seeks to minimize received damage over time.)

great job stepping back from the buzzwords

Those pesky buzzwords. You were just caught out being misled by one. (sync -- which typically means what you thought it did, but also has a generalized meaning)

nosuchthing · on March 12, 2015

In competitive video games, if you give the client a large open window of freedom sooner or later you'll notice that window being used as a door by swaths of your userbase.

stcredzero · on March 12, 2015

I thought long and hard about this. I would love it if there were tons of kamikaze tactics exploiting the additional half second of firing you have after you died. Maybe people could use this to defeat an otherwise powerful and untenable enemy.

Great!

That would mean groups of people were playing my game and cooperating to overcome huge odds in creative ways. (Players could use alts to do this, but there is already plans for a facility to let players build kamikaze drone ships without making an alt, and rigged to make this more economical.)

marknadal · on March 11, 2015

@stcredzero I'm interested in building simulation systems for distributed systems, and trying to "gamify" them to teach people about these concepts in a tactile-playable way. Would you be interested in such work? It is less building a "entertainment" game and more an "educational" fun game. Shoot me an email: mark@gunDB.io

yodsanklai · on March 11, 2015

> One of the most important results in the theory of distributed systems is an impossibility result, showing one of the limits of the ability to build systems that work in a world where things can fail.

Layman question. I wonder how important this result really is. This is an impossibility result in a certain model, where processes are deterministic. It's certainly a nice theoretical result but in practice, there are probabilistic algorithm that solve this problem.

I don't know what are the probabilistic bounds of probabilistic consensus algorithms, but if it's arbitrary low, the impossibility result for deterministic processes is irrelevant isn't it?

After all, if we can live with a super low probability of a meteorite destroying the planet, so can we with a good probabilistic consensus algorithm.

jamespitts · on March 11, 2015

This is an excellent overview.

Simultaneity isn't just for machines -- it is necessary for people being connected together online as well. Time sync is a huge part of creating a shared experience, and this will become more widely appreciated as virtual reality develops socially.

We solved this in a very limited way at rapt.fm for our timed rap battles. We maintained a shared clock tick (with adjustment), allowing UI and in-game events -- e.g. the beat kicking in -- to happen somewhat simultaneously across browsers. This helped make up for the latency of video, and created a feeling that people were together at the same "place".

brianzelip · on March 11, 2015

:) for the dude eating from a bowl of cereal at the end of the rapt.fm landing page background vid

jamespitts · on March 11, 2015

Yes, that is me! We had a lot of good times in downtown Det.

hyperion2010 · on March 11, 2015

I really enjoy the perspective that "now" is often a useful abstraction for certain types of processes. The fact that it turns out that "now" is one hellishly leaky abstraction. My perspective coming from biology is that for many systems the only meaningful type of "clock" is a logical clock. The important thing is not "when," when is used as a proxy for an assumed state of a remote part of the system (even if remote is only 10cm away), logical clocks are the only source that can guarantee that the state of the system is what you expect it to be so that it will perform as expected. Thanks to the many hardware guys who have spent years working out the underlying logic for this we mostly ignore it for things like processors. Now we just need to solve it for arbitrarily large finite delays! This also reminds me of a very funny (or depressing) read on systems engineering by James Mickens [1].

1. http://research.microsoft.com/en-us/people/mickens/thenightw...

Terr_ · on March 12, 2015

Another fun exercise is dealing with players who say "OMG teh netcode sux" for online first-person shooters, especially when they have patently unrealistic expectations for how well the software should break the speed of light.

Sometimes the hardest part is getting them to understand exactly how much of what they take for granted is an illusion... Often even before any packets leave their machine.

Dylan16807 · on March 12, 2015

It's less speed of light at fault than other factors, though. Bufferbloat is terrible, and it's hard to make netcode that reacts well when a few percent of packets are lost or slow.

Terr_ · on March 12, 2015

Sure, other factors predominate, but even in a significantly-more-ideal world, a signal from LA to NY is still going to be a 40ms round-trip. (3940km one-way along the surface of the earth, 200,000km/s signal speed in the glass.)

That's still more than enough time to require algorithmic trickery from games in order to provide the illusion of "real-time" gaming over the internet.

Dylan16807 · on March 12, 2015

Think about how many console games run at 30fps. Then consider that a fully framebuffered game running at 30fps, even under ideal conditions, has 70ms of latency by the time a frame finishes displaying. 40ms isn't a big deal.

Terr_ · on March 12, 2015

Even assuming a perfect computer with zero input latency and zero display latency the network part still matters when it comes to knitting an area like "North America" into a game region. It feeds into longer causal chains.

For example, consider two players in LA both using an NY server over the dedicated zero-overhead fiber-optic network mentioned above, with a relatively modern game-networking stack.

(1) Player A raises an energy-shield t=0 ms. (Client-prediction.)

(2) The server agrees as of t=20.

(3) Player B shot an instant-travel hitscan weapon at t=39, while the victim was still exposed on his screen.

(4) The server gets the shot-message at t=59, and honors it (Latency Compensation), sending a damage/death message out to Player A.

(5) Player A receives the news of his damage/death at t=79.

Even in that world of unattainably-good equipment, that's 80ms of "wait, that doesn't look right".

Dylan16807 · on March 12, 2015

Lag compensation has flaws, but you don't have to lag compensate. You can delay player input until the server's processed it and responded. Even better if you're clever you can delay player input by half the ping time.

Edit: Also don't double your ping by putting relay servers only on one edge of the country.

jbergens · on March 13, 2015

One piece of warning from the article regarding "last write wins":

>..., this is really a "many writes, chosen unpredictably, will be lost" policy—but that wouldn't sell as many databases, would it?

tlarkworthy · on March 11, 2015

Spanner is externally serializable so you do get a 'now'. You just don't know what the agreed now was until after the write.

The idea that there is no 'now' is of course, preposterous, we have very strict laws of physics supporting the concept of now (sans relativity) and eventually our engineering will be able to track that very accurately. Spanner is a step on that journey.

These kinds of 'impossible' articles will appear very dated in 10 years time, as they are really over exaggerating the rules-of-thumb of the previous 10 years.

dragonwriter · on March 11, 2015

> we have very strict laws of physics supporting the concept of now (sans relativity)

Laws of physics "sans relativity" aren't the actual laws of physics in our universe. There very much is no now except the now that is also here. Its quite accurate to say that simultaneity does not exist in distributed systems, and simultaneity is less valid as even an approximation the more widely distributed a system is.

nostrademons · on March 11, 2015

To add some context & numbers to dragonwriter's point, the speed-of-light delay from New York to San Francisco is about 21ms [1]. This is about 5 disk seeks, 1200 random SSD reads, 200K main memory reads (without caching), or 10M CPU cycles [2]. Speed of light delays absolutely matter in a distributed system.

[1] http://chimera.labs.oreilly.com/books/1230000000545/ch01.htm...

[2] http://www.eecs.berkeley.edu/~rcs/research/interactive_laten...

tlarkworthy · on March 11, 2015

that's transmission delays not relativistic effects. The idea of spanner is you have a timestamp of when it was decided to commit. The quorum knows they can't contact each other quickly but they trust each others timestamps and resolve conflicts based on the trusted commit times (which are accurate through hardware). Transmission delays don't undermine the fact there is a very real concept of ordered time in the physical world which is exploitable[1]. Spanner exploits it faster than transmission delays, but with a clock error of 10 ms or something. We have better clocks than that so its probably going to improve...

[1] sans relativity effects which are TINY, and not the limiting factor at the moment.

epistasis · on March 11, 2015

Furthermore, at the timescales (nanoseconds) and distances (meters to thousands of kilometers) at which a distributed application operates, relativity is a much better approximation to reality than the intuitive Newtonian physics.

If 3D chip designs let us take a couple racks of hardware into a few centimeters, then perhaps there will be a more Newtonian "now" possible, but it seems that once we can do that we'll just want to build even bigger systems that can analyze more data, and get back to distributed systems again.

Dylan16807 · on March 12, 2015

Delays do not affect ordering.

Relativity means that observers with different velocities view timings differently.

Mere distances do not matter.

Pick a reference frame for your protocol, and relativity stops being a problem. (Hint: If all endpoints are on Earth you probably won't have enough precision to even need to compensate for relativistic effects.)

robotresearcher · on March 12, 2015

Delays alone do affect the ordering perceived by receivers at different locations.

Dylan16807 · on March 12, 2015

Only for a very misleading definition of ordering. Delays affect the order in which receivers see events, but the receivers can compensate for transmission factors and calculate the exact same times for all events.

Except when relativity kicks in from the observers moving at different speeds. Now events that are outside each other's light cones have no objective order.

robotresearcher · on March 12, 2015

That definition of ordering is of course the standard, everyday definition. And to guarantee correct compensation for transmission factors in the same frame, do we not require perfectly synchronized local clocks: impossible in general?

Dylan16807 · on March 13, 2015

>That definition of ordering is of course the standard, everyday definition.

I would disagree and say the standard definition is the order things happen in, not when you feel the effects. There's rarely a different in practice, though, because people usually experience things via sight at a distance of miles at most. And in one of the few places where it differs, astronomy, people talk all the time about how events 'actually' happened a massive amount of time in the past.

But that doesn't matter. We're talking about relativity. Order, when talking about relativity, clearly means the timing of the events themselves, not the reception of the events.

You need such a definition before you can understand the ways even that can vary per observer. And importantly, the specific ways it can't vary per observer.

>And to guarantee correct compensation for transmission factors in the same frame, do we not require perfectly synchronized local clocks

Not at all. We do need to agree on a same reference frame (because different reference frames give different rates of time and different distances). Thankfully everyone on earth is in the same reference frame to a very very precise degree. And if that's not good enough you can do a few calculations to get nearly perfect compensation for different altitudes and latitudes and such.

Our clocks don't have to be in the same century to say that light A went on before light B. You just take the point in time you saw each fiber light up and subtract the distance of the fiber times propagation speed. And then you could set both clocks to be in sync by using A or B as a reference point, if you wanted.

tlarkworthy · on March 11, 2015

The error bars on relativistic affects we can expect to experience on earth are pico seconds. So we can pack millions of transactions per second if we get our clocks syncs correct and reliable. The future dbs will be not like we have now.

keppy · on March 11, 2015

> Another such area of work is logical time, manifest as vector clocks, version vectors, and other ways of abstracting over the ordering of events. This idea generally acknowledges the inability to assume synchronized clocks and builds notions of ordering for a world in which clocks are entirely unreliable.

Hardware is unreliable. Software is possibly less reliable. We have known that for a long time. The author talks on a conceptual level about logical time, but this concept isn't enough to understand the real challenges & possible solutions of keeping interactions in your system logically ordered in the dimension of time[0].

> You can think of coordination as providing a logical surrogate for "now." When used in that way, however, these protocols have a cost, resulting from something they all fundamentally have in common: constant communication. For example, if you coordinate an ordering for all of the things that happen in your distributed system, then at best you are able to provide a response latency no less than the round-trip time (two sequential message deliveries) inside that system.

Consensus protocols don't provide a logical surrogate for 'now', a log does that. The silver bullet for assuring that your transactions are ordered correctly is immutability[1]--"If two identical, deterministic processes begin in the same state and get the same inputs in the same order, they will produce the same output and end in the same state.[0]" It's important, from the perspective of the implementor, to understand that there are multiple pieces to this puzzle, and that each protocol has very specific details that can make or break the reliability and performance of a distributed system. This is similar to how a small bug in your cryptography code can expose the entire system to threat. Paxos itself can be implemented in a myriad of ways, and each decision the implementor makes must be well researched.

[0]http://engineering.linkedin.com/distributed-systems/log-what...

[1]http://basho.com/clocks-are-bad-or-welcome-to-distributed-sy...

keppy · on March 11, 2015

I suggest reading this article for some insight if you haven't built out a distributed system. Good hands on, practical information: http://engineering.linkedin.com/distributed-systems/log-what...

glittershark · on March 11, 2015

Reminds me of this (absolutely spectacular) talk from 2009 by Rich Hickey: http://www.infoq.com/presentations/Are-We-There-Yet-Rich-Hic...

aristus · on March 11, 2015

Related: http://carlos.bueno.org/2010/04/dismal-guide-to-concurrency....

ajarmst · on March 11, 2015

Doesn't appear to mention Lamport Timestamps (https://en.wikipedia.org/wiki/Lamport_timestamps) (Yes, THAT Lamport), which are one of the most elegant mechanisms to deal with the some of the discussed problems.

macintux · on March 11, 2015

> Another such area of work is logical time, manifest as vector clocks, version vectors, and other ways of abstracting over the ordering of events.

Vector clocks and version vectors are variations on the Lamport clock concept (and that quote is from a paragraph that mentions Paxos, another Lamport invention, cited in the bibliography).