How do you partition geospatial data for horizontal scalability? Seems the best option is less so partitioning and moreso to define geographic regions and then just duplicate the data to query against a region. Otherwise you'll have these weird borders, potentially butting up against the border of 4 tiles for a query, so you would have to query against 4 nodes (for a single geospacial query to get the data from all 4 tiles. I wonder how google places api etc handle this sort of problem.
The other potential solution is to overlap data so a node contains the tiles along its edges from the next and previous nodes as well. Not 100% sure how to handle this, what the best technology is etc.
Any recommendations welcome. I'm probably looking at the problem wrong - eg that a partition key in a columnar database query (eg cassandra) may be the floored lat & long integers getting a column range of the lesser significant digits. But maybe there is another way of looking at the data/problem space?
http://www.se-radio.net/2009/07/episode-141-second-life-and-...
My naive intuition is that sharding on two or more axes with some denormalization makes sense: e.g. sharding on both geospatial location and information layers. Infrequently modified elements that overlap several geospatial regions could be stored alongside each. This implies eventual consistency and high availability. On the other hand, some elements might need higher consistency and therefore have lower availability.
Which is to say that the proper architecture is one that allows accurate metrics and high levels of tuning based on actual use and application requirements.
Good luck.