Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Where do Github users live? WebGL visualization (aasen.in)
164 points by hawkharris on Sept 30, 2013 | hide | past | favorite | 88 comments


I've been fiddling around with a realtime(ish) geo visualization of github updates. Fans of the movie WarGames might enjoy the theme:

http://streams.robscanlon.com/github

Its a work in progress!


I love that you're using IRC as the messaging protocol.


Way more fun than redis. I actually like to sit in the channel and watch the data fly by (its color coded so its actually quite pretty). /connect irc.robscanlon.com /join #github or /join #wikipedia. You can create your own channel and stream your own data as well (it'll automatically create a url on my site for you with that wargame theme). I'm working on other themes that you can choose from.

And please don't bug the web-* bots that are hanging out in the rooms... they are the actual web servers and are a bit busy at the moment ;-)


Maybe a button that adds some ominous music :)

http://www.youtube.com/watch?feature=player_detailpage&v=uCW...


Your US labels are displayed on the map of Australia and vice-versa unless I'm misreading it.


Unfortunately I didn't have a chance to finish getting this completely working properly in FF. I assume that's the browser you were using (sorry about that).


Oh, no problem. It's a great web app. I just thought you probably accidentally reversed a couple of lines of code and would want to know about it. And, yes, I was using FF 24.


This is a cool take on the data!


This is really cool!


cool!


It stacks what I assume to be all unresolved Russian locations to a column the size of Saint Petersburg in the middle of Siberia.

Same thing with China with a large spike in the middle of Gobi desert.

Same with Canada - apparently some lumberjacks code when it snows too hard and they can't get out!

I wonder if there is a similar column in the USA and where is it located.

India is surprisingly bare. Come on, you can do better than that.

P. S. Went to update my Location: in github.


Author here, sorry about the spikes where people do not actually live. A lot of users put only their country and not city, and I didn't want to throw that data away.

I probably should've checked the specificity of the location and only thrown out the data for large countries like Russia and China, or distributed it across all other data points in the country.

Interesting project if you want to fork it!


Distributing it across should work well. Throwing out not so much because you're losing a varying big chunk of data.


It would be nice if there were an easy way to hit the population center instead of geographic center of a nation in webgl-globe, at least. That big spike in the middle of Siberia puzzled me at first.


This is the same spot in Google maps (pins), for example if you just type "Quebec" it's hits the same area hundreds of miles from the nearest town.


Take all the unknowns and spread them evenly among all the known cities.


This would falsely flatten the results.


Ah, I was wondering why there was a large column in the middle of Australia. It actually lines up reasonably well with the only real population center there (Alice Springs), but at ~25k people, it seemed unlikely.


I was more surprised by the smaller spike in the middle of Western Australia.


The "unresolved" column for USA appears to be in Wichita Kansas, or around that area.



How'd you get the state borders? That's exactly what I need.


http://www.gadm.org/

That has the administrative boundaries for every country.

The relevant project from the US government itself is here: https://www.census.gov/geo/maps-data/data/tiger.html

You might want to run a simplifying algorithm on the shapes first though, they're the official boundary coordinates so there's an absurd number of points on coastlines and the like, making point-in-polygon lookups very slow.

It has the stuff for doing your own geocoding as well. You can install it as a postgis extension and run geocoding SQL queries in postgres. It's an absolute joy.


Probably an easier solution, but I edited globe.js in the JS Console to use one with the state borders[1] that I found found used on another globe

[1]: https://s3.amazonaws.com/SocialHarvest/world.jpg


submit a PR?


I demand Kenya province borders too, then.


Ah! This makes sense. My first thought was "wow, impressive showing in St. Louis..."


Is there a big AT&T datacenter there or something? I've shown up as coming out of there on so many Geo-IP sites over the years, DSL, cellular, T1's...


I'd wager it's just the geographical center of the US.


This also prompted me to add my location in github.

It would be interesting to see if there's a non-negligible difference to run this again in a few days, after everyone who gets word of it also adds in their location.


Here's a quick 2d visualization of the data using D3.js. The advantage of a 2d projection is you can see all parts of the globe at once and have quantitative encodings that aren't distorted by projection effects. In this case, color+radius encode the magnitude variable, but there is overlap of adjacent circles.

http://bl.ocks.org/syntagmatic/6769077


Very cool. Also, I didn't know about bl.ocks.org!


It takes a minute, for my brain at least, to realize that the darker bits are the landmasses.

I unconsciously expected the opposite.


I had the same problem. Looked around for about 10 seconds wondering why I couldn't recognise anything.


Oh man, I thought it was just me!! Sat here for a while in quiet shame after I realized it and thought there must be something wrong with me :P


I think it's because the seas are textured while the land is all the same color, which is the opposite of how it is in reality.


Definitely poorly design.


Good 3d visualization is tricky, this is a good example of a 3d fail. The 3d here adds more trouble than goods. It makes it harder to see the whole set of data and it is also harder to compare.


As someone else mentioned, there's a 2D version as well (http://bl.ocks.org/syntagmatic/6769077).

Personally I loved the 3D version. I found it interesting and fun.


Hard to tell which are continents, and which are oceans.


It would be very exciting to normalize the Github user numbers by the population in the area they map to. Right now especially in Europe it seems to pretty much seems to map to population numbers of cities.



Is it really? If you look past Western Europe and North America it is definitely not. And even there you can see things, like that spike in Bay Area. Except for population size, there seem to be at least two more major factors at play: wealth and English proficiency. Even in Europe they are clearly visible: Oslo and Sofia are about the same population, Rome is at least three times more populated than Edinburgh. So, no, not really.


+1

It does correlate quite a bit to city population, although, the spike in what I presume to be Austin, relative to Dallas or Houston is quite telling.


That was the biggest surprise to me. I know Houston has over twice as many people but I didn't expect the numbers to be that large of a difference.


i thought the same when looking at the US, but then I noticed Los Angeles. L.A. is the 2nd largest city in the US, and the smaller spike, compared to its population, is consistent with what I've heard about the tech scene there.


It took a while until I realized that the black parts were land and gray was water.

Am I the only one who feel like it should be the reverse?


No, it was the same for me. Seas are dark blue while landmasses are white near the poles, green in most other places and sandy yellow in deserts - light colours. It makes me expect lands to be light and seas to be dark, even on a stylised monochrome globe.


Nope. I was wondering what kind of planet it was.


Any thoughts what the European location that seems to be southwest of Berlin. Seems to be between Hannover and Leipzig, but there are no major cities there, and it's strange that its bigger than Berlin.

EDIT: Could be people who just wrote "Germany".


Well, it could be my coworkers here in Göttingen, but given that this lovely little town is about quite the middle of the country, I'm going with your post-edited hypothesis. Thanks for the hint, though, I was scratching my head about some of the other points I could not attribute to any city known to me.


>EDIT: Could be people who just wrote "Germany".

I think that's it - there's a similar peak right in the middle of Australia with the nearest town being Alice Springs, but I highly doubt commits on the level of Brisbane come from there.


Hmm, would have been nice if the great lakes were still in place in North America...they're very useful landmarks for us frozen chosen in the American tundra (read: New York, Michigan, Wisconsin, Minnesota, South Ontario).


Hmm, I'm little suspicious of the accuracy of these numbers...

I looked at Japan, and there's a very high bar for Tokyo, which is expected (and a shorter line right next to it which is probably Yokohama)—and then there's a slightly shorter but still pretty extreme bar that looks like it's smack dab in the middle of Nagano prefecture, which is relatively speaking, the boonies. There seems to be almost nothing for Osaka and other big Japanese cities....


I find it very hard to read because you cannot see the height of the bar from above. And when you can see the bar, then you can't see the country.


This is pretty cool! However, I'm going to put on my Edward Tufte hat and point out that it's probably best to serve up this data in a static table. It always disappoints me to see sparse, categorical data visualized on a map since there's no additional context one achieves from the geospatial components. OP: maybe you could explore visualizing continuous fields, such as the Earth gravity field: http://www.csr.utexas.edu/grace/gallery/gravity/ggm01_asia_f...


While it looks cool, it only takes you a second to realize that this is kind of a worthless visualization because to see the height of the bar you have to spin the globe so the line is parallel to the view plane. If you are looking straight down, you only get the color.

Remember, 3d isn't 3d, it's 3d projected on a 2d plane. And if this were real steroscopic 3d, then each of these datapoints would be like the 3d gimmics in movies like a spear right to your face.

That said I've got this bookmarked because I think at it's base it could be useful for other visualizations.


I also used the same data in a 3D visualization with the Oculus Rift. You can check out a demo here: http://youtu.be/dgMOdzfoPgs?t=31s


Wow - nice work! Are there really that many contributors in Alice Springs or is the big spike in the middle of the country just a generic "Australia" stat?


Yeah it seems odd that there are more people in Alice Springs than in Perth. Must just be "Australia".


I don't think it's Alice Springs. It's probably all users that wrote "Australia" as location + Alice Springs is actually a little northern that this bar.


If you're wondering why Andreessen Horowitz felt github was worth the 100M dollar investment, look no further than India and China. These massive populations are yet to be tapped for the most part leaving a great opportunity for future growth. For a company that has been profitable from the get-go, things are only going to get a lot better in the long run.

Awesome job on the visualization btw.


I know we're all guessing here, because I assume you don't work at Andreessen Horowitz, but I don't agree with the bet you're making. When it comes to online services, China has a tendency to adopt their own versions of services started elsewhere. For example, Google is dying hard in China [1]. The players in the social media scene are similarly unfamiliar to users outside of China, with Weibo leading (I'm lacking citation here).

If Github sees China as an opportunity, they'd better move quick, and look very deeply at what has allowed services like Baidu and Weibo to trounce American-born competitors in their market.

http://www.techinasia.com/china-baidu-qihoo-google-search-ma...


It looks like this awesome visualization was done with Three.js, but I wanted to flag Ceasium[1] to anyone wanted to do 3D 'map stuff' in a browser. It has some great features, like built in support for CRS systems (hard), WMS/WMTS client. We got a lot done with it really fast.

[1]http://cesium.agi.com/


My GitHub data challenge entry was along a similar vein. The data is stale now, but it does break it down by project and language which can form some interesting hotspots.

The globe is a really nice touch and you did a good job on canonicalization/grouping/binning.

http://davidfischer.github.io/gdc2/


Seems curious to me that (at least by visual comparison) London looks to be about 30% larger than NYC, and NYC appears to be about the same as Paris. By population, you'd expect NYC = London > Paris. Is NYC really still that far behind in its tech scene? Does Paris have that large a tech scene? Or am I just reading the chart poorly?


It's not showing my data from Antarctica.


It only shows the 1000 most common locations. Sorry!


The real story this map tells is about the Digital Divide between the First World and Third World.


The usual problem with this kind of visualization is that, as you add more data, you get what is basically just a population density map.

What would be interesting is to see where there are more Github users than you would expect by pure density of internet users.


This is not the case. Examples: NYC vs SF and Madrid vs Berlin. Well, Madrid is bigger and more dense than most of the cities with more github users in this map.


Great visualization, but it really needs labels or some other means of figuring out what city the columns are referring to.

What is the third large spike on the East Coast? I assume the two more Northern spikes are Boston and New York.


Washington, DC


Lane, I don't know if you remember me, but I saw your name and instantly thought "Of course Lane would build something this cool."

AWESOME job with this, it's both fascinating and well laid out!


Thanks Leland!


More visualizations of the same type can be found here:

http://www.chromeexperiments.com/globe (some interesting ones) :)


Very awesome visualization work! Though it shows bars in the middle of nowhere (see France, UK...China too maybe), probably users that haven't mentioned any city.


Low quality texture map with seams and bad filtering at the poles. Big perceptible delay on camera movements. WebGL doesn't make a subpar gfx demo cool.


This is neat. Would it be possible to add boundary lines? It's difficult to determine where Louisville Kentucky is without them.


As with all similar such counts, it would be nice to be able to see them normalized against the size of the local population.


This might be the most beautiful visualization I've ever seen ... well done.


For some reason the rotation animation and control is incredibly satisfying.


Nice! Would be cool to have a tooltip over the lines to show the location.


Cool stuff. That's some gnarly JPEG around the coasts, though.


Makes me want to commit some code in antarctica...


I guess, London is the tech centre of Europe.


Antartica you are disappoint :*(


Very awesome and interesting.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: