There was a point in history when the total amount of digital data stored worldw...

qingcharles · on Jan 20, 2024

That reminds me of a calculation I did which showed that my desktop PC would be more powerful than all of the computers on the planet combined in like 1978 :D

plagiarist · on Jan 20, 2024

My phone has more computation than anything I would have imagined owning, and I sometimes turn on the screen just to use as a quick flashlight.

qingcharles · on Jan 20, 2024

Haha.. imagine taking it back to 1978 and showing how it has more computing power than the entire planet and then telling them that you mostly just use it to find that thing you lost under the couch :D

fiddlerwoaroof · on Jan 20, 2024

It’s at least 20ish years ago: I remember an old sysadmin talking about managing petabytes before 2003

_ugfj · on Jan 20, 2024

Must be much more than 20ish years, some 2400 ft reels in the 60s stored a few megabytes, you only need 100 000s of those to reach a terabyte. https://en.wikipedia.org/wiki/IBM_7330

> a single 2400-foot tape could store the equivalent of some 50,000 punched cards (about 4,000,000 six-bit bytes).

In 1964 with the introduction of System/360 you are going a magnitude higher https://www.core77.com/posts/108573/A-Storage-Cabinet-Based-...

> It could store a maximum of 45MB on 2,400 feet

At this point you only need a few ten thousand reels in existence to reach a terabyte. So I strongly suspect the "terabyte point" was some time in the 1960s.

aspenmayer · on Jan 20, 2024

Those numbers seem reasonable in that context. I first started using BitTorrent around that time as well, and it wasn't uncommon to see many users long-term seeding multiple hundreds of gigabytes of Linux ISOs alone.

Here’s another usage scenario with data usage numbers I found a while back.

> A 2004 paper published in ACM Transactions on Programming Languages and Systems shows how Hancock code can sift calling card records, long distance calls, IP addresses and internet traffic dumps, and even track the physical movements of mobile phone customers as their signal moves from cell site to cell site.

> With Hancock, "analysts could store sufficiently precise information to enable new applications previously thought to be infeasible," the program authors wrote. AT&T uses Hancock code to sift 9 GB of telephone traffic data a night, according to the paper.

https://web.archive.org/web/20200309221602/https://www.wired...

fiddlerwoaroof · on Jan 20, 2024

Yeah, at the other end of the scale, it sounds like Apple is now managing exabytes: https://read.engineerscodex.com/p/how-apple-built-icloud-to-...

This is pretty mind-boggling to me.

ComputerGuru · on Jan 20, 2024

I archived Hancock here over a decade ago, stumbled upon it via HN at the time if I’m not mistaken: https://github.com/mqudsi/hancock

aspenmayer · on Jan 20, 2024

That’s pretty cool. I remember someone on that repo from while back and was surprised to see their name pop up again. Thanks for archiving this!

Corinna Cortes et al wrote the paper(s) on Hancock and also the Communities of Interest paper referenced in the Wired article I linked to. She’s apparently a pretty big deal and went on to work at Google after her prestigious work at AT&T.

Hancock: A Language for Extracting Signatures from Data

https://scholar.google.com/citations?view_op=view_citation&h...

Hancock: A Language for Analyzing Transactional Data Streams

https://scholar.google.com/citations?view_op=view_citation&h...

Communities of Interest

https://scholar.google.com/citations?view_op=view_citation&h...

_ugfj · on Jan 20, 2024

I raised this to retro se and https://retrocomputing.stackexchange.com/a/28322/3722 notes a TiB of digital data likely was reached in the 1930s with punch cards.