Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In a great example of the Pareto Principle (80/20), or actually even more extreme, let's only apply this Zopfli optimization if the package download total is equal or more than 1GiB (from the Weekly Traffic in GiB column of the Top 5000 Weekly by Traffic tab of the Google Sheets file from the reddit post).

For reference, total bandwidth used by all 5000 packages is 4_752_397 GiB.

Packages >= 1GiB bandwidth/week - That turns out to be 437 packages (there's a header row, so it's rows 2-438) which uses 4_205_510 GiB.

So 88% of the top 5000 bandwidth is consumed by downloading the top 8.7% (437) packages.

5% is about 210 TiB.

Limiting to the top 100 packages by bandwidth results in 3_217_584 GiB, which is 68% of total bandwidth used by 2% of the total packages.

5% is about 161 TiB.



Packages with >= 20GiB bandwidth == 47 packages totaling 2,536,902.81 GiB/week.

Less than 1% of top 5000 packages took 53% of the bandwidth.

5% would be about 127 TiB (rounded up).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: