Roughly 2000 blogs running on a $7/month dyno

helsinkiandrew · on Sept 23, 2020

I would have thought that the number of blogs/total amount of content any site can serve is theoretically infinite.

It’s the rate of requests it can handle that matters, which may explain why I’m getting an “Application error” page at the moment

tyingq · on Sept 23, 2020

There are, I imagine, some tricks for lots of virtual hosts. Like not running out of file descriptors if they each have their own log. Or watching for bloat in config files. Apache, at least, used to bog down on startup with thousands of VirtualHost directives. Or perhaps an acme cert renewal script that runs serially and takes a looong time to finish.

pjc50 · on Sept 23, 2020

mod_vhost has existed for something like 20 years. https://httpd.apache.org/docs/2.4/mod/mod_vhost_alias.html ; back in the day ISPs like Demon were hosting thousands of customer websites off single large machines (on the hardware of 20 years ago!) using this technique.

tyingq · on Sept 23, 2020

Sure, though that page you linked contains some tips on how to avoid performance issues with the directory substitution. My point was that some thought is required to do thousands of vhosts.

87zuhjkas · on Sept 23, 2020

I would pick nginx in the first place.

tyingq · on Sept 23, 2020

Nginx may need tuning for high numbers of server blocks, so it's not completely immune: http://nginx.org/en/docs/http/server_names.html#optimization

dmortin · on Sept 23, 2020

If you wanted it to stay at 31% then it was a mistake to post on HN.

heatmiser · on Sept 23, 2020

It's a dare!

bojanvidanovic · on Sept 23, 2020

bradleyg_ · on Sept 23, 2020

Looks like the code is here for anybody interested. https://github.com/HermanMartinus/bearblog

Also down for me currently.. interested to know which resource was exhausted :)

mmcwilliams · on Sept 23, 2020

This is a great achievement! What kind of traffic does that translate to, if you don't mind sharing?

HermanMartinus · on Sept 23, 2020

It's been sitting at about 300k pageviews a month, which is not that much considering. If there are high-profile blogs running I may have to consider upgrading.

chrismorgan · on Sept 23, 2020

That’s an average of about one request every 8 seconds. That that should be 31% capacity, I find alarming: it’d suggest that each page view is consuming more than two seconds of available system resources, which… wow. (My interpretation of your stated “31% capacity” could be way off base, but if I try to be as generous as I can imagine, this figure would still be comfortably within a decimal order of magnitude.) I would expect something fairly basic with no obvious errors to be able to cope with ten times this load with perfect equanimity, and something that claimed to be pretty efficient while still being Django/Postgres to support at least a hundred times this load, perhaps even a thousand; and if it behaved like a static site generator and produced stuff nginx could serve directly, it’d probably handle between ten and a hundred thousand times the load. (I say all this with no knowledge of the performance/load characteristics of Heroku, but assuming that each Hobby dyno is roughly equivalent to the average $5/month VPS. I could be completely wrong.) These figures I’m estimating also assume that all requests make it to the origin server; if you add Cloudflare to the mix and set your caching properly—and this site should be extremely amenable to aggressive caching—you should be able to drastically reduce the number of requests that make it to the origin server, especially on popular articles.

In short: these numbers aren’t adding up to me, so I suspect that either I’ve misunderstood or misinterpreted something, or there’s some simple low-hanging performance fruit to pick, some simple misconfiguration of e.g. template or data caching or DB connection pooling or something.

mping · on Sept 23, 2020

You know that joke, Bill Gates walks into a bar and suddenly everyone's average worth is >1Million? Hehehe.

My guess is that 300k pvs mostly follow daylight probably on North America only. So you probably have peaks around 21-23 and almost nothing between 01-08am.

And of course, if you write it in C these numbers look awful, but I'd say for run-of-the-mill django they are quite good. I'm willing to bet most django sites you find on heroku free tier don't handle this kind of pressure.

chrismorgan · on Sept 23, 2020

I included that very matter in the order of magnitude flexibility, since I’m not sure what “31% capacity” actually means.

Even being as generous as I can imagine, that’s still each request consuming over 200ms of system resources, which for something like this would be terrible. It’s some years since I’ve done any Django, but I remember measuring a fairly simple django CMS installation (which is waaaay more complex than this) in 2013 taking 50–80ms to serve most pages, with the slowest page averaging something like 180ms; my recollection is that I took those measurements before doing any deliberate performance tuning (I don’t think I’d put the cached template loader in yet—it wasn’t in the default config back then—and I think it may have even not been doing connection pooling).

Given the scope of this app, I figure it should be doing one or two simple and well-indexed SELECT queries only, and maybe an INSERT or UPDATE for analytics, but nothing more with its database. I would think 20ms should be oodles of time to do everything even in Django.

neurostimulant · on Sept 23, 2020

Traffics are peaky. On sites that I manage, the traffic peaks at around afternoon, about 4x higher than the traffic at around midnight. Also, pageviews doesn't includes static assets, which can be as large as 100 static files per pageview on an unoptimized website. 31% capacity during peak is believable to me.

chrismorgan · on Sept 24, 2020

For this particular site, it has no JavaScript and all CSS is inline. Each page load is three requests: one for the HTML, one for a simple behavioural analytics tracker:

  <style>
    body:hover {
      background: url("/hit/12345");
    }
  </style>

And one more because the author was careless with the URLs, and /hit/12345 does a 301 → /hit/12345/ which is the actual thing.

I was taking this into account in my figures.

giancarlostoro · on Sept 23, 2020

Just learned HN now lets you click on links in the original post now, I think before you had to highlight the URL. Also the site seems to be getting back a 520, but after a refresh it works. I'm guessing it's getting more traffic than usual today. :)

throwbacktictac · on Sept 23, 2020

It must be getting hugged too death at the moment.

Nextgrid · on Sept 23, 2020

You can probably get even higher performance for your money by switching to a VPS like a Linode (and on a single machine with limited RAM, SQLite might perform better for read-heavy workloads than running a database server).

Communitivity · on Sept 23, 2020

I hope you post traffic statistics here as a comment when you do a post-mortem of the traffic that brought it down. I'm interested testing a site under a load similar to page 1 HN traffic.

dgellow · on Sept 23, 2020

Not the OP, but that’s what I got a few month ago when my blog was on the front page: https://twitter.com/dgellow/status/1284435410777776128

(It didn't crash though, but it's also not 2k blog instances. It runs on the cheapest digital ocean offering)

ufmace · on Sept 23, 2020

Wow, under 8% peak CPU usage. Are you serving static files?

dgellow · on Sept 23, 2020

The blog in itself is dynamically generated, I have an admin interface with an editor where I can create and modify content and settings, but pages are cached, so the content served is static, yes. That's common for blogs as the content doesn't change that often, I would expect the same for the OP.

3pt14159 · on Sept 23, 2020

If you're already on Heroku, you may want to autoscale your web dyno. It's pretty sweet and keeps the bills low unless you get a surge of traffic :)

HermanMartinus · on Sept 23, 2020

Gotcha, just turned that on. I wish I'd done that this morning. I didn't think this post would make it to the front page of HN.

colesantiago · on Sept 23, 2020

Beautiful. I hope this setup serves you well, and doesn't seem you need kubernetes or any of that overkill complex setup.

Just ship on heroku and you're done.

tyingq · on Sept 23, 2020

He may have spoken too soon :)

nwsm · on Sept 23, 2020

RIP $7 dyno

thrownaway954 · on Sept 23, 2020

the site seems to be throwing an application error from heroku. now correct me if i'm wrong, but isn't cloudflare supposed to be a caching proxy? and if i am correct, why am i even getting to the actual server? is it this dude doesn't have cloudflare configured correctly? if anyone can explain this i would love to know so i don't make the same mistake in the future.

tyingq · on Sept 23, 2020

Some CDN software actively thwarts Cloudflare with things like Pragma: no-cache or Cache-Control: no-cache, or bad Expires settings for static assets. Spend some time learning cache headers and then do some testing to see what actually produces a "hit" to the web server under the cache, and how long the various caches (browser, cloudflare, app, etc) last.

chaitanya · on Sept 23, 2020

If the upstream returns an error over HTTP, the CDN won’t hide it.

gvpmahesh · on Sept 23, 2020

Looks like it went down, after posting it here

cpursley · on Sept 23, 2020

This would be free on render.com or netlify as a static site and wouldn't be suffering the hn kiss of death.

HermanMartinus · on Sept 23, 2020

Well this has become an....ironic post *laughs sadly

It looks like it's back up now.

_mkef · on Sept 23, 2020

Fantastic alternative to Medium.

If I get into blogging again I might try this.

siscia · on Sept 25, 2020

Congrats!

How did you market the project for other people to use?

tuananh · on Sept 24, 2020

i used to run a WordPress site with 5M views per month on $20 instance of DigitalOcean. It was ~7 years ago.

codegeek · on Sept 23, 2020

and now its not running at 31% only :). HN hug of death.

denvercoder904 · on Sept 23, 2020

Until today :)

srebalaji · on Sept 23, 2020

It's down. Mistake to post it in HN :P

psychometry · on Sept 23, 2020

It's a mistake to run on only one dyno. You need at least two to enable the autoscaling feature and your single dyno cycles (reboots) once a day, which is creating downtime you probably didn't know about.

_verandaguy · on Sept 23, 2020

>It's a mistake to run on only one dyno

Unless it's a hobby site offering a free service, which this seems to be.

Projects like this are awesome, and the barrier to making them should be low — gatekeeping by the community like this project needs to have a five nines SLA is frankly counterproductive.

psychometry · on Sept 23, 2020

I'm literally talking about the difference between 7 and 14 dollars per month.

_verandaguy · on Sept 23, 2020

Which, for some people, is significant.

nogabebop23 · on Sept 23, 2020

no one spending 7 dollars a month for a pure hobby would balk at spending $14. The difference is between nothing and anything.

dvfjsdhgfv · on Sept 23, 2020

It's definitely not true. I bought several dozens of these in the last decade for various projects and the price was definitely an important factor for some of them (especially the ones that I hope to be running for decades).

shyn3 · on Sept 23, 2020

Seriously $7 for some people is the difference between eating for a week or not eating and without knowing where OP is it's hard to know.

peteradio · on Sept 23, 2020

Then why would he be spending $7 already on a hobby!??!

masukomi · on Sept 23, 2020

because "you have to spend money to make money" and building a resource people value for free despite the fact that you loose money, and then charging or asking for money is a proven technique for earning way more than $7 a month. It could be a financial stretch that has them on the edge, in hopes of earning a significant reward later.

bzbz · on Sept 23, 2020

It seems like you’re grasping at straws to defend your original statement.

mariushn · on Sept 23, 2020

The difference is between $7 and $50. If you want (auto)scaling, even manual, you need "Production-tier dynos", cheapest being $25. 2 of those is $50.

HermanMartinus · on Sept 23, 2020

Yeah, I'm realizing my mistake now with this spike in traffic. It is a free project which generally has 1/100th of the traffic it's receiveing from HN right now, but the autoscaling is worth the extra $14.

Thanks :)

0goel0 · on Sept 23, 2020

Looks like you need to optimize the DB more https://i.imgur.com/ghbKj9I.png

HermanMartinus · on Sept 23, 2020

Yeah, turns out it was the throughput that was exhausted. I've had to add autoscaling to deal with HN

tyingq · on Sept 23, 2020

Working great now. I do appreciate the uncluttered default look, and the discovery feed is interesting. Looks like HN has bombed your site before :) https://herman.bearblog.dev/the-hacker-news-hug/

HermanMartinus · on Sept 23, 2020

Thanks :) Yeah, I should have learnt by now. I didn't even know I was on the front page today until someone from my local tech Slack asked me how the firefighting was going...about 3 hours too late

dogma1138 · on Sept 23, 2020

[flagged]

colesantiago · on Sept 23, 2020

huh? looks like it's still holding up?

dogma1138 · on Sept 23, 2020

It was down at the time plenty of other comments about this as well.

hnnnnnnngggggg · on Sept 23, 2020

I do appreciate an efficient setup, but irony is even better.

"leopards eating my face" - HermanMartinus

molyss · on Sept 23, 2020

Dunning-Krueger effect maybe ?

Seriously, there’s been literally hundreds of blog-hosting websites. If you really thought none of them were backed by a DB, I’m not sure what to say. It’s not like “optimizing DB calls” is an original idea. It’s probably something 50% of all software engineers have spent some time on. Whether one works for google, facebook, amazon, apple or microsoft, I can guarantee you that if they’ve been on the job for long enough, at some point in their career they had to “optimize expensive calls”

Now, I don’t want to just be an asshole. I think it’s really cool to start something from scratch and push it out there. I, for one, never really got myself to do it. And I think the grit is harder to acquire than the technical skills. Just be careful of not overestimating your real skills, it can be a great turn-off for any half-competent professional who might be considering joining you in a future venture

robjan · on Sept 23, 2020

There's really no need for such a salty comment. OP just described their stack and the current server load. I didn't detect any arrogance.

njsubedi · on Sept 23, 2020

Hey stranger, just push something out there! Just do it, ok?