Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Roughly 2000 blogs running on a $7/month dyno
95 points by HermanMartinus on Sept 23, 2020 | hide | past | favorite | 63 comments
Bear Blog (https://bearblog.dev) has just hit 2000 blogs.

Due to the optimization of DB calls, the text-only content (no static files) and Cloudflare's awesome CDN, my single Heroku dyno is running at only 31% capacity.

Full stack: - Django/python running on a Hobby dyno - All HTML content is generated using Django templates - Postgres also on a Hobby dyno - Cloudflare CDN (free) - Sendgrid for confirmation emails - LetsEncrypt SSL certs

Thought this may be interesting :)



I would have thought that the number of blogs/total amount of content any site can serve is theoretically infinite.

It’s the rate of requests it can handle that matters, which may explain why I’m getting an “Application error” page at the moment


There are, I imagine, some tricks for lots of virtual hosts. Like not running out of file descriptors if they each have their own log. Or watching for bloat in config files. Apache, at least, used to bog down on startup with thousands of VirtualHost directives. Or perhaps an acme cert renewal script that runs serially and takes a looong time to finish.


mod_vhost has existed for something like 20 years. https://httpd.apache.org/docs/2.4/mod/mod_vhost_alias.html ; back in the day ISPs like Demon were hosting thousands of customer websites off single large machines (on the hardware of 20 years ago!) using this technique.


Sure, though that page you linked contains some tips on how to avoid performance issues with the directory substitution. My point was that some thought is required to do thousands of vhosts.


I would pick nginx in the first place.


Nginx may need tuning for high numbers of server blocks, so it's not completely immune: http://nginx.org/en/docs/http/server_names.html#optimization


If you wanted it to stay at 31% then it was a mistake to post on HN.


It's a dare!


LOL


Looks like the code is here for anybody interested. https://github.com/HermanMartinus/bearblog

Also down for me currently.. interested to know which resource was exhausted :)


This is a great achievement! What kind of traffic does that translate to, if you don't mind sharing?


It's been sitting at about 300k pageviews a month, which is not that much considering. If there are high-profile blogs running I may have to consider upgrading.


That’s an average of about one request every 8 seconds. That that should be 31% capacity, I find alarming: it’d suggest that each page view is consuming more than two seconds of available system resources, which… wow. (My interpretation of your stated “31% capacity” could be way off base, but if I try to be as generous as I can imagine, this figure would still be comfortably within a decimal order of magnitude.) I would expect something fairly basic with no obvious errors to be able to cope with ten times this load with perfect equanimity, and something that claimed to be pretty efficient while still being Django/Postgres to support at least a hundred times this load, perhaps even a thousand; and if it behaved like a static site generator and produced stuff nginx could serve directly, it’d probably handle between ten and a hundred thousand times the load. (I say all this with no knowledge of the performance/load characteristics of Heroku, but assuming that each Hobby dyno is roughly equivalent to the average $5/month VPS. I could be completely wrong.) These figures I’m estimating also assume that all requests make it to the origin server; if you add Cloudflare to the mix and set your caching properly—and this site should be extremely amenable to aggressive caching—you should be able to drastically reduce the number of requests that make it to the origin server, especially on popular articles.

In short: these numbers aren’t adding up to me, so I suspect that either I’ve misunderstood or misinterpreted something, or there’s some simple low-hanging performance fruit to pick, some simple misconfiguration of e.g. template or data caching or DB connection pooling or something.


You know that joke, Bill Gates walks into a bar and suddenly everyone's average worth is >1Million? Hehehe.

My guess is that 300k pvs mostly follow daylight probably on North America only. So you probably have peaks around 21-23 and almost nothing between 01-08am.

And of course, if you write it in C these numbers look awful, but I'd say for run-of-the-mill django they are quite good. I'm willing to bet most django sites you find on heroku free tier don't handle this kind of pressure.


I included that very matter in the order of magnitude flexibility, since I’m not sure what “31% capacity” actually means.

Even being as generous as I can imagine, that’s still each request consuming over 200ms of system resources, which for something like this would be terrible. It’s some years since I’ve done any Django, but I remember measuring a fairly simple django CMS installation (which is waaaay more complex than this) in 2013 taking 50–80ms to serve most pages, with the slowest page averaging something like 180ms; my recollection is that I took those measurements before doing any deliberate performance tuning (I don’t think I’d put the cached template loader in yet—it wasn’t in the default config back then—and I think it may have even not been doing connection pooling).

Given the scope of this app, I figure it should be doing one or two simple and well-indexed SELECT queries only, and maybe an INSERT or UPDATE for analytics, but nothing more with its database. I would think 20ms should be oodles of time to do everything even in Django.


Traffics are peaky. On sites that I manage, the traffic peaks at around afternoon, about 4x higher than the traffic at around midnight. Also, pageviews doesn't includes static assets, which can be as large as 100 static files per pageview on an unoptimized website. 31% capacity during peak is believable to me.


For this particular site, it has no JavaScript and all CSS is inline. Each page load is three requests: one for the HTML, one for a simple behavioural analytics tracker:

  <style>
    body:hover {
      background: url("/hit/12345");
    }
  </style>
And one more because the author was careless with the URLs, and /hit/12345 does a 301 → /hit/12345/ which is the actual thing.

I was taking this into account in my figures.


Just learned HN now lets you click on links in the original post now, I think before you had to highlight the URL. Also the site seems to be getting back a 520, but after a refresh it works. I'm guessing it's getting more traffic than usual today. :)


It must be getting hugged too death at the moment.


You can probably get even higher performance for your money by switching to a VPS like a Linode (and on a single machine with limited RAM, SQLite might perform better for read-heavy workloads than running a database server).


I hope you post traffic statistics here as a comment when you do a post-mortem of the traffic that brought it down. I'm interested testing a site under a load similar to page 1 HN traffic.


Not the OP, but that’s what I got a few month ago when my blog was on the front page: https://twitter.com/dgellow/status/1284435410777776128

(It didn't crash though, but it's also not 2k blog instances. It runs on the cheapest digital ocean offering)


Wow, under 8% peak CPU usage. Are you serving static files?


The blog in itself is dynamically generated, I have an admin interface with an editor where I can create and modify content and settings, but pages are cached, so the content served is static, yes. That's common for blogs as the content doesn't change that often, I would expect the same for the OP.


If you're already on Heroku, you may want to autoscale your web dyno. It's pretty sweet and keeps the bills low unless you get a surge of traffic :)


Gotcha, just turned that on. I wish I'd done that this morning. I didn't think this post would make it to the front page of HN.


Beautiful. I hope this setup serves you well, and doesn't seem you need kubernetes or any of that overkill complex setup.

Just ship on heroku and you're done.


He may have spoken too soon :)


RIP $7 dyno


the site seems to be throwing an application error from heroku. now correct me if i'm wrong, but isn't cloudflare supposed to be a caching proxy? and if i am correct, why am i even getting to the actual server? is it this dude doesn't have cloudflare configured correctly? if anyone can explain this i would love to know so i don't make the same mistake in the future.


Some CDN software actively thwarts Cloudflare with things like Pragma: no-cache or Cache-Control: no-cache, or bad Expires settings for static assets. Spend some time learning cache headers and then do some testing to see what actually produces a "hit" to the web server under the cache, and how long the various caches (browser, cloudflare, app, etc) last.


If the upstream returns an error over HTTP, the CDN won’t hide it.


Looks like it went down, after posting it here


This would be free on render.com or netlify as a static site and wouldn't be suffering the hn kiss of death.


Well this has become an....ironic post *laughs sadly

It looks like it's back up now.


Fantastic alternative to Medium.

If I get into blogging again I might try this.


Congrats!

How did you market the project for other people to use?


i used to run a WordPress site with 5M views per month on $20 instance of DigitalOcean. It was ~7 years ago.


and now its not running at 31% only :). HN hug of death.


Until today :)


It's down. Mistake to post it in HN :P


It's a mistake to run on only one dyno. You need at least two to enable the autoscaling feature and your single dyno cycles (reboots) once a day, which is creating downtime you probably didn't know about.


>It's a mistake to run on only one dyno

Unless it's a hobby site offering a free service, which this seems to be.

Projects like this are awesome, and the barrier to making them should be low — gatekeeping by the community like this project needs to have a five nines SLA is frankly counterproductive.


I'm literally talking about the difference between 7 and 14 dollars per month.


Which, for some people, is significant.


no one spending 7 dollars a month for a pure hobby would balk at spending $14. The difference is between nothing and anything.


It's definitely not true. I bought several dozens of these in the last decade for various projects and the price was definitely an important factor for some of them (especially the ones that I hope to be running for decades).


Seriously $7 for some people is the difference between eating for a week or not eating and without knowing where OP is it's hard to know.


Then why would he be spending $7 already on a hobby!??!


because "you have to spend money to make money" and building a resource people value for free despite the fact that you loose money, and then charging or asking for money is a proven technique for earning way more than $7 a month. It could be a financial stretch that has them on the edge, in hopes of earning a significant reward later.


It seems like you’re grasping at straws to defend your original statement.


The difference is between $7 and $50. If you want (auto)scaling, even manual, you need "Production-tier dynos", cheapest being $25. 2 of those is $50.


Yeah, I'm realizing my mistake now with this spike in traffic. It is a free project which generally has 1/100th of the traffic it's receiveing from HN right now, but the autoscaling is worth the extra $14.

Thanks :)


Looks like you need to optimize the DB more https://i.imgur.com/ghbKj9I.png


Yeah, turns out it was the throughput that was exhausted. I've had to add autoscaling to deal with HN


Working great now. I do appreciate the uncluttered default look, and the discovery feed is interesting. Looks like HN has bombed your site before :) https://herman.bearblog.dev/the-hacker-news-hug/


Thanks :) Yeah, I should have learnt by now. I didn't even know I was on the front page today until someone from my local tech Slack asked me how the firefighting was going...about 3 hours too late


[flagged]


huh? looks like it's still holding up?


It was down at the time plenty of other comments about this as well.


I do appreciate an efficient setup, but irony is even better.

"leopards eating my face" - HermanMartinus


Dunning-Krueger effect maybe ?

Seriously, there’s been literally hundreds of blog-hosting websites. If you really thought none of them were backed by a DB, I’m not sure what to say. It’s not like “optimizing DB calls” is an original idea. It’s probably something 50% of all software engineers have spent some time on. Whether one works for google, facebook, amazon, apple or microsoft, I can guarantee you that if they’ve been on the job for long enough, at some point in their career they had to “optimize expensive calls”

Now, I don’t want to just be an asshole. I think it’s really cool to start something from scratch and push it out there. I, for one, never really got myself to do it. And I think the grit is harder to acquire than the technical skills. Just be careful of not overestimating your real skills, it can be a great turn-off for any half-competent professional who might be considering joining you in a future venture


There's really no need for such a salty comment. OP just described their stack and the current server load. I didn't detect any arrogance.


Hey stranger, just push something out there! Just do it, ok?




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: