I really wonder how an in-place `pg_upgrade` of such small amounts of data would take 30+ minutes.
My experience from a less mission-critical situation where 5 minutes of maintenance are absolutely acceptable is that an in-place `pg_upgrade` with `--link` of a 8 TB database takes less than a minute and will not accidentally lose data or fail to properly configure schema search paths or whatever other mess the article was talking about.
I understand that 30 minutes of downtime are not acceptable. But if it's 5 minutes or less, I would seriously consider an offline upgrade using `pg_upgrade`
And if it takes 30 minutes to hard-link less than 1 TB of data files, you should seriously consider changing hosts because that's absolutely unacceptable performance.
The Lyft team reported 30 minutes for their 30TB database. Our db took about 15 minutes. In the essay we wrote:
> So we cloned our production database and tested an in-place upgrade. Even with our smaller size, it took about 15 minutes for the clone to come back online.
I don't think pg_upgrade takes the whole time. Some of it is overhead of AWS managed database service where it's creating a snapshot before and after, applying new config, spinning for no apparent reason
Yeah we just did it with the --link option on a 6TB database and it took like 30 seconds. Something has to be off with their OS settings or disk speeds.
The main challenge with that is running an ANALYZE on all the tables though, that took like 30 minutes during which time the DB was unusable
We did use the --analyze-in-stages option, I think our data model is just not optimal. We have a lot of high frequency queries hitting very large tables of .5 to 1 billion rows. Proper indexing makes them fast but until all the stats are there, the frontend is unusable.
I really wonder how an in-place `pg_upgrade` of such small amounts of data would take 30+ minutes.
My experience from a less mission-critical situation where 5 minutes of maintenance are absolutely acceptable is that an in-place `pg_upgrade` with `--link` of a 8 TB database takes less than a minute and will not accidentally lose data or fail to properly configure schema search paths or whatever other mess the article was talking about.
I understand that 30 minutes of downtime are not acceptable. But if it's 5 minutes or less, I would seriously consider an offline upgrade using `pg_upgrade`
And if it takes 30 minutes to hard-link less than 1 TB of data files, you should seriously consider changing hosts because that's absolutely unacceptable performance.