> for example... don't commit DB transactions, send out emails or post transactions onto a blockchain until you know everything went through. Exceptions mean rollback, a lot of the time.
But what if you need to send emails AND record it in a DB?
I had the same question, actually; it is very common to perform multiple point-of-no-return IO in a workflow, so deferring all IO into a specific spot does not, in practice, bring any advantages.
It does. You queue ALL of these side effects (simply tasks whose exceptions don't rollback your own task) until the end. Then you can perform them all, in parallel if you wish.
> If any of the required subtasks fail, you don’t do the side effects. You ROLLBACK.
I'm afraid I am still not seeing the advantage here: the subtasks that can fail are IO almost exclusively. If the email is already sent when the DB update fails, that email can't be recalled.
Other than hash/signature verification, just what sort of subtasks did you have in mind that can fail and aren't IO?
If your task fails, yet you sent an email that it succeeded, that is bad.
You should wait until all your subtasks finish before sending the email.
Async subtasks typically ARE i/o, whether over a network or not.
The email shouldn't be sent if the DB update fails. The whole point is you wait until everything succeeds, before sending the email.
If your subtasks cause you to write some rows as if they succeeded, but subsequent subtasks fail, that is bad. You have to rollback your changes and not commit them.
If you charge a user even though they didn't get the thing they paid for, that's bad. Yes you can refund them after a dispute, but it's better not to have charged them in the first place.
The point is this: any subtasks that can cause your main task to fail should be processed BEFORE any subtasks that cannot cause your main task to fail.
> If your task fails, yet you sent an email that it succeeded, that is bad. You should wait until all your subtasks finish before sending the email.
A common sequence is "Send email, then update DB with the new count of emails sent". Doesn't matter which way you reorder them, there is no advantage to queuing those two tasks to go at the end because if the first success and the second fails you have still done half an atomic task.
> The point is this: any subtasks that can cause your main task to fail should be processed BEFORE any subtasks that cannot cause your main task to fail.
Do you have any examples that aren't IO? Because IO can always fail, and I am still wondering what sort of workflow (or sequence of tasks) you have in mind where you will see any advantage to queuing all the IO to run at the end.
If you have pure computation subtasks (such as checking a hash or signature), then sure, do the IO only after you have verified the check. Have you any idea how rare that workflow is other than for checksumming?
What workflow have you in mind, where we see a practical advantage from queuing IO to run at the end?
But what if you need to send emails AND record it in a DB?