I think your problem might be not expanding wordpress. PHP will gladly do the above through wordpress plugins and cron jobs. I built something similar and ended up going that way. The system is still running to this day. Sure, its not hip but gets the job done with miniml fuss and makes money.
You are right that PHP can do the feed checking part but I wanted to use something with easy async/concurrency out of the box (and wanted to learn Elixir instead of using Node.)
One hard tech limit is that with 50k podcasts, 4million+ episodes, search definitely doesn't work well. Not just WP, but SQL itself. Hence Elasticsearch. I also plan to work on recommendations, etc. so will need probably to be exporting SQL data into other systems anyway for making the "people who liked this also liked this" kinda things.
Also I kinda lied about using the WP API--that's how I built the system initially (and will switch to it moving forward), but to import the first few million posts from the content of the feeds, I just used wp_insert_post against the DB of new entries that Elixir fetched (I posted the code I used here: http://wordpress.stackexchange.com/a/233786/30906).
I also plan to write the whole front-end in React (including server side rendering) so will have to figure out how to get that done. Would probably use the WP-API with a Node.js app in front of it, will look into hypernova from AirBNB. So probably more usage of WP API accessed by another service...
I hope you are not doing all of this alone. I'd try and keep things as simple as possible within a monolith and then improve as needs increase. Good luck :)
I generally write what I consider monolithic Django apps. I would add in Haystack (a search module for Django) and configure it to use Elasticsearch to overcome the problems you describe.
It doesn't sound like microservices are needed, just adding in the appropriate tech for the job.
That's an incisive question. My impression, which may be mistaken, is that a cronjob would be used to move data (pages compiled from templates, chart images, etc.) into the PHP host on a "batch" basis. To me, that implied the existence of other systems that handle the data in their own way, but I guess in this thread the salient difference between micro and mono is that the former connects components via a web stack. Are there more agile interfaces available for cronjobs? If instead we're only considering transformations of data already resident on the host (as what, flat files?), I don't imagine that cronjobs are the best solution available.