Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Modern websites execute JavaScript that render DOM nodes that are displayed on the browser.

For example if you look at this site on the browser https://pota.quack.uy/ and do `curl https://pota.quack.uy/` do you see any of the text that is rendered in the browser as output of the curl command?

You don't, because curl doesn't execute JavaScript, and that text comes from JavaScript. One way to fix this problem, is by having a Node.js instance running that does SSR, so when your curl command connects to the server, a node instance executes JavaScript that is streamed/served to curl. (node is running a web server)

Another way, without having to execute JavaScript in the server is to crawl yourself, let's say in localhost, (you do not even need to deploy) then upload the result to a web server that could serve the files.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: