Except cloud computing. HDFS is only useful if you own hardware. HDFS is still useful for these workloads, but calling it "Hadoop" when you are using "Spark" as the execution engine doesn't make sense. You are using Spark, which depends on HDFS for local installs only. And if you can be in the cloud, you should.
It is also worth noting that many people use Spark against a database like Cassandra and not HDFS. So it isn't even universal for local installs.
I was a Hadoop evangelist, but its time has passed. It is a foundation, not the tool you use to get work done.
It is also worth noting that many people use Spark against a database like Cassandra and not HDFS. So it isn't even universal for local installs.
I was a Hadoop evangelist, but its time has passed. It is a foundation, not the tool you use to get work done.