One of the "buzziest" subjects of conversation in the Big Data scene this past year has been Spark. The powerful open source processing engine developed in 2009 has gotten some great traction and many have covered the stories of community adoption and growth.
As one would expect though, some of the buzz spun out of control, so much, some went on to write that "Hadoop was dead" or that "Spark would eventually replace Hadoop".
Such reports are misinformed, overly sensational and inaccurate. If you have a few minutes, I propose you watch the below 2 minute-video on the key role that Spark plays and its relationship with Hadoop. In this interview, we talk about security, speed and other key items that make Hadoop relevant to business users.
For a deeper understanding on the topic, check out the great piece that our VP of Product Management, Josh Klahr, authored for ReadWrite last week. My favorite passage is below (the full piece is here):
"We need to stop playing Spark and Hadoop off each other and understand how they will coexist. Hadoop will continue to be used as a platform for scale-out data storage, parallel processing, and clustered workload management. Spark will continue to be used for both batch-oriented and interactive scale-out data-processing needs."