The plot thickens in Apache Spark ecosystem as IBM enters Spark ecosystem at full speed. IBM dedicates 3500 developers to Apache Spark and has announced a number of partnerships with analytics companies.
What does IBM’s strong entry mean to Spark community?
IBM Partners with Data Modeling and Visualization Companies
IBM announced several partners for Spark co-operation in modeling and visualization:
These partnerships indicate that Spark will be integrated with self-service analytics and data visualization tools.
Spark Gets Integrated in Analytics Pipeline
Data analytics pipeline will be affected in various ways. For example, a recent startup Arcadia Data adds data visualization on top of Hadoop and now plans to do the same for Spark. The same applies to ClearStory Data.
Datameer with Tableau will have a shot at self-service analytics, integrated with Spark. This is an important annoucement which extends Spark usage for business users. Interactive analytics will eventually become reality.
Spark development tools also are covered as IBM announced partnership with TypeSafe. TypeSafe provides application development platform for Scala which is the main language for Spark.
Spark Education Spreads Out Rapidly
Spark education will get a boost as IBM partners with Spark educators. These organizations and companies will and already provide Spark related courses:
Due to strong interest in Spark, universities are also stepping up on their courses for Spark. BerkeleyX runs a MOOC (Massive Open Online Course) course for Spark. This particular course has attracted over 67,000 students.
IBM’s investment definitely draws more attention to Apache Spark. It will move to mainstream from its current niche position as Hadoop has currently dominated big data analytics. Spark will be a strong competitor to Hadoop.
What remains to be seen is the response from Amazon, Google and Microsoft. These companies invest heavily on analytics and machine learning services. Just recently Amazon announced that Spark is available out-of-the-box in Amazon Elastic MapReduce (EMR).
Since more companies focus on integrating Spark as their offering, Spark may eventually become ubiquitous. It will run under the hood and is no longer a visible component of its own. Spark is on its way to become a common platform like Linux or Hadoop.