Categories: Big DataData Storage

IBM Invests Heavily In ‘Important’ Open Source Apache Spark Big Data Project

IBM has pledged significant resources and up to 3,500 researchers to the Apache Spark big data platform, which the company calls “the most important new open source project in a decade that is being defined by data.”

Spark was originally developed in 2009 at the AMPLab at UC Berkeley, of which IBM is a founding partner, and has gained popularity because of its perceived ease of use and efficient memory management.

Its supporters claim Spark is 100 times faster at analysing data in memory using Hadoop’s MapReduce and ten times faster than disk. Spark had 465 contributors as of 2014, making it the most active project in the Apache Software Foundation and open source Big Data project.

IBM Spark

IBM says this commitment from the open source community means Spark is in a constant state of improvement and wants to aid the development of the platform with its own contributions.

“IBM has been a decades long leader in open source innovation. We believe strongly in the power of open source as the basis to build value for clients, and are fully committed to Spark as a foundational technology platform for accelerating innovation and driving analytics across every business in a fundamental way,” said Beth Smith, general manager of IBM’s analytics platform. “Our clients will benefit as we help them embrace Spark to advance their own data strategies to drive business transformation and competitive differentiation.”

Spark is to be built into IBM’s analytics and commerce platforms and the company will offer Spark as a cloud service through BlueMix. Watson Health Cloud will use the engine to help medical researchers analyse population health data and IBM’s own SystemML machine learning technology is to be open sourced to aid Spark’s development.

Up to 3,500 researchers and developers will work on Spark-related projects across the world and IBM has committed to educating more than one million data scientists and data engineers about the platform.

High profile users of Spark include NASA and the SETI Institute, which are analysing terabytes of deep complex radio signals to see if there is evidence of extra-terrestrial life.

Our Big Data Quiz is the same size as all our others!

Steve McCaskill

Steve McCaskill is editor of TechWeekEurope and ChannelBiz. He joined as a reporter in 2011 and covers all areas of IT, with a particular interest in telecommunications, mobile and networking, along with sports technology.

Recent Posts

Apple Sales Rise 6 Percent After Early iPhone 16 Demand

Fourth quarter results beat Wall Street expectations, as overall sales rise 6 percent, but EU…

22 hours ago

X’s Community Notes Fails To Stem US Election Misinformation – Report

Hate speech non-profit that defeated Elon Musk's lawsuit, warns X's Community Notes is failing to…

23 hours ago

Google Fined More Than World’s GDP By Russia

Good luck. Russia demands Google pay a fine worth more than the world's total GDP,…

24 hours ago

Spotify, Paramount Sign Up To Use Google Cloud ARM Chips

Google Cloud signs up Spotify, Paramount Global as early customers of its first ARM-based cloud…

2 days ago

Meta Warns Of Accelerating AI Infrastructure Costs

Facebook parent Meta warns of 'significant acceleration' in expenditures on AI infrastructure as revenue, profits…

2 days ago

AI Helps Boost Microsoft Cloud Revenues By 33 Percent

Microsoft says Azure cloud revenues up 33 percent for September quarter as capital expenditures surge…

2 days ago