Categories: Big DataData Storage

IBM Invests Heavily In ‘Important’ Open Source Apache Spark Big Data Project

IBM has pledged significant resources and up to 3,500 researchers to the Apache Spark big data platform, which the company calls “the most important new open source project in a decade that is being defined by data.”

Spark was originally developed in 2009 at the AMPLab at UC Berkeley, of which IBM is a founding partner, and has gained popularity because of its perceived ease of use and efficient memory management.

Its supporters claim Spark is 100 times faster at analysing data in memory using Hadoop’s MapReduce and ten times faster than disk. Spark had 465 contributors as of 2014, making it the most active project in the Apache Software Foundation and open source Big Data project.

IBM Spark

IBM says this commitment from the open source community means Spark is in a constant state of improvement and wants to aid the development of the platform with its own contributions.

“IBM has been a decades long leader in open source innovation. We believe strongly in the power of open source as the basis to build value for clients, and are fully committed to Spark as a foundational technology platform for accelerating innovation and driving analytics across every business in a fundamental way,” said Beth Smith, general manager of IBM’s analytics platform. “Our clients will benefit as we help them embrace Spark to advance their own data strategies to drive business transformation and competitive differentiation.”

Spark is to be built into IBM’s analytics and commerce platforms and the company will offer Spark as a cloud service through BlueMix. Watson Health Cloud will use the engine to help medical researchers analyse population health data and IBM’s own SystemML machine learning technology is to be open sourced to aid Spark’s development.

Up to 3,500 researchers and developers will work on Spark-related projects across the world and IBM has committed to educating more than one million data scientists and data engineers about the platform.

High profile users of Spark include NASA and the SETI Institute, which are analysing terabytes of deep complex radio signals to see if there is evidence of extra-terrestrial life.

Our Big Data Quiz is the same size as all our others!

Steve McCaskill

Steve McCaskill is editor of TechWeekEurope and ChannelBiz. He joined as a reporter in 2011 and covers all areas of IT, with a particular interest in telecommunications, mobile and networking, along with sports technology.

Recent Posts

UK’s CMA Readies Cloud Sector “Behavioural” Remedies – Report

Targetting AWS, Microsoft? British competition regulator soon to announce “behavioural” remedies for cloud sector

13 hours ago

Former Policy Boss At X Nick Pickles, Joins Sam Altman Venture

Move to Elon Musk rival. Former senior executive at X joins Sam Altman's venture formerly…

15 hours ago

Bitcoin Rises Above $96,000 Amid Trump Optimism

Bitcoin price rises towards $100,000, amid investor optimism of friendlier US regulatory landscape under Donald…

17 hours ago

FTX Co-Founder Gary Wang Spared Prison

Judge Kaplan praises former FTX CTO Gary Wang for his co-operation against Sam Bankman-Fried during…

17 hours ago