Categories: Big DataData Storage

IBM Invests Heavily In ‘Important’ Open Source Apache Spark Big Data Project

IBM has pledged significant resources and up to 3,500 researchers to the Apache Spark big data platform, which the company calls “the most important new open source project in a decade that is being defined by data.”

Spark was originally developed in 2009 at the AMPLab at UC Berkeley, of which IBM is a founding partner, and has gained popularity because of its perceived ease of use and efficient memory management.

Its supporters claim Spark is 100 times faster at analysing data in memory using Hadoop’s MapReduce and ten times faster than disk. Spark had 465 contributors as of 2014, making it the most active project in the Apache Software Foundation and open source Big Data project.

IBM Spark

IBM says this commitment from the open source community means Spark is in a constant state of improvement and wants to aid the development of the platform with its own contributions.

“IBM has been a decades long leader in open source innovation. We believe strongly in the power of open source as the basis to build value for clients, and are fully committed to Spark as a foundational technology platform for accelerating innovation and driving analytics across every business in a fundamental way,” said Beth Smith, general manager of IBM’s analytics platform. “Our clients will benefit as we help them embrace Spark to advance their own data strategies to drive business transformation and competitive differentiation.”

Spark is to be built into IBM’s analytics and commerce platforms and the company will offer Spark as a cloud service through BlueMix. Watson Health Cloud will use the engine to help medical researchers analyse population health data and IBM’s own SystemML machine learning technology is to be open sourced to aid Spark’s development.

Up to 3,500 researchers and developers will work on Spark-related projects across the world and IBM has committed to educating more than one million data scientists and data engineers about the platform.

High profile users of Spark include NASA and the SETI Institute, which are analysing terabytes of deep complex radio signals to see if there is evidence of extra-terrestrial life.

Our Big Data Quiz is the same size as all our others!

Steve McCaskill

Steve McCaskill is editor of TechWeekEurope and ChannelBiz. He joined as a reporter in 2011 and covers all areas of IT, with a particular interest in telecommunications, mobile and networking, along with sports technology.

Recent Posts

Craig Wright Sentenced For Contempt Of Court

Suspended prison sentence for Craig Wright for “flagrant breach” of court order, after his false…

2 days ago

El Salvador To Sell Or Discontinue Bitcoin Wallet, After IMF Deal

Cash-strapped south American country agrees to sell or discontinue its national Bitcoin wallet after signing…

2 days ago

UK’s ICO Labels Google ‘Irresponsible’ For Tracking Change

Google's change will allow advertisers to track customers' digital “fingerprints”, but UK data protection watchdog…

2 days ago

EU Publishes iOS Interoperability Plans

European Commission publishes preliminary instructions to Apple on how to open up iOS to rivals,…

3 days ago

Momeni Convicted In Bob Lee Murder

San Francisco jury finds Nima Momeni guilty of second-degree murder of Cash App founder Bob…

3 days ago