Categories: CloudDatacentre

Amazon Launches Hadoop Cloud Computing Beta

Amazon Web Services is using the open-source Apache Hadoop distributed computing technology to make it easier for users to access large amounts of computing power to run data-intensive tasks.

AWS (Amazon Web Services) announced on 2 April the public beta of its Amazon Elastic MapReduce initiative, a service designed for businesses, researchers and analysts who have large number-crunching projects list Web indexing, data mining, financial analysis and scientific simulations, according to AWS officials.

Using a hosted Hadoop framework, users can instantly provision as much compute capacity they need from Amazon’s EC2 (Elastic Compute Cloud) platform to perform the tasks, and pay only for what they use.

Hadoop, the open-source version of Google’s MapReduce, is already being used by such companies as Yahoo and Facebook. Google only uses Hadoop internally.

HP is challenging Google, Amazon and Sun with its Cloud Assure.

There are efforts underway to increase the use of Hadoop in enterprise data centers. Most recently, a startup, Cloudera—which calls itself the commercial Hadoop company—announced 16 March the availability of its first product, the Cloudera Distribution for Hadoop. The product lets users store and process petabytes of data that many times is distributed among thousands of servers.

Cloudera also created a portal to help users install and use the company’s free product.

“Cloudera is advancing Hadoop technology to make it easier for everyone to store and process the same types of big data that large Web companies are successfully using in their businesses,” Christophe Bisciglia, the founder of Cloudera and former manager of Google’s Hadoop cluster, said in a statement at the time of Cloudera’s announcement.

According to AWS officials, using Hadoop and other MapReduce-based clusters on the Amazon EC2 cloud computing platform was a difficult task that forced users to do their own set up, management and cluster tuning. With Amazon Elastic MapReduce, those tasks are less time-consuming and more affordable, enabling users to quickly build up and take down Hadoop-based clusters on EC2 in moments.

AWS also is offering sample applications and tutorials to help users get more comfortable with the new service. Amazon Elastic MapReduce automatically deploys and configures the number of EC2 instances users ask for, then launches a Hadoop implementation of the MapReduce tool. MapReduce then loads the data from Amazon S3 (Simple Storage Service) and divides it so it can be processed in a parallel fashion. The data is then recombined after processing, with the end results put back into S3.

“Some researchers and developers already run Hadoop on Amazon EC2, and many of them have asked for even simpler tools for large-scale data analysis,” Adam Selipsky, vice president of product management and developer relations at AWS, said in a statement.

Jeffrey Burt

Jeffrey Burt is a senior editor for eWEEK and contributor to TechWeekEurope

Recent Posts

US Finalises Billions In Awards To Samsung, Texas Instruments

US finalises $4.7bn award to Samsung Electronics, $1.6bn to Texas Instruments to boost domestic chip…

8 hours ago

OpenAI Starts Testing New ‘Reasoning’ AI Model

OpenAI begins safety testing of new model o3 that uses 'reasoning' process to ensure reliability…

8 hours ago

US ‘Adding Sophgo’ To Blacklist Over Link To Huawei AI Chip

US Commerce Department reportedly adding China's Sophgo to trade blacklist after TSMC-manufactured part found in…

9 hours ago

Amazon Workers Go On Strike Across US

Amazon staff in seven cities across US go on strike after company fails to negotiate,…

9 hours ago

Senators Ask Biden To Extend TikTok Ban Deadline

Two US senators ask president Joe Biden to delay TikTok ban by 90 days after…

10 hours ago

Journalism Group Calls On Apple To Remove AI Feature

Reporters Without Borders calls on Apple to remove AI notification summaries feature after it generates…

10 hours ago