Amazon Hopes AWS Macie Machine Learning Tool Will Stem Cloud Data Loss
AWS SUMMIT: Amazon has released Macie, an AI-powered tool aimed at securing data held in the cloud, following a string of inadvertent data leaks
Amazon has unveiled a machine learning-based tool aimed at securing sensitive data held in the cloud, after a number of high-profile data leaks involving customers of Amazon Web Services (AWS).
The tool, called Macie, was announced at the AWS New York Summit event along with an automated extract, transform and load (ETL) service and a unified repository of AWS’ data migration tools.
Data insecurity
The announcement follows several data breaches in which major companies were found to have stored sensitive data on AWS Simple Storage Service (S3) in a way that left it publicly accessible.
Last month it was disclosed that Verizon had exposed data on about 6 million customers in this way, and similar incidents have affected voter information held by the Republican National Committee (RNC) and customer data exposed by wrestling entertainment company WWE.
The RNC breach, disclosed in June, affected more than 198 million people, or about 61 percent of the US population, and was the country’s largest-ever voter data exposure.
Macie, a fully managed service, scans users’ data repositories for sensitive data including personal information or intellectual property and uses machine learning to establish a baseline for how it’s typically accessed. The system then generates alerts when it detects unauthorised access or inadvertent data leaks.
It’s currently available for S3, with support for other AWS data stores scheduled for later this year. It can be enabled from the AWS management console, with users paying only for the storage classified and events analysed.
Amazon said the systems used to safeguard data in the cloud tend to require manual processes that can be cumbersome, particularly when large amounts of data are involved.
“By using machine learning to understand the content and user behaviour of each organisation, Amazon Macie can cut through huge volumes of data with better visibility and more accurate alerts,” stated AWS chief information security officer Stephen Schmidt.
AWS Glue
The ETL service, called AWS Glue, was first announced by Amazon chief technology officer Werner Vogels at the AWS re:Invent conference in December, and is now accessible via the AWS management console. So far it has only rolled out to AWS’ US East region, with more regions to follow in the coming months.
It can be used to quickly prepare and load data into S3 storage or a number of databases running on Amazon’s cloud service.
Amazon said up to 75 percent of the time spent on analytics projects can be taken up with extracting data, normalising it and loading it into data stores, including the labor-intensive hand-coding of ETL scripts.
The process can also involve dedicated hardware that sits idle between jobs. Amazon said AWS Glue is “serverless”, meaning users pay only for the computing resources they use during jobs.
“We developed AWS Glue to eliminate much of the undifferentiated heavy lifting involved with ETL,” stated Raju Gulabani, AWS’ vice president of databases, analytics and AI.
What is your biggest cybersecurity concern?
- Ransomware (28%)
- Humans / Social Engineering (27%)
- State sponsored hackers (14%)
- Malware (14%)
- Other (7%)
- Out of date tools (6%)
- DDoS (4%)
Migration Hub
A third announcement is AWS Migration Hub, which brings together a number of migration systems introduced in recent years, including AWS Application Discovery Service, Server Migration Services and AWS Database Migration Service.
The platform also includes third-party offerings from specialists including CloudEndure and Racemi.
In addition to putting all the tools in one place Migration Hub guides users through the migration process and tracks the status of each migration, AWS said.
How well do you know the cloud? Try our quiz!