AWS re:Invent 2022: Data and Machine Learning

On the second day of Amazon Web Services (AWS) re:Invent, Swami Sivasubramanian, vice president of AWS Data and Machine Learning (ML) revealed the latest innovations during his keynote.

To start, Sivasubramanian announced the launch of Amazon Athena for Apache Spark, which he said will provide organizations with a more intuitive way to run complex data analytics. He noted that Apache Spark will run three times faster on AWS.

The next product announcement was of the general availability of Amazon DocumentDB Elastic Clusters, a fully-managed solution to quickly scale document workloads of any size. Elastic Clusters integrates with other AWS services, similar to Amazon DocumentDB.

Amazon SageMaker now supports Geospatial ML, giving access to multiple new kinds of data. A demo of the updates showed how it could help save lives in natural disasters, predicting dangerous road conditions due to rising flood water levels, and demonstrated how this technology can guide first responders on the best path to send emergency supplies and evacuate people as fast as possible.

High-resolution satellite imagery provided by third-party data providers within Sagemaker show which roads are fully submerged in water, to help keep emergency responders up to date.

During the keynote, Sivasubramanian emphasized the importance of reliability and security for all organizations. To deliver this, AWS announced a new Amazon Redshift Multi-AZ feature that offers high availability and reliability for workloads.

Additional security products announced included an Aurora-themed extension to Amazon GuardDuty, a threat detection service that continuously monitors AWS accounts and workloads for malicious activity. The extension, Amazon GuardDuty RDS Protection, uses ML to identify threats and suspicious activity against data stored in Aurora databases.

To address machine learning challenges for governance, Amazon is launching three new capabilities for SageMaker – ML Governance Role Manager, Model Cards, and Model Dashboard. According to Sivasubramanian, these services should make using ML a more seamless experience.

He also announced the Amazon DataZone, which aims to help users organize, share and govern data across organizations.

“I have had the benefit of being an early customer of DataZone,” he said. “I leverage DataZone to run the AWS weekly business review meeting where we assemble data from our sales pipeline and revenue projections to inform our business strategy.”

During the keynote, a demo led by Shikha Verma, head of product for Amazon DataZone, demonstrated how organizations can use the product to create more effective advertising campaigns and get the most out of their data.

“Every enterprise is made up of multiple teams that own and use data across a variety of data stores. Data people have to pull this data together but do not have an easy way to access, or even have visibility to this data. Amazon DataZone fills this gap,” Verma said.

According to Verma, DataZone provides a unified environment where everyone in an organization—from data producers to consumers, can go to access and share data in a governed manner.

Other products and feature updates announced during the keynote include a new auto-copy feature into Amazon Redshift from S3, which makes it easier to create and maintain simple data ingestion pipelines.

The company is also trying to encourage ML training in schools, helping community colleges with an AWS Machine Learning University training program for educator training. In addition to that, AWS is building an AI and ML scholarship program, awarding a total of US$10 million to 2,000 selected students.

Related Posts