$Amazon.com(AMZN)$ is having their annual conference ReInvent this week, and for infra nerds like me, it's fun to dig in on all of the announcements! Here's a quick summary of what they announced in their keynotes across:
1) Compute
EC2 was a foundational product for AWS, and the OG compute instance. Today, AWS has ~850 compute instances types across 126 different families (for instance, they have ~14 instance types just for Nvidia GPUs). From 1 instance type to ~850 today! They also started building their own chips (Gravitron in 2018, Inferentia in 2019 and Trainium in 2022). They disclosed a cool stat - in 2019 AWS was a $35b business. Today - there is as much compute running on Gravitron in the AWS fleet than all of compute at AWS in 2019!
This week they launched a couple new compute instance types (and chips!):
A) EC2 P6 Instances: These will feature Nvidia Blackwell chips and be available next year
B) EC2 Trn2 Instances: This compute instance has 16 Trainium2 chips interconnected with Neuron Link
C) EC2 Trn2 Ultra Servers: this compute instance combines 4 Trn2 Instances into one with Neuron Link (so 64 Trainium2 chips)
D) Trainium3 chip - coming next year!
2) Storage
If EC2 was the OG compute product, it's counterpart in storage is S3! AWS largely has 5 different tiers of S3. Standard, Standard-infrequent access, intelligent tiering, express one zones, and Glacier.
Over the last few years we've seen the emergence of table formats Delta, Iceberg and Hudi which sit on top of different tiers of S3 (or other object stores).
This week AWS announced a new S3 tier aimed specifically at Iceberg tables called S3 Table Buckets.
They also announced a product called S3 Metadata which is a service that helps you discover and manage metadata in S3
3) Databases
Databases have long been another core component of AWS. RDS is commonly thought of as their flagship database product. Then they launched DynamoDB which some say spawned the NoSQL movement. Then came Aurora, Elasticache, DocumentDB, Neptune, MemoryDB, Keyspaces, Timestream, as purpose built databases.
Aurora is one of their more popular databases that has full Postgres and MySQL compatibility. This week they announced a new product called Aurora DSQL which is a distributed, multi-region version of Aurora (think of this as a similar product to Cockroach or Google Spanner)
4) Foundation Models
If there was a theme of the conference, it was "Choice, Choice, Choice!" They believe there will never be one model to rule them all (I agree), but instead a number of models purpose built for different use cases. Might the trajectory of number of models follow the trajectory of the number of AWS compute instance types, which is now at ~850?? AWS wants to expose their customers to every model possible.
This week they launched 6 of their own foundation models under the Nova family:
A) Nova Micro is their text input text output model
B) Nova Lite / Pro / Premier are their multi-modal models (input text, audio, video and output text) with varying levels of size / performance. Lite is comparable to GPT 4o mini, while Pro is comparable to GPT 4o. Premier comes out next year
C) Nova Canvas is their image generation model
D) Nova Reel is their video generation model
5) SageMaker
AWS breaks down their GenAI stack into roughly three buckets. The first is infrstraucture for foundation model training and inference (SageMaker, Compute Instances, custom chips, etc). The second is tools to build with LLMs (Bedrock). And the third is applications that leverage foundation models (Amazon Q). Let's start with their new product announcements in SageMaker.
First off - my impression after the Day 1 keynote is they're really starting to bucket a lot of services under SageMaker now (like Redshift). This week they announced:
A) Amazon SageMaker Lakehouse. AWS has officially entered the Lakehouse race! Their lakehouse aims to be the single place to unify analytics and AI with an open, unified secure data lakehouse compatible with Iceberg tables (and their S3 Table Buckets I mentioned above).
B) Amazon SageMaker Unified Studio: everything you need for fast analytics, AI, data processing, search, data prep, etc.
C) Zero ETL for third party apps
D) SageMaker HyperPod Flexible Training Plans. This helps makes training models more efficient. Training Plans helps you create a customized schedule that optimizes the provisioning and allocation of accelerated compute instances to train large AI models withing a specified timeline, budget and compute requirements
E) Sagemaker HyperPod Task Awareness. Dynamcially allocate compute resources to make sure high priority tasks are completed on time. Dynamically allocate accelerator compute resources across tasks across inference, fine tuning, training, etc. Monitor and audit compute allocation resources.
6) Bedrock
Bedrock, mentioned above, is a service that facilitates the development and scaling of generative AI applications. This week AWS announced the following new products available in Bedrock:
A) Model Distillation in Bedrock. This allows customers to take a large model and distll it down to a smaller model based on their enterprise data. Distilled models are generally much smaller, cheaper to run, and more customized on your own enterprise data.
B) Automated Reasoning Checks in Bedrock. This product is aimed at preventing factual errors in hallucinations
C) Bedrock Agents Support for Multi-Agent Collaboration. This allows agents to support more complex workflows
D) Bedrock Marketplace. A marketplace with access to more than 100 emerging foundation models
E) Prompt Caching on Bedrock. Models charge per input and output token. But oftentimes companies are running similar prompts. Prompt caching "stores" similar queries / prompts to avoid redundant token expenses (and speed up latency)
F) Intelligent Prompt Routing on Bedrock. AWS is al about choice, but what if you can't decide which model to use? Bedrock will automatically route prompts to different models to optimize response quality and costs
G) Structured Data Retrieval in Bedrock Knowledge Bases. Seamlessly integrate structured data for RAG. Can use natural language to query data in SageMaker Lakehouse, Redshift, S3 tables with Iceberg, etc.
H) Support for GraphRAG in Bedrock Knowledge Bases. Generate more relevant responses for generative AI applications using knolwedge graphs
I) Bedrock Data Automation. Transofrm unstructured multi-modal content (documetns, video, images, audio) with a single API for GenAI applications
J) Multimodal Toxicity Detection. Safeguards for image generation
7) Applications that leverage foundation models
For AWS, this is largely their Q Series of products. Amazon Q is broken down into Amazon Q Developer and Amazon Q Business.
Amazon Q Developer is their AI powered assistant for software development.
Amazon Q Business is their assistant for leveraging companies internal data (internal search). This of this as enterprise search across internal apps, wikis, messaging apps, databases, etc
AWS announced a number of new products across both Q Developer and Q Business:
A) Q Developer Agents to generate unit tests, code reviews and documentation. Writing code is just one part of a developers workflow! There are other tedious parts Q Developer will now help automate. Q Developer is also now available in GitLab
B) Q Developer Agents to transform .NET applications from Windows to Linux. This expands on their prior transformation agent that helped upgrade Java code
C) Q Developer Agent to transform mainframe applications to accelerate cloud migrations. Another code transformation agent.
D) Q Developer Agent to investigate issues across AWS environment. Think of this as a copilot for AWS CloudWatch
E) Q Developer available in SageMaker Canvas. Now you can build ML models from natural language
F) Amazon Q Business Apps. This one was pretty cool, and seems to take a shot at old school RPA. This product can monitor a human executing a workflow (like insurance claims processing), and then generate an Agent (workflow) app that will mirror the same workflow. All without writing any code to build this internal app / workflow
G) Integrating Q Business with QuickSight. Now you can use Q Business to search across your structured data in databases.
Comments