AWS Machine Learning Blog

Empowering everyone with GenAI to rapidly build, customize, and deploy apps securely: Highlights from the AWS New York Summit

See how AWS is democratizing generative AI with innovations like Amazon Q Apps to make AI apps from prompts, Amazon Bedrock upgrades to leverage more data sources, new techniques to curtail hallucinations, and AI skills training.

Search enterprise data assets using LLMs backed by knowledge graphs

In this post, we present a generative AI-powered semantic search solution that empowers business users to quickly and accurately find relevant data assets across various enterprise data sources. In this solution, we integrate large language models (LLMs) hosted on Amazon Bedrock backed by a knowledge base that is derived from a knowledge graph built on Amazon Neptune to create a powerful search paradigm that enables natural language-based questions to integrate search across documents stored in Amazon Simple Storage Service (Amazon S3), data lake tables hosted on the AWS Glue Data Catalog, and enterprise assets in Amazon DataZone.

Embodied AI Chess with Amazon Bedrock

In this post, we demonstrate Embodied AI Chess with Amazon Bedrock, bringing a new dimension to traditional chess through generative AI capabilities. Our setup features a smart chess board that can detect moves in real time, paired with two robotic arms executing those moves. Each arm is controlled by different FMs—base or custom. This physical implementation allows you to observe and experiment with how different generative AI models approach complex gaming strategies in real-world chess matches.

Efficiently train models with large sequence lengths using Amazon SageMaker model parallel

In this post, we demonstrate how the Amazon SageMaker model parallel library (SMP) addresses this need through support for new features such as 8-bit floating point (FP8) mixed-precision training for accelerated training performance and context parallelism for processing large input sequence lengths, expanding the list of its existing features.

Getting started with Amazon Bedrock Agents custom orchestrator

In this post, we explore how Amazon Bedrock Agents simplify the orchestration of generative AI workflows, particularly with the introduction of the custom orchestrator feature. You can use the custom orchestrator to fine-tune and optimize agentic workflows that align more closely with specific business and operational needs. We outline the feature’s key benefits, including full control over orchestration, real-time adjustments, and reusability, followed by a breakdown of how it manages state transitions and contract-based interactions between Amazon Bedrock Agents and AWS Lambda.

Use Amazon Bedrock Agents for code scanning, optimization, and remediation

For enterprises in the realm of cloud computing and software development, providing secure code repositories is essential. As sophisticated cybersecurity threats become more prevalent, organizations must adopt proactive measures to protect their assets. Amazon Bedrock offers a powerful solution by automating the process of scanning repositories for vulnerabilities and remediating them. This post explores how you can use Amazon Bedrock to enhance the security of your repositories and maintain compliance with organizational and regulatory standards.

Create a generative AI assistant with Slack and Amazon Bedrock

Seamless integration of customer experience, collaboration tools, and relevant data is the foundation for delivering knowledge-based productivity gains. In this post, we show you how to integrate the popular Slack messaging service with AWS generative AI services to build a natural language assistant where business users can ask questions of an unstructured dataset.

Unleash your Salesforce data using the Amazon Q Salesforce Online connector

In this post, we walk you through configuring and setting up the Amazon Q Salesforce Online connector. Thousands of companies worldwide use Salesforce to manage their sales, marketing, customer service, and other business operations. The Salesforce cloud-based platform centralizes customer information and interactions across the organization, providing sales reps, marketers, and support agents with a unified 360-degree view of each customer. With Salesforce at the heart of their business, companies accumulate vast amounts of customer data within the platform over time. This data is incredibly valuable for gaining insights into customers, improving operations, and guiding strategic decisions. However, accessing and analyzing the blend of structured data and unstructured data can be challenging. With the Amazon Q Salesforce Online connector, companies can unleash the value of their Salesforce data.

Flow diagram of custom hallucination detection and mitigation : The user's question is fed to a search engine (with optional LLM-based step to pre-process it to a good search query). The documents or snippets returned by the search engine, together with the user's question, are inserted into a prompt template - and an LLM generates a final answer based on the retrieved documents. The final answer can be evaluated against the reference answer from the dataset to get a custom hallucination score. Based on a pre-defined empirical threshold, a customer service agent is requested to join the conversation using SNS notification

Reducing hallucinations in large language models with custom intervention using Amazon Bedrock Agents

This post demonstrates how to use Amazon Bedrock Agents, Amazon Knowledge Bases, and the RAGAS evaluation metrics to build a custom hallucination detector and remediate it by using human-in-the-loop. The agentic workflow can be extended to custom use cases through different hallucination remediation techniques and offers the flexibility to detect and mitigate hallucinations using custom actions.

Deploy Meta Llama 3.1-8B on AWS Inferentia using Amazon EKS and vLLM

In this post, we walk through the steps to deploy the Meta Llama 3.1-8B model on Inferentia 2 instances using Amazon EKS. This solution combines the exceptional performance and cost-effectiveness of Inferentia 2 chips with the robust and flexible landscape of Amazon EKS. Inferentia 2 chips deliver high throughput and low latency inference, ideal for LLMs.

Serving LLMs using vLLM and Amazon EC2 instances with AWS AI chips

The use of large language models (LLMs) and generative AI has exploded over the last year. With the release of powerful publicly available foundation models, tools for training, fine tuning and hosting your own LLM have also become democratized. Using vLLM on AWS Trainium and Inferentia makes it possible to host LLMs for high performance […]