How to build AI solutions (the easy way) with AWS AI Services
How to build AI solutions with AWS AI services (and without prior AI or ML experience)
AI is big business – and that shouldn’t be a surprise. As organizations become more and more reliant on interdependent systems and exponentially growing volumes of data, artificial intelligence (AI) is becoming a crucial tool for navigating this complexity.
Artificial intelligence (AI) and machine learning (ML) are the keys to mastering your data and using it to gain productivity and efficiency. They can enable you to unlock insights or effortlessly extract information from a chaotic haystack of mixed data sources.
What this means for you as a developer or IT professional is that you can expect to be asked to build this kind of capability. Although AI expertise is highly specialized (and hard to come by), any developer with a good background with AWS can build highly capable AI-powered solutions.
Building AI solutions with AWS Services
AWS provides a substantial selection of 22 AI Services and dozens of Machine Learning Services. These are relatively easy to configure and use to build truly unique AI solutions that deliver substantial value. AWS AI Services are pre-trained ‘out of the box,’ so all you need to do is customize them for your specific needs and integrate them.
These aren’t exactly ‘plug and play’ solutions, but they’re as close as you could reasonably hope for, and they massively reduce the potential development time for this kind of sophisticated solution.
In this blog, we’re going to look at two key AI Services from AWS: Textract and Lex.
These two key services have a broad range of application scenarios. They’re easy to set up and they can generate solid value by increasing productivity and creating better user experiences.
What are AWS Textract and Lex?
AWS Textract is a pre-trained AI service that provides automated document processing and data extraction from scanned documents. This can be harnessed to replace manual processes and accelerate data capture from documents. This data can then be used in a variety of processes.
Textract gives developers the ability to add a customized automated document processing feature for any workflow, or a combination of processes.
AWS Lex is a powerful AI chatbot. It can use text or speech to trigger defined processes or retrieve information from a variety of sources. While ChatGPT may have captured the public’s attention, it’s important to remember that Amazon has been working on chatbots and Large Language Models (LLMs) for a lot longer, with a proven track history of success.
Since Alexa was launched in 2014, Amazon has continued to develop its capabilities in the LLM domain, and has a strong advantage when it comes to the supporting infrastructure and a stable tried-and-tested product that can be highly customized.
Both of these AI services can be used separately, or combined with other AWS AI Services to build highly complex functionalities. Let’s look closer at what each can do.
Using AWS Textract for AI-driven data capture and document processing
AWS Textract takes standard optical character recognition (OCR) and gives it added intelligence.
Scanned documents are analyzed by the AI which identifies key data and other elements. It then extracts these from the document and forwards the data to your selected processes.
In most cases, organizations will use an automated document processing capability to import data from physical documents into their ERPs. This way, it can be used in digital processes or migrated into a compliant data storage solution. However, there’s actually a lot of scope for using this functionality to drive other AI-supported processes including automated data-driven decision-making.
Textract can handle documents as imported S3 objects, with PDF, TIFF, JPG, and PNG formats and a minimum resolution of 150 dpi.
For every document it processes, Textract will identify the key data and highlight the location with a bounding box in the original document. For each piece of data, Textract assigns a confidence score from 0 to 100, which assesses how likely it has recognized the right data from the document.
Customizing AWS Textract
Although the AI is pre-trained, it will require some tweaking until it performs reliably, just the way you want.
You can customize Textract so that confidence scores behind a certain threshold are flagged. This is incredibly important for refining performance and verification, as it makes it clear when improvements are needed.
There are two key APIs involved in every document process: the Detect Document Text API, which detects text and handwriting in documents; and the Analyze Document API, which extracts specific data types from a document.
In addition to these, there are three other APIs that are used for specific tasks: the Analyze Expense API, for expenses calculations; the Analyze ID API, which is used to extract key data from ID documents; and the Analyze Lending API, which is used to calculate loan eligibility from mixed data sources.
There’s also considerable scope for customization by using custom queries that serve your organization’s specific document types and processing needs, including layout features, and the ability to summarize documents.
Important features of the AWS Textract AI Service
- Extracts various data types including handwritten text, typed data, and signatures.
- Deciphers messy, unclear pages (although this can affect the confidence score).
- Identifies ‘key-value pairs’ from forms and tables, so data from these are accurately captured and processed.
- Capable of extracting highly specific data types using Queries.
Textract is quite straightforward to set up, and it can be used to boost the efficiency of myriad data-capture processes that ordinarily take a lot of time, and are error-prone when performed manually.
Using the AWS Lex AI Service for speech-driven processes
Now, let’s look at Lex. With AWS Lex you can build, test, and deploy AI-powered chatbots with an incredibly short time-to-market, and a low cost.
These chatbots can use either text or speech inputs for queries, and AWS Lex enables you to offer chatbot functionality in two different ways. These are ‘Request and Response’ and ‘Streaming Conversation.’ These two approaches are better-suited to quite different situations, and choosing the right one will make a big difference to the user experience and running costs.
With the ‘Request and Response’ type of conversation, each question or query is handled separately with its own API call, meaning that each question is processed as if it’s totally unrelated to the previous or subsequent one. This type of conversation is best suited to ‘one-off’ queries where quick answers are needed, or where a single process needs to be triggered by speech or text.
By contrast, the ‘Streaming Conversation’ type is more like a real conversation. In this mode, Lex will listen continuously and even offer ‘natural’ proactive responses that anticipate user intent and contexts. Streaming Conversation queries are processed in a single streaming API call, and are better suited to situations where there are more potential outcomes, where multiple contexts are in effect, or where multiple processes could be triggered.
AWS Lex’s integrated chatbot builder
One of the great things about Lex for developers is that it comes with an integrated automated chatbot designer, and it can learn from your own transcripts of typical conversations.
As a result, you can get a proof-of-concept chatbot up and running in just minutes, so long as you’re prepared. There are several key elements you need to build a chatbot, and it can save a lot of time if you can prepare these in advance.
6 key elements you need to build a Lex chatbot
- Learning material (transcripts)
- Selected language (or multiple languages)
- Defined intents (these are the expected outcome/s of a conversation)
- Defined statements or ‘utterances’ (these identify the user intent)
- Defined ‘slots’ (key information a user must provide to define outcomes)
- Interaction flows
While you can build a working chatbot in a very short time, more complex ones will involve more work. For more complex chatbots, the assembly process can take some time – especially for the utterances and defined slots, which may be highly varied and very specific to different processes. Thankfully, there are a few ways you can accelerate this process.
For example, you can use AWS Bedrock generative AI (another AI Service), which can generate sample utterances automatically and assist with defining and matching slots.
Another thing to consider is using the ‘Network of Bots’ feature to break up complex processes or queries into smaller, specialized ‘chunks.’ This can be especially valuable when different processes might use superficially similar utterances or slots, as each specialized bot will handle its own part of the process or request.
How can you use AWS Lex AI chatbots?
AI chatbots have a clear use-case in customer service and other contact scenarios, as well as for sales, marketing, and internal processes such as information retrieval. With the right integrations in place with your secure cloud environment, Lex can collect or dispense data and/or trigger processes in many ways.
You can integrate Lex with predefined integrations like Slack and Facebook messenger or a selection of APIs, which are available for all the major programming languages: Android, JS, iOS, Java, .NET, Ruby, PHP, Python, and Mobile Web.
Optimizing AWS Lex AI Chatbots
If you’re using the Streaming Conversation type, then there are also options for improving user experiences with the Wait, Continue, and Interrupt features. These create a more ‘natural feeling’ conversation that keeps people engaged and reduces stress when there are natural pauses.
In case a customer needs to find certain information, for example, your bot can say something like, “Take your time, I’m happy to wait.”
The final part of the UX optimization comes from extensive testing. This is essential to ensure it behaves in a reliable way before deploying.
What will you build with AI Services?
You can generate unique value by combining AWS AI Services in many different ways, and this is enhanced by integrating them with your own software or processes.
The important thing to know is that you don’t need any AI or ML experience to get started - just a few pointers about the pitfalls to avoid and a solid background in AWS.