Hello world. This is the summer edition of the AWS Natural Language Processing(NLP) newsletter covering everything related to NLP at AWS. Feel free to leave comments & share it on your social network.
NLP@AWS Customer Success Story
How Mantium achieves low-latency GPT-J inference
Mantium helps customers build AI applications that incorporate state of the art language models in minutes and managing them at scale through their low-code cloud platform. Mantium supports access to model APIs from AI providers as well as open source models like GPT-J that are trained using SageMaker distributed model parallel library. To ensure users get best-in-class performance from such open source models, Mantium used DeepSpeed to optimize inference of GPT-J deployed on SageMaker inference endpoints.
How eMagazines utilizes Amazon Polly to voice articles for school-aged kids
Research has shown that developing brains need to hear language even before learning to talk and is a pre-requisite for learning to read. eMagazines used Amazon Polly to help TIME For Kids to automate audio synthesis as content were added dynamically on a daily basis without involvement of the audio artist. With Amazon Polly, they were also able to support new features like text highlight and scrolling as the article is read aloud as well as collecting and analysing usage data in real time.
AI Language Services
Break through language barriers with AWS AI Services
The ability to directly communicate in a multi-lingual context on demand without the need of a human translator can be applied in many areas like media, medicine, education, hospitality and many more. This blog post will show how you can combine three fully managed AWS AI services (Amazon Transcribe, Amazon Translate, and Amazon Polly) to create a near-real-time speech-to-speech translator solution that can quickly translate a source speaker’s live voice input into a spoken, accurate, translated target language.
Using NLP to gain insights from customer tickets
Amazon Comprehend is a natural language processing (NLP) service that uses machine learning (ML) to uncover valuable insights and connections in text. This blog shares how Amazon Managed Services (AMS) used Amazon Comprehend's custom classifications to categorise inbound requests by resource and operation type according to customer’s description of the issue. This allowed AMS to build workflows that recommend automated solutions for the issue related to the tickets and generate classification analysis reports using Amazon Quicksight.
Enable your contact centre with live call analytics and agent assist using AI Services
One way to raise the bar on good caller experience to your contact center is to provide supervisors the ability to assess the quality of the caller experiences through call analytics and respond decisively before the call ends. Furthermore, if agents are assisted with proactive and contextual information guidance, it can greatly enhance their ability to deliver a great caller experience. This blog shows how to build a solution for live call analytics and real time agent assist using AWS AI services like Amazon Comprehend, Amazon Lex, Amazon Kendra and Amazon Transcribe.
NLP on SageMaker
Text classification for online conversations with machine learning on AWS
The explosion of usage of online conversations in modern digital life have led to wide-spread non-traditional usage of language. One phenomenon is the use of constantly evolving and domain-specific vocabularies. Another is the (both intentional and accidental) adoption of lexical deviation of words from proper English in such conversations. Traditional NLP techniques do not perform well in analysing such online conversations. The authors of this blog post from Amazon ML Solutions Lab discusses 2 different model approaches to predict toxicity and subtype labels like obscene, threat, insult, identity attack and sexual explicit and tested them on the Jigsaw Unintended Bias in Toxicity Classification dataset.
Text summarization with Amazon SageMaker and Hugging Face
The use of modern digital services and communication generates data that are growing at zettabyte scale. Text summarization is a helpful technique in understanding large amounts of text data because it creates a subset of contextually meaningful information from source documents. Hugging Face and AWS have a partnership to seamlessly integrate the HuggingFace library into Amazon SageMaker to enable developers and data scientists to get started with NLP on AWS more easily. Hugging Face’s 400 pretrained text summarization models can be easily deployed using the summarization pipeline Hugging Face transformer API. This blog post is a good starting point to learn how to quickly experiment and select suitable Hugging Face text summarization models and deploy them on Amazon SageMaker.
Build a news-based real-time alert system with Twitter, Amazon SageMaker, and Hugging Face
In industries like insurance, law enforcement, first respondents and government agencies, the ability to process news and social media feeds in near real time can allow them to respond immediately as events unfold. If you are thinking about such an use case, this blog post can guide you on building a real-time alert system on AWS that consumes news alerts from social media and classify the alerts using a pre-trained model from Hugging Face Hub deployed on Amazon SageMaker.
NLP@Community
The World’s Largest Open Multilingual Language Model
BLOOM is the first multilingual LLM (Large Language Model) trained completely transparently to make LLMs accessible to academia, nonprofits, and smaller research labs. With its 176 billion parameters, BLOOM is able to generate text in 46 natural languages and 13 programming languages and will be the very first LLM with over 100B parameters for almost all of these languages. It is hoped that BLOOM will be the seed for a living family of models that will grow in the future through the community.
The Curious Case of LaMDA, the AI that Claimed to Be Sentient
In this article, the author discusses the recent controversial claim that Google’s LaMDA maybe sentient by examining the suppositions of the claim.