Name: AI-Powered Text Analysis Using AWS Comprehend with a Flask Web Interface
Rating: 2.9 (413 reviews)
Author: zahraajawad

Introduction to the idea of the work

In this work, we will build a Text Analysis application using AWS AI services. This application will allow users to enter text, and then process it using Amazon Comprehend to extract Sentiment Analysis, Entity Recognition, and Categories.

Amazon Comprehend
Amazon Comprehend uses natural language processing (NLP) to extract insights about the content of documents. It develops insights by recognizing the entities, key phrases, language, sentiments, and other common elements in a document. Use Amazon Comprehend to create new products based on understanding the structure of documents. For example, using Amazon Comprehend you can search social networking feeds for mentions of products or scan an entire document repository for key phrases.
You can access Amazon Comprehend document analysis capabilities using the Amazon Comprehend console or using the Amazon Comprehend APIs. You can run real-time analysis for small workloads or you can start asynchronous analysis jobs for large document sets. You can use the pre-trained models that Amazon Comprehend provides, or you can train your own custom models for classification and entity recognition.
Amazon Comprehend uses a pre-trained model to examine and analyze a document or set of documents to gather insights about it. This model is continuously trained on a large body of text so that there is no need for you to provide training data.

Key benefits of our work:

Easily analyze text data: This application enables us to quickly analyze texts to extract sentiment (positive, negative, mixed), entities (such as names and places), and important keywords.
Using AWS cloud services: You can explore and use AWS AI services such as Amazon Comprehend to provide advanced analytics in a simple and effective way.
Displaying results in an understandable and organized way: The results are displayed in the form of understandable and clear tables in a black interface (such as an SSH screen) and a web page. This enhances the understanding of the analytics.
Interactive web interface: The web interface allows you to view the analysis results in an organized and easy-to-understand way, making it easy to share or display the results to others.
Learning and using Flask: It provides an opportunity to learn how to create a simple web application using Flask, which adds skill in developing web applications with Python.

The steps we need will be as follows:

Launch an instance (your machine)
Setting up a Python virtual environment
Configure AWS Credentials
Write Python code to display results

Step (1)-Launch an instance (your machine)
You can find the same steps of the launch and connect of instance described in the article:
https://dev.to/zahraajawad/building-a-jupyter-notebook-environment-in-docker-for-data-analysis-on-aws-ec2-376i

Note: Make sure you choose the security group:
- Port 5000: This is the default port that Flask uses to serve the web application.

- Port 22: required for SSH access to your instance, which you use via Git to connect to the server.

After updating the system through the command
sudo apt update && sudo apt upgrade –y we install Python by the command :
sudo apt install python3 python3-venv python3-pip –y

Step (2)- Setting up a Python virtual environment
A virtual work environment can be created by creating a new folder for work and then creating the virtual environment within this folder. This is done through the following steps and commands:

- Create a project folder by the command mkdir my_project

- Then change the current directory by the command cd my_project

-Now we create our virtual environment with the command python3 -m venv venv

-We activate the virtual environment through source venv/bin/activate

Successfully built and activated

- Now we need to make sure that boto3, Flask, and pandas are installed in our virtual environment and this is done through the command:
pip install boto3 pandas Flask

Step (3)- Configure AWS Credentials
We now configure AWS Credentials to ensure that Amazon Comprehend can access resources securely and authoritatively. This is a way to verify the identity of a user or application and grant it the necessary permission to use the services and resources required to operate, This is done through:

We install awscli in the Python virtual environment through the command: pip install awscli

Then we make sure that the installation was successful by the command:
aws --version

After installation, we configure the credentials using the command:
aws configure
Where we will need to enter our account credentials on the AWS:

Access Key.
Then the Secret Access Key.
And the Default region name - such as us-west-2 or ap-south-1.
Finally the Default output format - which can be left blank or set to json

Step (4)-Write Python code to display results

We will create two codes to display the results of text analysis using the service Amazon Comprehend:

By displaying on the black screen (the terminal).
By displaying on the web page by creating a simple web application using Flask.

Displaying text analysis results on the black screen(the terminal
)
We create a new file called text_analysis.py which contains the following functions and codes which we will explain in detail and which we will use to implement the text analysis and display on the black screen:

To import the necessary libraries, boto3 and pandas, we use:

import boto3
import pandas as pd

To create an AWS Comprehend client using the boto3 library we use:


comprehend = boto3.client('comprehend')

text - The written text will be used for sentiment analysis, entity extraction, and key phrase detection:

text = "AWS is a great service provider for AI applications."

For Sentiment Analysis, we use:

- comprehend.detect_sentiment: Calls AWS Comprehend to perform analysis on the text.

- LanguageCode='en': Specifies the language of the text (here is English)

- sentiment_response: Stores the AWS Comprehend response, which includes sentiment details and scores.

- sentiment_data: Contains the scores for each sentiment (positive, negative, neutral, mixed).

sentiment_response = comprehend.detect_sentiment(Text=text, LanguageCode='en')
sentiment_data = sentiment_response['SentimentScore']

To display the sentiment analysis results:

The sentiment scores are converted to a DataFrame using pandas to display them in an organized manner without index numbers.

and to_string(index=False): Used to display the data without index numbers.

print("\nSentiment Analysis Results:")
sentiment_df = pd.DataFrame([sentiment_data], columns=['Positive', 'Negative', 'Neutral', 'Mixed'])
print(sentiment_df.to_string(index=False))

To extract entities: comprehend.detect_entities calls the AWS Comprehend service to extract entities in the text. As for entities_data: It contains a list of the extracted entities (such as names, organizations, etc.), their type, and their accuracy levels.

entities_response = comprehend.detect_entities(Text=text, LanguageCode='en')
entities_data = entities_response['Entities']

To create and display data for entities, a DataFrame is created from the extracted data and displayed in a structured form.

entities_df = pd.DataFrame(entities_data)[['Text', 'Type', 'Score']]
print("\nEntities Recognition Results:")
print(entities_df.to_string(index=False))

To discover key phrases, the following is used:

comprehend.detect_key_phrases: Calls the AWS Comprehend service to discover key phrases in the text.
key_phrases_data: Contains a list of key phrases and their accuracy scores.

key_phrases_response = comprehend.detect_key_phrases(Text=text, LanguageCode='en')
key_phrases_data = key_phrases_response['KeyPhrases']

The DataFrame here is created from the main phrases and displayed in an organized form.

key_phrases_df = pd.DataFrame(key_phrases_data)[['Text', 'Score']]
print("\nKey Phrases Detection Results:")
print(key_phrases_df.to_string(index=False))

So the entire text_analysis.py file is:

import boto3
import pandas as pd

# Create AWS Comprehend Client
comprehend = boto3.client('comprehend')

# Text to be analyzed
text = "AWS is a great service provider for AI applications."

# Sentiment Analysis
sentiment_response = comprehend.detect_sentiment(Text=text, LanguageCode='en')
sentiment_data = sentiment_response['SentimentScore']

# Show sentiment analysis results
print("\nSentiment Analysis Results:")
sentiment_df = pd.DataFrame([sentiment_data], columns=['Positive', 'Negative', 'Neutral', 'Mixed'])
print(sentiment_df.to_string(index=False))

# Entity extraction
entities_response = comprehend.detect_entities(Text=text, LanguageCode='en')
entities_data = entities_response['Entities']

# Create a DataFrame for entities
entities_df = pd.DataFrame(entities_data)[['Text', 'Type', 'Score']]
print("\nEntities Recognition Results:")
print(entities_df.to_string(index=False))

# Extract keywords
key_phrases_response = comprehend.detect_key_phrases(Text=text, LanguageCode='en')
key_phrases_data = key_phrases_response['KeyPhrases']

# Create a DataFrame for Keywords
key_phrases_df = pd.DataFrame(key_phrases_data)[['Text', 'Score']]
print("\nKey Phrases Detection Results:")
print(key_phrases_df.to_string(index=False))

Now through the command vim or nano create the text_analysis.py file and put the above code in the file, here we will use the command nano:
nano text_analysis.py

And past the codes:

then exit the file from the keyboard by ctrl x then y (to save the file) then click enter

Now we run the code and check the results by executing the following command:
python3 text_analysis.py

The results will appear as formatted tables on the black screen:

Displaying the results on the web page by creating a simple web application using the Flask library in Python

To display the results on the web page by creating a simple web application using Flask, we create a new file called app.py which contains the following functions and codes which we will explain in detail and which we will use to implement the text analysis and display on the Web page:

Below is to import the libraries needed to work

from flask import Flask, render_template_string
import boto3
import pandas as pd

To initialize and create a new Flask application object, we use:

app = Flask(__name__)

The code below is to create the home page path:

@app.route('/')
def home():

To create an AWS Comprehend client:

comprehend = boto3.client('comprehend')

The text that will be parsed by AWS Comprehend (here our text is "AWS is a great service provider for AI applications.")

 text = "AWS is a great service provider for AI applications."

To analyze sentiment, we use: detect_sentiment: which calls the AWS Comprehend service to analyze the sentiment in the text. sentiment_response: stores the AWS service response. Then convert the results to a DataFrame to display them in an organized manner

sentiment_response = comprehend.detect_sentiment(Text=text, LanguageCode='en')
    sentiment_data = sentiment_response['SentimentScore']
    sentiment_df = pd.DataFrame([sentiment_data], columns=['Positive', 'Negative', 'Neutral', 'Mixed'])

detect_entities function: Calls the AWS Comprehend service to extract entities from text, then converts the results to a DataFrame

entities_response = comprehend.detect_entities(Text=text, LanguageCode='en')
    entities_data = entities_response['Entities']
    entities_df = pd.DataFrame(entities_data)[['Text', 'Type', 'Score']]

To discover key phrases, the detect_key_phrases function is used, which calls the AWS service to discover key phrases and then converts the results to a DataFram

key_phrases_response = comprehend.detect_key_phrases(Text=text, LanguageCode='en')
    key_phrases_data = key_phrases_response['KeyPhrases']
    key_phrases_df = pd.DataFrame(key_phrases_data)[['Text', 'Score']]

HTML template is used to display results where as templates are used to display tables in a web page.

html_template = """
    <h2>Sentiment Analysis Results:</h2>
    {{ sentiment_table | safe }}
    <h2>Entities Recognition Results:</h2>
    {{ entities_table | safe }}
    <h2>Key Phrases Detection Results:</h2>
    {{ key_phrases_table | safe }}

The to_html(index=False) function converts tables from DataFrames to HTML without index numbers

sentiment_table = sentiment_df.to_html(index=False)
    entities_table = entities_df.to_html(index=False)
    key_phrases_table = key_phrases_df.to_html(index=False)

To render HTML and display tables on the page, the render_template_string function is used

return render_template_string(html_template, sentiment_table=sentiment_table,
                                  entities_table=entities_table, key_phrases_table=key_phrases_table)

Finally, to run the Flask application, we need to use: app.run(): To run the Flask application. host='0.0.0.0': Makes the application available on all IP addresses of the host. port=5000: specifies the port to run the application (here port 5000). debug=True: Runs the application in debug mode to provide error information.

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000, debug=True)

So the entire app.py file is:

from flask import Flask, render_template_string
import boto3
import pandas as pd

app = Flask(__name__)

@app.route('/')
def home():
# Create AWS Comprehend Client
    comprehend = boto3.client('comprehend')

# Text to be analyzed
    text = "AWS is a great service provider for AI applications."

# Sentiment Analysis
    sentiment_response = comprehend.detect_sentiment(Text=text, LanguageCode='en')
    sentiment_data = sentiment_response['SentimentScore']
    sentiment_df = pd.DataFrame([sentiment_data], columns=['Positive', 'Negative', 'Neutral', 'Mixed'])

# Entity extraction
    entities_response = comprehend.detect_entities(Text=text, LanguageCode='en')
    entities_data = entities_response['Entities']
    entities_df = pd.DataFrame(entities_data)[['Text', 'Type', 'Score']]

# Extract keywords
    key_phrases_response = comprehend.detect_key_phrases(Text=text, LanguageCode='en')
    key_phrases_data = key_phrases_response['KeyPhrases']
    key_phrases_df = pd.DataFrame(key_phrases_data)[['Text', 'Score']]

# HTML templates to display results
    html_template = """
    <h2>Sentiment Analysis Results:</h2>
    {{ sentiment_table | safe }}
    <h2>Entities Recognition Results:</h2>
    {{ entities_table | safe }}
    <h2>Key Phrases Detection Results:</h2>
    {{ key_phrases_table | safe }}
    """

# Convert DataFrame to HTML
    sentiment_table = sentiment_df.to_html(index=False)
    entities_table = entities_df.to_html(index=False)
    key_phrases_table = key_phrases_df.to_html(index=False)

    return render_template_string(html_template, sentiment_table=sentiment_table,
                                  entities_table=entities_table, key_phrases_table=key_phrases_table)

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000, debug=True)

Now in the same way to create app.py file using vim or nano command and put the above code in the file, here we will also use nano command:

nano app.py

then past the codes:

then exit the file from the keyboard by ctrl x then y (to save the file) then click enter

The last step is to run the Flask application, which is done with the command:

python3 app.py

When you run app.py, the terminal should display a URL as shown below:

Now to view the results, we open the browser on our device and enter the URL link through the following steps:

Go back to the instance and select it by the checkbox.
Go to the Details and copy the Public IPv4 address.

Then paste the public IPv4 address with port 5000 into the browser and press Enter.

The webpage is working successfully and all the results (sentiment analysis, extracted entities, key phrases) can be seen on the webpage where they are organized through our use of the Flask library.

References: