Introduction
Ever wondered what a Machine Learning (ML) model would say about a Big Mac or fries? Let’s find out.. McDeepNet is a playful ML experiment trained on a dataset of McDonald’s reviews.
Check out the source code and the app:
The Dataset Overview
The Kaggle dataset (McDonald’s Store Reviews) comprises 20,000 reviews of various McDonald’s stores.
The Technology Powering McDeepNet
- TensorFlow and Keras: framework and functionalities needed for model building and training.
- Pandas and NumPy: data processing and manipulation, enabling efficient handling of the review dataset.
- Streamlit: used to convert our RNN model into an interactive web application, allowing for easy demonstration and user interaction.
- Plotly Express: visualizing the insights derived from our model, making the data analysis both accessible and engaging.
Unraveling McDeepNet's Functionality
Let’s break it down..
a. Input Layer: Represents the initial input of text data (e.g., McDonald's reviews).
b. Tokenizer: This stage converts the input text into numerical sequences, making it understandable for the neural network.
c. Pre-trained RNN Model: This is the core of McDeepNet where the actual processing of sequential data takes place. It could be further detailed to show:
c.1 An Embedding Layer: Converts the sequences into dense vectors of fixed size.
c.2 LSTM Layers: Part of the RNN architecture, responsible for learning from the sequence data.
e. Output Layer: Generates the final output, which could be the transformed text or predictions based on the input.
Output: The final result produced by the model, such as generated text or analysis results.g. Application Layer: Streamlit web app in McDeepNet's case.
Training Phase of McDeepNet
1.1 Crafting the Model Architecture:
The cornerstone of McDeepNet is its RNN model, tailored to handle sequences up to 442 characters in length.
Code Snippet:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense
# Set the maximum sequence length
max_length = 442
# Specify the vocabulary size (unique words in the dataset)
vocab_size = 10000 # Example value
# Define the embedding dimension
embedding_dim = 256 # Example value
# Constructing the RNN model architecture
model = Sequential()
model.add(Embedding(input_dim=vocab_size, output_dim=embedding_dim, input_length=max_length))
model.add(LSTM(128, return_sequences=True))
model.add(LSTM(128))
model.add(Dense(64, activation='relu'))
model.add(Dense(vocab_size, activation='softmax'))
# Compiling the model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# Summarizing the model
model.summary()
1.2 Preparing the Training Data:
The process of training McDeepNet on 20,000 McDonald's reviews is a meticulous one, involving several crucial preprocessing steps.
Code Snippet for Data Preparation:
import pandas as pd
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
# Importing the dataset
df = pd.read_csv('mcdonalds_reviews.csv') # Replace this with the path to your dataset
reviews = df['review'].tolist() # Assuming the reviews are stored in a column named 'review'
# Text preprocessing steps
# Cleaning involves lowercasing and removing non-alphabetic characters
cleaned_reviews = [re.sub(r'[^a-zA-Z\s]', '', review.lower()) for review in reviews]
# Tokenizer initialization and configuration
tokenizer = Tokenizer(num_words=10000) # Setting the limit to the top 10,000 words
tokenizer.fit_on_texts(cleaned_reviews)
# Transforming text into integer sequences
sequences = tokenizer.texts_to_sequences(cleaned_reviews)
# Padding sequences to ensure uniform length
max_length = 442
padded_sequences = pad_sequences(sequences, maxlen=max_length, padding='post')
# The padded sequences are now ready to be fed into the RNN model
1.3 The Model Training Process:
This stage is vital as it involves actual training of the McDeepNet model.
Code Snippet for Model Training:
from tensorflow.keras.utils import to_categorical
# Assuming the padded sequences and tokenizer have been previously defined
# Convert the sequences into categorical data, suitable for classification tasks
labels = to_categorical(padded_sequences)
# Dividing the data into training and validation subsets
# Here, we use an 80% training and 20% validation split as an example
train_size = int(len(padded_sequences) * 0.8)
X_train, X_val = padded_sequences[:train_size], padded_sequences[train_size:]
y_train, y_val = labels[:train_size], labels[train_size:]
# Setting the number of epochs for the training process
epochs = 10 # This is an illustrative figure
# Commencing the training
history = model.fit(X_train, y_train, epochs=epochs, validation_data=(X_val, y_val))
# Option to save the model post-training
model.save('text_generation_model.h5')
Model Loading Phase
2.1 Retrieving the Trained Model:
After training, the model is preserved for future applications.
Code Snippet for Model Loading:
from tensorflow.keras.models import load_model
# Specify the file path of the saved model
model_path = 'text_generation_model.h5'
# Reloading the pre-trained model
model = load_model(model_path)
# The model is now primed for inference tasks
# You can employ methods like model.predict() for your specific needs
Reloading the Tokenizer
2.2 Accessing the Pre-Trained Tokenizer:
The tokenizer, an integral component trained alongside the dataset, plays a critical role in converting new textual inputs into structured sequences that the model can understand.
Code Snippet for Tokenizer Loading:
import pickle
from tensorflow.keras.models import load_model
# Reinstating the model (optional if already loaded)
model = load_model('text_generation_model.h5')
# Opening and loading the saved tokenizer
with open('tokenizer.pickle', 'rb') as handle:
tokenizer = pickle.load(handle)
Preparing for Inference
3.1 Inference Readiness of the Model:
Having successfully loaded both the model and the tokenizer, your setup is now fully equipped to undertake text generation tasks using new inputs. Just need a UI..
Developing a Streamlit Web Application Interface
Crafting a user-friendly interface with Streamlit is a straightforward process.
This interface will allow users to input initial text (seed text), adjust generation settings, and then receive custom-generated reviews.
Code Snippet for Streamlit Interface:
import streamlit as st
# Setting up the title and subtitle for the web application
st.title("🍔 McDeepNet 🍔")
st.subheader("Experience AI-Generated McDonald's Reviews")
# Creating a form for user inputs
with st.form(key='user_input_form'):
seed_text = st.text_input(label='Enter Seed Text for Review Generation')
num_words = st.number_input('Select Number of Words to Generate', min_value=1, max_value=100, value=5)
temperature = st.slider('Adjust Generation Creativity (Temperature)', min_value=0.1, max_value=3.0, value=1.0, step=0.1)
submit_button = st.form_submit_button(label='Generate Review')
# Additional instructions or user guidance can be added here
3.1 Visualizing the Generated Results
Once the user inputs their preferences, McDeepNet produces a unique review.
Code Snippet for Result Generation and Visualization:
# Set up the UI
st.title("🍔 McDeepNet 🍔")
st.subheader("Trained on 20k McDonald's Reviews")
st.write("Welcome to McDeepNet! This project uses a Machine Learning (ML) model trained on 20,000 McDonald's reviews. It's an interesting application that employs Recurrent Neural Networks (RNNs) to learn patterns from these reviews and, subsequently, generates a unique review of its own. The model can produce varying types of output based on a seed text and a temperature parameter provided by the user.")
st.markdown("""
- [Checkout my GitHub](https://github.com/zanepearton)
- [My dev.to Article](https://dev.to/zanepearton/mcdeepnet-training-tensorflow-on-mcdonalds-reviews-21e)
""")
# Form to take user inputs
with st.form(key='my_form'):
seed_text = st.text_input(label='Enter the seed text for sentence completion')
num_words = st.number_input(label='Enter the number of words to generate', min_value=1, max_value=100, value=50)
temperature = st.slider(label='Set temperature', min_value=0.1, max_value=3.0, value=1.0, step=0.1)
submit_button = st.form_submit_button(label='Generate Text')
# Generate and display the output on form submission
if submit_button:
sentence, word_probs = generate_sentence(model, tokenizer, max_length, seed_text, num_words, temperature)
st.text_area("Generated Text", value=sentence, height=150)
# Count word frequencies
word_freq = Counter(sentence.split())
# Create and display the tree diagram
fig_tree = create_tree_diagram(word_probs)
st.plotly_chart(fig_tree)
# Create a DataFrame for the frequencies
freq_df = pd.DataFrame(list(word_freq.items()), columns=['Word', 'Frequency'])
# Create a Plotly Express scatter plot
fig_scatter = px.scatter(freq_df, x='Word', y='Frequency', size='Frequency', title='Word Frequencies',
hover_name='Word', size_max=60)
st.plotly_chart(fig_scatter)
Project Repository
For more details, code, and updates, visit the McDeepNet GitHub repository: McDeepNet on GitHub.
Catch me on my Linktree 🔗
Linktree