Name: How to Build Flet Chat App with Barcode and Gemini APIs
Rating: 2 (1858 reviews)
Author: yushulx

Gemini is Google's latest AI model, which can be used for free with a limit of 60 queries per minute, and is capable of recognizing text from images. Generally, 1D barcodes are accompanied by human-readable text, which can be used to verify the accuracy of barcode recognition results. In this article, we will use the Flet Python API to build a desktop chat app integrated with both barcode and Gemini APIs. The app will read barcodes from images using Dynamsoft Barcode Reader and perform OCR on text within images using Gemini's text recognition capabilities.

Installation

pip install -U google-generativeai dbr flet

Prerequisites

Flet Python API for Desktop Applications

Flet empowers developers to create desktop applications using Python. It offers a crash course for constructing a real-time chat application, which serves as an excellent starting point.

Our application features a list view for displaying chat messages, a text input field, a button for uploading images, a button for sending messages, and a button to clear the chat history.

Chat messages:


chat = ft.ListView(
        expand=True,
        spacing=10,
        auto_scroll=True,
    )

Text input field:

new_message = ft.TextField(
    hint_text="Write a message...",
    autofocus=True,
    shift_enter=True,
    min_lines=1,
    max_lines=5,
    filled=True,
    expand=True,
    on_submit=send_message_click,
)

Button to load an image:


def pick_files_result(e: ft.FilePickerResultEvent):
    global image_path
    image_path = None
    if e.files != None:
        image_path = e.files[0].path
        # TODO

def pick_file(e):
    pick_files_dialog.pick_files()

pick_files_dialog = ft.FilePicker(on_result=pick_files_result)
page.overlay.append(pick_files_dialog)

ft.IconButton(
    icon=ft.icons.UPLOAD_FILE,
    tooltip="Pick an image",
    on_click=pick_file,
)

Button to send a message:

def on_message(message: Message):
    if message.message_type == "chat_message":
        m = ChatMessage(message)

        chat.controls.append(m)
        page.update()

page.pubsub.subscribe(on_message)

def send_message_click(e):
    global image_path
    if new_message.value != "":
        page.pubsub.send_all(
            Message("Me", new_message.value, message_type="chat_message"))

        question = new_message.value

        new_message.value = ""
        new_message.focus()
        page.update()

        page.pubsub.send_all(
            Message("Gemini", "Thinking...", message_type="chat_message"))

        # TODO

ft.IconButton(
    icon=ft.icons.SEND_ROUNDED,
    tooltip="Send message",
    on_click=send_message_click,
),

PubSub facilitates asynchronous communication across page sessions. The subscribe method enables the receipt of broadcast messages from other sessions, while the send_all method allows for sending messages to all active sessions. Whenever a new message is received, the list view is automatically updated to display this new message.

Button to clear the chat history:

def clear_message(e):
    global image_path
    image_path = None
    chat.controls.clear()
    page.update()

ft.IconButton(
    icon=ft.icons.CLEAR_ALL,
    tooltip="Clear all messages",
    on_click=clear_message,
)

Integrating the Dynamsoft Barcode Reader

The Dynamsoft Barcode Reader is an efficient library designed for barcode scanning. To enable barcode scanning in your app, you must integrate this library. Here's how you can do it:

Import the Dynamsoft Barcode Reader library and initialize a barcode reader instance using your license key.

from dbr import *
license_key = "LICENSE-KEY"
BarcodeReader.init_license(license_key)
reader = BarcodeReader()

Decode the barcode from the uploaded image and send the result to the chat.

def pick_files_result(e: ft.FilePickerResultEvent):
    global image_path, barcode_text
    barcode_text = None
    image_path = None
    if e.files != None:
        image_path = e.files[0].path
        page.pubsub.send_all(
            Message("Me", image_path, message_type="chat_message", is_image=True))

        text_results = None
        try:
            text_results = reader.decode_file(image_path)
        except BarcodeReaderError as bre:
            print(bre)

        if text_results != None:
            barcode_text = text_results[0].barcode_text
            page.pubsub.send_all(
                Message("DBR", barcode_text, message_type="chat_message"))

Utilizing Google's Gemini AI for Text Recognition

Gemini can extract text from images. Once you've decoded a barcode, you can employ Gemini to verify the accuracy of the text decoded from the barcode. Here are the steps to use Gemini:

Set up the API key for Gemini.

import google.generativeai as genai
import google.ai.generativelanguage as glm

genai.configure(api_key='API-KEY')

Initialize the text and vision models. The vision model takes both text and images as input.

model_text = genai.GenerativeModel('gemini-pro')
chat_text = model_text.start_chat(history=[])
model_vision = genai.GenerativeModel('gemini-pro-vision')
chat_vision = model_vision.start_chat(history=[])

Customize the command to effectively recognize text from the barcode image.

def send_message_click(e):
    global image_path
    if new_message.value != "":
        ...

        if question == ":verify":
            question = "recognize text around the barcode"
            response = model_vision.generate_content(
                glm.Content(
                    parts=[
                        glm.Part(
                            text=question),
                        glm.Part(
                            inline_data=glm.Blob(
                                mime_type='image/jpeg',
                                data=pathlib.Path(
                                    image_path).read_bytes()
                            )
                        ),
                    ],
                ))

            text = response.text
            page.pubsub.send_all(
                Message("Gemini", text, message_type="chat_message"))

Verifying the Barcode Decoding Results with the Accompanying Text

Now, we can check whether the text read from the barcode exists in the text recognized from the image. Since the text extracted by Gemini might include spaces, it's essential to eliminate these spaces prior to comparison.

if barcode_text == None:
    return

text = text.replace(" ", "")
if text.find(barcode_text) != -1:
    page.pubsub.send_all(
        Message("Gemini", barcode_text + " is correct ✓", message_type="chat_message"))
else:
    page.pubsub.send_all(
        Message("Gemini", barcode_text + " may not be correct", message_type="chat_message"))

Launch the desktop application and test it with some images that contain 1D barcodes:

flet run chatbot.py

Source Code

https://github.com/yushulx/python-barcode-qrcode-sdk/tree/main/examples/official/9.x/flet_chat_gemini