Introduction
LLM applications are becoming increasingly popular. However, there are numerous LLM models, each with its differences. Handling streaming output can be complex, especially for new front-end developers.
Thanks to the AI SDK developed by Vercel, implementing LLM chat in next.js with streaming output has become incredibly easy. Next, I'll provide a step-by-step tutorial on how to integrate Ollama into your front-end project.
Install Ollama
Ollama is the premier local LLM inferencer. It allows for direct model downloading and exports APIs for backend use. If you're seeking lower latency or improved privacy through local LLM deployment, Ollama is an excellent choice. For installation, if you're using Linux, simply run the following command:
curl -fsSL https://ollama.com/install.sh | sh
If you're using a different OS, please follow this link.
Create a New Next.js Project
To create a new Next.js project, enter the command npx create-next-app@latest your-new-project
. Make sure you choose App route mode. After that, run npm dev
and open localhost:3000
in your preferred browser to verify if the new project is set up correctly.
Next, you need to install the AI SDK:
npm install ai
The AI SDK utilizes a sophisticated provider design, enabling you to implement your own LLM provider. At present, it is only necessary to install the Ollama provider offered by third-party support.
npm install ollama-ai-provider
Server-Side Code
Now that you've gathered all the prerequisites for your LLM application, create a new file named actions.ts
in the app
folder:
"use server";
import { ollama } from "ollama-ai-provider";
import { streamText } from "ai";
import { createStreamableValue } from "ai/rsc";
export interface Message {
role: "user" | "assistant";
content: string;
}
export async function continueConversation(history: Message[]) {
"use server";
const stream = createStreamableValue();
const model = ollama("llama3:8b");
(async () => {
const { textStream } = await streamText({
model: model,
messages: history,
});
for await (const text of textStream) {
stream.update(text);
}
stream.done();
})().then(() => {});
return {
messages: history,
newMessage: stream.value,
};
}
Let me provide some explanation about this code.
-
interface Message
is a shared interface that establishes the structure of a message. It includes two properties: 'role' (which can be either 'user' or 'assistant') and 'content' (the actual text of the message). - The
continueConversation
function is a server component that utilizes the conversation history to generate the assistant's response. This function interacts with the Ollama model (specificallyllama3:8b
, but you can replace it with any model of your choice) to generate a continuous text output. - The
streamText
function is part of the AI SDK and it creates a text stream that will be updated with the assistant's response as it is generated.
Client-Side Code
Next, replace the contents of page.tsx
with the new code:
"use client";
import { useState } from "react";
import { continueConversation, Message } from "./actions";
import { readStreamableValue } from "ai/rsc";
export default function Home() {
const [conversation, setConversation] = useState<Message[]>([]);
const [input, setInput] = useState<string>("");
return (
<div>
<div>
{conversation.map((message, index) => (
<div key={index}>
{message.role}: {message.content}
</div>
))}
</div>
<div>
<input
type="text"
value={input}
onChange={(event) => {
setInput(event.target.value);
}}
/>
<button
onClick={async () => {
const { messages, newMessage } = await continueConversation([
...conversation,
{ role: "user", content: input },
]);
let textContent = "";
for await (const delta of readStreamableValue(newMessage)) {
textContent = `${textContent}${delta}`;
setConversation([
...messages,
{ role: "assistant", content: textContent },
]);
}
}}
>
Send Message
</button>
</div>
</div>
);
}
This is a very simple UI you can continue talk with LLM model now. There are some important snips:
- The
input
field captures the user's input. It is controlled by a React state variable that gets updated every time the input changes. - The
button
has anonClick
event that triggers thecontinueConversation
function. This function takes the current conversation history, appends the user's new message, and waits for the assistant's response. - The
conversation
array holds the history of the conversation. Each message is displayed on the screen, and new messages are appended at the end. By usingreadStreamableValue
from the AI SDK, we're able to read the streaming output value from the server component function and update the conversation in real-time.
Let’s Test Now
I type "who are you" into the input placeholder.
Here is the output of llama:8b
supported by Ollama. You'll notice that the output is printed in a streaming manner.
References
- Documentation for the AI SDK: https://sdk.vercel.ai/docs/introduction
- Ollama Github: https://github.com/ollama/ollama
- Find more models supported oy Ollama: https://ollama.com/library