Testing Frameworks
For testing this project, I chose the following tools:
-
Jest:
- Why?: Jest is a widely used testing framework for JavaScript projects, and since my project was written in JavaScript, it was a natural choice. Jest provides a simple API for unit and integration testing, built-in mocking, and snapshot testing. It is ideal for both small and large projects.
- Link: Jest Documentation
-
Nock:
- Why?: Nock is a powerful library that allows you to mock HTTP requests. It works well for testing scenarios where external API calls (like LLM integrations) are involved. By using Nock, I can simulate LLM responses without actually hitting the live API, which makes the tests faster and more reliable.
- Link: Nock Documentation
Testing Configuration
Installing Dependencies
The first step was to install Jest and Nock:
npm install --save-dev jest nock
Configuring Jest with ESLint
Since Jest uses global variables like describe
, test
, expect
, etc., I had to configure ESLint
to avoid warnings related to uninitialized global variables. To do this, I added the following configuration in my eslint.config.mjs
file:
import globals from "globals";
import pluginJs from "@eslint/js";
import pluginJest from "eslint-plugin-jest";
export default [
{
files: ["**/*.js"],
languageOptions: { sourceType: "commonjs" },
},
{
ignores: [
"build/",
"coverage/",
"node_modules/",
".env",
"*.config.js",
"*.config.mjs",
"examples/",
],
},
{ languageOptions: { globals: { ...globals.browser, ...globals.jest } } },
pluginJs.configs.recommended,
{
plugins: {
jest: pluginJest,
},
rules: {
...pluginJest.configs.recommended.rules,
},
},
];
This tells ESLint that these global variables are provided by Jest, and it shouldn't flag them as errors.
Setting Up Nock for Mocking HTTP Requests
In the test files, I imported Nock and used it to mock HTTP requests:
const nock = require("nock");
Then, I defined mock HTTP responses for the LLM API like so:
const mockResponse = {
id: "chatcmpl-0b910ec8-e9d9-4095-99ed-1311b8efcf39",
object: "chat.completion",
created: 1730865897,
model: "llama3-8b-8192",
choices: [
{
index: 0,
message: {
role: "assistant",
content: "This is a test response from the LLM.",
},
logprobs: null,
finish_reason: "stop",
},
],
usage: {
queue_time: 0.003401526000000002,
prompt_tokens: 300,
prompt_time: 0.079328471,
completion_tokens: 500,
completion_time: 0.215833333,
total_tokens: 800,
total_time: 0.295161804,
},
system_fingerprint: "fp_a97cfe35ae",
x_groq: { id: "req_01jbztb85cf4nv71cyjd6pms9w" },
};
nock("https://api.groq.com/openai")
.post("/v1/chat/completions")
.reply(200, mockResponse);
This way, every time my code makes an HTTP request, Nock intercepts it and returns the mocked response.
Challenges that I faced
One of the major challenges I faced during this process was testing how the LLM handles streaming responses. This type of response is returned differently than a standard JSON object, which made mocking it a bit tricky. Here's what I learned:
Streaming Responses Are Different: The LLM streaming API sends responses in "chunks", which is fundamentally different from regular responses that come in a single payload. This required me to modify my approach to testing, specifically how I simulated the stream-like behavior.
Nock Can't Mock Streams Directly: Initially, I tried using Nock to mock the stream, but I quickly realized that it was designed to mock HTTP responses, not streams. I had to simulate the stream using an array of mocked chunks of data.
test("should return an object with response and tokenInfo properties", async () => {
// Mock the stream
const stream = [
{ choices: [{ delta: { content: "Hello" } }] },
{
choices: [{ delta: { content: " World" } }],
x_groq: {
usage: {
queue_time: 0.001,
prompt_tokens: 300,
prompt_time: 0.046,
completion_tokens: 500,
completion_time: 0.203,
total_tokens: 800,
total_time: 0.25,
},
},
},
];
// Call the readStream function
const result = await readStream(stream);
// Assert the result
expect(result.response).toBe("Hello World");
expect(result.tokenInfo).toEqual({
completionToken: 500,
promptToken: 300,
totalToken: 800,
});
});
Testing uncover some edge cases
Empty Chunks: One potential edge case I uncovered was how the system should behave when the streaming chunks are empty or incomplete. Handling empty chunks gracefully is crucial, as real-world API responses can sometimes contain empty or unexpected data.
Missing Token Information: I had to ensure that my code correctly handled cases where the token information (i.e., usage data) wasn't present in every chunk. This could happen if the stream is not fully sent or if it's split across multiple chunks.
These edge cases helped refine my implementation and ensure that my code could handle a variety of real-world situations.
What I learnt
This was a highly educational experience. Here's what I took away from it:
Testing is Essential: Although I had written tests before, this project was my first time writing tests for an API integration, particularly for something as complex as a language model stream. I now see just how important it is to test all parts of the system, including edge cases and error handling.
Mocking is Powerful: I learned the importance of mocking external dependencies. Using Nock to mock the HTTP responses saved me from having to rely on a live connection to the LLM, which could have been slow or unreliable. Mocking also allowed me to simulate different responses easily.
Confidence in Code: Writing tests made me feel more confident that my code was functioning as expected. The tests helped me catch potential issues early on, and they provided a safety net for future changes or refactoring.