Hello everyone! In this post I would like to share with you my recent development on my very own tiny wrapper for OpenAI API.
Motivation
Nowadays, every programmer who wants to build a production-ready LLM application, inevitable stumbles upon certain libraries such as LangChain, Semantic Kernel, Guidance etc. Then a lot of developers quickly grow frustrated and disappointed in their stack choice. Why? Because all these tools introduce unnecessary abstractions, add complicated logic and hide prompts they are using. As a result, the development starts to resemble a very tedious task even for simple applications.
In fact, you don't need anything to work with existing Open AI API. You don't even need any wrappers, all you need to get started is just using your own logic, reasoning and naked API calls via your preferred programming language. I created the wrapper based on my experience and my current tasks and thought it would probably be useful for a larger audience.
Concept
I had some requirements in mind which I tried to convey in code:
- No hard-coded prompts. It's just a text, if you feel like your prompts are not working, there are thousands of them over the internet available for free
- Conversation history should be stored by default. This is a chat model anyway, so it's a good idea for the conversation history to be stored as a dictionary for consistent experience. It's easier to implement a method to clear the history rather than constructing the history
- Token count validation. Tokens should be counted and if the amount exceeds the available quota, the program should reduce the messages or at least produce a warning or an error
- No hard-coded complicated logic for any chains/agents etc. It should implement only basic methods, while anything else should be done by the developer. Later below I will show that it's quite easy
Lloom
With concepts listed above, I have implemented and released "lloom" version 0.0.1: https://github.com/zakharsmirnoff/lloom. It's available via pip, the documentation for methods can be found on github. Here I would like to give some examples in comparison with Langchain.
Examples
One of the main use cases for GPT-powered applications is summarization. I studied this page in Langchain docs and implemented the same logic using my wrapper:
from lloom import LloomConfig, Lloom
from pypdf import PdfReader
import re
# I'm using PdfReader package for processing PDF docs, the pdf here is a paper available on arxiv
reader = PdfReader("virtual_injection.pdf")
config = LloomConfig(your_api_key, model="gpt-3.5-turbo-0613")
lloom = Lloom(config)
answers = []
# A minimal helper function to achieve a stuff chain. I know it's not exactly stuff chain as in Langchain documentation, but the concept is similar: putting the whole document into the prompt. I just split the PDF by pages
def one_shot(text):
res = loom.generate(f"Write a concise summary of the following: {text}")
answers.append(res)
loom.clear_history()
# The code below took approximately 1.5 minutes to execute, as a result we got a list with gpt-generated summaries. I chose only 11 first pages, since the rest is examples and references
for page in reader.pages[0:11]:
text = page.extract_text()
one_shot(text)
# This single line of code along with the function above implements a map-reduce chain
summary = loom.generate(f"Write a short and concise summary for these pieces of text: {', '.join(answers)}")
print(summary) # it gave me this:
'''
The papers introduce the concept of Virtual Prompt Injection (VPI) as a method to manipulate the behavior of Large Language Models (LLMs) without directly modifying the model input.
VPI allows an attacker to control the model's responses by specifying a virtual prompt, leading to biased views and potentially harmful outcomes.
The papers propose a method for performing VPI by poisoning the model's instruction tuning data and demonstrate its effectiveness in steering the LLM's behavior.
They emphasize the importance of ensuring the integrity of instruction tuning data and suggest data filtering as a defense against poisoning attacks.
The effectiveness of VPI is evaluated in various scenarios such as sentiment steering and code injection, with comparisons to baseline methods and different model scales.
The papers also discuss defense mechanisms and the need for further research in this area to develop better defense mechanisms against VPI attacks.
The limitations of the study are acknowledged, and the authors emphasize the importance of studying vulnerabilities in instruction-tuned language models to enhance security measures.
'''
You can check other examples of summarization in the repo (to be added later actually)
Let's head to the agent example. Below you can see the implementation of DnD for one player:
from lloom import LloomConfig, Lloom
protagonist_name = "Storm Ryder"
storyteller_name = "Captain Quill"
quest = '''Set sail on the "Marauder's Dream" with your unique crew, seeking the fragmented map leading to the legendary "Mythic Isles" and the coveted "Seafarer's Heart" artifact. Battle fierce rivals, sea monsters, and unravel hidden histories while forging unbreakable bonds. A thrilling quest inspired by One Piece awaits with treasure beyond gold adventure, camaraderie, and freedom.'''
protagonist_description = "Storm Ryder, a daring and enigmatic adventurer, possesses the heart of a true pirate. Guided by an unyielding sense of justice and fueled by an insatiable thirst for adventure, Storm sails the uncharted seas, leaving a legacy of courage and camaraderie in their wake."
storyteller_description = "Captain Quill, a weathered and charismatic storyteller, carries tales of ancient legends and forgotten myths in their ink-stained logbook. With a twinkle in their eye and a voice that mesmerizes, they narrate the epic quest of Storm Ryder and the fabled Mythic Isles, inspiring awe and wonder in all who listen."
# I have already taken the ready-made system message from langchain docs, but the same functionality of so-called specifying the description can be implemented in 3 lines of code if needed
player_sysmsg = f'''
Here is the topic for a Dungeons & Dragons game: {quest}.
There is one player in this game: the protagonist, {protagonist_name}.
The story is narrated by the storyteller, {storyteller_name}.
Never forget you are the protagonist, {protagonist_name}, and I am the storyteller, {storyteller_name}.
Your character description is as follows: {protagonist_description}.
You will propose actions you plan to take and I will explain what happens when you take those actions.
Speak in the first person from the perspective of {protagonist_name}.
For describing your own body movements, wrap your description in '*'.
Do not change roles!
Do not speak from the perspective of {storyteller_name}.
Do not forget to finish speaking by saying, 'It is your turn, {storyteller_name}.'
Do not add anything else.
Remember you are the protagonist, {protagonist_name}.
Stop speaking the moment you finish speaking from your perspective.
'''
master_sysmsg = f'''
Here is the topic for a Dungeons & Dragons game: {quest}.
There is one player in this game: the protagonist, {protagonist_name}.
The story is narrated by the storyteller, {storyteller_name}.
Never forget you are the storyteller, {storyteller_name}, and I am the protagonist, {protagonist_name}.
Your character description is as follows: {storyteller_description}.
I will propose actions I plan to take and you will explain what happens when I take those actions.
Speak in the first person from the perspective of {storyteller_name}.
For describing your own body movements, wrap your description in '*'.
Do not change roles!
Do not speak from the perspective of {protagonist_name}.
Do not forget to finish speaking by saying, 'It is your turn, {protagonist_name}.'
Do not add anything else.
Remember you are the storyteller, {storyteller_name}.
Stop speaking the moment you finish speaking from your perspective.
'''
master_config = LloomConfig(your_api_key, temperature=1.0, logging=False, model="gpt-3.5-turbo-0613",system_message=master_sysmsg)
player_config = LloomConfig(your_api_key, temperature=1.0, logging=False, model="gpt-3.5-turbo-0613", system_message=player_sysmsg)
master = Lloom(master_config)
player = Lloom(player_config)
max_iter = 3
initial_message = "I'm ready to start!"
n = 0
# as the history is stored by default, we don't need to implement any additional classes or methods
while n < max_iter:
master_message = master.generate(initial_message)
print(f"Master message: {master_message}")
player_message = player.generate(master_message)
initial_message = player_message
print(f"Player message: {player_message}")
n += 1
If you check the Langchain docs, you will see that their implementation is at least of the same length, but usually they need more code. Same is true for Semantic Kernel. Thank you for reading! If you like my work, please star the repository and share your use cases, would be excited to learn new ways!
P.S. In no means I'm trying to make a competition with LangChain and similar libraries, and no hate either! Langchain and others might be a good choice in some cases, I just found out that simpler implementation works best for me.