In the cloud computing ecosystem, infusing applications with the power of Language Learning Models (LLMs) has become increasingly accessible. This article will guide you through integrating an LLM into your cloud project, utilizing the capabilities of WebAssembly (WASM) within the open-source framework of the 8bit.ws cloud.

LLM Capabilities via 8bit.ws Cloud
The 8bit.ws cloud extends LLM features through its Satellite system, enabling serverless functions to harness AI's predictive power. Here’s how to exploit these capabilities in your cloud functions.

LLAMA Satellite & its SDK
Our Cloud Computing Network provides LLM capabilities through what we call a Satellite. It does so by exporting llama.cpp capabilities to the Taubyte Virtual Machine, which powers Serverless Functions (or DFunctions, as per Taubyte's terminology). The source code for the Satellite can be found here.

Satellites export low-level functions that aren't very intuitive to use directly. Fortunately, it's possible to address that with a user-friendly SDK. As of today, we offer a Go SDK. The source code can be found here.

Building a Predictive Function
Let’s construct a function that uses LLM to process prompts.

Before proceeding, let's ensure you have a project and a DFunction ready to go. If not, please refer to "Create a Function".

If you followed the steps from Taubyte's Documentation, your basic function should look something like this:

package lib

import (
    "github.com/taubyte/go-sdk/event"
)

func ping(e event.Event) uint32 {
    h, err := e.HTTP()
    if err != nil {
        return 1
    }

    h.Write([]byte("PONG"))

    return 0
}

We will transform this ping function to a predict function that takes a prompt from the POST request body and returns a prediction.

Incorporating LLM Predictions
Here's how to modify your function to perform LLM predictions with the LLAMA SDK:

package lib

import (
    "github.com/taubyte/go-sdk/event"
    "github.com/samyfodil/taubyte-llama-satellite/sdk"
    "io"
)

func predict(e event.Event) uint32 {
    h, err := e.HTTP()
    if err != nil {
        return 1
    }
    defer h.Body().Close()

    prompt, err := io.ReadAll(h.Body())
    if err != nil {
        panic(err)
    }

    p, err := sdk.Predict(
        string(prompt),
        sdk.WithTopK(90),
        sdk.WithTopP(0.86),
        sdk.WithBatch(512),
    )
    if err != nil {
        panic(err)
    }

    for {
        token, err := p.Next()
        if err == io.EOF {
            break
        } else if err != nil {
            panic(err)
        }
        h.Write([]byte(token))
        h.Flush()
    }

    return 0
}

Explanation
This function reads the POST body as the prompt for the LLM. It then requests a prediction and enters a loop to write each token received back to the HTTP response. We’ve used the sdk.Predict method to send the prompt to the LLM and p.Next to iterate over the tokens.

Deploying Your Function
Once your function is ready, deploy it to the 8bit.ws cloud. Ensure your function's entry point and method are configured correctly to handle POST requests.

Conclusion
By integrating LLM with WASM in the 8bit.ws cloud, we unlock the potential for applications that are not just reactive, but proactive and intelligent. As open-source initiatives, 8bit.ws and Taubyte provide fertile ground for innovation, inviting developers to build, share, and improve upon a collective vision for the future of cloud computing.

Harnessing LLM with WebAssembly