Turbocharge your application development using WebAssembly with SingleStoreDB

Akmal Chaudhri - Oct 4 '22 - - Dev Community

Abstract

WebAssembly (Wasm) is a binary instruction format for a stack-based virtual machine. Wasm enables developers to use existing code from programming languages, such as C, C++ and Rust, as part of their application development process. However, Wasm is not just for the web, and today is moving in exciting new directions. For example, one use-case would be to run Wasm code in a database system on the data already stored in the database system - an example of co-locating computation with data. Using Wasm to extend the capabilities of a database system opens up opportunities to develop many new applications. SingleStoreDB supports Wasm through Code Engine and, in this article, we'll see how to build a Wasm UDF to perform sentiment analysis on data already stored in SingleStoreDB.

Introduction

We'll need to perform a few steps to prepare our development environment, and the following sections will show how to do this. We'll also use Rust to create our Wasm UDF. Our code example will build upon the example described in the SingleStore blog post [r]evolution Summer 2022: Bring Application Logic to Your Data With SingleStoreDB Code Engine for Wasm.

Create a SingleStoreDB Cloud account

A previous article showed the steps required to create a free SingleStoreDB Cloud account. We'll use Wasm Demo Group as our Workspace Group Name and wasm-demo as our Workspace Name. We'll make a note of our password and host name.

Setup local Wasm development environment

We can quickly create a local Wasm development environment using a few steps. However, SingleStore also provides a Docker Container as an alternative.

Install the software

First, we'll download the wasi-sdk. We'll use wasi-sdk-16.0-linux.tar.gz, the latest version available when writing this article. We'll unpack the file to the /opt directory, as follows:



sudo tar xzvf /path/to/wasi-sdk-16.0-linux.tar.gz -C /opt


Enter fullscreen mode Exit fullscreen mode

We'll replace /path/to/ with the actual path where we downloaded the file. We'll also need to ensure that we add the bin directory to our PATH variable, as follows:



export PATH=/opt/wasi-sdk-16.0/bin:$PATH


Enter fullscreen mode Exit fullscreen mode

Second, we'll download and install the Rust toolchain, as follows:



curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh


Enter fullscreen mode Exit fullscreen mode

To configure the current shell, we'll need to run:



source "$HOME/.cargo/env"


Enter fullscreen mode Exit fullscreen mode

Finally, we'll install wit-bindgen, as follows:



cargo install --git https://github.com/bytecodealliance/wit-bindgen wit-bindgen-cli


Enter fullscreen mode Exit fullscreen mode

To deploy our Wasm module to SingleStoreDB, we'll use the pushwasm tool. First, we'll clone the GitHub repo to a convenient location:



git clone https://github.com/singlestore-labs/pushwasm


Enter fullscreen mode Exit fullscreen mode

Next, we'll change to the pushwasm directory and build the code, as follows:



cd pushwasm

cargo build --release


Enter fullscreen mode Exit fullscreen mode

A new file called pushwasm should be written to target/release, and this directory should be added to our PATH variable.

NOTE: We may also need to run the following to ensure a successful pushwasm build:



sudo apt install libssl-dev


Enter fullscreen mode Exit fullscreen mode

Initialise the source tree

Next, let's create a new directory called workdir in our home folder:



cd

mkdir workdir

cd workdir


Enter fullscreen mode Exit fullscreen mode

From the workdir, we'll now create a skeletal Rust source tree, as follows:



cargo init --vcs none --lib


Enter fullscreen mode Exit fullscreen mode

and we'll need to add wasm32-wasi to the Rust toolchain as it is not installed by default:



rustup target add wasm32-wasi


Enter fullscreen mode Exit fullscreen mode

Create the .wit file

The code we'll use below for our Wasm UDF is also available on GitHub.

In our workdir, we'll now create a file called sentimentable.wit that contains the interface definition. In this file, we'll add the following:



record polarity-scores {
    compound: float64,
    positive: float64,
    negative: float64,
    neutral: float64,
}

sentimentable: func(input: string) -> list<polarity-scores>


Enter fullscreen mode Exit fullscreen mode

We'll define a function sentimentable that will take a string, perform sentiment analysis on that string and return a list of polarity scores.

Implement and compile

In our workdir we'll replace the existing contents of Cargo.toml with the following code:



[package]
name = "sentimentable"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
wit-bindgen-rust = { git = "https://github.com/bytecodealliance/wit-bindgen.git", rev = "60e3c5b41e616fee239304d92128e117dd9be0a7" }
vader_sentiment = { git = "https://github.com/ckw017/vader-sentiment-rust" }
lazy_static = "1.4.0"

[lib]
crate-type = ["cdylib"]


Enter fullscreen mode Exit fullscreen mode

Now we need to add the code for sentimentable, so we'll navigate to the src directory in our workdir and locate the lib.rs file. In the lib.rs file, we'll replace the existing contents with the following code:



wit_bindgen_rust::export!("sentimentable.wit");
use crate::sentimentable::PolarityScores;
struct Sentimentable;
impl sentimentable::Sentimentable for Sentimentable {

    fn sentimentable(input: String) -> Vec<PolarityScores> {
        lazy_static::lazy_static! {
            static ref ANALYZER: vader_sentiment::SentimentIntensityAnalyzer<'static> =
                vader_sentiment::SentimentIntensityAnalyzer::new();
        }

        let scores = ANALYZER.polarity_scores(input.as_str());
        vec![PolarityScores {
            compound: scores["compound"],
            positive: scores["pos"],
            negative: scores["neg"],
            neutral: scores["neu"],
        }]
    }
}


Enter fullscreen mode Exit fullscreen mode

Our code uses VADER (Valence Aware Dictionary and sEntiment Reasoner). VADER is a lexicon and rule-based sentiment analysis tool that can interpret and classify emotions.

Next, we'll go back up one directory level:



cd ..


Enter fullscreen mode Exit fullscreen mode

We'll now build the Wasm module:



cargo build --target wasm32-wasi --release


Enter fullscreen mode Exit fullscreen mode

A new Wasm file should be written to target/wasm32-wasi/release/sentimentable.wasm.

Deploy

From SingleStoreDB Cloud, we'll create a new database:



CREATE DATABASE IF NOT EXISTS demo;


Enter fullscreen mode Exit fullscreen mode

Next, from the command line, we'll use the pushwasm tool to push our Wasm module into SingleStoreDB, as follows:



pushwasm tvf --force --conn "mysql://admin:<password>@<host>:3306/demo" --wit ./sentimentable.wit --wasm ./target/wasm32-wasi/release/sentimentable.wasm --name sentimentable


Enter fullscreen mode Exit fullscreen mode

We'll replace the <password> and <host> with the values from our SingleStoreDB Cloud account.

After a short time, we should see the following message:



Wasm function was created successfully.


Enter fullscreen mode Exit fullscreen mode

From SingleStoreDB Cloud, we can also check that the function was created, as follows:



USE demo;

SHOW FUNCTIONS;


Enter fullscreen mode Exit fullscreen mode

Run in the database

We can quickly test our function, as follows:



SELECT * FROM sentimentable('The movie was great');


Enter fullscreen mode Exit fullscreen mode

The result should be similar to the following:



+------------------+------------------+----------+-----------------+
| compound         | positive         | negative | neutral         |
+------------------+------------------+----------+-----------------+
| 0.62489332693894 | 0.57746478873239 |        0 | 0.4225352112676 |
+------------------+------------------+----------+-----------------+
1 row in set (0.10 sec)


Enter fullscreen mode Exit fullscreen mode

VADER can consider capitalisation, so we can try:



SELECT * FROM sentimentable('The movie was GREAT!');


Enter fullscreen mode Exit fullscreen mode

The result should be similar to the following:



+------------------+------------------+----------+-----------------+
| compound         | positive         | negative | neutral         |
+------------------+------------------+----------+-----------------+
| 0.72902590497990 | 0.63076923076923 |        0 | 0.3692307692307 |
+------------------+------------------+----------+-----------------+
1 row in set (0.11 sec)


Enter fullscreen mode Exit fullscreen mode

We can see that the values have changed, showing a stronger positive sentiment expressed by capitalisation.

Bonus: Use the Wasm UDF with real sentiment data

We can try a larger-scale test of the Wasm UDF using real sentiment data. One popular dataset is the Large Movie Review Dataset:



@InProceedings{maas-EtAl:2011:ACL-HLT2011,
  author    = {Maas, Andrew L.  and  Daly, Raymond E.  and  Pham, Peter T.  and  Huang, Dan  and  Ng, Andrew Y.  and  Potts, Christopher},
  title     = {Learning Word Vectors for Sentiment Analysis},
  booktitle = {Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies},
  month     = {June},
  year      = {2011},
  address   = {Portland, Oregon, USA},
  publisher = {Association for Computational Linguistics},
  pages     = {142--150},
  url       = {http://www.aclweb.org/anthology/P11-1015}
}


Enter fullscreen mode Exit fullscreen mode

We'll download a compressed file from GitHub and extract the CSV file. The CSV file contains 25,000 rows consisting of two columns:

  1. Text
  2. Sentiment (0 = positive, 1 = negative)

In SingleStoreDB Cloud, we'll create a table:



USE demo;

CREATE TABLE imdb_reviews (
    text TEXT,
    sentiment INT
);


Enter fullscreen mode Exit fullscreen mode

We'll now connect using a MySQL Client:



mysql -u admin -h <host> -P 3306 --default-auth=mysql_native_password -p<password> --local-infile


Enter fullscreen mode Exit fullscreen mode

We'll replace the <host> and <password> with the values from our SingleStoreDB Cloud account.

Now we'll load the data into SingleStoreDB:



USE demo;

LOAD DATA LOCAL INFILE '/path/to/imdb-reviews.csv'
INTO TABLE imdb_reviews
IGNORE 1 LINES
COLUMNS TERMINATED BY ','
ENCLOSED BY '"';


Enter fullscreen mode Exit fullscreen mode

We'll replace /path/to/ with the actual path to the CSV file.

Once the data are loaded, we can run queries such as the following:



SELECT SUBSTRING(i.text, 1, 30) AS text,
    FORMAT(s.compound, 3) cpd,
    FORMAT(s.positive, 3) pos,
    FORMAT(s.negative, 3) neg,
    FORMAT(s.neutral, 3) ntrl
FROM imdb_reviews i, sentimentable(i.text) s
LIMIT 10;


Enter fullscreen mode Exit fullscreen mode

The result should be similar to the following:



+--------------------------------+--------+-------+-------+-------+
| text | cpd | pos | neg | ntrl |
+--------------------------------+--------+-------+-------+-------+
| This is an interesting left tu | 0.996 | 0.126 | 0.060 | 0.813 |
| I don't think any movie of Van | 0.976 | 0.242 | 0.105 | 0.653 |
| This has to be THE WORST film | -0.572 | 0.103 | 0.129 | 0.768 |
| When I saw this trailer on TV | 0.968 | 0.155 | 0.014 | 0.831 |
| As an adult I really did enjoy | 0.896 | 0.209 | 0.046 | 0.744 |
| Another example of the unique | 0.979 | 0.280 | 0.000 | 0.720 |
| This film was a waste of time, | 0.678 | 0.126 | 0.073 | 0.801 |
| I thought this was a splendid | 0.988 | 0.142 | 0.085 | 0.774 |
| I can only believe that Garson | 0.968 | 0.244 | 0.119 | 0.637 |
| Gods...where to start. I was o | -0.956 | 0.092 | 0.131 | 0.777 |
+--------------------------------+--------+-------+-------+-------+
10 rows in set (0.28 sec)

Enter fullscreen mode Exit fullscreen mode




Next steps

The Code Engine - Powered by Wasm website is a great place to learn more about Wasm support in SingleStoreDB.

SingleStore offers a free Wasm training course. The course will show the participant how to set up a development environment using a Container. The course will also guide the participant through various examples that are also available through the SingleStore Wasm Toolkit.

A Wasm tutorial is also available that works through several examples in various programming languages.

The SingleStore YouTube channel also contains a video about Code Engine Powered by Wasm from the Summer 2022 launch event.

Summary

Wasm support in SingleStoreDB Cloud is available today and can be used by anyone. SingleStore is one of the first database vendors to provide Wasm support. Through the worked example in this article, we have seen that Wasm provides great potential to extend the capabilities of the database engine in a safe and controlled way. It also opens up the possibilities to reuse code that developers may have built in other programming languages.

Contact us at SingleStore and let us know how you are using Wasm in SingleStoreDB. We'd love to hear about all the great use cases!

Acknowledgements

I thank my colleague Peter Vetere at SingleStore for his assistance and patience during the development of the code example in this article.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Terabox Video Player