Quick tip: Using the built-in notebooks with SingleStoreDB

Akmal Chaudhri - May 20 '23 - - Dev Community

Abstract

SingleStoreDB Cloud now supports notebooks. These provide a convenient way to perform Data Science directly with SingleStoreDB without installing external tools. In this short article, we'll use a simple example to demonstrate some of the benefits of this new built-in integration.

The notebook file used in this article is available on GitHub.

Introduction

Jupyter-style notebooks are a great interactive way to test code, analyse data and visualise results. In this short article, we'll demonstrate some of the capabilities of the SingleStoreDB Cloud built-in notebooks environment using the Iris flower data set.

Create a SingleStoreDB Cloud account

A previous article showed the steps to create a free SingleStoreDB Cloud account. We'll use the following settings:

  • Workspace Group Name: Iris Demo Group
  • Cloud Provider: AWS
  • Region: US East 1 (N. Virginia)
  • Workspace Name: iris-demo
  • Size: S-00
  • Advanced Settings: MarTech Application deselected

New notebook

From the left navigation pane, we'll select DEVELOP > Data Studio.

In the top right of the web page, we'll select New Notebook > New Notebook, as shown in Figure 1.

Figure 1. New Notebook.

Figure 1. New Notebook.

We'll call the notebook iris_demo, and select a Blank notebook template from the available options.

Fill out the notebook

Create Table

First, we'll create a new table, as follows:

CREATE DATABASE IF NOT EXISTS iris_db;

USE iris_db;

DROP TABLE IF EXISTS iris;
CREATE TABLE IF NOT EXISTS iris (
    sepal_length FLOAT,
    sepal_width FLOAT,
    petal_length FLOAT,
    petal_width FLOAT,
    species VARCHAR(20)
);
Enter fullscreen mode Exit fullscreen mode

Load Data

We'll now load the data into the table, as follows:

USE iris_db;
INSERT INTO iris VALUES
(5.1,3.5,1.4,0.2,'Iris-setosa'),
(4.9,3,1.4,0.2,'Iris-setosa'),
(4.7,3.2,1.3,0.2,'Iris-setosa'),
(4.6,3.1,1.5,0.2,'Iris-setosa'),
(5,3.6,1.4,0.2,'Iris-setosa'),
(5.4,3.9,1.7,0.4,'Iris-setosa'),
(4.6,3.4,1.4,0.3,'Iris-setosa'),
(5,3.4,1.5,0.2,'Iris-setosa'),
(4.4,2.9,1.4,0.2,'Iris-setosa'),
(4.9,3.1,1.5,0.1,'Iris-setosa'),
(5.4,3.7,1.5,0.2,'Iris-setosa'),
(4.8,3.4,1.6,0.2,'Iris-setosa'),
(4.8,3,1.4,0.1,'Iris-setosa'),
(4.3,3,1.1,0.1,'Iris-setosa'),
(5.8,4,1.2,0.2,'Iris-setosa'),
(5.7,4.4,1.5,0.4,'Iris-setosa'),
(5.4,3.9,1.3,0.4,'Iris-setosa'),
(5.1,3.5,1.4,0.3,'Iris-setosa'),
(5.7,3.8,1.7,0.3,'Iris-setosa'),
(5.1,3.8,1.5,0.3,'Iris-setosa'),
(5.4,3.4,1.7,0.2,'Iris-setosa'),
(5.1,3.7,1.5,0.4,'Iris-setosa'),
(4.6,3.6,1,0.2,'Iris-setosa'),
(5.1,3.3,1.7,0.5,'Iris-setosa'),
(4.8,3.4,1.9,0.2,'Iris-setosa'),
(5,3,1.6,0.2,'Iris-setosa'),
(5,3.4,1.6,0.4,'Iris-setosa'),
(5.2,3.5,1.5,0.2,'Iris-setosa'),
(5.2,3.4,1.4,0.2,'Iris-setosa'),
(4.7,3.2,1.6,0.2,'Iris-setosa'),
(4.8,3.1,1.6,0.2,'Iris-setosa'),
(5.4,3.4,1.5,0.4,'Iris-setosa'),
(5.2,4.1,1.5,0.1,'Iris-setosa'),
(5.5,4.2,1.4,0.2,'Iris-setosa'),
(4.9,3.1,1.5,0.1,'Iris-setosa'),
(5,3.2,1.2,0.2,'Iris-setosa'),
(5.5,3.5,1.3,0.2,'Iris-setosa'),
(4.9,3.1,1.5,0.1,'Iris-setosa'),
(4.4,3,1.3,0.2,'Iris-setosa'),
(5.1,3.4,1.5,0.2,'Iris-setosa'),
(5,3.5,1.3,0.3,'Iris-setosa'),
(4.5,2.3,1.3,0.3,'Iris-setosa'),
(4.4,3.2,1.3,0.2,'Iris-setosa'),
(5,3.5,1.6,0.6,'Iris-setosa'),
(5.1,3.8,1.9,0.4,'Iris-setosa'),
(4.8,3,1.4,0.3,'Iris-setosa'),
(5.1,3.8,1.6,0.2,'Iris-setosa'),
(4.6,3.2,1.4,0.2,'Iris-setosa'),
(5.3,3.7,1.5,0.2,'Iris-setosa'),
(5,3.3,1.4,0.2,'Iris-setosa'),
(7,3.2,4.7,1.4,'Iris-versicolor'),
(6.4,3.2,4.5,1.5,'Iris-versicolor'),
(6.9,3.1,4.9,1.5,'Iris-versicolor'),
(5.5,2.3,4,1.3,'Iris-versicolor'),
(6.5,2.8,4.6,1.5,'Iris-versicolor'),
(5.7,2.8,4.5,1.3,'Iris-versicolor'),
(6.3,3.3,4.7,1.6,'Iris-versicolor'),
(4.9,2.4,3.3,1,'Iris-versicolor'),
(6.6,2.9,4.6,1.3,'Iris-versicolor'),
(5.2,2.7,3.9,1.4,'Iris-versicolor'),
(5,2,3.5,1,'Iris-versicolor'),
(5.9,3,4.2,1.5,'Iris-versicolor'),
(6,2.2,4,1,'Iris-versicolor'),
(6.1,2.9,4.7,1.4,'Iris-versicolor'),
(5.6,2.9,3.6,1.3,'Iris-versicolor'),
(6.7,3.1,4.4,1.4,'Iris-versicolor'),
(5.6,3,4.5,1.5,'Iris-versicolor'),
(5.8,2.7,4.1,1,'Iris-versicolor'),
(6.2,2.2,4.5,1.5,'Iris-versicolor'),
(5.6,2.5,3.9,1.1,'Iris-versicolor'),
(5.9,3.2,4.8,1.8,'Iris-versicolor'),
(6.1,2.8,4,1.3,'Iris-versicolor'),
(6.3,2.5,4.9,1.5,'Iris-versicolor'),
(6.1,2.8,4.7,1.2,'Iris-versicolor'),
(6.4,2.9,4.3,1.3,'Iris-versicolor'),
(6.6,3,4.4,1.4,'Iris-versicolor'),
(6.8,2.8,4.8,1.4,'Iris-versicolor'),
(6.7,3,5,1.7,'Iris-versicolor'),
(6,2.9,4.5,1.5,'Iris-versicolor'),
(5.7,2.6,3.5,1,'Iris-versicolor'),
(5.5,2.4,3.8,1.1,'Iris-versicolor'),
(5.5,2.4,3.7,1,'Iris-versicolor'),
(5.8,2.7,3.9,1.2,'Iris-versicolor'),
(6,2.7,5.1,1.6,'Iris-versicolor'),
(5.4,3,4.5,1.5,'Iris-versicolor'),
(6,3.4,4.5,1.6,'Iris-versicolor'),
(6.7,3.1,4.7,1.5,'Iris-versicolor'),
(6.3,2.3,4.4,1.3,'Iris-versicolor'),
(5.6,3,4.1,1.3,'Iris-versicolor'),
(5.5,2.5,4,1.3,'Iris-versicolor'),
(5.5,2.6,4.4,1.2,'Iris-versicolor'),
(6.1,3,4.6,1.4,'Iris-versicolor'),
(5.8,2.6,4,1.2,'Iris-versicolor'),
(5,2.3,3.3,1,'Iris-versicolor'),
(5.6,2.7,4.2,1.3,'Iris-versicolor'),
(5.7,3,4.2,1.2,'Iris-versicolor'),
(5.7,2.9,4.2,1.3,'Iris-versicolor'),
(6.2,2.9,4.3,1.3,'Iris-versicolor'),
(5.1,2.5,3,1.1,'Iris-versicolor'),
(5.7,2.8,4.1,1.3,'Iris-versicolor'),
(6.3,3.3,6,2.5,'Iris-virginica'),
(5.8,2.7,5.1,1.9,'Iris-virginica'),
(7.1,3,5.9,2.1,'Iris-virginica'),
(6.3,2.9,5.6,1.8,'Iris-virginica'),
(6.5,3,5.8,2.2,'Iris-virginica'),
(7.6,3,6.6,2.1,'Iris-virginica'),
(4.9,2.5,4.5,1.7,'Iris-virginica'),
(7.3,2.9,6.3,1.8,'Iris-virginica'),
(6.7,2.5,5.8,1.8,'Iris-virginica'),
(7.2,3.6,6.1,2.5,'Iris-virginica'),
(6.5,3.2,5.1,2,'Iris-virginica'),
(6.4,2.7,5.3,1.9,'Iris-virginica'),
(6.8,3,5.5,2.1,'Iris-virginica'),
(5.7,2.5,5,2,'Iris-virginica'),
(5.8,2.8,5.1,2.4,'Iris-virginica'),
(6.4,3.2,5.3,2.3,'Iris-virginica'),
(6.5,3,5.5,1.8,'Iris-virginica'),
(7.7,3.8,6.7,2.2,'Iris-virginica'),
(7.7,2.6,6.9,2.3,'Iris-virginica'),
(6,2.2,5,1.5,'Iris-virginica'),
(6.9,3.2,5.7,2.3,'Iris-virginica'),
(5.6,2.8,4.9,2,'Iris-virginica'),
(7.7,2.8,6.7,2,'Iris-virginica'),
(6.3,2.7,4.9,1.8,'Iris-virginica'),
(6.7,3.3,5.7,2.1,'Iris-virginica'),
(7.2,3.2,6,1.8,'Iris-virginica'),
(6.2,2.8,4.8,1.8,'Iris-virginica'),
(6.1,3,4.9,1.8,'Iris-virginica'),
(6.4,2.8,5.6,2.1,'Iris-virginica'),
(7.2,3,5.8,1.6,'Iris-virginica'),
(7.4,2.8,6.1,1.9,'Iris-virginica'),
(7.9,3.8,6.4,2,'Iris-virginica'),
(6.4,2.8,5.6,2.2,'Iris-virginica'),
(6.3,2.8,5.1,1.5,'Iris-virginica'),
(6.1,2.6,5.6,1.4,'Iris-virginica'),
(7.7,3,6.1,2.3,'Iris-virginica'),
(6.3,3.4,5.6,2.4,'Iris-virginica'),
(6.4,3.1,5.5,1.8,'Iris-virginica'),
(6,3,4.8,1.8,'Iris-virginica'),
(6.9,3.1,5.4,2.1,'Iris-virginica'),
(6.7,3.1,5.6,2.4,'Iris-virginica'),
(6.9,3.1,5.1,2.3,'Iris-virginica'),
(5.8,2.7,5.1,1.9,'Iris-virginica'),
(6.8,3.2,5.9,2.3,'Iris-virginica'),
(6.7,3.3,5.7,2.5,'Iris-virginica'),
(6.7,3,5.2,2.3,'Iris-virginica'),
(6.3,2.5,5,1.9,'Iris-virginica'),
(6.5,3,5.2,2,'Iris-virginica'),
(6.2,3.4,5.4,2.3,'Iris-virginica'),
(5.9,3,5.1,1.8,'Iris-virginica');
Enter fullscreen mode Exit fullscreen mode

Install Libraries

Next, we'll install some libraries, as follows:

!pip install matplotlib --quiet
!pip install scikit-learn --quiet
!pip install seaborn --quiet
Enter fullscreen mode Exit fullscreen mode

Connect to Database

We'll now connect to our database, as follows:

from sqlalchemy import *

db_connection = create_engine(connection_url)
Enter fullscreen mode Exit fullscreen mode

There are no complicated connection strings. The connection_url contains everything we need.

Read from Database

We'll now read the table contents into a Pandas Dataframe, as follows:

iris_df = pd.read_sql_query(
    "SELECT * FROM iris",
    db_connection
)
Enter fullscreen mode Exit fullscreen mode

Perform Data Analysis

We can now perform some Exploratory Data Analysis (EDA), such as:

# https://plotly.com/python/pca-visualization/

X = iris_df[[
    "sepal_length",
    "sepal_width",
    "petal_length",
    "petal_width"
]]

pca = PCA(n_components = 2)
components = pca.fit_transform(X)

pca_fig = px.scatter(
    components,
    x = 0,
    y = 1,
    color = iris_df["species"]
)

pca_fig.show()
Enter fullscreen mode Exit fullscreen mode

The output should be similar to Figure 2.

Figure 2. PCA Visualisation.

Figure 2. PCA Visualisation.

And further analysis, as follows:

# More about correlations:
# https://www.w3schools.com/python/pandas/pandas_correlations.asp

sns.heatmap(
    iris_df.corr(numeric_only = True),
    cmap = "OrRd",
    annot = True
)

plt.title("Correlations")
plt.plot()
Enter fullscreen mode Exit fullscreen mode

The output should be similar to Figure 3.

Figure 3. Correlations.

Figure 3. Correlations.

The SingleStoreDB Cloud notebook environment provides many more capabilities. Check out the documentation for further details.

Summary

This short article used one approach to load data into SingleStoreDB Cloud. We also performed some data analysis of the data stored in our database, and created several visualisations. All of this was achieved using the built-in notebook environment.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Terabox Video Player