Introduction to GBase 8a MPP Cluster Data Migration

Cong Li - Sep 30 - - Dev Community

GBase 8a MPP Cluster provides multiple data migration tools that are easy to use, catering to different scenario needs. Today, we will introduce a few methods for data migration.

1. Database Object Structure Export

Overview

The gcdump tool allows you to export the structure of database objects, including:

  • Table structures
  • Stored procedures
  • Custom functions

Description

The gcdump tool is located in the $GCLUSTER_HOME/bin directory. The parameter gbase_show_ident_case_sensitive controls whether column names are case-sensitive. By default, the case matches the source table's structure.

Syntax:

gcdump [OPTIONS] database [tables]
gcdump [OPTIONS] --databases [OPTIONS] DB1 [DB2 DB3...]
gcdump [OPTIONS] --all-databases [OPTIONS]
Enter fullscreen mode Exit fullscreen mode

Parameter Explanation

Parameter Name Description
-u Cluster login username
-p Cluster login password
-R Export stored procedures and functions
-B Export multiple databases
-W Specify VC name

Example:

$ $GCLUSTER_BASE/server/bin/gcdump -uroot -p****** -B -R ssbm > /home/gbase/ssbm.sql
Enter fullscreen mode Exit fullscreen mode

2. Data Loading

Overview

GBase 8a MPP Cluster provides a SQL interface for loading data, supporting:

  • Local file loading
  • Data fetching from a general data server
  • Multiple protocols such as FTP, HTTP, HDFS, SFTP
  • Parallel loading to maximize performance
  • Loading of plain text, gzip, snappy, and lzo compressed files
  • Real-time loading status and information query
  • Error tracing for identifying problematic data in the source file
  • Scalable loading performance as the cluster expands

Syntax:

LOAD DATA INFILE 'file_list' INTO TABLE [vcname.] [dbname.]tbl_name [options]
Enter fullscreen mode Exit fullscreen mode

Explanation of file_list

  • Cluster Local Data Source Loading:

    1. Supports loading from local files on multiple data nodes. Use the format file://host+abs_path, separating multiple files with commas.
    2. Supports concurrent loading from files on each node. Use file://+abs_path, separating multiple files with commas.
  • Loading from General File Servers:

    1. Set up ftp/http/hdfs/sftp services on the file server and place the data files in the configured path.
    2. Use a URL to specify the file path on the server, using commas to separate multiple files or directories.

Examples

LOAD DATA INFILE 'ftp://gbase:gbase@127.0.0.1/data/a.tbl' INTO TABLE test.t DATA_FORMAT 3;
LOAD DATA INFILE 'http://127.0.0.1/data/b.tbl.gz' INTO TABLE test.t DATA_FORMAT 3;
LOAD DATA INFILE 'hdp://gbase@127.0.0.1:50070/data/a.tbl.snappy' INTO TABLE test.t DATA_FORMAT 3;
LOAD DATA INFILE 'ftp://192.168.0.1/pub/lineitem.tbl,http://192.168.0.2/lineitem.tbl' INTO TABLE test.lineitem FIELDS TERMINATED BY '|' ENCLOSED BY '"' LINES TERMINATED BY '\n';
Enter fullscreen mode Exit fullscreen mode

Monitoring Load Status

gbase> use information_schema;
gbase> select * from load_status;
gbase> show load logs 100 limit 1,5;
gbase> show gcluster load logs 101;
gbase> select * from information_schema.cluster_load_result;
Enter fullscreen mode Exit fullscreen mode

3. Data Export

Overview

GBase 8a MPP Cluster provides data export functionality through the SQL syntax SELECT ... INTO OUTFILE .... It supports:

  • Exporting data to the cluster's server, an FTP/SFTP server, or a Hadoop cluster, in text or compressed formats (gzip/snappy/lzo)
  • Remote export of data from the cluster to the client's machine in text format

Syntax

SELECT ... INTO OUTFILE 'file_name' [OPTION] FROM ...;
SELECT... FROM... INTO OUTFILE 'file_name' [OPTION];
rmt:SELECT... FROM... INTO OUTFILE 'file_path' [OUTFILE_OPTION];
Enter fullscreen mode Exit fullscreen mode

Example

gbase> select * from aa into outfile '/home/davies/out.txt' fields escaped by '' terminated by '|' double_enclosed by '"' null_value 'null';
Enter fullscreen mode Exit fullscreen mode

4. Data Migration

4.1 orato8a Data Extraction Tool

Overview

orato8a is a dedicated tool for quickly and efficiently extracting data from Oracle databases, saving the data to files or directly migrating it to GBase 8a MPP Cluster. It supports both query-based export and full table export.

Deployment

orato8a is a standalone tool that must be deployed on a machine that can access Oracle, such as a machine with an Oracle client or the Oracle server itself.

Syntax

./orato8a parameter_1 parameter_2 ... parameter_n
Enter fullscreen mode Exit fullscreen mode

Example

$ ./orato8a --user='ct1/ct1ct1@orcl' --query="select LO_ORDERKEY, LO_LINENUMBER FROM lineorder_test" --file='/opt/orato8a_output/lineorder.txt ' --field=";" --format=3
Enter fullscreen mode Exit fullscreen mode

4.2 db2to8a Data Extraction Tool

Overview

db2to8a is a dedicated tool for quickly and efficiently extracting data from DB2 databases and saving it to a specified file.

Deployment

db2to8a must be deployed on a machine that can access DB2, such as a DB2 client or the DB2 server itself.

Syntax

./db2to8a parameter_1 parameter_2 ... parameter_n
Enter fullscreen mode Exit fullscreen mode

Example

$ ./db2to8a -D'test' -u'db2inst1' -p'db2inst1' -q"select * from t" -f'data1.txt' -m'3' -e'|' -l'\n' -s'h'
Enter fullscreen mode Exit fullscreen mode

That's all for today. Thank you for reading!

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Terabox Video Player