Introduction
This article introduces the configuration of the GBFS dedicated file server for GBase 8a MPP Cluster data loading and the methods for monitoring load status.
Related References:
- FTP File Server Configuration
- HTTP File Server Configuration
- HDFS Server Configuration
- SFTP Server Configuration
1. Introduction to GBFS Dedicated File Server
The GBFS dedicated file server is a binary executable program designed specifically for data loading in GBase 8a MPP Cluster databases. It is typically provided to users in the form of a package, such as gbfs-9.5.3.22-redhat7.3.tar.bz2
. Users can simply extract this package using the following command and run it directly.
Using gbfs-9.5.3.22-redhat7.3.tar.bz2
as an example:
# tar xvf gbfs-9.5.3.22-redhat7.3.tar.bz2
After extraction, a gbfs
folder will be created in the current directory, containing the main gbfs
program and BUILDINFO
(compilation information). You can view the help information of the gbfs
program using the command gbfs -?
.
[root@rhel73-1 gbfs]# ./gbfs -?
./gbfs ver 9.5.3.22.126635 for unknown-linux-gnu on x86_64
Copyright 2004-2021 General Data Technology Co.Ltd.
GBase File Server
Usage: ./gbfs [OPTIONS]
-V, --version Get version info.
-?, --help Get help info.
-P, --port Port number to use for connection or 6666 for default,
valid range: [1025,65535] order of preference.
-H, --home-dir The GBase file server home dir, default: current user home dir.
-L, --log-dir The GBase file server logs dir, default: /tmp/.
The help information includes the version and usage of gbfs
. The parameters are introduced as follows:
-
-P & --port
: The port number for the GBFS server to listen on. The default is6666
. -
-H & --home-dir
: The GBFS home directory, similar to the FTP home directory feature. The default is the home directory of the current user starting the server. This parameter is mainly used to support the relative path feature ofgbfs
.
For example:
If the gbase
user is running the server, the default GBFS home directory would be /home/gbase/
. If the user data is stored under /home/gbase/data/
, the user can use the following URL to load the file:
gbfs://192.168.146.20/data/test.tbl
The equivalent absolute path URL would be:
gbfs://192.168.146.20//home/gbase/data/test.tbl
Users can configure this parameter according to their actual scenarios.
-
-L & --log-dir
: The directory for storing GBFS log files. After the GBFS server starts, it will create agbfs_port.log
file in this directory. The default directory is/tmp/
.
It is recommended to run the GBFS dedicated file server in the background:
[gbase@rhel73-1 gbfs]$ ./gbfs &
[1] 23302
[gbase@rhel73-1 gbfs]$ IPv6 is available.
gbfs is ready for connections. home dir:/home/gbase/, log dir:/tmp/, port:6666.
Example:
To load the part.tbl
file located on the GBFS server, using the default line delimiter and |
as the column delimiter:
gbase> load data infile 'gbfs://192.168.146.20//opt/ssbm/part.tbl' into table part data_format 3 FIELDS TERMINATED BY '|';
2. Load Status Monitoring
Function Description
After the load task starts, you can check the status information of the current load task through SQL.
Syntax Format
SELECT * FROM information_schema.load_status;
The status information table records the status information of all running load tasks.
Each field in the status information table is defined as follows:
Field Name | Description |
---|---|
SCN |
SCN number |
DB_NAME |
Database name |
TB_NAME |
Table name |
IP |
Load machine IP address |
STATE |
Load state |
START_TIME |
Load start time |
ELAPSED_TIME |
Load end time |
AVG_SPEED |
Average load speed |
PROGRESS |
Load progress |
TOTAL_SIZE |
Total file size |
LOADED_SIZE |
Amount of data loaded |
LOADED_RECORDS |
Number of records loaded |
SKIPPED_RECORDS |
Number of records skipped |
DATA_SOURCE |
Data source |
SQL_CMD |
Load task SQL |
That's all for today. Thank you for reading!