1 Overview
1.1 Overview
Deploying GBase 8a MPP Cluster on high-configuration servers and NUMA architecture servers (Non-Uniform Memory Access, or NUMA) often results in underutilization of hardware resources when only one database instance is deployed per server. For example:
- When the server memory exceeds 300GB, a single database instance struggles to utilize all the memory effectively.
- When the server CPU has more than 40 logical cores, a single database instance cannot achieve linear performance improvement with an increase in cores.
- When using NUMA architecture servers with multiple NUMA nodes, frequent cross-node memory access by a single database instance leads to suboptimal performance.
- A single database instance cannot fully leverage the capabilities of new hardware such as SSD/NVME.
The GBase 8a MPP Cluster V9.5.3 version officially supports multi-instance deployment. Deploying multiple database instances on a single server can address these performance utilization issues and improve cluster performance. According to actual tests, adopting a multi-instance deployment on NUMA architecture and high-configuration servers can significantly enhance cluster performance, with an improvement of more than 1x compared to single-instance deployment.
This document introduces the installation process, configuration recommendations, and management methods for GBase 8a MPP Cluster V9.5.3 in multi-instance deployment scenarios.
1.2 Terminology
The terms used in this document are explained as follows:
Term/Definition | Meaning |
---|---|
Multi-instance | Deployment of multiple database instances, also known as multi-instance deployment, refers to deploying multiple data cluster nodes on a single physical server. Each data cluster node is referred to as a database instance. |
NUMA | Non-Uniform Memory Access |
gcware node | Node managing the cluster, used for sharing information between GClusters |
gcluster node | Node scheduling the cluster, responsible for SQL parsing, SQL optimization, distributed execution plan generation, and execution scheduling. |
data node | Node of the data cluster, also known as gnode, consisting of data nodes, serving as the storage and computation unit of the data. |
2 Multi-Instance Installation Deployment
2.1 Deployment Plan
GBase 8a MPP Cluster installs multiple data nodes on each server. Each data node needs to be configured with a unique IP address, and nodes are distinguished by these IP addresses. A maximum of one gcluster node and one gcware node can be installed per physical server:
Before deploying the cluster, the following tasks need to be planned and completed:
1) Evaluate and determine the number of instances to be deployed on each server:
- Based on the number of NUMA nodes, memory size, cluster size, and business scenarios (load), evaluate the number of database instances to be deployed on each server. It is generally recommended to deploy no more than 4 database instances on a physical server, with each instance having at least 64GB of available memory.
2) IP address resource planning:
- Apply for an IP address for each database instance for internal communication within the cluster. It is recommended to configure multiple network cards on the physical server and bind them in load-balancing mode.
3) Disk planning:
- Different database instances should use different disk groups for disk I/O isolation. For example, configure a RAID5 disk group for each database instance.
4) Determine cluster architecture:
- It is recommended to have an odd number of gcware nodes and gcluster nodes, with a maximum of one gcware node and one gcluster node per physical server. It is suggested to deploy gcware and gcluster nodes on a single NUMA node, separate from data nodes.
5) Ensure server and OS environment meet GBase 8a cluster installation requirements:
- Refer to the GBase 8a MPP Cluster product manual.
2.2 Cluster Installation
Example server IPs for installation:
-
Server 1:
- IP1: 192.168.146.20
- IP2: 192.168.146.40
-
Server 2:
- IP3: 192.168.146.21
- IP4: 192.168.146.41
Current Cluster Version Limitations
- The installation package only checks for RedHat, SUSE, and CentOS systems. Other systems need manual adjustments to bypass related checks.
- Supports Python versions 2.6 and 2.7, not Python 3.
- A physical machine can have only one coordinator and one gcware, and they must share the same IP.
2.2.1 Configure Multiple IPs
For servers with multiple 10Gbps network cards, bind multiple network cards to ensure high network availability and maximum bandwidth.
For each database instance, configure an IP address on a single network card or bound network cards. Example configuration:
vim /etc/sysconfig/network-scripts/ifcfg-p6p2
The blue parts above are the added content, IPADDR1 is the first virtual IP address, and NETMASK1 is the subnet mask for the first virtual IP. Follow this pattern for subsequent virtual IPs. NETMASK must match the NETMASK of the physical IP. If this parameter is not added, the subnet allocated for the virtual IP might differ from the physical IP.
- When multiple network cards are not bound, configure one network card for each instance, as shown in the example below:
Restart the network service to apply changes:
service network restart
# or
systemctl restart network
2.2.2 Prepare for Installation
Follow the GBase 8a cluster installation steps as outlined in the product manual. Before formal installation:
- Create a gbase user on each server:
useradd gbase
passwd gbase
- Upload and extract the installation files:
tar xjf GBase8a_MPP_Cluster-NoLicense-9.5.3.17-redhat7.3-x86_64.tar.bz2
chown -R gbase:gbase gcinstall
- Copy and execute
SetSysEnv.py
on all servers to configure environment variables:
scp SetSysEnv.py root@192.168.146.21:/opt
python SetSysEnv.py --installPrefix=/opt --dbaUser=gbase
- Adjust the permissions of the installation path to allow the gbase user to write:
drwxr-x---. 6 gbase gbase 157 Jan 28 18:59 opt
- Modify the installation configuration file
demo.options
:
2.2.3 Execute Installation
As the gbase user, run the installation:
python gcinstall.py --silent=demo.options
2.2.4 Obtain License
If installing the no-license version, skip this section.
1) Fingerprint Collection:
Use any instance IP from the multi-instance server to obtain the fingerprint:
./gethostsid -n 192.168.146.20,192.168.146.21 -u gbase -p gbase -f hostsfingers.txt
2) Generate License:
Email the hostsfingers.txt
file to the vendor to obtain the license.
3) Import License:
Import the license to all instances:
./License -n 192.168.146.20,192.168.146.21,192.168.146.40,192.168.146.41 -u gbase -p gbase -f gbase.lic
2.2.5 Cluster Initialization
Configure distribution and execute initialization as per the product manual.
gcadmin createvc vc.xml
gcadmin distribution gcChangeInfo.xml p 1 d 1 vc vc1
Ensure primary and standby data slices are on different physical servers for data high availability.
2.3 NUMA Binding
For high-configuration servers with NUMA architecture, it is recommended to evenly allocate NUMA nodes to different GBase instances. For example, on a server with 8 NUMA nodes running two GBase instances, bind 4 NUMA nodes to each instance.
2.3.1 View NUMA Groups
For servers with NUMA architecture, you need to install numactl in advance on each server as follows:
Note: This feature must be installed as it is required for starting services.
Use the numastat command to view the NUMA groups of the server. The following examples show configurations for 4 NUMA nodes and 8 NUMA nodes. Depending on the number of NUMA nodes, you can allocate 1 NUMA node per instance (IP), 2 NUMA nodes per instance (IP), or 4 NUMA nodes per instance (IP), etc.
4 NUMA nodes:
8 NUMA nodes:
2.3.2 Bind GBase 8a Instances to NUMA Nodes
For servers that deploy multiple data nodes (GBase instances) only, it's recommended to evenly distribute the server's CPUs and memory across NUMA nodes based on the number of data nodes. For servers that deploy both data nodes (GBase instances) and gcluster/gcware nodes, it's advisable to deploy gcluster and gcware nodes on the same NUMA node.
The binding relationship between GBase 8a instances and NUMA nodes is configured through adjustments to the gcluster_services
script. After installing multiple instances, because the cluster service startup command gcluster_services
points to the gcluster_services
script under any instance (including gnode and gcluster instances), you can specify the gcluster_services
file of a particular instance to add binding commands. For example, modify the gcluster_services
file under IP/gnode/server/bin
to add bindings.
There are two methods to start the database service thereafter:
Method 1: Use the modified gcluster_services
file every time you start the database service.
cd IP/gnode/server/bin
./gcluster_services all start
Method 2: Copy the modified gcluster_services
file to replace the gcluster_services
files under all instances (IP/gnode/server/bin/gcluster_services
and IP/gcluster/server/bin/gcluster_services
). Subsequently, use the regular cluster startup command:
gcluster_services all start
Example: For a server with 8 NUMA nodes, where 1 instance is bound to 4 NUMA nodes:
Choose any instance's gnode/server/bin/gcluster_services
file and modify two sections:
1) Original section around line 410:
Modify to:
Note:
-
numactl --membind=nodes program
(nodes specify the nodes to allocate, e.g., 0, 1, or other node numbers; program can be an absolute path or a service startup script) -
numactl --cpunodebind=nodes program
(nodes specify CPU nodes, followed by the program)
2) Original section around line 500:
Execute code of gcluster_services all start
Modify to:
3) Original section around line 450:
Execute code of gcluster_services gbase|syncserver start
Modify to:
After making these changes to the files above, restart the cluster service:
cd IP/gnode/server/bin
./gcluster_services all start
You can verify the NUMA binding effect using the following command:
numastat `pidof gbased`
For example, to view the effect of binding 1 NUMA node per instance in a scenario with 2 instances and 2 NUMA nodes per instance:
[root@pst_w61 config]$ numastat `pidof gbased`