GBase 8c Distributed Cluster TPC-C Performance Tuning Practical Guide

Cong Li - Aug 29 - - Dev Community

By optimizing the utilization of server, network, and other resources, the performance of GBase 8c database in TPC-C standard tests can be maximized. This article provides a practical guide to TPC-C performance tuning using GBase 8c V5 3.0.0 as an example.

1. Server Configuration

(1) Servers

  • Use 4 servers with IP addresses ranging from 10.100.100.1 to 10.100.100.4.

(2) Operating System Version

  • Example: Using Kylin OS. Use the cat /etc/os-release command to check the specific version.

    [root@GBase-NODE4 ~]# cat /etc/os-release
    NAME="Kylin Linux Advanced Server"
    VERSION="V10 (Sword)"
    ID="kylin"
    VERSION_ID="V10"
    PRETTY_NAME="Kylin Linux Advanced Server V10 (Sword)"
    ANSI_COLOR="0;31"
    

(3) CPU Information

  • Use the lscpu command to view CPU information:

    [gbase@GBase-NODE1 ~]$ lscpu 
    Architecture:                    aarch64
    CPU op-mode(s):                  64-bit
    Byte Order:                      Little Endian
    CPU(s):                          96
    On-line CPU(s) list:             0-95
    Thread(s) per core:              1
    Core(s) per socket:              48
    Socket(s):                       2
    NUMA node(s):                    4
    Vendor ID:                       HiSilicon
    Model:                           0
    Model name:                      Kunpeng-920
    Stepping:                        0x1
    CPU max MHz:                     2600.0000
    CPU min MHz:                     200.0000
    BogoMIPS:                        200.00
    L1d cache:                       6 MiB
    L1i cache:                       6 MiB
    L2 cache:                        48 MiB
    L3 cache:                        96 MiB
    NUMA node0 CPU(s):               0-23
    NUMA node1 CPU(s):               24-47
    NUMA node2 CPU(s):               48-71
    NUMA node3 CPU(s):               72-95
    
  • Key information includes CPU(s) and NUMA node(s). In this environment, each server has 96 CPU cores and 4 NUMA nodes.

(4) Memory Information

  • Use the free -h command to check available memory. It is recommended to reserve at least 16GB:

    [root@GBase-NODE4 ~]# free -h
                 total        used        free      shared  buff/cache   available
    Mem:          509Gi        18Gi       414Gi       5.1Gi        77Gi       415Gi
    Swap:         4.0Gi          0B       4.0Gi
    

(5) Storage Information

  • Use the lsscsi command to view storage information:

    [gbase@GBase-NODE1 ~]$ lsscsi 
    [5:0:65:0]   enclosu HUAWEI   Expander 12Gx16  131   -        
    [5:2:0:0]    disk    AVAGO    HW-SAS3508       5.06  /dev/sda 
    [5:2:1:0]    disk    AVAGO    HW-SAS3508       5.06  /dev/sdb
    
  • This shows two disks. For this environment, we operate on /dev/sda.

  • Use the lsblk command to view the storage structure:

    [root@GBase-NODE4 ~]# lsblk 
    NAME            MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
    sda               8:0    0 557.9G  0 disk 
    ├─sda1            8:1    0   600M  0 part /boot/efi
    ├─sda2            8:2    0     1G  0 part /boot
    └─sda3            8:3    0 556.3G  0 part 
     ├─rootvg-root 252:0    0     5G  0 lvm  /
     ├─rootvg-swap 252:1    0     4G  0 lvm  [SWAP]
     ├─rootvg-usr  252:2    0    10G  0 lvm  /usr
     ├─rootvg-tmp  252:4    0     4G  0 lvm  /tmp
     ├─rootvg-opt  252:5    0     5G  0 lvm  /opt
     ├─rootvg-var  252:6    0     5G  0 lvm  /var
     └─rootvg-home 252:7    0    10G  0 lvm  /home
    sdb               8:16   0  17.5T  0 disk 
    └─datavg-datalv 252:3    0  17.5T  0 lvm  /data
    

(6) Network Information

  • Use the ip addr command to check network information:
[gbase@GBase-NODE1 ~]$ ip addr 
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
   link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
   inet 127.0.0.1/8 scope host lo
      valid_lft forever preferred_lft forever
   inet6 ::1/128 scope host 
      valid_lft forever preferred_lft forever
2: enp125s0f0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN group default qlen 1000
   link/ether 44:a1:91:9f:0c:02 brd ff:ff:ff:ff:ff:ff
3: enp125s0f1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN group default qlen 1000
   link/ether 44:a1:91:9f:0c:03 brd ff:ff:ff:ff:ff:ff
4: enp125s0f2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN group default qlen 1000
   link/ether 44:a1:91:9f:0c:04 brd ff:ff:ff:ff:ff:ff
5: enp125s0f3: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN group default qlen 1000
   link/ether 44:a1:91:9f:0c:05 brd ff:ff:ff:ff:ff:ff
6: enp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master mgmt state UP group default qlen 1000
   link/ether 44:a1:91:3a:5a:f9 brd ff:ff:ff:ff:ff:ff
7: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master mgmt state UP group default qlen 1000
   link/ether 44:a1:91:3a:5a:f9 brd ff:ff:ff:ff:ff:ff
8: enp5s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
   link/ether 44:a1:91:3a:5a:fb brd ff:ff:ff:ff:ff:ff
9: enp6s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
   link/ether 44:a1:91:3a:5a:fc brd ff:ff:ff:ff:ff:ff
10: enp131s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master app state UP group default qlen 1000
   link/ether 44:a1:91:3a:5c:49 brd ff:ff:ff:ff:ff:ff
11: enp132s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master app state UP group default qlen 1000
   link/ether 44:a1:91:3a:5c:49 brd ff:ff:ff:ff:ff:ff
12: enp133s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
   link/ether 44:a1:91:3a:5c:4b brd ff:ff:ff:ff:ff:ff
13: enp134s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
   link/ether 44:a1:91:3a:5c:4c brd ff:ff:ff:ff:ff:ff
14: enp137s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
   link/ether 44:a1:91:3a:76:d9 brd ff:ff:ff:ff:ff:ff
   inet 192.20.122.1/24 brd 192.20.122.255 scope global noprefixroute enp137s0
      valid_lft forever preferred_lft forever
   inet6 fe80::939f:da48:c828:9e05/64 scope link noprefixroute 
      valid_lft forever preferred_lft forever
15: enp138s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
   link/ether 44:a1:91:3a:76:da brd ff:ff:ff:ff:ff:ff
   inet 192.20.123.1/24 brd 192.20.123.255 scope global noprefixroute enp138s0
      valid_lft forever preferred_lft forever
   inet6 fe80::98c5:33ca:65e7:fb68/64 scope link noprefixroute 
      valid_lft forever preferred_lft forever
……
Enter fullscreen mode Exit fullscreen mode
  • Identify the 10G NICs, enp137s0 and enp138s0. Check details with the ethtool command:
[gbase@GBase-NODE1 ~]$ ethtool enp137s0
Settings for enp137s0:
       Supported ports: [ FIBRE ]
       Supported link modes:   10000baseKR/Full 
                               25000baseCR/Full 
                               25000baseKR/Full 
       Supported pause frame use: Symmetric
       Supports auto-negotiation: No
       Supported FEC modes: Not reported
       Advertised link modes:  10000baseKR/Full 
                               25000baseCR/Full 
                               25000baseKR/Full 
       Advertised pause frame use: Symmetric
       Advertised auto-negotiation: No
       Advertised FEC modes: Not reported
       Speed: 10000Mb/s
       Duplex: Full
       Port: FIBRE
       PHYAD: 0
       Transceiver: internal
       Auto-negotiation: off
Cannot get wake-on-lan settings: Operation not permitted
       Current message level: 0x00000045 (69)
                              drv link rx_err
       Link detected: yes
Enter fullscreen mode Exit fullscreen mode
[gbase@GBase-NODE1 ~]$ ethtool enp138s0
Settings for enp138s0:
       Supported ports: [ FIBRE ]
       Supported link modes:   10000baseKR/Full 
                               25000baseCR/Full 
                               25000baseKR/Full 
       Supported pause frame use: Symmetric
       Supports auto-negotiation: No
       Supported FEC modes: Not reported
       Advertised link modes:  10000baseKR/Full 
                               25000baseCR/Full 
                               25000baseKR/Full 
       Advertised pause frame use: Symmetric
       Advertised auto-negotiation: No
       Advertised FEC modes: Not reported
       Speed: 10000Mb/s
       Duplex: Full
       Port: FIBRE
       PHYAD: 0
       Transceiver: internal
       Auto-negotiation: off
Cannot get wake-on-lan settings: Operation not permitted
       Current message level: 0x00000045 (69)
                              drv link rx_err
       Link detected: yes
Enter fullscreen mode Exit fullscreen mode

2. Operating System Tuning

(1) Disable IRQ Balancing

service irqbalance stop
#service sysmonitor stop
service rsyslog stop
Enter fullscreen mode Exit fullscreen mode

(2) Disable NUMA Balancing

echo 0 > /proc/sys/kernel/numa_balancing
Enter fullscreen mode Exit fullscreen mode

(3) Disable Transparent Huge Pages

echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled
echo 'never' > /sys/kernel/mm/transparent_hugepage/defrag
Enter fullscreen mode Exit fullscreen mode

(4) Optimize Services

tuned-adm profile throughput-performance
Enter fullscreen mode Exit fullscreen mode

3. Network Configuration Tuning

(1) Database Cluster Network Adjustment

  • Since multiple NICs are present, it is recommended to separate the management and data interfaces:

    New Plan:

    • MGMT: For management
    • SERVICE: For data

    Adjusted yml Deployment File:

gha_server:
 - gha_server3:
     host: 10.100.100.1
     port: 20001
 - gha_server3:
     host: 10.100.100.1
     port: 20001
dcs:
 - host: 10.100.100.1
   port: 2379
 - host: 10.100.100.3
   port: 2379
 - host: 10.100.100.4
   port: 2379
gtm:
 - gtm3:
     host: 10.185.103.1
     agent_host: 10.100.100.1
     role: primary
     port: 6666
     agent_port: 8003
     work_dir: /data/mpp/gbase/data/gtm3
 - gtm4:
     host: 10.185.103.2
     agent_host: 10.100.100.2
     role: standby
     port: 6666
     agent_port: 8004
     work_dir: /data/mpp/gbase/data/gtm4
 - gtm5:
     host: 10.185.103.4
     agent_host: 10.100.100.4
     role: standby
     port: 6666
     agent_port: 8005
     work_dir: /data/mpp/gbase/data/gtm5
coordinator:
 - cn3:
     host: 10.185.103.1
     agent_host: 10.100.100.1
     role: primary
     port: 5432
     agent_port: 8008
     work_dir: /data/gbase/mpp/data/coord/cn3
 - cn4:
     host: 10.185.103.2
     agent_host: 10.100.100.2
     role: primary
     port: 5432
     agent_port: 8009
     work_dir: /data/gbase/mpp/data/coord/cn4
datanode:
 - dn1:
     - dn1_3:
         host: 10.185.103.1
         agent_host: 10.100.100.1
         role: primary
         port: 20010
         agent_port: 8032
         work_dir: /data/mpp/gbase/data/dn1/dn1_3
     - dn1_4:
         host: 10.185.103.2
         agent_host: 10.100.100.2
         role: standby
         port: 20010
         agent_port: 8032
         work_dir: /data/mpp/gbase/data/dn1/dn1_4
 - dn2:
     - dn2_3:
         host: 10.185.103.2
         agent_host: 10.100.100.2
         role: primary
         port: 20020
         agent_port: 8033
         work_dir: /data/mpp/gbase/data/dn2/dn2_3
         # numa:
         #   cpu_node_bind: 0,1
         #   mem_node_bind: 0,1
     - dn2_4:
         host: 10.185.103.1
         agent_host: 10.100.100.21
         role: standby
         port: 20020
         agent_port: 8033
         work_dir: /data/mpp/gbase/data/dn2/dn2_4
         # numa:
         #   cpu_node_bind: 2
         #   mem_node_bind: 2
 - dn3:
     - dn3_3:
         host: 10.185.103.3
         agent_host: 10.100.100.3
         role: primary
         port: 20030
         agent_port: 8033
         work_dir: /data/mpp/gbase/data/dn3/dn3_3
     - dn3_4:
         host: 10.185.103.4
         agent_host: 10.100.100.4
         role: standby
         port: 20030
         agent_port: 8033
         work_dir: /data/mpp/gbase/data/dn3/dn3_4
 - dn4:
     - dn4_3:
         host: 10.185.103.4
         agent_host: 10.100.100.4
         role: primary
         port: 20040
         agent_port: 8034
         work_dir: /data/mpp/gbase/data/dn4/dn4_3
     - dn4_4:
         host: 10.185.103.3
         agent_host: 10.100.100.3
         role: standby
         port: 20040
         agent_port: 8034
         work_dir: /data/mpp/gbase/data/dn4/dn4_4
env:
 # cluster_type allowed values: multiple-nodes, single-inst, default is multiple-nodes
 cluster_type: multiple-nodes
 pkg_path: /data/mpp/gbase/gbase_pkg_b78
 prefix: /data/mpp/gbase/gbase_db
 version: V5_S3.0.0B78
 user: gbase
 port: 22
# constant:
# virtual_ip: 100.0.1.254/24
Enter fullscreen mode Exit fullscreen mode

(2) Interrupt Core Binding

The number 16 indicates selecting the last 4 nodes of each NUMA node out of 96 cores for core binding:

sudo bind_net_irq.sh 16
Enter fullscreen mode Exit fullscreen mode

To implement interrupt core binding, SMMU needs to be disabled. If it is enabled in the BIOS, you need to manually adjust the script by commenting out the relevant checks. The script is located at /data/mpp/install/gbase/app/bin/bind_net_irq.sh.

function check_os()
{
   echo "Platform number of cores:" $core_no
   if [ $numa_node -ne 4 ]
   then
        echo "Warning: Number of NUMA nodes is not matched, please Disable DIE interleave in BIOS"
   fi
#    smmu=$(ps aux|grep smmu)
#    if [[ $smmu == *"arm-smmu"* ]]
#    then
#        echo $smmu
#        echo "Error: SMMU enabled, please Disable SMMU in BIOS"
#        exit
#    fi
}
Enter fullscreen mode Exit fullscreen mode

4. Database Parameter Optimization

Perform GUC parameter optimization on the CN, GTM, and DN nodes respectively:

Parameter Modifications

CN (Coordinator Node):

gs_guc reload -N all -I all -Z coordinator -c "max_process_memory = 20GB"
gs_guc reload -N all -I all -Z coordinator -c "max_connections = 2048"
gs_guc reload -N all -I all -Z coordinator -c "max_prepared_transactions = 2048"
gs_guc reload -N all -I all -Z coordinator -c "shared_buffers = 10GB"
gs_guc reload -N all -I all -Z coordinator -c "work_mem = 4MB"
gs_guc reload -N all -I all -Z coordinator -c "fsync = off"
gs_guc reload -N all -I all -Z coordinator -c "synchronous_commit = off"
gs_guc reload -N all -I all -Z coordinator -c "maintenance_work_mem = 64MB"
gs_guc reload -N all -I all -Z coordinator -c "checkpoint_segments = 512"
gs_guc reload -N all -I all -Z coordinator -c "thread_pool_attr = '96,2,(nodebind:1,2)'"
gs_guc reload -N all -I all -Z coordinator -c "thread_pool_attr = '812,4,(cpubind:1-19,24-43,48-67,72-91)'"
gs_guc reload -N all -I all -Z coordinator -c "enable_memory_limit = on"
gs_guc reload -N all -I all -Z coordinator -c "cstore_buffers = 1GB"
gs_guc reload -N all -I all -Z coordinator -c "temp_buffers = 64MB"
gs_guc reload -N all -I all -Z coordinator -c "checkpoint_timeout = 30min"
gs_guc reload -N all -I all -Z coordinator -c "random_page_cost = 1.1"
gs_guc reload -N all -I all -Z coordinator -c "max_pool_size = 4096"
Enter fullscreen mode Exit fullscreen mode

DN (Data Node):

gs_guc reload -N all -I all -Z datanode -c "max_process_memory = 180GB"
gs_guc reload -N all -I all -Z datanode -c "max_connections = 4096"
gs_guc reload -N all -I all -Z datanode -c "max_prepared_transactions = 4096"
gs_guc reload -N all -I all -Z datanode -c "shared_buffers = 80GB"
gs_guc reload -N all -I all -Z datanode -c "work_mem = 4MB"
gs_guc reload -N all -I all -Z datanode -c "fsync = off"
gs_guc reload -N all -I all -Z datanode -c "synchronous_commit = off"
gs_guc reload -N all -I all -Z datanode -c "maintenance_work_mem = 64MB"
gs_guc reload -N all -I all -Z datanode -c "checkpoint_segments = 512"
gs_guc reload -N all -I all -Z datanode -c "thread_pool_attr = '96,2,(nodebind:1,2)'"
gs_guc reload -N all -I all -Z datanode -c "thread_pool_attr = '812,4,(cpubind:1-19,24-43,48-67,72-91)'"
gs_guc reload -N all -I all -Z datanode -c "enable_memory_limit = on"
gs_guc reload -N all -I all -Z datanode -c "cstore_buffers = 1GB"
gs_guc reload -N all -I all -Z datanode -c "temp_buffers = 64MB"
gs_guc reload -N all -I all -Z datanode -c "checkpoint_timeout = 30min"
gs_guc reload -N all -I all -Z datanode -c "random_page_cost = 1.1"
Enter fullscreen mode Exit fullscreen mode

GTM (Global Transaction Manager):

gs_guc reload -N all -I all -Z gtm -c "max_process_memory = 20GB"
gs_guc reload -N all -I all -Z gtm -c "max_connections = 2048"
gs_guc reload -N all -I all -Z gtm -c "max_prepared_transactions = 2048"
gs_guc reload -N all -I all -Z gtm -c "shared_buffers = 2GB"
gs_guc reload -N all -I all -Z gtm -c "work_mem = 4MB"
gs_guc reload -N all -I all -Z gtm -c "fsync = off"
gs_guc reload -N all -I all -Z gtm -c "synchronous_commit = off"
gs_guc reload -N all -I all -Z gtm -c "maintenance_work_mem = 64MB"
gs_guc reload -N all -I all -Z gtm -c "checkpoint_segments = 512"
gs_guc reload -N all -I all -Z gtm -c "enable_memory_limit = on"
gs_guc reload -N all -I all -Z gtm -c "cstore_buffers = 1GB"
gs_guc reload -N all -I all -Z gtm -c "temp_buffers = 64MB"
gs_guc reload -N all -I all -Z gtm -c "checkpoint_timeout = 30min"
gs_guc reload -N all -I all -Z gtm -c "random_page_cost = 1.1"
Enter fullscreen mode Exit fullscreen mode

By following these steps, optimal performance testing for GBase 8c can be achieved. If you have any other suggestions, please feel free to leave a comment.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Terabox Video Player