TPC-C, created by the Transaction Processing Council (TPC), is the industry's most recognized and complex online transaction processing benchmark. This benchmark tests a wide range of database functionalities, including queries, updates, and queue-style small batch transactions. TPC-C measures the throughput of commercial transactions per minute (tpmC) in a simulated order entry and sales environment, specifically measuring the number of new order transactions generated per minute while concurrently executing four other transaction types (such as payment, order status updates, delivery, and stock level changes).
This article primarily discusses the key system-level tuning of the GBase 8c database to achieve optimal tpmC performance.
1. Setting Up the Performance Testing Environment
1.1 TPC-C Hardware Requirements
- Number of Servers: 2 servers (1 for the database server, 1 for the TPC-C client).
Server Hardware Specifications:
- Memory: Greater than 512GB.
- Free Disk: 4 NVMe drives, each with a capacity of more than 1TB.
- Network Card: Hi1822 gigabit network card, fiber interconnection.
1.2 TPC-C Software Requirements
- Operating System Requirements: aarch64 architecture
-
Software Dependencies:
numactl
,jdk
,ant
,htop
1.3 CPU Core Binding
Properly scheduling threads to run on different cores in a multi-core CPU can achieve higher performance. In multi-threaded programming, the priority of tasks handled by each thread varies. For threads with high real-time requirements, such as main threads, you can specify that they be bound to a particular CPU core at creation, ensuring faster task processing. This is particularly beneficial for tasks involving direct user interaction, as shorter response times enhance user experience. Currently, GBase 8c supports core binding only on ARM server platforms, not on x86 architecture servers.
[Steps for Setting Core Binding]
(1) Install the GBase 8c database (refer to the installation and deployment manual for detailed operations).
(2) Stop the database using the command: gha_ctl stop all -l dcslist
(3) Modify the database port, IP, etc., using the gs_guc
tool (refer to the tool reference manual for details).
(4) Set the following parameters using gs_guc
:
advance_xlog_file_num = 100
numa_distribute_mode = 'all'
thread_pool_attr = '464,4,(cpubind:1-27,32-59,64-91,96-123)'
xloginsert_locks = 16
wal_writer_cpu=0
wal_file_init_num = 20
xlog_idle_flushes_before_sleep = 500000000
pagewriter_sleep = 10ms
(5) Start the server database in core binding mode:
numactl -C 1-27,32-59,64-91,96-123 gaussdb --single_node -D /data1/gaussdata -p 3625 &
Here, core 0 is used for wal_writer
, and cores 1-27, 32-59, 64-91, 96-123 are used to run the TPC-C program, with the remaining 16 cores handling server network interrupts.
(6) Bind the client CPU's 48 cores to the network card interrupt queue:
sh bind_net_irq.sh 48
2. TPC-C Performance Tuning Parameter Recommendations
2.1 Operating System Parameter Tuning
[Optimize operating system configuration]
- Disable IRQ balance to avoid CPU contention between database processes and clients, ensuring even CPU usage:
service irqbalance stop
echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled
echo 'never' > /sys/kernel/mm/transparent_hugepage/defrag
echo none > /sys/block/nvme*n*/queue/scheduler
[Recommended parameters]
(1) File Handles:
ulimit -n 1000000
(2) System Parameters Configuration (sysctl.conf):
vm.max_map_count = 999999
kernel.shmall = 65894144
kernel.shmmax = 161941448294
kernel.shmmni = 4096
kernel.sem = 501000 641280000 501000 12800
net.ipv4.ip_local_port_range = 9000 65500
net.core.rmem_default = 4194304
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048586
fs.file-max = 6815744
fs.aio-max-nr = 1048576
2.2 Key Database Performance Parameters
[shared_buffer Parameter]
Optimization Explanation: The size of the shared memory buffer used by the database. Larger values reduce disk I/O pressure. It is recommended to set this value between 25%-40% of machine memory.
Parameter Description: Sets the size of the shared memory buffer used by GBase 8c. Increasing this value requires more System V shared memory than the system default.
Range: Integer, 16 ~ 1073741823, in 8KB units. shared_buffers
should be set as a multiple of BLCKSZ, currently 8KB.
Default Value: 8MB
[Setting Recommendations]
- It is recommended to set
shared_buffers
to less than 40% of memory, treating row storage and column storage separately. Set higher for row storage and lower for column storage. - If a large
shared_buffers
value is set, also increasecheckpoint_segments
due to the time needed for new and modified data writes.
[work_mem Parameter]
Optimization Explanation: Declares the working memory available for internal sorting and hash operations. Increase for many sorts, decrease for high concurrency. The maximum usage is max_connections * work_mem
.
Parameter Description: Sets the memory size for internal sort operations and hash tables before writing to temporary disk files. Sorting operations are used in ORDER BY, DISTINCT, and merge joins. Hash tables are used in hash joins, hash-based aggregation, and hash-based IN subqueries.
Range: Integer, 64~2147483647, in KB units.
Default Value: 64MB
[maintenance_work_mem Parameter]
Optimization Explanation: Used for memory allocation during create index and vacuum operations. These operations are infrequent but consume large amounts of memory. Suggested value is the number of vacuum processes per data node (DN) times the maintenance_work_mem
value.
Parameter Description: Sets the maximum memory available for maintenance operations (e.g., VACUUM, CREATE INDEX, ALTER TABLE ADD FOREIGN KEY). Affects the efficiency of VACUUM, VACUUM FULL, CLUSTER, and CREATE INDEX operations.
Range: Integer, 1024~INT_MAX, in KB units.
Default Value: 16MB
2.3 Other Database Settings Optimization Suggestions
[Data Disk Handling]
- If there are multiple disks, separate XLOG and the tablespaces
tablespace2
andtablespace3
created by the benchmark, and create soft links for these partitions.
[Disable High Availability Processes]
- To reduce resource usage, high availability processes can be disabled:
gha_ctl stop all -l http://ip:port
numactl -C 0-27,32-59,64-91,96-123 gs_ctl start -D /mnt/data1/data/dn1/dn1_1/