1. Solution Overview
During customer service for database products, there are often scenarios where the operating system needs to be upgraded. Older versions may require upgrades due to discontinued support or new hardware. Consequently, the database product must also be upgraded alongside the operating system. This GBase 8a MPP Cluster OS upgrade solution primarily introduces the considerations and steps before and after the OS upgrade to ensure the normal operation of the database and data. This article takes the 8a cluster V952 version as an example to elaborate on the relevant precautions.
2. Precautions
- Ensure that the GBase 8a installation package after the OS upgrade corresponds to the new OS and that the 8a version is completely consistent with the pre-upgrade version.
- Ensure important files within the operating system are backed up before upgrading.
- Ensure that the server IP address and MAC address do not change after the OS upgrade.
3. Process Overview
The upgrade process is as shown in the following diagram:
4. Upgrade Steps
The upgrade process consists of 11 steps: the first 4 steps are completed before the system upgrade, step 5 is the OS upgrade, and steps 6-11 are operations under the new OS.
(1) Check Cluster Status
All nodes must be in a normal state, and cluster initialization must be completed before performing the version upgrade. If there are issues, they need to be resolved before proceeding.
- View cluster overview information:
gcadmin
- View data recovery information generated by DDL operations in the cluster:
gcadmin showddlevent
- View data recovery information generated by DML operations in the cluster:
gcadmin showdmlevent
- View complete recovery information of table partitions in the current cluster:
gcadmin showdmlstorageevent
- Display all failover information currently retained in gcware:
gcadmin showfailover
(2) Stop All Cluster Node Services and Refresh REDOLOG
- Stop cluster services:
gcluster_services all stop
- After stopping all node cluster services, start all node gcware individually:
gcluster_services gcware start
- On one gcware node, run Python:
import gcware
gcware.flush_statemachine()
- Wait two minutes for REDOLOG to refresh on all gcware nodes.
- Stop all node gcware services:
gcluster_services gcware stop
(3) Backup Cluster Related Files
Similar to physical backup and restore operations, apart from the program part, other elements like data, configuration files, runtime parameters, etc., need to be backed up. The principle is that the OS upgrade should not lose the backed-up data. The methods listed below are ranked by security from high to low:
- The safest method is to back up to external storage.
- Next is to back up to another non-root mount partition on the local system.
- The least safe is to move (mv) to another directory (definitely on the current mount partition; otherwise, mv and cp are the same). You can cp or mv to a local directory that remains after the OS upgrade or to external storage. If unsure, back up to external storage.
The files that need to be backed up include those under the following directories:
-
gcware
configuration files:/install_directory/gcware/config
-
gcware
data files:/install_directory/gcware/data
-
gcluster
configuration files:/install_directory/gcluster/config
-
gcluster
data files:/install_directory/gcluster/userdata
-
gnode
configuration files:/install_directory/gnode/config
-
gnode
data files:/install_directory/gnode/userdata
- C-UDF (user-defined functions written in C) files:
/install_directory/gcluster/server/lib/gbase/plugin/
and/install_directory/gnode/server/lib/gbase/plugin/
(4) Uninstall or Remove the Original Cluster
Uninstall using the uninstall.py that comes with the installation package, or directly delete the gcware
, gcluster
, and gnode
folders under the installation directory.
(5) Upgrade the Operating System
Ensure that the previously backed-up data is not lost. Currently known GBase 8a tools, such as gcadmin
, require Python 2.7 and do not support Python 3+. If Python 2.7 is uninstalled after the OS upgrade, please reinstall it.
(6) Install the New Cluster
Reinstall the corresponding version of the cluster according to the standard cluster installation steps. This includes checking the installation directory permissions and configuring environment variables. Refer to the product manual. If the OS is not a mainstream OS, such as AlmaLinux8, you can modify the gcinstall.py
file by commenting out the CheckOSType()
function to skip the OS check. After the auto-install script gcinstall.py
runs, proceed to the next steps.
(7) Stop Cluster Services
Stop cluster services:
gcluster_services all stop
(8) Restore Cluster Configuration and Data Files
Before restoring the configuration files, back up the gcware_file_info
file in the /install_directory/gcware/config
to another directory. After restoring the configuration files, the original gcware_file_info
should be used again. Sequentially restore all files backed up before the upgrade, overwriting the new cluster's files with the configuration and data files, and finally overwrite the gcware_file_info
with the backup from the previous step. Note: Only restore the user-defined part of the CUDF (plugins) and avoid overwriting the cluster's built-in ones. Use the cp
command with the -i
option to ignore already existing files and prevent overwriting.
(9) Start Cluster Services
Start cluster services:
gcluster_services all start