Introduction to GBase 8a MPP Cluster SQL Standards

Cong Li - Aug 21 - - Dev Community

Today, I would like to introduce the SQL standards of the GBase 8a database, including identifier syntax rules, comment rules, user variables, reserved words, and more.

1.1 Identifier Syntax Rules

The names of objects such as databases, tables, columns, and aliases are referred to as identifiers. This section will describe the syntax rules allowed for identifiers in the GBase 8a MPP Cluster. The following table provides an overview of the maximum length and allowable characters for each type of identifier.

Identifier Maximum Length (Characters) Allowed Characters
Database English: 48
Chinese: 48
a-z, A-Z, 0-9, underscore. Must start with a letter or underscore. Supports Chinese databases.
Table English: 56
Chinese: 21
a-z, A-Z, 0-9, hyphen, underscore, Chinese. Must start with a letter or underscore. Supports Chinese tables.
View English: 56
Chinese: 50
a-z, A-Z, 0-9, underscore. Must start with a letter or underscore. Supports Chinese views.
Column English: 64
Chinese: 64
a-z, A-Z, 0-9, underscore, Chinese. Must start with a letter or underscore. Supports Chinese columns.
Alias English: 256
Chinese: 85
a-z, A-Z, 0-9, underscore. Must start with a letter or underscore. Supports Chinese aliases.
Stored Procedure English: 64
Chinese: 64
a-z, A-Z, 0-9, underscore. Must start with a letter or underscore. Supports Chinese stored procedures.
User Variable No maximum length specified Composed of a-z, A-Z, 0-9, underscore. Must start with a letter or underscore. User variable names are case-insensitive.

Notes:

1) Apart from the restrictions listed in the table, identifiers cannot contain ASCII(0) or ASCII(255). Database, table, and column names should not end with a space.
2) If an identifier is a reserved word or contains special characters, it must always be quoted with backticks (` `) when used, e.g., SELECT * FROM `select`.id > 100.
3) If an identifier exceeds the maximum length, commands for databases, tables, columns, views, and stored procedures will generate an error, while aliases will be truncated to 256 characters for display.

In practical application systems, identifiers must not use GBase 8a MPP Cluster reserved words or contain special characters. For a list of reserved words supported by GBase 8a MPP Cluster, refer to the reserved words section below.

1.2 Identifier Qualifiers

GBase 8a MPP Cluster allows names to consist of one or more identifiers. The components of a composite name should be separated by the period character ".". The initial part of a composite name serves as a qualifier, influencing the interpretation of the subsequent identifiers in context. In GBase 8a MPP Cluster, users can reference a column using any of the following methods:

Column Reference Meaning
col_name Column col_name from any table used in the query.
table_name.col_name Column col_name from table table_name in the current database.
database_name.table_name.col_name Column col_name from table table_name in database database_name.
vc_name.database_name.table_name.col_name Column col_name from table table_name in database database_name in virtual cluster vc_name.
`column_name` The field is a keyword or contains special characters.

If a composite identifier needs to be quoted, each part of the identifier must be quoted separately, not as a whole. For example: `gs-table`.`gs-column` is valid, while `gs-table.gs-column` is not.

In a column reference within a statement, it is unnecessary to explicitly specify a table_name, database_name.table_name, or vc_name.database_name.table_name prefix unless the reference is ambiguous. For example:

1) If both t1 and t2 tables contain a column c, when selecting c from a query using both t1 and t2, the c field is ambiguous since it is not unique to the query's tables. Therefore, it must be specified as t1.c or t2.c.
2) When retrieving data from table t in database db1 and table t in database db2, the user must specify which database's table is being referenced using db1.t.col_name and db2.t.col_name.
3) When retrieving data from table t in database db1 in virtual cluster vc1 and table t in database db2 in virtual cluster vc2, the user must specify which virtual cluster's table is being referenced using vc1.db1.t.col_name and vc2.db2.t.col_name.

1.3 Comment Syntax

GBase 8a MPP Cluster supports three styles of comments:

  • #: Single-line comment.
  • --: Single-line comment starting with -- to the end of the line. Note that the -- comment requires at least one space after the second dash.
  • /* comment content */: This comment style supports single-line or multi-line comments and can be placed in the middle of a line. The /**/ enclosed sequence does not need to be on the same line, allowing for multi-line comments.

Note:
The -- comment style requires at least one space after the second dash. This syntax differs slightly from standard SQL comment styles.

Examples

Example 1: Using # for comments.

gbase> SELECT 1+1 FROM t;# This comment continues to the END of line
+-----+
| 1+1 |
+-----+
|   2 |
+-----+
1 row in set
Enter fullscreen mode Exit fullscreen mode

Example 2: Using -- for comments.

gbase> SELECT 1+1 FROM t;-- This comment continues to the END of line
+-----+
| 1+1 |
+-----+
|   2 |
+-----+
1 row in set
Enter fullscreen mode Exit fullscreen mode

Example 3: Using /* single-line */ for comments.

gbase> SELECT 1 /* this is an in-line comment */ + 1 FROM t;
+--------+
| 1  + 1 |
+--------+
|      2 |
+--------+
1 row in set
Enter fullscreen mode Exit fullscreen mode

Example 4: Using /* multi-line */ for comments.

gbase> SELECT 1+
  /* 
   this is a
   multiple-line comment
  */
   1 FROM t;
+--------------+
| 1    +     1 |
+--------------+
|            2 |
+--------------+ 
1 row in set
Enter fullscreen mode Exit fullscreen mode

1.4 User Variables

GBase 8a MPP Cluster supports user variables. The lifecycle of a user variable is session-based, and it is not visible to other sessions. When the user exits, all user variables for that user are automatically released.

User variables are written as @var_name.

Variables are defined and assigned values using the SET syntax:

SET @var_name = expr [, @var_name = expr] 
Enter fullscreen mode Exit fullscreen mode

The = is the assignment operator. The expr value assigned to each variable can be a real number, string, or NULL.

You can view the value of a user variable using the SELECT syntax:

SELECT @var_name [, @var_name] ... 
Enter fullscreen mode Exit fullscreen mode

Example

Example 1: Assigning values to variables using the SET statement.

gbase> SET @t1='abc',@t2=null,@t3=4;
Query OK, 0 rows affected
gbase> SELECT @t1,@t2,@t3;
+------+------+------+
| @t1  | @t2  | @t3  |
+------+------+------+
| abc  | NULL |    4 |
+------+------+------+
1 row in set
Enter fullscreen mode Exit fullscreen mode

User variables can be used anywhere an expression is allowed. If a user variable is not initialized, its value is NULL.

Note: Variables cannot be used in the context of constants, such as in the LIMIT clause of a SELECT.

1.5 Reserved Words

The following table lists the SQL reserved words supported by GBase 8a MPP Cluster. Please refer to it as needed.

Reserved words Reserved words Reserved words
ACCESSIBLE ADD ALL
ALTER ANALYZE AND
AS ASC ASENSITIVE
BEFORE BETWEEN BIGINT
BINARY BIT_AND BIT_OR
BIT_XOR BLOB BOTH
BY
CALL CASCADE CASE
CAST CHANGE CHAR
CHARACTER CHECK CLUSTER
COLLATE COLUMN COMPRESS
CONDITION CONNECT CONSTRAINT
CONTINUE CONVERT COUNT
CREATE CROSS CURDATE
CURDATETIME CURRENT_DATE CURRENT_DATETIME
CURRENT_TIME CURRENT_TIMESTAMP CURRENT_USER
CURSOR CURTIME
DATABASE DATABASES DATE_ADD
DATE_SUB DAY_HOUR DAY_MICROSECOND
DAY_MINUTE DAY_SECOND DEC
DECIMAL DECLARE DEFAULT
DELAYED DELETE DENSE_RANK
DESC DESCRIBE DETERMINISTIC
DISTINCT DISTINCTROW DISTRIBUTE
DISTRIBUTED DIV DOUBLE
DROP DUAL
EACH ELSE ELSEIF
ENCLOSED ESCAPED EXISTS
EXIT EXPLAIN EXTRACT
EXCEPT
FALSE FETCH FLOAT
FLOAT4 FLOAT8 FOR
FORCE FOREIGN FROM
FULL FULLTEXT
GCEXPORT GCIMPORT GCLOCAL
GCLUSTER GCLUSTER_LOCAL GET
GRANT GROUP GROUPED
GROUP_CONCAT HAVING HIGH_PRIORITY
HOUR_MICROSECOND HOUR_MINUTE HOUR_SECOND
IF IGNORE IN
INDEX INFILE INITNODEDATAMAP
INNER INOUT INPATH
INSENSITIVE INSERT INT
INT1 INT2 INT3
INT4 INT8 INTEGER
INTERSECT INTERVAL INTO
IS ITERATE JOIN
KEY KEYS KILL
LAG LEAD LEADING
LEAVE LEFT LEVEL
LIKE LIMIT LIMIT_STORAGE_SIZE
LINEAR LINES LINK
LOAD LOCALTIME LOCALTIMESTAMP
LOCK LONG LONGBLOB
LONGTEXT LOOP LOW_PRIORITY
MASTER_SSL_VERIFY_SERVER_CERT MATCH MAX
MEDIUMBLOB MEDIUMINT MEDIUMTEXT
MERGE MID MIDDLEINT
MIN MINUS MINUTE_MICROSECOND
MINUTE_SECOND MOD MODIFIES
NATURAL NOCOPIES NOCYCLE
NOT NOW NO_WRITE_TO_BINLOG
NULL NUMERIC
ON OPTIMIZE OPTION
OPTIONALLY OR ORDER
ORDERED OUT OUTER
OUTFILE OVER
POSITION PRECEDING PRIMARY
PRIOR PROCEDURE PURGE
RANGE RANK READ
READS READ_WRITE REAL
REFERENCES REFRESH REFRESHNODEDATAMAP
REGEXP RELEASE RENAME
REPEAT REPLACE REQUIRE
RESTRICT RETURN REVERT
REVOKE RIGHT RLIKE
SCHEMA SCHEMAS SCN_NUMBER
SECOND_MICROSECOND SEGMENT_ID SELECT
SELF
SENSITIVE SEPARATOR SET
SHOW SMALLINT SORT
SPATIAL SPECIFIC SQL
SQLEXCEPTION SQLSTATE SQLWARNING
SQL_BIG_RESULT SQL_CALC_FOUND_ROWS SQL_SMALL_RESULT
SSL START STARTING
STD STDDEV STDDEV_POP
STDDEV_SAMP STRAIGHT_JOIN
TABLE TARGET TERMINATED
THEN TINYBLOB TINYINT
TINYTEXT TO TRAILING
TRIGGER TRIM TRUE
UNDO UNION UNIQUE
UNLOCK UNSIGNED UPDATE
USAGE USE USING
UTC_DATE UTC_DATETIME UTC_TIME
UTC_TIMESTAMP
VALUES VARBINARY VARCHAR
VARCHARACTER VARYING VAR_SAMP
WHEN WHERE WHILE
WITH WRITE
XOR
YEAR_MONTH
ZEROFILL
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Terabox Video Player