When deploying YugabyteDB on Kubernetes, you will have two Stateful Sets: YB-Master and YB-TServer. You also start those two on VMs.
You want to scale out the YB-TServers (tablet servers) to more than one per failure zone for elasticity. This is where your application connects, SQL is processed, and data is stored (in tablets). You can scale to more pods when you have more connections, a higher workload, or more data. Small databases with a moderate load can scale down to the replication factor, such as three, to be resilient to one failure.
The YB-Master doesn't need to scale to more than one per fault zone. It stores the cluster metadata, including the PostgreSQL catalog, and orchestrates the cluster's automated operations, like tablet rebalancing. The name "Master" is misleading. TServers can function for a while without the Master. There's no disruption if the leader steps down and one of the followers is elected after a few seconds. Regarding network traffic, the TServers call it on new SQL connections to get the catalog information to keep in the cache. When DDL occurs, they refresh their cache. Besides that, the network traffic between the Master and TServers is limited to heartbeats.
TCP traffic to YB Master
To illustrate this, I use tcpdump
to trace the traffic to the port 7100
where the yb-master
leader runs:
tcpdump -nni any dst port 7100 -A |
# display every 10 seconds the count of lines with ".yb."
awk -F . '
/ IP [0-9.]+[.][0-9.]+ > [0-9.]+[.]7100: /{t1=$1}
/[.]yb[.]/{c[$0]=c[$0]+1} # count messages with ".yb."
substr(t1,1,7)>substr(t0,1,7){ # print every 10 seconds
for (i in c) printf "%10d %-s\n",c[i],i;print t1;delete(c)
}
{t0=t1}
'
The AWK script counts the lines with .yb.
, which are YugabyteDB Protobufs, and shows the count every 10 seconds.
Idle server
When there is no activity, YB Master receives heartbeats and consensus updates. In this three-node cluster with hundreds of tables, I see around 100 RPCs per 10 seconds:
19:48:00
20 .yb.consensus.ConsensusService..UpdateConsensus.....: f48f17d6633c46f7bda6b0fc0d34e4ba
4 .yb.master.MasterService..GetTableSchema..'.
6 .yb.master.MasterService..TSHeartbeat..u...
20 .yb.consensus.ConsensusService..UpdateConsensus.....: 5502c42867c649dabe00ed71e7242c7e
4 .yb.master.MasterService..GetTableLocations..'.
34 .yb.master.MasterService..TSHeartbeat..u..
19:48:10
20 .yb.consensus.ConsensusService..UpdateConsensus.....: f48f17d6633c46f7bda6b0fc0d34e4ba
4 .yb.master.MasterService..GetTableSchema..'.
6 .yb.master.MasterService..TSHeartbeat..u...
20 .yb.consensus.ConsensusService..UpdateConsensus.....: 5502c42867c649dabe00ed71e7242c7e
4 .yb.master.MasterService..GetTableLocations..'.
34 .yb.master.MasterService..TSHeartbeat..u..
Connections
I start PgBench with ten connections:
pgbench -nc 10 -T 10 -f /dev/stdin <<<"select pg_sleep(1)"
I see 132 read requests from the TServers when those new connections read the critical information from the PostgreSQL catalog.
19:54:00
11 .yb.master.MasterService..GetNamespaceInfo....&
19 .yb.consensus.ConsensusService..UpdateConsensus.....: f48f17d6633c46f7bda6b0fc0d34e4ba
4 .yb.master.MasterService..GetTableSchema..'.
132 .yb.tserver.TabletServerService..Read...$..
6 .yb.master.MasterService..TSHeartbeat..u...
19 .yb.consensus.ConsensusService..UpdateConsensus.....: 5502c42867c649dabe00ed71e7242c7e
4 .yb.master.MasterService..GetTableLocations..'.
34 .yb.master.MasterService..TSHeartbeat..u..
19:54:10
You can see that those reads are "tserver" protobuf because they read from the catalog tablet 0000000000000000000000, which is stored in the Master but is similar to other tablets stored in the TServers.
Data Definition Language
I create the PgBench tables:
pgbench -niIdtpfg
This runs multiple DDL statements to create the tables and add primary and foreign keys. It reads and writes to the catalog with higher activity calling the YB-Master:
20:00:50
4 .yb.master.MasterService..GetTableLocations..'.
2 .yb.consensus.ConsensusService..UpdateConsensus....m: 5502c42867c649dabe00ed71e7242c7e
1 .yb.consensus.ConsensusService..UpdateConsensus....j: 5502c42867c649dabe00ed71e7242c7e
2 .yb.master.MasterService..IsAlterTableDone....L
4 .yb.master.MasterService..IsAlterTableDone....M
1 .yb.consensus.ConsensusService..UpdateConsensus....v: 5502c42867c649dabe00ed71e7242c7e
6 .yb.master.MasterService..TSHeartbeat..u...
3 .yb.consensus.ConsensusService..UpdateConsensus....u: f48f17d6633c46f7bda6b0fc0d34e4ba
10 .yb.tserver.TabletServerService..Write...$.
30 .yb.master.MasterService..IsCreateTableDone....$
4 .yb.master.MasterService..TruncateTable...."
600 .yb.consensus.ConsensusService..UpdateConsensus.....: 5502c42867c649dabe00ed71e7242c7e
600 .yb.consensus.ConsensusService..UpdateConsensus.....: 5502c42867c649dabe00ed71e7242c7e
264 .yb.master.MasterService..TSHeartbeat..u..
136 .yb.master.MasterService..GetTableSchema....$
2 .yb.consensus.ConsensusService..UpdateConsensus....m: f48f17d6633c46f7bda6b0fc0d34e4ba
1 .yb.consensus.ConsensusService..UpdateConsensus....j: f48f17d6633c46f7bda6b0fc0d34e4ba
30 .yb.tserver.TabletServerService..Write...$.
882 .yb.tserver.TabletServerService..Read...$..
13 .yb.tserver.TabletServerService..UpdateTransaction..'p
3 .yb.master.MasterService..IsObjectPartOfXRepl...."
13 .yb.master.MasterService.
4 .yb.master.MasterService..GetTableSchema..'.
12 .yb.consensus.ConsensusService..UpdateConsensus.... : 5502c42867c649dabe00ed71e7242c7e
7 .yb.master.MasterService..GetTableLocations....(
1 .yb.consensus.ConsensusService..UpdateConsensus....v: f48f17d6633c46f7bda6b0fc0d34e4ba
3 .yb.master.MasterService..DeleteTable....&
136 .yb.master.MasterService..GetTableLocations.....
597 .yb.consensus.ConsensusService..UpdateConsensus.....: f48f17d6633c46f7bda6b0fc0d34e4ba
12 .yb.master.MasterService..IsTruncateTableDone...."
22 .yb.master.MasterService..IsAlterTableDone....$
143 .yb.tserver.TabletServerService..Write...$..
1 .yb.tserver.TabletServerService..Write...$.0
7 .yb.master.MasterService..CreateTable......
1 .yb.master.MasterService..GetNamespaceInfo....&
4 .yb.master.MasterService..ListTabletServers.......
2 .yb.tserver.TabletServerService..Write...$.5
3 .yb.consensus.ConsensusService..UpdateConsensus....u: 5502c42867c649dabe00ed71e7242c7e
7 .yb.master.MasterService..IsDeleteTableDone...."
4 .yb.tserver.TabletServerService..Write...$.9
12 .yb.consensus.ConsensusService..UpdateConsensus.... : f48f17d6633c46f7bda6b0fc0d34e4ba
20:01:00
The updates on the YB-Master Leader are replicated to its Raft Followers, and the catalog version change is also sent to the TServers through the heartbeats, which calls back the Master to get the schema to refresh their cache.
Data Manipulation Language
I started PgBench read/write workload:
pgbench -nc 10 -T 60 -P 10 -N
Once the clients are connected, there are no additional calls to the Master:
Frequent Connections
I can add the -C
option to PgBench to re-connect for each transaction (which is one of the worst practices with most databases - don't do that):
pgbench -nc 10 -T 60 -P 10 -N -C
With 10 clients reconnecting, the number of calls to the Master increases:
Frequent re-connections are not scalable, especially in a multi-region cluster with high latency from the YB-TServers to the YB-Master. You must use a connection pool (or the YugabyteDB Connection Manager, which is a database-resident connection pool) so that physical connections keep the state of the metadata cache when grabbing new logical connections.
I use this level of tracing to get more details, but in a production cluster, you can monitor the number of operations on the Master:
There are detailed metrics available in the Prometheus Endpoint of the tablet servers:
$ curl -Ls arm.pachot.net:9000/prometheus-metrics | awk '{sub(/{/," ")}/^proxy_response_bytes_yb_master_Master/ && $(NF-1)>0{printf "%10d %-s\n", $(NF-1),$1}' | sort -n
4 proxy_response_bytes_yb_master_MasterDdl_IsCreateNamespaceDone
12 proxy_response_bytes_yb_master_MasterReplication_IsObjectPartOfXRepl
16 proxy_response_bytes_yb_master_MasterClient_ReservePgsqlOids
34 proxy_response_bytes_yb_master_MasterDdl_CreateNamespace
44 proxy_response_bytes_yb_master_MasterAdmin_WaitForYsqlBackendsCatalogVersion
48 proxy_response_bytes_yb_master_MasterDdl_IsTruncateTableDone
84 proxy_response_bytes_yb_master_MasterDdl_IsDeleteTableDone
396 proxy_response_bytes_yb_master_MasterDdl_BackfillIndex
408 proxy_response_bytes_yb_master_MasterDdl_IsAlterTableDone
431 proxy_response_bytes_yb_master_MasterDdl_ListNamespaces
488 proxy_response_bytes_yb_master_MasterDdl_IsCreateTableDone
514 proxy_response_bytes_yb_master_MasterDdl_GetBackfillStatus
910 proxy_response_bytes_yb_master_MasterDdl_DeleteTable
1088 proxy_response_bytes_yb_master_MasterClient_GetTransactionStatusTablets
1761 proxy_response_bytes_yb_master_MasterClient_GetTabletLocations
1870 proxy_response_bytes_yb_master_MasterDdl_CreateTable
53124 proxy_response_bytes_yb_master_MasterCluster_ListTabletServers
143572 proxy_response_bytes_yb_master_MasterDdl_GetNamespaceInfo
7835913 proxy_response_bytes_yb_master_MasterClient_GetTableLocations
17771529 proxy_response_bytes_yb_master_MasterDdl_GetTableSchema
424897254 proxy_response_bytes_yb_master_MasterHeartbeat_TSHeartbeat
You probably don't need this level of information, but if you observe too many remote calls (RPC) to the YB-Master, you should look at it.