YB-Master, the YugabyteDB Universe control plane

Franck Pachot - May 24 - - Dev Community

When deploying YugabyteDB on Kubernetes, you will have two Stateful Sets: YB-Master and YB-TServer. You also start those two on VMs.

You want to scale out the YB-TServers (tablet servers) to more than one per failure zone for elasticity. This is where your application connects, SQL is processed, and data is stored (in tablets). You can scale to more pods when you have more connections, a higher workload, or more data. Small databases with a moderate load can scale down to the replication factor, such as three, to be resilient to one failure.

The YB-Master doesn't need to scale to more than one per fault zone. It stores the cluster metadata, including the PostgreSQL catalog, and orchestrates the cluster's automated operations, like tablet rebalancing. The name "Master" is misleading. TServers can function for a while without the Master. There's no disruption if the leader steps down and one of the followers is elected after a few seconds. Regarding network traffic, the TServers call it on new SQL connections to get the catalog information to keep in the cache. When DDL occurs, they refresh their cache. Besides that, the network traffic between the Master and TServers is limited to heartbeats.

TCP traffic to YB Master

To illustrate this, I use tcpdump to trace the traffic to the port 7100 where the yb-master leader runs:



tcpdump -nni any dst port 7100 -A |
# display every 10 seconds the count of lines with ".yb."
 awk -F . '
 / IP [0-9.]+[.][0-9.]+ > [0-9.]+[.]7100: /{t1=$1}
 /[.]yb[.]/{c[$0]=c[$0]+1}         # count messages with ".yb."
 substr(t1,1,7)>substr(t0,1,7){    # print every 10 seconds
  for (i in c) printf "%10d %-s\n",c[i],i;print t1;delete(c)
 }
 {t0=t1}
' 


Enter fullscreen mode Exit fullscreen mode

The AWK script counts the lines with .yb., which are YugabyteDB Protobufs, and shows the count every 10 seconds.

Idle server

When there is no activity, YB Master receives heartbeats and consensus updates. In this three-node cluster with hundreds of tables, I see around 100 RPCs per 10 seconds:



19:48:00
        20 .yb.consensus.ConsensusService..UpdateConsensus.....: f48f17d6633c46f7bda6b0fc0d34e4ba
         4 .yb.master.MasterService..GetTableSchema..'.
         6 .yb.master.MasterService..TSHeartbeat..u...
        20 .yb.consensus.ConsensusService..UpdateConsensus.....: 5502c42867c649dabe00ed71e7242c7e
         4 .yb.master.MasterService..GetTableLocations..'.
        34 .yb.master.MasterService..TSHeartbeat..u..
19:48:10
        20 .yb.consensus.ConsensusService..UpdateConsensus.....: f48f17d6633c46f7bda6b0fc0d34e4ba
         4 .yb.master.MasterService..GetTableSchema..'.
         6 .yb.master.MasterService..TSHeartbeat..u...
        20 .yb.consensus.ConsensusService..UpdateConsensus.....: 5502c42867c649dabe00ed71e7242c7e
         4 .yb.master.MasterService..GetTableLocations..'.
        34 .yb.master.MasterService..TSHeartbeat..u..


Enter fullscreen mode Exit fullscreen mode

Connections

I start PgBench with ten connections:



pgbench -nc 10 -T 10 -f /dev/stdin <<<"select pg_sleep(1)"


Enter fullscreen mode Exit fullscreen mode

I see 132 read requests from the TServers when those new connections read the critical information from the PostgreSQL catalog.



19:54:00
        11 .yb.master.MasterService..GetNamespaceInfo....&
        19 .yb.consensus.ConsensusService..UpdateConsensus.....: f48f17d6633c46f7bda6b0fc0d34e4ba
         4 .yb.master.MasterService..GetTableSchema..'.
       132 .yb.tserver.TabletServerService..Read...$..
         6 .yb.master.MasterService..TSHeartbeat..u...
        19 .yb.consensus.ConsensusService..UpdateConsensus.....: 5502c42867c649dabe00ed71e7242c7e
         4 .yb.master.MasterService..GetTableLocations..'.
        34 .yb.master.MasterService..TSHeartbeat..u..
19:54:10


Enter fullscreen mode Exit fullscreen mode

You can see that those reads are "tserver" protobuf because they read from the catalog tablet 0000000000000000000000, which is stored in the Master but is similar to other tablets stored in the TServers.

Data Definition Language

I create the PgBench tables:



pgbench -niIdtpfg


Enter fullscreen mode Exit fullscreen mode

This runs multiple DDL statements to create the tables and add primary and foreign keys. It reads and writes to the catalog with higher activity calling the YB-Master:



20:00:50
         4 .yb.master.MasterService..GetTableLocations..'.
         2 .yb.consensus.ConsensusService..UpdateConsensus....m: 5502c42867c649dabe00ed71e7242c7e
         1 .yb.consensus.ConsensusService..UpdateConsensus....j: 5502c42867c649dabe00ed71e7242c7e
         2 .yb.master.MasterService..IsAlterTableDone....L
         4 .yb.master.MasterService..IsAlterTableDone....M
         1 .yb.consensus.ConsensusService..UpdateConsensus....v: 5502c42867c649dabe00ed71e7242c7e
         6 .yb.master.MasterService..TSHeartbeat..u...
         3 .yb.consensus.ConsensusService..UpdateConsensus....u: f48f17d6633c46f7bda6b0fc0d34e4ba
        10 .yb.tserver.TabletServerService..Write...$.
        30 .yb.master.MasterService..IsCreateTableDone....$
         4 .yb.master.MasterService..TruncateTable...."
       600 .yb.consensus.ConsensusService..UpdateConsensus.....: 5502c42867c649dabe00ed71e7242c7e
       600 .yb.consensus.ConsensusService..UpdateConsensus.....: 5502c42867c649dabe00ed71e7242c7e
       264 .yb.master.MasterService..TSHeartbeat..u..
       136 .yb.master.MasterService..GetTableSchema....$
         2 .yb.consensus.ConsensusService..UpdateConsensus....m: f48f17d6633c46f7bda6b0fc0d34e4ba
         1 .yb.consensus.ConsensusService..UpdateConsensus....j: f48f17d6633c46f7bda6b0fc0d34e4ba
        30 .yb.tserver.TabletServerService..Write...$.
       882 .yb.tserver.TabletServerService..Read...$..
        13 .yb.tserver.TabletServerService..UpdateTransaction..'p
         3 .yb.master.MasterService..IsObjectPartOfXRepl...."
        13 .yb.master.MasterService.
         4 .yb.master.MasterService..GetTableSchema..'.
        12 .yb.consensus.ConsensusService..UpdateConsensus....  : 5502c42867c649dabe00ed71e7242c7e
         7 .yb.master.MasterService..GetTableLocations....(
         1 .yb.consensus.ConsensusService..UpdateConsensus....v: f48f17d6633c46f7bda6b0fc0d34e4ba
         3 .yb.master.MasterService..DeleteTable....&
       136 .yb.master.MasterService..GetTableLocations.....
       597 .yb.consensus.ConsensusService..UpdateConsensus.....: f48f17d6633c46f7bda6b0fc0d34e4ba
        12 .yb.master.MasterService..IsTruncateTableDone...."
        22 .yb.master.MasterService..IsAlterTableDone....$
       143 .yb.tserver.TabletServerService..Write...$..
         1 .yb.tserver.TabletServerService..Write...$.0
         7 .yb.master.MasterService..CreateTable......
         1 .yb.master.MasterService..GetNamespaceInfo....&
         4 .yb.master.MasterService..ListTabletServers.......
         2 .yb.tserver.TabletServerService..Write...$.5
         3 .yb.consensus.ConsensusService..UpdateConsensus....u: 5502c42867c649dabe00ed71e7242c7e
         7 .yb.master.MasterService..IsDeleteTableDone...."
         4 .yb.tserver.TabletServerService..Write...$.9
        12 .yb.consensus.ConsensusService..UpdateConsensus....  : f48f17d6633c46f7bda6b0fc0d34e4ba
20:01:00


Enter fullscreen mode Exit fullscreen mode

The updates on the YB-Master Leader are replicated to its Raft Followers, and the catalog version change is also sent to the TServers through the heartbeats, which calls back the Master to get the schema to refresh their cache.

Data Manipulation Language

I started PgBench read/write workload:



pgbench -nc 10 -T 60 -P 10 -N


Enter fullscreen mode Exit fullscreen mode

Once the clients are connected, there are no additional calls to the Master:
Image description

Frequent Connections

I can add the -C option to PgBench to re-connect for each transaction (which is one of the worst practices with most databases - don't do that):



pgbench -nc 10 -T 60 -P 10 -N -C


Enter fullscreen mode Exit fullscreen mode

With 10 clients reconnecting, the number of calls to the Master increases:
Image description
Frequent re-connections are not scalable, especially in a multi-region cluster with high latency from the YB-TServers to the YB-Master. You must use a connection pool (or the YugabyteDB Connection Manager, which is a database-resident connection pool) so that physical connections keep the state of the metadata cache when grabbing new logical connections.

I use this level of tracing to get more details, but in a production cluster, you can monitor the number of operations on the Master:
Image description

There are detailed metrics available in the Prometheus Endpoint of the tablet servers:



$ curl -Ls arm.pachot.net:9000/prometheus-metrics | awk '{sub(/{/," ")}/^proxy_response_bytes_yb_master_Master/ && $(NF-1)>0{printf "%10d %-s\n", $(NF-1),$1}' | sort -n
         4 proxy_response_bytes_yb_master_MasterDdl_IsCreateNamespaceDone
        12 proxy_response_bytes_yb_master_MasterReplication_IsObjectPartOfXRepl
        16 proxy_response_bytes_yb_master_MasterClient_ReservePgsqlOids
        34 proxy_response_bytes_yb_master_MasterDdl_CreateNamespace
        44 proxy_response_bytes_yb_master_MasterAdmin_WaitForYsqlBackendsCatalogVersion
        48 proxy_response_bytes_yb_master_MasterDdl_IsTruncateTableDone
        84 proxy_response_bytes_yb_master_MasterDdl_IsDeleteTableDone
       396 proxy_response_bytes_yb_master_MasterDdl_BackfillIndex
       408 proxy_response_bytes_yb_master_MasterDdl_IsAlterTableDone
       431 proxy_response_bytes_yb_master_MasterDdl_ListNamespaces
       488 proxy_response_bytes_yb_master_MasterDdl_IsCreateTableDone
       514 proxy_response_bytes_yb_master_MasterDdl_GetBackfillStatus
       910 proxy_response_bytes_yb_master_MasterDdl_DeleteTable
      1088 proxy_response_bytes_yb_master_MasterClient_GetTransactionStatusTablets
      1761 proxy_response_bytes_yb_master_MasterClient_GetTabletLocations
      1870 proxy_response_bytes_yb_master_MasterDdl_CreateTable
     53124 proxy_response_bytes_yb_master_MasterCluster_ListTabletServers
    143572 proxy_response_bytes_yb_master_MasterDdl_GetNamespaceInfo
   7835913 proxy_response_bytes_yb_master_MasterClient_GetTableLocations
  17771529 proxy_response_bytes_yb_master_MasterDdl_GetTableSchema
 424897254 proxy_response_bytes_yb_master_MasterHeartbeat_TSHeartbeat


Enter fullscreen mode Exit fullscreen mode

You probably don't need this level of information, but if you observe too many remote calls (RPC) to the YB-Master, you should look at it.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Terabox Video Player