I am experimenting a lot with our YugabyteDB database, both with the YSQL (PostgreSQL compatible) as well as the YCQL (Cassandra compatible) APIs, based on prospect and user requests and questions. That means I try things, both sensible things, but also things for which I don't know if they will work or not, which might be a failure scenario, or an overload scenario, etcetera. You get it.
One of reoccurring things that I need to know is: what is the current state of my YugabyteDB cluster? Is something, or someone, still actively using a database connection? Are there still database connections?
This not impossible to understand, but at the same time the having two different database APIs make it a bit more involved to get an answer to that, and having a cluster of machines make it even more so. Luckily, there is a solution now that makes this more easy: yb_stats --print-rpcs
.
This is how yb_stats --print-rpcs
on an idle (test) cluster looks like:
➜ yb_stats --print-rpcs
------------------------------------------------------------------------------------------------------------------------
192.168.66.80; port: 12000, 0/2 act/tot; port: 13000, 0/1 act/tot; port: 7000, 0/40 act/tot; port: 9000, 0/40 act/tot
------------------------------------------------------------------------------------------------------------------------
192.168.66.81; port: 13000, 0/1 act/tot; port: 7000, 0/8 act/tot; port: 9000, 0/40 act/tot
------------------------------------------------------------------------------------------------------------------------
192.168.66.82; port: 13000, 0/1 act/tot; port: 7000, 0/8 act/tot; port: 9000, 0/40 act/tot
------------------------------------------------------------------------------------------------------------------------
- There are 3 nodes in the cluster: 192.168.66.80, 192.168.66.81 and 192.168.66.82.
- The first node has connections on the following ports: 12000, 13000, 7000 and 9000.
- The other two nodes have connections on the ports 13000, 7000 and 9000.
YugabyteDB exposes information about YSQL on port 13000 and YCQL on port 12000 (by default). This means that to know more about YSQL and YCQL connections, you can filter down the --print-rpcs
request to these ports:
➜ yb_stats --print-rpcs --hostname-match '(12000|13000)'
------------------------------------------------------------------------------------------------------------------------
192.168.66.80; port: 12000, 0/2 act/tot; port: 13000, 0/1 act/tot
------------------------------------------------------------------------------------------------------------------------
192.168.66.81; port: 13000, 0/1 act/tot
------------------------------------------------------------------------------------------------------------------------
192.168.66.82; port: 13000, 0/1 act/tot
------------------------------------------------------------------------------------------------------------------------
So this means for YSQL, there are no (zero) active connections, and 1 idle connection. This is the checkpointer process, and for YCQL there are no connections on the nodes 192.168.66.81 and 192.168.66.82, and two idle connections to node 192.168.66.80.
If a connection becomes active, yb_stats
will show the active connection:
YSQL:
➜ yb_stats --print-rpcs --hostname-match '(12000|13000)'
------------------------------------------------------------------------------------------------------------------------
192.168.66.80; port: 13000, 1/2 act/tot
------------------------------------------------------------------------------------------------------------------------
192.168.66.80:13000 <-127.0.0.1:47024 active 4097 ms db:yugabyte, q:insert into t select id, repeat('x',500) from generate_series(1,1000000) id;
------------------------------------------------------------------------------------------------------------------------
192.168.66.81; port: 13000, 0/1 act/tot
------------------------------------------------------------------------------------------------------------------------
192.168.66.82; port: 13000, 0/1 act/tot
------------------------------------------------------------------------------------------------------------------------
The summary row which was shown earlier shows the active connection count, and the active connection is shown.
YCQL:
➜ yb_stats --print-rpcs --hostname-match '(12000|13000)'
------------------------------------------------------------------------------------------------------------------------
192.168.66.80; port: 12000, 1/9 act/tot; port: 13000, 0/1 act/tot
------------------------------------------------------------------------------------------------------------------------
192.168.66.80:12000 <-127.0.0.1:36166 OPEN # 200 45 ms BATCH (20) ks:test, q: INSERT INTO test.t2006 (year, month, dayofmonth, dayofweek, deptime, crsdeptime, arrtime, crsarrtime, uniquecarrier, flightnum, tailnum, actualelapsedtime, crselapsedtime, airtime, arrdelay, depdelay, origin, dest, distance, taxiin, taxiout, cancelled, cancellationcode, diverted, carrierdelay, weatherdelay, nasdelay, securitydelay, lateaircraftdelay) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
45 ms BATCH (20) ks:test, q: INSERT INTO test.t2006 (year, month, dayofmonth, dayofweek, deptime, crsdeptime, arrtime, crsarrtime, uniquecarrier, flightnum, tailnum, actualelapsedtime, crselapsedtime, airtime, arrdelay, depdelay, origin, dest, distance, taxiin, taxiout, cancelled, cancellationcode, diverted, carrierdelay, weatherdelay, nasdelay, securitydelay, lateaircraftdelay) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
45 ms BATCH (20) ks:test, q: INSERT INTO test.t2006 (year, month, dayofmonth, dayofweek, deptime, crsdeptime, arrtime, crsarrtime, uniquecarrier, flightnum, tailnum, actualelapsedtime, crselapsedtime, airtime, arrdelay, depdelay, origin, dest, distance, taxiin, taxiout, cancelled, cancellationcode, diverted, carrierdelay, weatherdelay, nasdelay, securitydelay, lateaircraftdelay) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
46 ms BATCH (20) ks:test, q: INSERT INTO test.t2006 (year, month, dayofmonth, dayofweek, deptime, crsdeptime, arrtime, crsarrtime, uniquecarrier, flightnum, tailnum, actualelapsedtime, crselapsedtime, airtime, arrdelay, depdelay, origin, dest, distance, taxiin, taxiout, cancelled, cancellationcode, diverted, carrierdelay, weatherdelay, nasdelay, securitydelay, lateaircraftdelay) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
------------------------------------------------------------------------------------------------------------------------
192.168.66.81; port: 12000, 0/3 act/tot; port: 13000, 0/1 act/tot
------------------------------------------------------------------------------------------------------------------------
192.168.66.82; port: 12000, 0/3 act/tot; port: 13000, 0/1 act/tot
------------------------------------------------------------------------------------------------------------------------
This is a single 'COPY FROM' command, which produces work over the cluster. It shows the multiple commands in flight, as well as the batching (20 rows).
This makes yb_stats --print-rpcs
a really handy tool to quickly check activity and connections to the YSQL and YCQL endpoints.
If you want to see the idle connections, you can add the --details-enable
switch:
➜ yb_stats --print-rpcs --hostname-match '(12000|13000)' --details-enable
------------------------------------------------------------------------------------------------------------------------
192.168.66.80; port: 12000, 0/2 act/tot; port: 13000, 0/1 act/tot
------------------------------------------------------------------------------------------------------------------------
192.168.66.80:13000 <-background:checkpointer 0 ms db:, q:
192.168.66.80:12000 <-127.0.0.1:34726 OPEN # 13
192.168.66.80:12000 <-127.0.0.1:34736 OPEN # 2
------------------------------------------------------------------------------------------------------------------------
192.168.66.81; port: 13000, 0/2 act/tot
------------------------------------------------------------------------------------------------------------------------
192.168.66.81:13000 <-127.0.0.1:56646 idle 0 ms db:yugabyte, q:
192.168.66.81:13000 <-background:checkpointer 0 ms db:, q:
------------------------------------------------------------------------------------------------------------------------
192.168.66.82; port: 13000, 0/1 act/tot
------------------------------------------------------------------------------------------------------------------------
192.168.66.82:13000 <-background:checkpointer 0 ms db:, q:
------------------------------------------------------------------------------------------------------------------------
If you want to know more about the YugabyteDB cluster performing requests, you can remove the filtering, which will show the all the active RPCs:
➜ yb_stats --print-rpcs
------------------------------------------------------------------------------------------------------------------------
192.168.66.80; port: 13000, 1/2 act/tot; port: 7000, 0/43 act/tot; port: 9000, 2/43 act/tot
------------------------------------------------------------------------------------------------------------------------
192.168.66.80:13000 <-127.0.0.1:46906 active 8238 ms db:yugabyte, q:insert into t select id, repeat('x',500) from generate_series(1,1000000) id;
192.168.66.80:9000 <-127.0.0.1:52912 OPEN # 96 247 ms READY yb.tserver.PgClientService:Perform (timeout: 120000 ms)
331 ms READY yb.tserver.PgClientService:Perform (timeout: 120000 ms)
397 ms READY yb.tserver.PgClientService:Perform (timeout: 120000 ms)
192.168.66.80:9000 ->192.168.66.81:9100 OPEN # 23515 218 ms SENT yb.tserver.TabletServerService:Write (timeout: 119965 ms)
------------------------------------------------------------------------------------------------------------------------
192.168.66.81; port: 13000, 0/2 act/tot; port: 7000, 0/8 act/tot; port: 9000, 4/44 act/tot
------------------------------------------------------------------------------------------------------------------------
192.168.66.81:9000 <-192.168.66.80:40516 OPEN # 23514 123 ms READY yb.tserver.TabletServerService:Write (timeout: 119987 ms)
192.168.66.81:9000 <-192.168.66.80:40526 OPEN # 23514 63 ms READY yb.tserver.TabletServerService:Write (timeout: 119973 ms)
192.168.66.81:9000 ->192.168.66.80:9100 OPEN # 22848 16 ms SENT yb.consensus.ConsensusService:UpdateConsensus (timeout: 3000 ms)
192.168.66.81:9000 ->192.168.66.82:9100 OPEN # 22568 30 ms SENT yb.consensus.ConsensusService:UpdateConsensus (timeout: 3000 ms)
------------------------------------------------------------------------------------------------------------------------
192.168.66.82; port: 13000, 0/1 act/tot; port: 7000, 0/8 act/tot; port: 9000, 3/41 act/tot
------------------------------------------------------------------------------------------------------------------------
192.168.66.82:9000 <-192.168.66.80:54442 OPEN # 22834 21 ms READY yb.consensus.ConsensusService:UpdateConsensus (timeout: 3000 ms)
192.168.66.82:9000 <-192.168.66.81:57118 OPEN # 22567 7 ms READY yb.consensus.ConsensusService:UpdateConsensus (timeout: 3000 ms)
192.168.66.82:9000 <-192.168.66.80:54406 OPEN # 22835 57 ms READY yb.tserver.TabletServerService:Write (timeout: 119997 ms)
------------------------------------------------------------------------------------------------------------------------
Here you see a batch insertion initiated from YSQL.
The batch insertion executed the typical 3 PgClientService:Perform RPC requests to its local tablet server (which are executed in the same connection). The other active RPCs are the TabletServerService write requests and the ConsensusService updateconsensus requests.
This allows you to see the active RPC requests in a YugabyteDB in great detail.