History and Description
--by Jeff Victor
In order to better understand the possibilties of monitoring resource consumption of workloads in Solaris Containers, I created a Perl script I call 'zonestat'. Zonestat summarizes resource usage of Containers (Zones). I consider this script a prototype, not intended for production use. On the other hand, for a small number of zones, it seems to be pretty handy, and moderately robust.
Its output looks like this:
|~----Pool~-----|~------CPU~-------|~----------------Memory~----------------|
|~---|~--Size~---|~-----Pset~-------|~---RAM~---|~---Shm~---|~---Lkd~---|~---VM~---|
Zonename| IT| Max| Cur| Cap|Used|Shr|S%| Cap| Use| Cap| Use| Cap| Use| Cap| Use
~-------------------------------------------------------------------------------
global 0D 66K 2 0.1 1 25 986M 139K 18E 2M 18E 754M
db01 0D 66K 2 0.1 2 50 1G 122M 536M 536M 0 1G 135M
web02 0D 66K 2 0.42 0.0 1 25 100M 11M 20M 20M 0 268M 8M
The 'Pool' columns provide information about the Dynamic Resource Pool in which the zone's processes are running. The two-character 'I' column displays the pool ID (number) and the 'T' column indicates the type of pool - 'D' for 'default', 'P' for 'private' (using the dedicated-cpu feature of zonecfg) or 'S' for 'shared.' The two 'Size' columns show the quantity of CPUs assigned to the pool in which the zone is running.
The 'CPU Pset' columns show each zone's CPU usage and any caps that have been set. The first two columns show CPU quantities - vCPUs on CMT systems. The other two show the number of FSS shares assigned to the zone, and what percentage of the total number of shares in that zone's pool. In the example above, all the zones share the default pset, and the zone 'db01' has two shares, so it should receive 50% of the CPU power of the pool as a guaranteed minimum.
The 'Memory' columns show the caps and usage for RAM, shared memory, locked memory and virtual memory.
The syntax of zonestat is very similar to the other *stat tools:
zonestat [-l] [interval [count]]
The output shown above is generated with the -l flag, which means "show the limits (caps) that have been set." Without -l, only usage columns are displayed.
Example of Usage
Here is more output, showing some of the conclusions that can be drawn from the data. I have added parenthetical numbers in the right-hand in order to refer to specific lines of output.
|~----Pool~-----|~------CPU~-------|~----------------Memory~----------------|
|~---|~--Size~---|~-----Pset~-------|~---RAM~---|~---Shm~---|~---Lkd~---|~---VM~---|
Zonename| IT| Max| Cur| Cap|Used|Shr|S%| Cap| Use| Cap| Use| Cap| Use| Cap| Use
~-------------------------------------------------------------------------------
global 0D 66K 2 0.1 1 HH 983M 139K 18E 2M 18E 752M
~--------
global 0D 66K 2 0.1 1 HH 983M 139K 18E 2M 18E 752M
Note that the none of the non-global zones are running. Because the global zone is the only zone running in its pool, its one FSS share is 100% of the shares in its pool. To save a column of output, I indicate that with 'HH' instead of '100'.
global 0D 66K 2 0.1 1 50 984M 139K 18E 2M 18E 753M
z3 0D 66K 2 0.1 1 50 1G 30M 536M 536M 0 1G 27M
Another zone has booted. It has caps for RAM, shared memory, locked memory, and VM. The default pool now has a total of two shares: one for each zone. Therefore, each zone has 50% of the shares in that pool.
global 0D 66K 2 0.1 1 50 984M 139K 18E 2M 18E 753M
z3 0D 66K 2 0.3 1 50 1G 93M 536M 536M 0 1G 95M
~--------
global 0D 66K 2 0.1 1 50 981M 139K 18E 2M 18E 753M
z3 0D 66K 2 0.4 1 50 1G 122M 536M 536M 0 1G 135M
The zone 'z3' is still booting, and is using 0.4 CPUs worth of CPU cycles.
global 0D 66K 2 0.1 1 50 984M 139K 18E 2M 18E 753M
z3 0D 66K 2 0.3 1 50 1G 122M 536M 536M 0 1G 135M
~--------
global 0D 66K 2 0.1 1 50 984M 139K 18E 2M 18E 753M
z3 0D 66K 2 0.2 1 50 1G 122M 536M 536M 0 1G 135M
~--------
global 0D 66K 2 0.1 1 33 986M 139K 18E 2M 18E 754M
z3 0D 66K 2 0.1 1 33 1G 122M 536M 536M 0 1G 135M
web02 0D 66K 2 0.42 0.0 1 33 100M 11M 20M 20M 0 268M 8M
A third zone has booted. This zone has a CPU-cap of 0.42 CPUs. It also has memory caps, including a RAM cap that is less than the amount of memory that zone 'z3' is using. Let's see what happens...
~--------
global 0D 66K 2 0.1 1 33 985M 139K 18E 2M 18E 754M
z3 0D 66K 2 0.1 1 33 1G 122M 536M 536M 0 1G 135M
web02 0D 66K 2 0.42 0.1 1 33 100M 29M 20M 20M 0 268M 36M
~--------
global 0D 66K 2 0.1 1 33 984M 139K 18E 2M 18E 754M
z3 0D 66K 2 0.1 1 33 1G 122M 536M 536M 0 1G 135M
web02 0D 66K 2 0.42 0.2 1 33 100M 63M 20M 20M 0 268M 138M
|~----Pool~-----|~------CPU~-------|~----------------Memory~----------------|
|~---|~--Size~---|~-----Pset~-------|~---RAM~---|~---Shm~---|~---Lkd~---|~---VM~---|
Zonename| IT| Max| Cur| Cap|Used|Shr|S%| Cap| Use| Cap| Use| Cap| Use| Cap| Use
~-------------------------------------------------------------------------------
global 0D 66K 2 0.1 1 33 985M 139K 18E 2M 18E 754M
z3 0D 66K 2 0.0 1 33 1G 122M 536M 536M 0 1G 135M
web02 0D 66K 2 0.42 0.2 1 33 100M 87M 20M 20M 0 268M 185M
~--------
global 0D 66K 2 0.1 1 33 985M 139K 18E 2M 18E 754M
z3 0D 66K 2 0.0 1 33 1G 122M 536M 536M 0 1G 135M
web02 0D 66K 2 0.42 0.2 1 33 100M 100M 20M 20M 0 268M 112M
~--------
global 0D 66K 2 0.1 1 33 984M 139K 18E 2M 18E 754M
z3 0D 66K 2 0.0 1 33 1G 122M 536M 536M 0 1G 135M
web02 0D 66K 2 0.42 0.3 1 33 100M 112M 20M 20M 0 268M 117M
As expected, web02 exceeds its RAM cap. Now rcapd should address the problem.
~--------
global 0D 66K 2 0.1 1 33 981M 139K 18E 2M 18E 754M
z3 0D 66K 2 0.0 1 33 1G 119M 536M 536M 0 1G 135M
web02 0D 66K 2 0.42 0.3 1 33 100M 111M 20M 20M 0 268M 127M
One of two things has happened: either a process in web02 freed up memory, or rcapd caused pageouts. rcapstat(1M) will tell us which it is. Also, the increase in VM usage indicates that more memory was allocated than freed, so it's more likely that rcapd was activity during this period.
~--------
global 0D 66K 2 0.1 1 33 981M 139K 18E 2M 18E 754M
z3 0D 66K 2 0.0 1 33 1G 119M 536M 536M 0 1G 135M
web02 0D 66K 2 0.42 0.2 1 33 100M 110M 20M 20M 0 268M 133M
~--------
global 0D 66K 2 0.1 1 33 978M 139K 18E 2M 18E 754M
z3 0D 66K 2 0.0 1 33 1G 116M 536M 536M 0 1G 135M
web02 0D 66K 2 0.42 0.2 1 33 100M 91M 20M 20M 0 268M 133M
At this point 'web02' is safely under its RAM cap. If this zone began to do 'real' work, it would continually be under memory pressure, and the value in 'Memory:RAM:Use" would fluctuate around 100M.