LatencyTOP Quick Start

This Quick Start describes the features of the LatencyTOP utility and how to use them. If you have questions or comments, see the LatencyTOP discussion archives.

Start LatencyTOP

The pre-built LatencyTOP tool has only one binary file and does not need any external configuration file. To start LatencyTOP, type:

# <path>/latencytop

Note: You must have DTrace privilege to run LatencyTOP since LatencyTOP uses DTrace to collect latency statistics.

Understand the Data

When running, the LatencyTOP window looks like this:

Screen shot with annotation

1 This line shows the caption of each column from rows in 2 and 4.

2 These are the “system wide” statistics. The values are combined from latencies that have the same cause from all processes in the system. For example, if 2 processes are sleeping, the count of “nanosleep” is the sum of the values from both processes, and the maximum is the larger value from both processes.

For example, if 16 processes each called sleep() exactly 1 time in the past several seconds, and process 1 slept for 45 seconds, process 2 slept for 1.8 seconds, and processes 316 slept for 1 second, the values shown are calculated in the following way:

  • Count is 1 time for 16 processes = 16.
  • Total (not shown) is 45 + 1.8 + 14x1 = 60.8 seconds,
  • Maximum is 45 seconds (from process 1),
  • Average is 60.8/16 = 3.8 seconds.
  • The “percent” field is calculated by the total (60.8 seconds) and the grand total of all latencies. If the percent shows 23.8%, then the grand total of all latencies that occurred in the system is 60.8/0.238 = 255.5 seconds.

By default, entries are sorted in descendant order by the percent field, which is equivalent to sort by the “total” value. Pressing the key "c" (count), "m" (maximum), "a" (average), or "p" (percentage) changes the field to sort. The sort order is always descendant.

3 This line, along with 4, shows data from one process (or thread). The line has the format:

Process <execname> (<PID>) Total: <grand_total_of_latencies> from <number_of_threads> threads [C | S | L] [P | T]

The grand total is calculated within the process. When there is more than one thread, the grand total might be larger than the sample interval.

[C | S | L] is the label for the current view:

CLatencies are grouped by their causes from the translation table.
SOnly special latencies such as lock blocking or scheduler queue are shown.
LLatencies are grouped by the synchronization objects.

[P | T] is a flag indicator:

PThe data are collected on the process level.
TThe data are collected on the thread level.

To toggle between process view and thread view, press the "T" key.

4 These are the “per process (or thread)” statistics. Values have the same meaning and are calculated in the same way as in 2.

5 This is a list of all processes that have data in LatencyTOP so far. A recently created process might not show in the list because it has not generated any latency yet. Once it generates latency, it remains in the list, even if it did not generate any latency in the past period; 4 is blank in this case. When the process terminates, it disappears from the list.

The process displayed in 3 and 4 is highlighted. Pressing the < and > keys moves the cursor and switches to another process. The arrow on the left or right edge of this line means there are more processes to display in that direction.

6 This line displays help messages, for example about hot keys, and error messages. The data on screen is updated periodically. Every 5 seconds, LatencyTOP clears old data, reads latencies that occurred during these 5 seconds from DTrace, and displays the latency values on the screen. To change the interval, use the option -t seconds when you start LatencyTOP. Press the "r" key to force a manual refresh immediately.

Latencies are sampled when they complete, for example, when a process wakes from a sleep. If the sleep lasts across multiple sampling periods, the value is only counted to the last period. The result is that latency statistics collected at 5 second intervals might occasionally have a maximum value more than 5 seconds.

When you finish monitoring data, press the "q" key to quit LatencyTOP gracefully. You can also press Ctrl+C to exit LatencyTOP.

Advanced Features

Logging

LatencyTOP can save raw data it captures from DTrace probes to a log file. This feature is useful for purposes like offline analysis or to improve LatencyTOP code itself. This can be enabled by adding -k level to the command line when you start LatencyTOP. A level of 1 means log only the causes that are not translated by embedded translation rules, that is unknown to LatencyTOP. A level of 2 means log only causes that are translated. A level of 3 means log all causes. In its default behavior, LatencyTOP logs information to /var/log/latencytop.log. To change the log file, specify -o file on the command line.

The log file looks like:

# Log generated <date> by LatencyTOP for OpenSolaris, version 0.1
# List of processes
PID, CMD
<pid>, "<cmd_args>"
# Statistics
TOTAL, COUNT, MAX, PID, KSTACK
<total>, <count>, <max>, <PID>, "<kernel_stack_trace>"

Note that some causes, like waiting in scheduler queue and lock spinning, are not logged even with a level of 2.

Unless you use the -l option, the log is written only when LatencyTOP exits (either by pressing the "q" key or by pressing Ctrl+C). For some long benchmark testing this could be inconvenient, since it could be difficult to distinguish data from different test items. Use -l time to override the default behavior and write to the log file periodically. Note that this is different from the screen refresh interval.

Showing Only Interesting Data

LatencyTOP usually detects many latencies, because every second many LWPs block or unblock for various reasons. The large number of latencies detected can obscure the real problem. The appropriate view of the data depends on what you are looking for: system level tuning or application bottleneck. When the default view seems insufficient, there are a number of modifications you can make to display different data in LatencyTOP.

  • Append the option -f filter when you start LatencyTOP. This option filters “wakable” latencies that are larger than a certain value (default 5ms, defined in LatencyTOP D script). This is an attempt to eliminate user induced latencies.
  • Append the option -f sched when you start LatencyTOP. This causes LatencyTOP to track activities in the sched process. By default, LatencyTOP does not track them, and focuses on applications only.
  • Add more entries or hide existing entries in the translation table. See the next section.

Modifying Translation for Specific Purposes

LatencyTOP uses a translation table to map latencies to different causes. The table is defined in latencytop.trans under the source code folder. You can modify this table if you want to focus on a specific topic, for example find and tune latencies from a specific kernel module. To load the modified table, use the -c option when you start LatencyTOP.

The table has the following format:

#comment
<priority>    <module>`<function>    <cause>
;disable_category <cause>

When module and function match an entry in the kernel call stack, the latency can be mapped to cause. When more than one map is found, the priority controls which cause is used. By adding more rules in the translation table with proper priority, you can focus on a particular area in the kernel.

The command disable_category removes all latencies mapped to the cause from the LatencyTOP display and the calculation of the grand total. A typical use of this command is to hide some wait sleep for some daemon threads.

Quick Reference

Synopsis

latencytop [-t <seconds>] [-o <file>] [-k 0|1|2|3] [-f [no]<feature>,...] [-c <file>] [-l <seconds>]
latencytop -h

Command Line Options

-t, --interval timeSet refresh interval to time number of seconds. Valid range [1...60], Default = 5.
-o, --output-log-file fileOutput kernel log to file. Default = /var/log/latencytop.log.
-k, --kernel-log-level levelSet kernel log level to level. 0(default) = None, 1 = Unmapped, 2 = Mapped, 3 = All.
-f, --feature [no]feature1,[no]feature2,...Enable or disable features in LatencyTOP.
-c fileLoad a translation table from file.
-l, --log-period timeWrite and restart log every time seconds, where time > 60.
-h, --helpPrint command line option help and exit.

Features Supported for the -f Option

[no]filterFilter large interruptible latencies, e.g. sleep. Default: off
[no]schedMonitors sched (PID=0). Default: off
[no]sobjMonitors synchronize objects. Default: on
[no]lowLower overhead by sampling small latencies. Enabling this feature lowers CPU utilization by estimating small latencies statistically. Use it for heavy workloads such as a very busy web server. Default: off

Hot Keys

<Move to previous process/thread.
>Move to next process/thread.
qExit.
rRefresh.
tToggle process/thread mode.
cSort by count.
aSort by average.
mSort by maximum.
pSort by percent.
1Show list by causes.
2Show list of special entries.
3Show list by synchronization objects.
hShow this help.
last modified by alta on 2009/10/27 18:38
Collectives
Project


© Sun Microsystems Inc. 2009
XWiki Enterprise 1.8.2.19075 - Documentation
Terms Of Use | Privacy | Trademarks | Copyright Policy | Site Guidelines | Site map | Help
Your use of this web site or any of its content or software indicates your agreement to be bound by these Terms of Use.