Heads-up: mechanism for finding STREAMS memory leaks
Date: Thu, 12 Feb 2009 02:00:57 -0500
From: Peter Memishian <peter.memishian at sun dot com>
To: on-all at eng dot sun dot com, onnv-gate at onnv dot eng dot sun dot com
Subject: Heads-up: mechanism for finding STREAMS memory leaks
My recent integration for:
6646362 STREAMS flow trace should be enhanced to aid memory leak detection
... introduces a mechanism for finding STREAMS memory leaks.
Specifically, if you place the following two settings in /etc/system:
set str_ftnever=0
set str_ftstack=1
(or equivalent via mdb -kw), then any dumps will have additional information
in each leaked message that can help you isolate the source of that leak.
For illustrative purposes, I've intentionally introduced a leak into a
STREAMS module and forced a dump.
1. Find all leaks originating with allocb():
> ::findleaks ! grep allocb
00000300001ae020 18260 00000300201ce3b0 allocb+0x4c
0000030014a86020 18211 00000300210e64d8 allocb+0xcc
2. Examine the bufctl_audit records ~-- e.g., for the first one above:
> 00000300201ce3b0$<bufctl_audit
ADDR BUFADDR TIMESTAMP THREAD
CACHE LASTLOG CONTENTS
300201ce3b0 6001507f880 98a0066914 3003b1d2e20
300001ae020 3000672a800 3000d4dcd70
kmem_cache_alloc+0x90
allocb+0x4c
allocb_tryhard+0x18
putnextctl1+0x30
strioctl+0x2a38
spec_ioctl+0x80
fop_ioctl+0x58
ioctl+0x164
> 6001507f880::print dblk_t db_mblk | ::mblk
ADDR FL TYPE LEN BLEN RPTR DBLK
00000600181126e0 0 flush 1 16 000006001507f8f0 000006001507f880
[ Until now, this is where the trail would dead-end: we could see that
an M_FLUSH had been allocated by the stream head and sent downstream,
and that something had lost it. But there was no way to figure out
what that something was. ]
3. Use the BUFADDR value and extract the STREAMS flow trace information:
> 6001507f880::walk strftblk | ::strftevent
ADDR Q/CALLER QNEXT STACK DATA EVENT
6001861a590 allocb_tryhard+0x18 ~-- allocb+0x118 1 allocb
allocb_tryhard+0x18
putnextctl1+0x30
strioctl+0x2a38
spec_ioctl+0x80
fop_ioctl+0x58
ioctl+0x164
6001861a5b8 strwhead ttcompat putnext+0x378 0 putnext|W
putnextctl1+0x5c
strioctl+0x2a38
spec_ioctl+0x80
fop_ioctl+0x58
ioctl+0x164
6001861a5e0 ttcompat ldterm putnext+0x378 0 putnext|W
putnext+0x390
putnextctl1+0x5c
strioctl+0x2a38
spec_ioctl+0x80
fop_ioctl+0x58
ioctl+0x164
6001861a608 ldterm ptem putnext+0x378 0 putnext|W
ldtermwput+0xb4
putnext+0x390
putnext+0x390
putnextctl1+0x5c
strioctl+0x2a38
spec_ioctl+0x80
fop_ioctl+0x58
ioctl+0x164
6001861a630 ptem pts putnext+0x378 0 putnext|W
putnext+0x390
ldtermwput+0xb4
putnext+0x390
putnext+0x390
putnextctl1+0x5c
strioctl+0x2a38
spec_ioctl+0x80
fop_ioctl+0x58
ioctl+0x164
Here, we can see that after the stream head allocated the M_FLUSH, it
passed it down to ttcompat, which in turn passed it down to ldterm,
then to ptem, and finally to pts. Since the facility also tracks other
core STREAMS events (such as putq() and getq()), and no such events are
present, we know this message was lost shortly after it was passed to
pts. Indeed, the leak I introduced for this example is in ptswput():
diff -r 379ec9d30f5c usr/src/uts/common/io/pts.c
~--- a/usr/src/uts/common/io/pts.c Tue Feb 10 20:25:26 2009 -0500
+++ b/usr/src/uts/common/io/pts.c Thu Feb 12 01:47:19 2009 -0500
@@ -589,10 +589,8 @@
if (*mp->b_rptr & FLUSHR) {
ASSERT(RD(qp)->q_first == NULL);
DBG(("qreply(qp) turning FLUSHR around\n"));
qreply(qp, mp);
- } else {
- freemsg(mp);
}
break;
While the above is obviously a manufactured example, this mechanism has
already enabled us to root out similar subtle leaks that have been resident
in core functionality for more than a decade. Hopefully you will find it
equally useful during your own development.
~--
meem
on 2009/11/20 23:48