| Solaris |
|
|
TCP packets coming from the network (or in STREAMS terminology from the read side of the queue) are translated into mblk_t structures. Every incoming mblk carying data is passed to src/uts/common/inet/tcp/tcp.c:tcp_rput_data(). tcp_t data structure representing each TCP connection (or to be more precise each open TCP stream) has tcp_kssl_ctx member which represents per-connection kssl context. This is one more level down compared to per-socket kssl context. Relevant kssl related members of tcp_t are the following:
531 * Kernel SSL session information
532 */
533 boolean_t tcp_kssl_pending; /* during an SSL handshake */
534 boolean_t tcp_kssl_inhandshake; /* during SSL handshake */
535 kssl_ent_t tcp_kssl_ent; /* SSL table entry */
536 kssl_ctx_t tcp_kssl_ctx; /* SSL session */
tcp_rput_data() is huge function (it occupies 26 double-printed pages) but the real kssl related processing is near the end of this function. Normally, tcp_rput_data() would pass the data further along via tcp_recv_enqueue() or putnext but for SSL packets we need to process them first so we make a little round-about via tcp_kssl_input() called for packet with some kssl context.
per-connection context is created in tcp_kssl_input() via src/uts/common/inet/kssl/ksslapi.c:kssl_init_context():
83 /* First time here, allocate the SSL context */
84 if (tcp->tcp_kssl_ctx == NULL) {
85 ASSERT(tcp->tcp_kssl_pending);
86
87 if (kssl_init_context(tcp->tcp_kssl_ent,
88 tcp->tcp_ipha->ipha_dst, tcp->tcp_mss,
89 &(tcp->tcp_kssl_ctx)) != KSSL_STS_OK) {
90 tcp->tcp_kssl_pending = B_FALSE;
91 kssl_release_ent(tcp->tcp_kssl_ent, NULL,
92 KSSL_NO_PROXY);
93 tcp->tcp_kssl_ent = NULL;
94 goto no_can_do;
95 }
96 tcp->tcp_kssl_inhandshake = B_TRUE;
97
98 /* we won't be needing this one after now */
99 kssl_release_ent(tcp->tcp_kssl_ent, NULL, KSSL_NO_PROXY);
100 tcp->tcp_kssl_ent = NULL;
101
102 }
This is safe thing to do because we only enter tcp_kssl_input() if there is per-socket kssl context already or if tcp_kssl_pending is set:
14549 if (tcp->tcp_kssl_pending) {
14550 tcp_kssl_input(tcp, mp);
14551 } else {
14552 tcp_rcv_enqueue(tcp, mp, seg_len);
14553 }
Non-NULL tcp_kssl_pending flag means SSL handshake is in progress. This flag is set in src/uts/common/inet/tcp/tcp.c:tcp_conn_create_v4() (or tcp_conn_create_v6 although kssl is not IPv6 capable) like this:
5129 /* Inherit the listener's SSL protection state */
5130 if ((tcp->tcp_kssl_ent = ltcp->tcp_kssl_ent) != NULL) {
5131 kssl_hold_ent(tcp->tcp_kssl_ent);
5132 tcp->tcp_kssl_pending = B_TRUE;
5133 }
Non-NULL tcp_kssl_ent means that there is a kssl table entry. This member is set in src/uts/common/inet/tcp/tcp.c:tcp_wput_proto() which is processing the bind() request via bcopy():
21113 case T_SSL_PROXY_BIND_REQ: /* an SSL proxy endpoint bind request */
21114 /*
21115 * save the kssl_ent_t from the next block, and convert this
21116 * back to a normal bind_req.
21117 */
21118 if (mp->b_cont != NULL) {
21119 ASSERT(MBLKL(mp->b_cont) >= sizeof (kssl_ent_t));
21120
21121 if (tcp->tcp_kssl_ent != NULL) {
21122 kssl_release_ent(tcp->tcp_kssl_ent, NULL,
21123 KSSL_NO_PROXY);
21124 tcp->tcp_kssl_ent = NULL;
21125 }
21126 bcopy(mp->b_cont->b_rptr, &tcp->tcp_kssl_ent,
21127 sizeof (kssl_ent_t));
21128 kssl_hold_ent(tcp->tcp_kssl_ent);
21129 freemsg(mp->b_cont);
21130 mp->b_cont = NULL;
21131 }
21132 tprim->type = T_BIND_REQ;
T_SSL_PROXY_BIND_REQ is set by kssl_check_proxy() which is called when handling bind() syscall.
This means that new connection (tcp_t structure for new incoming connection is created via tcp_conn_create_v4()) destined for socket handled by kssl inherits the kssl flag so it can be bootstrapped.
src/uts/common/inet/tcp/tcp_kssl.c:tcp_kssl_input() is basically a wrapper for src/uts/common/inet/kssl/ksslapi.c:kssl_input(). When a mblk reaches =kssl_input() it is queued via KSSL_ENQUEUE_MP macro. So firstly, kssl_input() is a point where data are queued. But it is also a place where SSL handshake is handled. The data carried by SSL stream (not SSL control messages) are handled in a different way.
In sum, there are 2 main functions which process incoming data:
Both functions are similar - they process incoming SSL record. The main difference is their purpose - kssl_handle_record() handles mostly application_data records while kssl_handle_any_record() handles SSL handshake and SSL alert messages.
Also, kssl_handle_record() processes a chain of mblk structures in a cycle while kssl_handle_any_record() processes only single mblk.
The application_data processing is done via mechanism of hooks. The hooks are attached to STREAM head so they are as near to the application as possible.
The reason for having functions/hooks strsock_kssl_input()/strsock_kssl_output() is the issue is of resource observability. i.e. We want the user process to be charged for the CPU resource cost of doing SSL crypto. These hooks enable that.
It would also seem that strsock_kssl_* hooks were designed to do push SSL processing to the point where it is really needed to feed data from/to application in case the application is not capable to process incoming (or send outgoing) data for some reason. Actually, this is not the case, rather this is taken care by the streams system and TCP itself. i.e. TCP can throttle the flow if there is a slow receiver.
The hooks are setup in $SRC/uts/common/fs/sockfs/socktpi.c:sotpi_accept() like this:
1515 /*
1516 * If the transport sent up an SSL connection context, then attach
1517 * it the new socket, and set the (sd_wputdatafunc)() and
1518 * (sd_rputdatafunc)() stream head hooks to intercept and process
1519 * SSL records.
1520 */
1521 if (ctxmp != NULL) {
1522 /*
1523 * This kssl_ctx_t is already held for us by the transport.
1524 * So, we don't need to do a kssl_hold_ctx() here.
1525 */
1526 nso->so_kssl_ctx = *((kssl_ctx_t *)ctxmp->b_rptr);
1527 freemsg(ctxmp);
1528 mp->b_cont = NULL;
1529 strsetrwputdatahooks(nvp, strsock_kssl_input,
1530 strsock_kssl_output);
1531 }
sotpi_accept() is called when a process issues accept() syscall. The purpose of above mentioned code is to attach SSL processing hooks to the STREAM head in case kssl was configured. After that whenever such application sends data to the network they will be processed by strsock_kssl_output() which calls kssl_build_record() which constructs SSL record in a mblk and returns it. This mblk will be later sent on the wire. It works in a similar way for ingress path with strsock_kssl_input() which calls kssl_handle_record().
Let's stop here for a moment. strsock_kssl_input() works a bit differently than strsock_kssl_output(). It calls kssl_handle_record() processes data coming from the network. In case the data are ok it will return KSSL_CMD_DELIVER_PROXY which will make strsock_kssl_input() return the mblk processed (decrypted and verified) by kssl_handle_record(). This will make it go to the application. In case kssl_handle_record() returns KSSL_CMD_SEND it means that a SSL record needs to be sent back to the network. This could happen e.g. in case where SSL record MAC verification failed and SSL alert message needs to be sent back to the SSL client. In such case strsock_kssl_input() takes the 'out' parameter (a mblk_t) and does a putnext(9F) with it. This will make the mblk go to the write side of the STREAM - to the network.
For the ingress (read) side kstrgetmsg() calls strsock_kssl_input(). Relevant part of kstrgetmsg() looks like this:
7172 if ((stp->sd_rputdatafunc != NULL) && (DB_TYPE(bp) == M_DATA) &&
7173 (!(DB_FLAGS(bp) & DBLK_COOKED))) {
7174
7175 bp = (stp->sd_rputdatafunc)(
7176 stp->sd_vnode, bp, NULL,
7177 NULL, NULL, NULL);
7178
7179 if (bp == NULL)
7180 goto retry;
7181
7182 DB_FLAGS(bp) |= DBLK_COOKED;
7183 }
The hooks are unplugged in src/uts/common/fs/sockfs/sockvnops.c:socktpi_close() which happens when close() syscall is issued. This is also place where per socket KSSL context (so->so_kssl_ctx) is destroyed via calling kssl_release_ctx().
All mblks which go into kssl are M_DATA type. This means no STREAMS control message should make it into kssl. However, some parts of kssl check for M_DATA tag to be sure.
XXX
All possible ways how to end up in kssl_handle_any_record() are displayed in the following figure:
+~-----------------+ +~-----------------+ +~--------------+
| | | | | |
| tcp_rput_data() +~-->+ tcp_kssl_input()+~------>+kssl_input() |~-------.
+~-----------------+ +~-----------------+ +~--------------+ `.
\
sockfs \
+~-----------------+ +~----------------------+ +~---------------------+ :
| | | | | +. :
|kstrgetmsg() +~-->+ strsock_kssl_input() +->+kssl_handle_record() | \\ :
+~-----------------+ +~----------------------+ +~---------------------+ : |
| ;
+~------------------------+ ,'
| +'
|kssl_handle_any_record()|
+~------------------------+
XXX
During testing of CR 6556443 it was discovered that some data can leak again to kssl_handle_any_record() and cause record size mismatch. The cause was identified as DBLK_COOKED flag (defined in src/uts/common/sys/stream.h) not being set on a dblk associated with a mblk while processing SSL records in kssl_handle_record().
During investigation it was realized that:
The following code in src/uts/common/os/streamio.c:kstrgetmsg() displays how strsock_kssl_input() hook is called:
7172 if ((stp->sd_rputdatafunc != NULL) && (DB_TYPE(bp) == M_DATA) &&
7173 (!(DB_FLAGS(bp) & DBLK_COOKED))) {
7174
7175 bp = (stp->sd_rputdatafunc)(
7176 stp->sd_vnode, bp, NULL,
7177 NULL, NULL, NULL);
7178
7179 if (bp == NULL)
7180 goto retry;
7181
7182 DB_FLAGS(bp) |= DBLK_COOKED;
7183 }
Hadn't DBLK_COOKED been set by kssl_handle_record() it could happen that the same mblk is processed twice.
Now that we have all parts in place let's summarize them:
Here's how it works:
+~-------------------+
| |
| process |
| (httpd) |
| |
+~-------------------+
read()
,~-------------/ userland
~------------,-'~--------------------------------------------------
/ kernel
v
kstrgetmsg()
\
|-> kssl_handle_record()-.
: :
`-> struiocopyout() ;
/ ,'
,~-----' ,-'
,' ,~----'
; / +~--------------------+
| / | |
tcp->tcp_rq->q_next | STREAM head |
^ | |
putnext( / | |
complete | +~--------------------+
unprocessed /
SSL record) |
/ +~--------------------+
ssl->rec_ass_head | |
rec_ass_tail | KSSL module |
| | (IP perimeter) |
; | |
/ | |
kssl_input() | |
^ | |
tcp_kssl_input() +~--------------------+
...
+~--------------------+
| NIC driver |
| |
| |
+~--------------------+
kstrgetmsg() reads the data from STREAM head queue. It uses the hooks set by KSSL to process the data (demarshall, decrypt and verify the SSL records).
kstrgetmsg() works basically like this:
See CR 6614159 and related escalations for more details.
XXX
queue processing function ~--------------------------------+~------------------------------------- TCP segment reassembly queue tcp_rput_data() SSL fragment queue kssl_input() stream head queue kstrgetmsg(), now kssl_handle_mblk()
XXX
Q: What is so_tail / sd_tail for ? I see that it was added with Greyhound integration but fail to see what exactly is it used for.
A: They are there to reserve space at the end of an outgoing message. When KSSL does HMAC + encrypt on the message the output is larger than the input (+20 bytes for hmac-sha1 and some more for block ciphers like AES). We avoid the cost of allocating a bigger mblk and copying to it later by doing this up front.
Also, see the comment block in $SRC/uts/common/inet/tcp/tcp.c:tcp_accept_finish():
18030 /*
18031 * If this is endpoint is handling SSL, then reserve extra
18032 * offset and space at the end.
18033 * Also have the stream head allocate SSL3_MAX_RECORD_LEN packets,
18034 * overriding the previous setting. The extra cost of signing and
18035 * encrypting multiple MSS-size records (12 of them with Ethernet),
18036 * instead of a single contiguous one by the stream head
18037 * largely outweighs the statistical reduction of ACKs, when
18038 * applicable. The peer will also save on decyption and verification
18039 * costs.
18040 */
18041 if (tcp->tcp_kssl_ctx != NULL) {
18042 stropt->so_wroff += SSL3_WROFFSET;
18043
18044 stropt->so_flags |= SO_TAIL;
18045 stropt->so_tail = SSL3_MAX_TAIL_LEN;
18046
18047 stropt->so_maxblk = SSL3_MAX_RECORD_LEN;
18048 }
kssl_build_single_record() presumes that it gets mblk_t which has extra so_tail/sd_tail bytes allocated at the end.
so_wroff is also used for sendfile - see CR 6568266 (kssl doesn't get along with sendfile) for details.
Terms of Use
|
Privacy
|
Trademarks
|
Copyright Policy
|
Site Guidelines
|
Site Map
|
Help
Your use of this web site or any of its content or software indicates your agreement to be bound by these Terms of Use.
© 2012, Oracle Corporation and/or its affiliates.