KSSL internals » KSSL record processing
en

KSSL record processing

Record processing

Incoming data processing

 TCP packets coming from the network (or in STREAMS terminology from the read side of the queue) are translated into mblk_t structures. Every incoming mblk carying data is passed to src/uts/common/inet/tcp/tcp.c:tcp_rput_data(). tcp_t data structure representing each TCP connection (or to be more precise each open TCP stream) has tcp_kssl_ctx member which represents per-connection kssl context. This is one more level down compared to per-socket kssl context. Relevant kssl related members of tcp_t are the following:


    531     * Kernel SSL session information
    532     */
    533    boolean_t      tcp_kssl_pending; /* during an SSL handshake */
    534    boolean_t      tcp_kssl_inhandshake; /* during SSL handshake */
    535    kssl_ent_t      tcp_kssl_ent;   /* SSL table entry */
    536    kssl_ctx_t      tcp_kssl_ctx;   /* SSL session */

tcp_rput_data() is huge function (it occupies 26 double-printed pages) but the real kssl related processing is near the end of this function. Normally, tcp_rput_data() would pass the data further along via tcp_recv_enqueue() or putnext but for SSL packets we need to process them first so we make a little round-about via tcp_kssl_input() called for packet with some kssl context.

 per-connection context is created in tcp_kssl_input() via src/uts/common/inet/kssl/ksslapi.c:kssl_init_context():


     83    /* First time here, allocate the SSL context */
     84    if (tcp->tcp_kssl_ctx == NULL) {
     85       ASSERT(tcp->tcp_kssl_pending);
     86 
     87       if (kssl_init_context(tcp->tcp_kssl_ent,
     88           tcp->tcp_ipha->ipha_dst, tcp->tcp_mss,
     89           &(tcp->tcp_kssl_ctx)) != KSSL_STS_OK) {
     90          tcp->tcp_kssl_pending = B_FALSE;
     91          kssl_release_ent(tcp->tcp_kssl_ent, NULL,
     92              KSSL_NO_PROXY);
     93          tcp->tcp_kssl_ent = NULL;
     94          goto no_can_do;
     95       }
     96       tcp->tcp_kssl_inhandshake = B_TRUE;
     97 
     98       /* we won't be needing this one after now */
     99       kssl_release_ent(tcp->tcp_kssl_ent, NULL, KSSL_NO_PROXY);
    100       tcp->tcp_kssl_ent = NULL;
    101 
    102    }

 This is safe thing to do because we only enter tcp_kssl_input() if there is per-socket kssl context already or if tcp_kssl_pending is set:


   14549       if (tcp->tcp_kssl_pending) {
   14550          tcp_kssl_input(tcp, mp);
   14551       } else {
   14552          tcp_rcv_enqueue(tcp, mp, seg_len);
   14553       }

 Non-NULL tcp_kssl_pending flag means SSL handshake is in progress. This flag is set in src/uts/common/inet/tcp/tcp.c:tcp_conn_create_v4() (or tcp_conn_create_v6 although kssl is not IPv6 capable) like this:


   5129    /* Inherit the listener's SSL protection state */
   5130    if ((tcp->tcp_kssl_ent = ltcp->tcp_kssl_ent) != NULL) {
   5131       kssl_hold_ent(tcp->tcp_kssl_ent);
   5132       tcp->tcp_kssl_pending = B_TRUE;
   5133    }

 Non-NULL tcp_kssl_ent means that there is a kssl table entry. This member is set in src/uts/common/inet/tcp/tcp.c:tcp_wput_proto() which is processing the bind() request via bcopy():


   21113    case T_SSL_PROXY_BIND_REQ:   /* an SSL proxy endpoint bind request */
   21114       /*
   21115        * save the kssl_ent_t from the next block, and convert this
   21116        * back to a normal bind_req.
   21117        */
   21118       if (mp->b_cont != NULL) {
   21119           ASSERT(MBLKL(mp->b_cont) >= sizeof (kssl_ent_t));
   21120 
   21121          if (tcp->tcp_kssl_ent != NULL) {
   21122             kssl_release_ent(tcp->tcp_kssl_ent, NULL,
   21123                 KSSL_NO_PROXY);
   21124             tcp->tcp_kssl_ent = NULL;
   21125          }
   21126          bcopy(mp->b_cont->b_rptr, &tcp->tcp_kssl_ent,
   21127              sizeof (kssl_ent_t));
   21128          kssl_hold_ent(tcp->tcp_kssl_ent);
   21129          freemsg(mp->b_cont);
   21130          mp->b_cont = NULL;
   21131       }
   21132       tprim->type = T_BIND_REQ;

T_SSL_PROXY_BIND_REQ is set by kssl_check_proxy() which is called when handling bind() syscall.

 This means that new connection (tcp_t structure for new incoming connection is created via tcp_conn_create_v4()) destined for socket handled by kssl inherits the kssl flag so it can be bootstrapped.

src/uts/common/inet/tcp/tcp_kssl.c:tcp_kssl_input() is basically a wrapper for src/uts/common/inet/kssl/ksslapi.c:kssl_input(). When a mblk reaches =kssl_input() it is queued via KSSL_ENQUEUE_MP macro. So firstly, kssl_input() is a point where data are queued. But it is also a place where SSL handshake is handled. The data carried by SSL stream (not SSL control messages) are handled in a different way.

 In sum, there are 2 main functions which process incoming data:

  • kssl_handle_record()
    • generic SSL data processing, can call kssl_handle_any_record()
  • kssl_handle_any_record()
    • handshake and SSL alerts

 Both functions are similar - they process incoming SSL record. The main difference is their purpose - kssl_handle_record() handles mostly application_data records while kssl_handle_any_record() handles SSL handshake and SSL alert messages.

 Also, kssl_handle_record() processes a chain of mblk structures in a cycle while kssl_handle_any_record() processes only single mblk.

The hooks

 The application_data processing is done via mechanism of hooks. The hooks are attached to STREAM head so they are as near to the application as possible.

 The reason for having functions/hooks strsock_kssl_input()/strsock_kssl_output() is the issue is of resource observability. i.e. We want the user process to be charged for the CPU resource cost of doing SSL crypto. These hooks enable that.

 It would also seem that strsock_kssl_* hooks were designed to do push SSL processing to the point where it is really needed to feed data from/to application in case the application is not capable to process incoming (or send outgoing) data for some reason. Actually, this is not the case, rather this is taken care by the streams system and TCP itself. i.e. TCP can throttle the flow if there is a slow receiver.

 The hooks are setup in $SRC/uts/common/fs/sockfs/socktpi.c:sotpi_accept() like this:


   1515    /*
   1516     * If the transport sent up an SSL connection context, then attach
   1517     * it the new socket, and set the (sd_wputdatafunc)() and
   1518     * (sd_rputdatafunc)() stream head hooks to intercept and process
   1519     * SSL records.
   1520     */
   1521    if (ctxmp != NULL) {
   1522       /*
   1523        * This kssl_ctx_t is already held for us by the transport.
   1524        * So, we don't need to do a kssl_hold_ctx() here.
   1525        */
   1526       nso->so_kssl_ctx = *((kssl_ctx_t *)ctxmp->b_rptr);
   1527       freemsg(ctxmp);
   1528       mp->b_cont = NULL;
   1529       strsetrwputdatahooks(nvp, strsock_kssl_input,
   1530           strsock_kssl_output);
   1531    }

sotpi_accept() is called when a process issues accept() syscall. The purpose of above mentioned code is to attach SSL processing hooks to the STREAM head in case kssl was configured. After that whenever such application sends data to the network they will be processed by strsock_kssl_output() which calls kssl_build_record() which constructs SSL record in a mblk and returns it. This mblk will be later sent on the wire. It works in a similar way for ingress path with strsock_kssl_input() which calls kssl_handle_record().

 Let's stop here for a moment. strsock_kssl_input() works a bit differently than strsock_kssl_output(). It calls kssl_handle_record() processes data coming from the network. In case the data are ok it will return KSSL_CMD_DELIVER_PROXY which will make strsock_kssl_input() return the mblk processed (decrypted and verified) by kssl_handle_record(). This will make it go to the application. In case kssl_handle_record() returns KSSL_CMD_SEND it means that a SSL record needs to be sent back to the network. This could happen e.g. in case where SSL record MAC verification failed and SSL alert message needs to be sent back to the SSL client. In such case strsock_kssl_input() takes the 'out' parameter (a mblk_t) and does a putnext(9F) with it. This will make the mblk go to the write side of the STREAM - to the network.

 For the ingress (read) side kstrgetmsg() calls strsock_kssl_input(). Relevant part of kstrgetmsg() looks like this:


   7172    if ((stp->sd_rputdatafunc != NULL) && (DB_TYPE(bp) == M_DATA) &&
   7173        (!(DB_FLAGS(bp) & DBLK_COOKED))) {
   7174 
   7175       bp = (stp->sd_rputdatafunc)(
   7176           stp->sd_vnode, bp, NULL,
   7177           NULL, NULL, NULL);
   7178 
   7179       if (bp == NULL)
   7180          goto retry;
   7181 
   7182       DB_FLAGS(bp) |= DBLK_COOKED;
   7183    }

 The hooks are unplugged in src/uts/common/fs/sockfs/sockvnops.c:socktpi_close() which happens when close() syscall is issued. This is also place where per socket KSSL context (so->so_kssl_ctx) is destroyed via calling kssl_release_ctx().

 All mblks which go into kssl are M_DATA type. This means no STREAMS control message should make it into kssl. However, some parts of kssl check for M_DATA tag to be sure.

SSL handshake processing

 XXX

 All possible ways how to end up in kssl_handle_any_record() are displayed in the following figure:


  +~-----------------+   +~-----------------+       +~--------------+
  |                 |   |                 |       |              |
  | tcp_rput_data() +~-->+ tcp_kssl_input()+~------>+kssl_input()  |~-------.
  +~-----------------+   +~-----------------+       +~--------------+        `.
                                                                            \
   sockfs                                                                    \
  +~-----------------+   +~----------------------+  +~---------------------+     :
  |                 |   |                      |  |                     +.    :
  |kstrgetmsg()     +~-->+ strsock_kssl_input() +->+kssl_handle_record() | \\   :
  +~-----------------+   +~----------------------+  +~---------------------+  :   |
                                                                           |   ;
                                                  +~------------------------+ ,'
                                                  |                        +'
                                                  |kssl_handle_any_record()|
                                                  +~------------------------+

SSL application_data processing

 XXX

kssl_get_next_record()

The DBLK_COOKED flag

 During testing of CR 6556443 it was discovered that some data can leak again to kssl_handle_any_record() and cause record size mismatch. The cause was identified as DBLK_COOKED flag (defined in src/uts/common/sys/stream.h) not being set on a dblk associated with a mblk while processing SSL records in kssl_handle_record().

 During investigation it was realized that:

  • DBLK_COOKED was introduced via PSARC 2005/625 just for the purpose of kssl proxy but was made a generic STREAMS mechanism
  • there is a curious comment near the beginning of the definition of kssl_handle_record() about flagging mblks as DBLK_COOKED so by adding it to kssl_handle_record() we are actually doing something which was done before only in comments

 The following code in src/uts/common/os/streamio.c:kstrgetmsg() displays how strsock_kssl_input() hook is called:


   7172    if ((stp->sd_rputdatafunc != NULL) && (DB_TYPE(bp) == M_DATA) &&
   7173        (!(DB_FLAGS(bp) & DBLK_COOKED))) {
   7174 
   7175       bp = (stp->sd_rputdatafunc)(
   7176           stp->sd_vnode, bp, NULL,
   7177           NULL, NULL, NULL);
   7178 
   7179       if (bp == NULL)
   7180          goto retry;
   7181 
   7182       DB_FLAGS(bp) |= DBLK_COOKED;
   7183    }

 Hadn't DBLK_COOKED been set by kssl_handle_record() it could happen that the same mblk is processed twice.

Big picture of incoming SSL record processing

 Now that we have all parts in place let's summarize them:

  • incoming mblks are queued in kssl_input() called from tcp_rput_data() via tcp_kssl_input()
  • if the SSL record type (acquired from SSL header) is different than application_data it is processed by kssl_handle_any_record() otherwise kssl_input() will let it sit in the queue and will pass KSSL_CMD_DELIVER_PROXY to tcp_kssl_input()
    • also, only initial SSL handshake is passed to kssl_handle_any_record() from kssl_input() (this is the reason why kssl_handle_record() can see non-application_data mblks
  • mblks in the queue are processed by kssl_handle_record() called from strsock_kssl_input()
    • the reason why SSL alerts etc. are not processed directly in kssl_input() once SSL connection is established is again resource observability (accounting) - after SSL handshake is done all SSL processing is accounted to the application process

STREAM head processing

 Here's how it works:


                                +~-------------------+
                                |                   |
                                | process           |
                                | (httpd)           |
                                |                   |
                                +~-------------------+
                                 read()
                  ,~-------------/                       userland
   ~------------,-'~--------------------------------------------------
              /                                         kernel
             v
          kstrgetmsg()
            \
             |-> kssl_handle_record()-.
             :                        :
              `-> struiocopyout()     ;
                                /   ,'
                         ,~-----' ,-'
                       ,'  ,~----'
                      ;   /    +~--------------------+
                      |  /     |                    |
         tcp->tcp_rq->q_next   | STREAM head        |
                         ^     |                    |
       putnext(          /     |                    |
          complete      |      +~--------------------+
          unprocessed   /
          SSL record)  |
                       /       +~--------------------+
            ssl->rec_ass_head  |                    |
                 rec_ass_tail  | KSSL module        |
                       |       | (IP perimeter)     |
                       ;       |                    |
                      /        |                    |
                kssl_input()   |                    |
                      ^        |                    |
            tcp_kssl_input()   +~--------------------+

                                        ...

                               +~--------------------+
                               | NIC driver         |
                               |                    |
                               |                    |
                               +~--------------------+

kstrgetmsg() reads the data from STREAM head queue. It uses the hooks set by KSSL to process the data (demarshall, decrypt and verify the SSL records).

kstrgetmsg() works basically like this:

  • first it cares about locks
  • then it calls hooks (if there are any and the data are not processed - DBLK_COOKED flag will tell the latter)
  • distinguish what we have (we are interested in M_DATA in our case)
  • copyout the data into userland via struiocopyout
    • this is done using the IO vector which tells the destination and how many bytes are needed

 See CR 6614159 and related escalations for more details.

Queues

 XXX


 queue                            processing function
~--------------------------------+~-------------------------------------
 TCP segment reassembly queue    tcp_rput_data()
 SSL fragment queue              kssl_input()
 stream head queue               kstrgetmsg(), now kssl_handle_mblk()

Outgoing data processing

 XXX

SO_TAIL flag

 Q: What is so_tail / sd_tail for ? I see that it was added with Greyhound integration but fail to see what exactly is it used for.

 A: They are there to reserve space at the end of an outgoing message. When KSSL does HMAC + encrypt on the message the output is larger than the input (+20 bytes for hmac-sha1 and some more for block ciphers like AES). We avoid the cost of allocating a bigger mblk and copying to it later by doing this up front.

 Also, see the comment block in $SRC/uts/common/inet/tcp/tcp.c:tcp_accept_finish():


   18030    /*
   18031     * If this is endpoint is handling SSL, then reserve extra
   18032     * offset and space at the end.
   18033     * Also have the stream head allocate SSL3_MAX_RECORD_LEN packets,
   18034     * overriding the previous setting. The extra cost of signing and
   18035     * encrypting multiple MSS-size records (12 of them with Ethernet),
   18036     * instead of a single contiguous one by the stream head
   18037     * largely outweighs the statistical reduction of ACKs, when
   18038     * applicable. The peer will also save on decyption and verification
   18039     * costs.
   18040     */
   18041    if (tcp->tcp_kssl_ctx != NULL) {
   18042       stropt->so_wroff += SSL3_WROFFSET;
   18043 
   18044       stropt->so_flags |= SO_TAIL;
   18045       stropt->so_tail = SSL3_MAX_TAIL_LEN;
   18046 
   18047       stropt->so_maxblk = SSL3_MAX_RECORD_LEN;
   18048    }

kssl_build_single_record() presumes that it gets mblk_t which has extra so_tail/sd_tail bytes allocated at the end.

so_wroff is also used for sendfile - see CR 6568266 (kssl doesn't get along with sendfile) for details.

Tags:
Created by admin on 2009/10/26 12:15
Last modified by admin on 2009/10/26 12:15

Collectives


XWiki Enterprise 2.7.1.34853 - Documentation