Subliminal Messaging in Codecs (SubliME)

Subliminal Messaging in Codecs (SubliME) OpenFortress.nl

Haarlebrink 5 Enschede Overijssel 7544 WP The Netherlands rick@openfortress.nl

The backbone for telephony consists of a digital network that is chiefly used for audio using the G.711 codec, which is also widely supported in VoIP telephony. This specification defines Subliminal Messaging as a general facility for opportunistic data exchange in the noise levels of audio codecs, with profiles for the G.711, G.722 and clearmode codecs.

Telephony continues to be an area of communication that is mostly separate from the Internet. The addition of data transport, even if just opportunistic in nature, can resolve that. Telephony was traditionally an analog sound carrier, but has evolved into a digital backbone dedicated to the G.711 codec. Since this is digital, any content would be replicated accurately, giving the opportunity to carry data. Codecs make use of sound properties to transmit compressed forms of the original audio signal, but may still have bits that carry mostly noise. An extended interpretation of those noise bits may be used to pass digital data. This would be opportunistic, and care must be taken to validate any such data to distinguish it from noise in unsupporting devices. The G.711 codec passes floating-point numbers, and the volume of the mantisse bits are not always the same. A cut-off level may be defined, and bits under that level can be replaced with data. Such noise is audible, roughly to the level of a long-distance analog call, to yield bitrates around 28800 bits/second with peaks up to 32000 bits/second. For the higher quality G.722 codec, the lower 2 bits may be used for data to yield a constant 16000 bit/second data rate. The design of Subliminal Messaging is based on synchronous HDLC frames, initially by stealing noise bearer bits from a codec and possibly later by stealing the complete byte flow of a codec. HDLC is used to recognise support beyond reasonable doubt. The address field in HDLC is used to pass many kinds of data to a variety of services. It is possible to transmit data in one direction only, such as in the early media of a remote ringing cadence or alongside a voice menu prompt. Among the facilities that can run over HDLC is encryption, for which keys may be derived from common Internet security mechanisms, and which can then be applied as encryption and/or signatures for voice and/or data streams. Before keying, encryption is absent and the Frame Check Sequence in HDLC uses a fixed key, reducing from origin integrity checking to mere transport integrity checking. Along a telephony path, codecs may be translated. A supportive translator may recognise Subliminal Messaging, take out the HDLC data and insert it into the translation. This generally destroys the ability to take over the entire codec byte stream, and there may be a need to use HDLC flow control on the translator. But when HDLC content is not otherwise unmodified, cryptographic assurances may pass along, and provide end-to-end security for the data portions (but not the sound portions) of a call. TODO: HDLC header encryption has impact. This approach for inclusion of data in a codec may be referred to as SubliMe, and it may be pronounced as "sublime". This includes the future extensions that insert the bit streams into different codecs.

At the start of a connection, the channel is not assuming HDLC frames to be sent. For that to be considered, a first pattern to recognise is BREAK or 1111111, so seven of more consecutive 1 bits. It should be followed with a FLAG 01111110 marker to start the first HDLC frame. After the last FLAG, a BREAK can be sent without FLAG to return to the initial situation. The switch from bit-stealing mode to byte-stealing mode or back is determined by HDLC commands, as specified below. The same is true for cryptography. Neither of these changes take immediate effect; instead, they are deferred to the next BREAK. After that has been sent, the channel makes the switch. A new BREAK should then be sent, which may be followed by a FLAG and the first HDLC frame in the new channel mode. HDLC frames are inserted into codecs, initially using bit-stealing mode, but with the option to take over the complete codec byte stream in what will be called byte-stealing mode. The interpretation of HDLC frames is the same in both modes, but codecs may represent them completely differently. Every HDLC frame is surrounded by FLAG markers. Two consecutive HDLC frames may share one FLAG marker 01111110 as their separator. It is also permitted to have no HDLC frame between to FLAG markers, so the FLAG can be used as filling when no HDLC frames are ready to be sent. Before the first FLAG is accepted, the channel must send a BREAK marker 1111111. After this, the first FLAG marker may follow. After the last FLAG marker, another BREAK can be sent to return to the state where HDLC is essentially not transmitting. When new HDLC frames must be sent, another BREAK and FLAG can be sent. When BREAK occurs in the middle of an HDLC frame it is silently ignored.

Codecs that inject data bits for SubliMe into their media flow are said to work in bit-stealing mode. Encoding passes in a flow of bits, possibly with variable timing. The decoder produces the same flow of bits. The bits are not NRZI-encoded, because the codec does not constitute the actual transmission channel. The codec may incorporate other encodings, though. These bit flows are usually organised as synchronous HDLC, with bit-stuffing to turn data 11111 into 111110, thereby making 0111110 usable as a FLAG and 111111... as a BREAK marker.

It is possible to completely switch the byte stream of a codec to a HDLC byte sequence. Any desire to pass audio must now be inserted via a HDLC service. This allows use of the full bandwidth for data, and only pass audio at a lower priority or in more compressed forms. The environment that negotiated the original codec is not made aware of this change; messaging with SubliMe is subliminal. In byte-stealing mode, HDLC is organised as an asynchronous byte flow. The FLAG is the fixed byte 0x7e, and any occurrence of that byte in the data flow is escaped, as is the escape character and anything else deemed useful. Escaping inserts the byte value 0x7d and then passes the escaped byte XOR 0x40. Data bytes 0x7e are passed as 0x7d 0x3e and the data bytes 0x7d are passed as 0x7d 0x3d.

HDLC frames send a one-byte Command to a one-byte Address. Depending on the Command field, there may be a use for a non-empty Information field. The frame ends with a Check. Since it is transmitted between FLAG sequences, the context provides framing and there is no need for an explicit Length field. Addresses are used by SubliMe to determine which service shall process the HDLC frame. Addresses 0x00 through 0x7f are standardised in a register, and 0x80 through 0xfe can be called for with a service-specific UUID code. The othe end rejects unknown addresses. Address 0xff is generally used as "you-know-who" address for the remote end, but is more generally considered a broadcast address. Address 0x00 is used to communicate "meta" aspects of SubliMe. Commands are standardised, and classify as I-frames that pass information with confirmation of reception, S-frames that pass supervisory information for confirmations and flow control, and U-frames that pass various kinds of unconfirmed information. Information may be empty, and many HDLC frames require that. It is the carrier for user data, in a form that should be defined for the Address when registered, or for a UUID used to dynamically allocating an Address. There is a maximum size for the Information field, which means that fragementation may be needed when passing user data in HDLC frames. Check (formally the Frame Check Sequence) validates whether the frame was passed without errors. The size of this field may be 16 bits or 32 bits. SubliMe uses 32 bits for I-frames and TODO: 16 bits for other frames. The number of bits in the Check field imposes a maximum size on the Information field.

The sender and receiver handle a window of up to 8 HDLC frames per Address, and indicate the progress on each end in various header fields. These window updates work as acknowledgements, and are sent in the header of an I-frame or S-frame. They are not included in U-frames. After connecting to a SubliMe service Address in async/balanced mode, a receiver can actively send feedback. This is done with S-frames named RR (receiver ready), RNR (receiver not ready), REJ (reject) to request resends from the given window counter or SREJ (selective reject) to request that a given frame is sent once more. Finally, an I-frame can pass a Poll flag to explicitly ask for windowing progress. This can be useful to free window buffers in the sender. This works like passing a token, and the sender must not send anout Poll flag unless it is a resend of a prior attempt. If no response comes, then a timeout or unexpected response raises an exceptional condition. Normally, the receiver will respond at the earliest possibility with a window counter and pass back the token by setting the Final flag in it. Confirmation of arrival is only possible when a reverse channel of HDLC frames exists. This is not always the case, and it is still possible to do meaningful things as long as (1) it is possible to resend the data from the start because it is idempotent, and (2) there is no Poll bit anywhere. Some situations make it attractive to send things blindly, for instance opportunistic attempts to share data.

The Check field in HDLC can be used as 16-bit or 32-bit checksum. It normally has a specific CRC attached, but cryptography will use this field differently, and we also freely choose CRC polynomials. HDLC framing reveals its size, and the smallest ones carry no information, and can be well-protected with CRC-16. Other frames carry larger Information fields, and will be protected in 32 bits. TODO: What is the theoretic optimum hamming distance for a hash? I think it would be 2^-31 for a 32-bit hash value. Perfect Error Correction, https://www.callibrity.com/blog/coding-theory-2-of-3 TODO: 16-bit CRC, 32-bit CRC, 32-bit signature, what/when? Keep in mind that a translator may want to send RR/RNR. Keep in mind that not U-frames and S-frames cannot use counter mode.

Aside from its tasks relating to data link management, HDLC carries user data. In doing so, it aims to respect the chunk sizes in which this occurs. The maximum size of the Information field may call for fragmentation of a chunk of user data over multiple HDLC frames.

There may be an outer codec and/or an inner codecs. Outer means that it is transported outside of HDLC (more accurately, that it contains the HDLC bits) and inner means that the codec is transported in HDLC frames. There is a choice between outer and inner cryptography, again relative to the HDLC frames. Outer cryptography covers the entire byte stream and protects both the outer codec and the HDLC frames. Inner cryptography covers the contents of HDLC frames, including any inner codecs but excluding the outer codec, if any. In bit-stealing mode, these differences can be meaningful. In byte-stealing mode, the difference is less pronounced, although HDLC headers are not fully protected by inner cryptography but they are with outer cryptography. But the more dramatic point is that inner cryptography does not protect an outer codec.

HDLC is expressive and, like most uses, SubliMe uses only a subset. Any commands not defined are answered with FRMR (flag W). In some cases where this benefits protocol understanding, this is detailed below.

XID frames are exchanged for service negotiation at the given Address. The remote end should respond with an XID frame before reception can be assumed. An empty XID response indicates explicit and final rejection, an XID message with the same or no UUID indicates acceptance, an XID message with a higher UUID can be ignored, and an XID message with a lower UUID indicates a current override with implicit temporary rejection. XID frames use a format local to SubliMe, consisting of the following byte sequence: One byte with the following general flag bits, low to high: 2 bits with the XID senders role (01=client, 10=server, 11=either, 00=peer) 1 bit to announce a UUID for the service 1 bit to announce a version field for this service 1 bit to announce an MRU different from 1500 2 bits reserved (send zero, do not interpret) 1 bit fixated to 0 to signal a local XID format One byte with service-specific flags. The service's UUID in 16 bytes. This is used for Addresses 0x00 and 0x80..0x7e. One byte with the major version in the high nibble and the minor version in the low nibble. These are protocol versions, not software versions, so they should be slow-changing. The major version signifies incompatibility and the minor version is backward compatible. Software may support multiple versions, as long as it avoids degredation attacks to an insecure version. Unsigned 32-bit integer in network byte order, with 2 top bits whose values 0..3 indicate a maximum HDLC Information field size of 16, 64, 256 or 994 bytes. The MRU value 0 indicates a continuous flow, which the other side would replicate. The service may define its own XID bytes, some of which only when announced in the Service Flags, which are appended from here. Once an XID is sent and not explicitly or implicitly rejected, it should be noted on the other end. This may involve the allocation of a dynamic Address, the indication that the Address can be used and connections to its service, and any flags and settings will have been processed against its local structures. This rule breaks any desire to keep passing XID messages between peers. An XID may offer facilities that are not useful to the peer, and that may even be incompatible with its XID. This generally means that both sides tone down their expectations of the service. This does not need to be relayed; the design of XID facilities is such that each end derives the conjunction on its local end.

Since this is an opportunistic protocol, it must first be discovered. This is first done by looking for HDLC in bit-stealing mode for some codecs and/or byte-stealing mode for others. First, a BREAK must be detected, then a FLAG and a HDLC frame followed by another FLAG and posisbly more frames. Each HDLC frame have the right Check values. When this fails at any point, it needs to restart. Once established, such restarts may not be necessary. As soon as HDLC framing comes up, an XID command must be sent to Address 0x00 to indicate to the peer that SubliMe is supported. The format adaptions for this service follow. It is possible to continue, but only after the peer sends its own XID to Address 0x00 can there be certainty through acknowledged communication. The XID message sent to Address 0x00 is more specialised. The UUID and the Service Version must be inserted in any message sent to Address 0x00. The UUID must be set to 45e2b4e8-018b-4efd-8f05-137317273293 and the Service Version for this specification is 0x00. The Service Flags from low to high are: 1 bit to indicate support for outer cryptography; codec translators reset this bit 1 bit to indicate support for inner cryptography 1 bit to insist on outer cryptography; codec translators cannot be in the codec path 1 bit to request byte-stealing mode; codec translators may support this but may need active flow control with RR and RNR 4 bits reserved (send zero, do not interpret) TODO: The Service Parameters specify key agreement / encryption if inner and/or outer encryption is supported..

Even when services use protocols that can pass data bidirectionally, either as a part of their encoding (such as in Kerberos) or as a possibly useful facility (such as in LDAP), it may still be impossible to use both roles on any given address for a remote peer. Through XID interactions, this may be established to avoid application-level error messages. Connections to a service are always made with SABM. This connects two equal peers, without predetermined client or server roles. This means that each side can start sending I-frames if the command succeeds. If the service is available, the response to SABM will be UA, even if the connection is already open. If the service is unknown, the response will be FRMR (flag W); if the service is somehow unusable, the response will be DM. Connections can be closed from either end using DISC. The peer should respond with UA to confirm this, even for a connection that was already closed. Until the UA is received, the peer may still submit commands like I an UP to the connection. The peer will send FRMR (flag W) if no service is not defined at the targeted Address, not even due to dynamic Address allocation. The SABM and DISC commands are idempotent; that is, they will be quietly accepted when they confirm the active state. This may happen when both ends take the same initiative at the same time. Connections to Address 0x00 are special, as discussed below. Connection attempts with the other mode setting commands SM, SNRM, SARM, SNRME, SARME and SABME are currently answered with FRMR (flag W) because they are not supported in this version. The same is done for the RD command, which is unused because both ends may call DISC.

To switch from bit-stealing mode to byte-stealing mode, send SABM to Address 0x00; to switch back, send DISC to Address 0x00. When the switch is confirmed with UA, the original requester sends a BREAK after its final FLAG, then makes the switch to the other mode, sends another BREAK and a FLAG, after which it continues; once the other side adapts it will go through the same sequence. Codecs can help by avoiding spurious FLAG signaling in the intermediate time, as they might offer to do when HDLC is off. Codec translators may not be able to switch to byte-stealing mode, or perhaps they are unwilling to engage in bit-stealing mode when SubliMe is detected. They may send appropriate commands to signal this while still in the plaintext phase. Note that BREAK and FLAG signaling is also present in an encrypted flow. As a result, the switch can be made under encryption. This would only work when outer crypto is available, so when no codec translator has interfered. Outer crypto has no use in byte-stealing mode.

Most of the data passed through as HDLC frames arrives as network frames over a protocol such as UDP, TCP or SCTP. Sometimes, there is a notion of message to these communications. This notion is preserved when passing. HDLC frames have a maximum length, based on a good hamming distance for the Check field. Inasfar as transmission state syncing is concerned, each communication peer has a "Poll token" that it may transmit once (and perhaps retry after a timeout) in an I-frame until it receives it back as part of an S-frame or U-frame. The windowing feedback is transported in the N(R) bitfield of an S-frame or, piggybacked, of an I-frame. Aeral rule, an HDLC frame with a maximum-sized Information field indicates that another frame will continue the user data. The last frame always carries less than the maximum size in the Information field. When the user data happens to be a multiple of the maximum size, then a frame with an empty Information field is added. In a continuous flow, the user data may be shipped without delay to the service indicated in its address. In some cases, it may be necessary to collect data up to the maximum permissible by the MRU for a channel. When a HDLC I-frame delivers an Information field with less than the maximum size, then this is considered to end the frame before the MRU is reached, and the collected bytes with the new extension will be relayed to the connected service as indicated by the address. An I-frame with empty Information field is one possible way of sending less than the maximum size, and so it triggers transmission of the collected bytes to the service. TODO:CONCURRENCY_AS_AN_OPTION??? Finally, when a sequence causes a change of address, then too will the user data be transmitted.

As long as traffic is bidirectional, there are many opportunities for feedback from the receiver to the sender, to acknowledge progress of the window pointer up to N(R). This allows the sending window pointer N(S) to move to this later position, thus freeing up sending buffers. Positive acknowledgement is given with RR, rejection of anything beyond a window pointer is given with REJ. For individual missing frames, selective rejection with SREJ may be sent, though that would usually be done as soon as decoding of a hDLC frame fails.

The standard behaviour of the TEST command in HDLC is to send back a TEST with the same Information field. This makes it a suitable candidate to see if the channel is mangled (and needs escapes on certain codes). When mangling occurs, the Check will not match the Information field and the response is not received. This means that a few experiments can be sent, and the responses indicate bytes that do not require escaping. To test the codec, the TEST command is sent to Address 0x00. This should be done in byte-streaming mode to obtain meaningful results. Note that the TEST command may also be defined on other addresses, which may relay it in a service-specific manner.

UI frames are sent to a different kind of address; the field is the same HDLC field as before, but it is interpreted to indicate an RTP payload type. This allows the transmission of codec data, inside of HDLC, which may be considered an "inner codec", regardless of the "outer codec" from which bits or bytes are stolen. An SDP message from the remote peer provides the peer's acceptable RTP payload types. These can be used as the address for a UI frame. In byte-stealing mode, audio payloads can be played to the user; in bit-stealing mode, audio payloads will be TODO:REALLY-DROP-THE-SECURE-ALTERNATIVE??? ignored to avoid confusion and perhaps even abuse. Non-audio payload such as DTMF events will be processed in either mode. TODO: Insert some counter value in the beginning? N(S) and/or N(R) ? TODO: Given a continuous flow for the inner codec, he best timing to send an UI is such that at arrives between 25% and 75% of the time to process it, especially when the size is rougly the same as the one received. This soft guidance is intended to TODO:WHAT?UNNUMBERED!?!

The window size in each direction is 8 entries, since the N(S) and N(R) counters are 3 bits each, per direction. The percentage 12.5% * ( ( N(S) - N(R) ) mod 8 ) shows how much of the window arrived but was not acknowledged yet. Some care for the window timing in terms of this percentage of the window is useful to keep data flowing: At any time, when a full information frame is available, it should be sent as an I-frame. HDLC framing then helps to simplify the processing of user data messages, and this is broken by combining those into one HDLC frame, except for continuous flows. Between 25% and 50% of the window received, the timing is suggesting to send back an I-frame for continuous flows. This results in a reply rate that is a querter up to half of the sending rate, suggesting that the interaction may slow down if not hastened by other factors. At 50% of the window received, the receiver may want to offload the sender, and send feedback with RR. This would only be used when no other interaction has taken care of it yet. With 75% of the window sent, the sender may get a little agitated, and request feedback with UP. This can help to receive active feedback before the window is full. At any time that a frame cannot be recognised, the recipient may want to send SREJ. When the HDLC frame is encrypted, it will have to guess that the Address matches the last frame that was properly received, and the N(R) one higher than that frame. Since HDLC frames do not change order, this can still be sufficient information in situations where an Address change occurred, and redirect the resend to the address following it. (That is not standard in HDLC, but SubliMe happens to cover multiple addresses in one endpoint, even with shared encryption.)

The cryptographic framework adds authenticated encryption to HDLC. This is used for end-to-end security, involving assurance of the remote peer and integrity of its data.

New connections, as well as ones that need to start from scratch, fall back to null cryptography. This does not encrypt the data and uses a null key for signing. This cryptographic mode is defined in a companion specification. Since the null key is public, anyone can pose as a legitimate peer. There can be no security guarantees from null cryptography, but it does help to protect HDLC frames in transit against transport issues such as noise and bursty lines. Null cryptography has one special use-case, namely between a codec translator and an endpoint. When HDLC frames pass from one codec into another, the available bandwidth may be lower. For this reason, the codec translator may inject RNR and RR indications, sent to Address 0x00 to cover the entire flow, with N(R) and Final set to 0. This frame is always signed with null cryptography, even when encryption is active between the endpoints. This implies a denial-of-service risk, but that is always a problem with an opportunistically probing protocol.

When codec translation is required on the communication path, then not all security properties can be resolved. Note however that switching between A-law and μ-law can be handled without this damage, by taking mangling into account. Outer cryptography involves encryption and signing of the entire codec. The signature is passed in HDLC frames, both in bit-stealing mode and byte-stealing mode. Signatures are 32 bits long, so they do not have cryptographic assurance on their own, but chaining of I-frames works like a thread that increases assurance up to 128 bit in the last frame. It is possible to send a few extra frames at the end of a connection to reach good security for the whole connection. TODO:MOVE_SOME_DOWN_TO_CONNECTIONS. Inner cryptography involves encryption and signing for HDLC frames. This is always possible, even with codec translation in place. It does not protect the codec into which bit-stealing mode injects, however, and that should be signaled to the user. TODO: Using UI frames, it is possible to send inner sound as part of HDLC, so it is secure. The choice between inner and outer cryptography is made while bootstrapping SubliMe with XID to Address 0x00. This is independently done for both directions. The flag to insist on outer cryptography always causes outer crypto, even if this breaks a codec translator, because anything else would be supportive of a downgrade attack. When both sides agree to byte-stealing mode, that switch is made first. Codec translators should not pass the XID to Address 0x00 if it insists on outer crypto, but either it does not ask for byte-stealing mode or the codec translator cannot offer that. Codec translators should can safely pass the inner cryptography flag, provided that they take the HDLC frames out from the incoming codec and inject it into the outgoing codec.

In his theory of special relativity, Einstein explains that there can be no notion of two things happening at the same time in different locations. The SABM connections cause precisely this kind of problem, making it generally impossible to order the frames sent from either end. The two directions of frames therefore send independent frame sequences. Accommodating this, the crypto used is independently set for each of the directions. Bootstrapping with XID to Address 0x00 has established what cryptographic modes may be used, and connections to an key agreement Address guides the choice of key exchange, but keys may be setup separately in each direction. It is possible to use one key exchange phase to derive two keys separately, but it is also possible to start another key exchange, for instance to extend single-sided authentication into mutual authentication. Having independent crypto in each direction helps to switch it on or off more elegantly. This coincides with the (theoretic) option to use different codecs in each direction. It also matches the idea that codecs independently switch between bit-stealing mode and byte-stealing mode after agreeing on it with SABM and DISC commands sent to Address 0x00.

There is no length field in HDLC because it surrounds frames with a FLAG and escaping or bit-stuffing similar patterns inside it. The frame boundaries remain in plaintext, and the same is true for bit-stuffing in bit-stealing mode, and for escaping in byte-stealing mode. This means that before encryption and after decryption, the HDLC frame has no internal escaping for SubliMe to take care of.soso These are transport modifications only, and indeed dependent on whether the transport is based on bit-stealing mode or byte-stealing mode. HDLC also defines a BREAK, which is evaded by the same practice of bit-stuffing or escaping. This also sites outside of the encryption, and it is used to switch the codec to the desired mode, if it was recently changed by commands exchanged inside the encryption layer.

The cryptographic mode organises most of the nuts and bolts of encryption. This specification only clarifies the areas that are encrypted, and how traffic in transit may connect to encryption. Encryptionn can be implemented as inner or outer cryptography. While XID bootstrapping to Address 0x00, the options are set to choose the variant that will be used. They are not both employed, but the choice is made separately for the uplink and downlink.

Inner encryption applies to the HDLC frame format only. It does not protect the outer codec. It can be used in bit-stealing mode as well as in byte-stealing mode. Encryption starts with the Address field to protect against leaking which application is in use. This means that a need arises to find the right signer. This is not as difficult as it may seem; at any time, the number of possible responses are limited to open connections, and the one used most recent is the most likely receiver, so a backward search is useful. Address 0x00 must also be tried.

Parts of the Command field are not changed by encryption, somewhat dependent on their value. This helps to test quickly to distinguish RR and RNR frames, other S-frames, U-frames and I-frames. For RNR and RR frames, the first attempt should be whether they match fixed messages from a codec translator, asking for the entire HDLC flow to be paused or resumed. Furthermore, S-frames are sent under response chaining, while I-frames and U-frames are signed by the peer with I-frames are always part of a connection chain, while only the U-frames DISC and UP are part of such a chain. Encryption modes can make this kind of searching for a match easier by predicting the first two bytes in the response. This usually means that a few more bytes are encrypted than just the message itself. TODO: NEED TO SEE IF MATCHING IS PRACTICAL, OR THAT IT IS BETTER TO REVEAL THE ADDRESS. MAY STILL CONCEAL SERVICES VIA DYNAMIC ADDRESS ALLOCATION.

Outer encryption applies to the codec bytes. It is incompatible with codec translation, and may be disabled by such an intermediate. To make that impossible, there is a flag to insist on outer encryption. If this creates an impasse, then byte-stealing mode can be chosen. In byte-stealing mode, outer encryption is the same as inner encryption. In bit-stealing mode, outer encryption involves encryption of the codec as well as the bits stolen to pass HDLC frames. Outer encryption is a steam cipher applied to the codec bytes between the transmission channel and the detection of FLAG and BREAK signals, as well as HDLC frame bits. Outer encryption will be updated to the desired setting after a BREAK is sent out via the codec in the old format. After the BREAK, the switch is made and scanning for a FLAG continues in the new format. It is recommended to send an extra BREAK in the new format to facilitate synchronisation. TODO: Maybe require a 0 bit after the BREAK to learn that the whole 11111... bit sequence has arrived? Is that a general property for BREAK detection?

The cryptographic mode organises most of the nuts and bolts of signing. This specification only clarifies the areas that are signed, and how traffic in transit may connect to signatures. All HDLC frames need a Check, and this constitutes an inner signature, which is always present. When outer crypto is used, there will additionally be UI frames sent to Adress 0x00 with the signature for the outer codec.

Certain HDLC commands may trigger a response, which should be part of the security framework, both for reasons of privacy (encryption of the Address and connection progress) and for authentication (actually being entitled to respond). To accommodate this, a cryptographic link must be made between the command and the matching response(s). This results in a response hierarchy between HDLC frames. The UID command never produces a response. The XID and TEST commands do not produce a response at the HDLC level, but send an independent frame signed by the sending peer. The SABM command returns either UA or DM or FRMR with flag W. there may be any number of RR or RNR frames for flow control. There are 8 variations of both RR and RNR, due to their inclusion of N(R), which originates in the responder. The DISC command returns either UA or DM. The I command may trigger a matching RR, REJ or SREJ response. It may also lead to another I command being sent, but that is not an HDLC level response, and it is signed by the sending peer. The UP command may trigger a matching RR, SREJ or REJ response. All these responses at the HDLC level are chained to the signature for the command. That is, the data taken into account in the hashing algorithm continues from the previous form. The manner in which signatures are chained are defined per cryptographic mode. There may also be a need to allocate counter values for this purpose. Besides response chaining, there is also a signature chain per connection. This starts with the SABM command, and adds any I, UP and DISC command frames and the UA response to the DISC frame. This form of chaining permits cryptographic assurance to build up beyond the 32-bit level at the end of each HDLC frame. The fourth HDLC frame in a chain certifies the first. The end of a connection reduces in certainty, but ending with an UP command to assure arrival of the last frame, then DISC and its response add the extra 96 bits to get to the 128-bit assurance level. The other side may want to send three UP commands to reach the 128-bit level of assurance before sending the UA response to a DISC command from its peer. TODO: Other side may incorporate the same data -- SABM on the connected-to side, and DISC on the disconnected-from side. That would make this easier to use. Aside from the response to the final DISC, connection chaining does not incorporate the responses in the chain; response chaining works like forks from the connection chain. Resends also are concealed from the connection chain, which is possible as long as the encrypted HDLC frame is literally sent in the same manner. The result is a connection chain that indicates the setup, data exchange and teardown of an entire connection, as seen from one side. Connections uses the same chaining mechansim from the cryptographic mode as used for responses. There will be no need to allocate counter values for connection chaining.

Inner signatures add the Check value to HDLC frames. This is always done. During outer cryptography, the unencrypted HDLC frames are signed before they are encrypted. During inner cryptography, the HDLC frame is encrypted and may be signed in encrypted or plaintext form, as defined by the cryptographic mode. For null cryptography, this is makes no difference because the encrypted content equals the plaintext. Signing starts together with encryption, unless there is a reason for chaining, namely connection chaining or response chaining. Note that response chains may branch off from a connection chain, as described above.

Outer signatures are sent if and only if outer cryptography is used, as a result of XID bootstrapping via Address 0x00. It coincides with outer encryption. Null cryptography does not count as "cryptography is used", and no outer cryptography is applied. This protects from permissible phenomenons during the setup process, such as modifications by codec translators and synchronisation problems at the start of communication. Outer signatures start with the first byte that is also subjected to outer encryption, so right after detection of a BREAK marker. At regular intervals, a signature is sent to Address 0x00 in a UI frame with an empty Information field. This forks off of the continued outer codec, in the same manner as responses fork off their signatures, so that the outer codec is signed with a continually advancing hash. The byte range that is signed runs up to the byte that sent the ending 0 of the FLAG preceding the UI frame with the signature. TODO: Timing can get screwed up, choose the frame before instead, or count the frames, or...? TODO: Consider UIH to allow the counter to be inserted after signing. Also, UI is already taken! TODO: This signing process does not recover from a burp on the line. It may instead be better to allow signing periods, tighetly joined or with some overlap, to avoid a failure to persist forever.

TODO:WRITE Overlapping signatures, for match-making. Reduced end-of-transmission security, by default. Denial of service: Outer crypto, byte-stealing mode, inner/outer codec overlap.

IANA is requested to setup a registry named Subliminal Messaging Service Addresses, with initial entries from the tables in TODO:REF, and split into the same categories. New entries must reference a frame format in stable documentation, and are subject to expert review and rough consensus on a publicly accessible IANA mailing list, and handed out with a care that matches the small address space. To make allocation of an address likely, it must be generic in nature and allow many inner uses and have a (possibly lighter) procedure for registering new entries.

Codecs usually pack analog or complex data in a lossy manner in accurately transported byte sequences. By playing with the noise level, the bit-stealing mode can be added. And when requested, the entire byte flow of a codec might be claimed for HDLC frame transport. The manner in which this is done is specific to a codec. This appendix is normative.

Where available, the TODO:xref:RFC4040 RTP type for audio/clearmode may be used to negotiate a completely undisturbed 64 kb/s channel. The link to PSTN is described in ITU Q.1912.5 so telecom providers may indeed facilitate it.

Clearmode channels are not subjected to mangling, and so the provisions that skip the lowest bit will not be used. In fact, since there are no defined sound semantics, the bandwidth can be completely used for HDLC transport. Although the channel starts in bit-stealing mode and considers that a different setting from byte-stealing mode, there is no practical difference. The full 8 bits per byte are available, without mangling, so its implementation of bit-stealing does not work with bit stuffing but with the same escaping mechanism (for FLAG, BREAK and escape bytes within HDLC frames) as for byte-stealing mode. Clearmode offers no outer codec, but is open to inner codecs. Bytes are XOR-ed with 0x55 for transport. This helps to set lots of transitions in zero content, and since this is an interface to a digital telephony backbone that is useful.

In byte-stealing mode, the entire G.722 codec would be considered a transport layer for bytes, just like CLEARMODE. It would escape bytes for FLAG, BREAK and escape inside HDLC frames. In bit-stealing mode, the least-valued 2 bits of each sample are used for data. This is explicitly permitted by the G.722 specification, without indications of a particular purpose. These bits represent finer details that may be replaced by arbitrary data. They are used as HDLC bits, where the higher-valued bit comes before the lower-valued bit.

It is not sufficient to combine 4 consecutive blocks of 2 bits to form a byte; firstly because synchronisation of the byte start would be difficult, and secondly because bit stuffing would end up being awkward. Instead of taking this approach, bit-stealing mode for G.722 will consider the least-valued 2 bits in every sample just like for the G.711 form with 2 data bits. This may also improve code sharing. Mangling does not occur in G.722 because it runs over a CLEARMODE channel, as ITU Q.1912.5 suggests. This means that no provisioning is needed for the lower words. Indeed, mangling is a G.711 phenomenon. Bytes are XOR-ed with 0x55 for transport. This helps to set lots of transitions in zero content, and since this is an interface to a digital telephony backbone that is useful.

Even though ISDN is going or gone for subscriber lines, it still forms a vital part of the telephony backbone and, because its codecs are rigidly enforced, the more flexible VoIP systems have all adapted to include A-law and μ-law, the two forms defined in G.711. The G.711 codec used in SubliMe is A-law only. There are predefined translations between A-law and μ-law and back, and no matter how often this is used the mangling of codec bytes that it causes is known and constant. The reason to choose A-law only is that it retains detail in the lower bits during such translations, which is where we can transmit most of our data. Mangling of samples occurs in the higher values, but this is less damaging to the transmission of data in the codec.

In byte-stealing mode, the codec carries HDLC frame bytes directly as codec data, but it should be mindful that some of the A-law bytes may translate to one μ-law byte and back to one A-law byte; one of the original A-law values is then mangled. These mangles values should be sent with TODO:escaping if it is unknown whether this may occur on the communications channel. This may be tested for explicitly, by sending a TEST command to Address 0x00 and checking the response.

In bit-stealing mode, the exponent determines how far the mantisse is shifted. The insertion of a fixed point in the actual samples shows where SubliMe makes a cut-off between audio content above and under the noise level. The bits under the noise level are stolen to carry the HDLC bit flow, in the direction from most to least significant bit.

25 | 0x4c -> 0x4d | s100.110q 28 -> 27 | 0x4e -> 0x4f | s100.111q 30 -> 29 | 0x48 -> 0x49 | s100.100q Legend: 32 -> 31 | 0x4a -> 0a4b | s100.101q s = Sign Bit 45 -> 46 | 0x79 -> 0x78 | s1111.00q . = Noise level separator 47 -> 48 | 0x7b -> 0x7a | s1111.01q q = Possibly mangled bit 63 -> 64 | 0x6b -> 0x6a | s11010.1q 80 -> 79 | 0x1a -> 0x1b | s001101q. TODO:CAPTION: A-law bytes that may be mangled on the wire cause one bit in the A-law codec byte without the transport XOR mask 01010101 to have an uncertain least significant bit. ]]> Just before stealing the least significant bit for data, it may become clear that it will be part of a mangled pair of codec values. In this case, this bit cannot carry data. It should instead be set so the total byte becomes a mangled value. Upon reception, the occurrence of this same value is a sign that no mangling has taken place. As an efficiency measure, when no HDLC frames are being transmitted, the bit-stealing mode may switch off by sending a BREAK after the last FLAG, and then set the lowest data bit to 0 where data could be. In case of a mangled pair of codec values, the one-but-lowest data bit would be set to 0 instead. Since no more than 4 data bits are carried in any codec byte, this lower 0 bit enables an efficient test that the byte can be skipped. The "off" mode therefore becomes a chase for a lowest bit set to 1 and then continues to match for BREAK and FLAG marks. This lower bit is not detectable to the ear, but the more signifcant bits can be heard as noise, and when they are not altered the sound quality improves when no bits are stolen for the transmission of HDLC frames. The A-law codec in G.711 already ensures that all bytes are XOR-ed with 0x55 for transport. This helps to set lots of transitions in zero content, and since this is an interface to a digital telephony backbone that is useful.

TODO: Describe a structure with desired data per UUID, and list these as examples in a separate document. Frame format and boundaries, its definition within a protocol or data format. Service Flags and Parameters for XID, along with their impact. Frames with an MRU or continuous flow? This section discusses ranges of the Address values that SubliMe supports. The only required Address to support is 0x00 for SubliMe itself. Services are opened with SABM and accepted with UA or rejected with DM or FRMR (with flag W). Some services may not be available until encryption is established. The descriptions below indicate a brief purpose and protocol, plus the protocol messages that determine the frame formats and their boundaries. These boundaries are predominantly of interest for UDP and SCTP protocols; for TCP protocols, it is always possible to stream data continuously instead of collecting a full frame. In general, an implementation of SubliMe should aim to preserve frame boundaries, even when they arrive as a block in a continuous byte stream. This appendix is normative.

Some protocols use HDLC themselves, and embedding HDLC into HDLC requires a few special remarks. One inner HDLC frame maps to one outer HDLC frame or, if it is too large, a consecutive sequence of HDLC fragments that combine to the original frame. This even includes the (somewhat superfluous) inner Frame Check Sequence. Since the outer HDLC frame provides framing, there is no further need for FLAG recognition, bit stuffing or escape sequences. This is therefore removed from the inner HDLC frame.

These protocols can be used for authentication and key agreement. The last key exchanged can be setup for SubliMe encryption and signing by setting up salt, initialisation vector, counter, ...

TODO: It may be interesting to incorporate OpenSSH support and perhaps even ZRTP, MIKEY, ... The precise definition of these mechanisms for key exchange and authentication is deferred to an extension to this specification.

The Hayes AT indicates command interfacing to a modem, which is not usually available to a remote peer, but there may be uses still. The V.150.1, T.38 and T.140 protocols all express telephony signals in a descriptive form, rather than as timing-sensitive audio. In lieu of realtime guarantees on the Internet, these tend to work more reliably. MM4 is the MMS-passing protocol between MMS Centres. It packs (possibly large) media in an SMTP mail message, but it does not need SMTP semantics for delivery. One mail may however exceed practical HDLC user data sizes, so the application follows the format of the DATA command in SMTP, where the mail is terminated with a CR-LF-dot-CR-LF sequence and any line that starts with a dot will be prefixed with another dot. One or more consecutive HDLC frames combine to one MM4 frame. The purpose of assigning an Address to T.140 even when it also works over RTP is that it can now rely on HDLC flow mechanisms, unlike RTP which builds it from scratch. It is preferred to use this form.

None defined yet. Intended for open standards frameworks, not singular mechanisms (regardless of their size) with lock-in potential. They can define a UUID for their purposes.

After a peer announces support for SubliMe, it can make an attempt to allocate an Address in the range 0x80 through 0xfe by sending an XID to it. This will fail if the Address has been taken, and as discussed above there may also be clashes. Transmitting data to the Address before the other side has confirmed it can easily lead to problems, ranging from not being heard, through receiving an error, to causing errors due to reaching another service than the intended one. This is however an option to transmit data to a dynamically allocated service over an unidirectional SubliMe path. This mechanism makes it easy to add new services. The required ingredients are a fresh UUID, preferrably derived from a DNS name, and a specification that is known at least to the user base. Specifications tend to be useful for much longer than one might imagine, so it is often good to make them live in a lasting place, not just on a privately owned website.

This address is traditionally used in HDLC for broadcasting, as well as sending to the remote peer without specifying its address. It is not currently used in SubliMe.

This work was supported by NLnet.nl as one element of the Subliminal Messaging project, along with KIP-secured SIP connection management, and Wireshark VPN connectivity configuration over SIP and/or telephone links. I also owe gratitude to my father, Harry van Rein, who brought me as a kid an endless supply of telephony waste from his repair job to tinker with.