The document provides an overview of the Stream Control Transmission Protocol (SCTP). SCTP is a connection-oriented transport layer protocol that offers reliable data transfer over IP networks. It supports features like multihoming for network fault tolerance, multi-streaming to minimize delay, and congestion control. The document discusses SCTP's architecture, features, security mechanisms, and error handling. It is intended to help application developers write programs using SCTP socket APIs.
3. Table of Contents
About This Document...................................................................................................................13
Intended Audience.............................................................................................................13
Document Organization.....................................................................................................13
Typographical Conventions................................................................................................13
Related Information............................................................................................................14
HP Encourages Your Comments........................................................................................15
1 Introduction..............................................................................................................................17
SCTP Overview...................................................................................................................17
Limitations of TCP and UDP..............................................................................................18
Limitations of TCP........................................................................................................18
Limitations of UDP........................................................................................................19
SCTP Architecture..............................................................................................................19
SCTP in the IP Stack......................................................................................................20
Connection Setup in SCTP............................................................................................21
SCTP Packet...................................................................................................................23
Congestion Control in SCTP.........................................................................................26
Slow Start and Congestion Avoidance Algorithms.................................................26
Fast Retransmit and Fast Recovery..........................................................................27
SCTP Features.....................................................................................................................27
Multihoming..................................................................................................................28
Multistreaming..............................................................................................................30
Conservation of Data Boundaries.................................................................................31
SCTP Graceful Shutdown Feature................................................................................31
SCTP Support for IPv4 and IPv6 Addresses.................................................................32
SCTP Data Exchange Features......................................................................................32
Support for Dynamic Address Reconfiguration ..........................................................33
Reporting Packet Drops to an Endpoint.......................................................................33
Support for ECN-Nonces in SCTP................................................................................34
SCTP Support for Partially Reliable Data Transmission...............................................35
Error Handling in SCTP.....................................................................................................37
Retransmission of DATA Chunks.................................................................................37
HEARTBEATs to Identify Path Failures........................................................................38
HEARTBEATs to Identify Endpoint Failure.................................................................38
SCTP Security.....................................................................................................................38
Cookie Mechanism........................................................................................................38
Verification Tag..............................................................................................................39
2 SCTP Socket APIs......................................................................................................................41
Table of Contents 3
4. Overview............................................................................................................................41
Socket API Versus SCTP Socket API..................................................................................41
Different Socket API Styles.................................................................................................42
One-to-One Socket APIs................................................................................................42
Basic One-to-One Call Flow Sequence..........................................................................43
The socket() Socket API............................................................................................43
The bind() Socket API..............................................................................................44
The listen() Socket API.............................................................................................45
The accept() Socket API...........................................................................................45
The connect() Socket API.........................................................................................45
The close() Socket API..............................................................................................46
The shutdown() Socket API.....................................................................................46
The sendmsg() and recvmsg() Socket APIs.............................................................47
The getpeername() Socket API.................................................................................48
One-to-Many Socket APIs.............................................................................................48
Basic One-to-Many Call Flow Sequence.......................................................................48
The socket() Socket API ...........................................................................................49
The bind() Socket API..............................................................................................50
The listen() Socket API.............................................................................................51
The sendmsg() and recvmsg() Socket APIs.............................................................51
The close() Socket API..............................................................................................52
The connect() Socket API.........................................................................................52
API Options to Modify Socket Behavior............................................................................52
Common Socket Calls.........................................................................................................54
The send(), sendto(), recv(), and recvfrom() Socket Calls.............................................55
The setsocktopt() and getsockopt() Socket Calls...........................................................56
The read() and write() Socket Calls...............................................................................56
The getsockname() Socket Call......................................................................................57
SCTP Events and Notifications...........................................................................................57
SCTP Ancillary Data Structures.........................................................................................58
SCTP Initiation Structure (SCTP_INIT)........................................................................59
SCTP Header Information (SCTP_SNDRCV)...............................................................59
SCTP-Specific Socket APIs..................................................................................................61
The sctp_bindx() SCTP Socket API...............................................................................61
The sctp_peeloff() SCTP Socket API.............................................................................62
The sctp_getpaddrs() SCTP Socket API........................................................................62
The sctp_freepaddrs() SCTP Socket API.......................................................................63
The sctp_getladdrs() SCTP Socket API.........................................................................63
The sctp_freeladdrs() SCTP Socket API........................................................................64
The sctp_sendmsg() SCTP Socket API..........................................................................64
The sctp_recvmsg() SCTP Socket API...........................................................................65
The sctp_connectx() SCTP Socket API..........................................................................65
The sctp_send() SCTP Socket API.................................................................................66
The sctp_sendx() SCTP Socket API...............................................................................66
4 Table of Contents
5. 3 Compiling and Running Applications that Use the SCTP Socket APIs..............................................69
Compiling Applications that Use the SCTP APIs..............................................................69
Running Sample Applications that use the SCTP APIs.....................................................70
4 Migrating TCP Applications to SCTP...........................................................................................73
A SCTP Sample Programs.............................................................................................................75
Sample Server Programs.....................................................................................................75
One-to-One Server Program..........................................................................................75
One-to-Many Server Program.......................................................................................77
Sample Client Programs.....................................................................................................80
One-to-One Client Program..........................................................................................80
One-to-Many Client Program.......................................................................................82
Glossary.....................................................................................................................................85
Index..........................................................................................................................................87
Table of Contents 5
7. List of Figures
1-1 The Internet Protocol Stack.........................................................................................20
1-2 Three-Way Handshake in TCP....................................................................................21
1-3 Four-Way Handshake in SCTP...................................................................................22
1-4 SCTP Packet Format....................................................................................................23
1-5 A Single-Homed Connection......................................................................................29
1-6 A Multihomed Connection.........................................................................................29
1-7 Multistreaming in an SCTP Association.....................................................................31
1-8 Shutdown in TCP and SCTP.......................................................................................32
7
9. List of Tables
1-1 Chunk Types...............................................................................................................24
1-2 Comparison Between SCTP, TCP, and UDP...............................................................27
2-1 Data Structures in the recvmsg() and sendmsg() Calls...............................................60
9
11. List of Examples
3-1 Sample Commands to Compile the Server and Client Programs...............................70
3-2 Sample Command to Run the Server Application......................................................70
3-3 Sample Command to Run the Client Application......................................................71
11
13. About This Document
This document describes how to write, compile, and run applications using Stream
Control Transmission Protocol (SCTP) socket APIs on systems running HP-UX 11i v2.
HP's implementation of SCTP conforms to the RFCs and RFC drafts listed in “Related
Information” (page 14).
The document printing date and part number indicate the document’s current edition.
The printing date will change when a new edition is printed. Minor changes may be
made at reprint without changing the printing date. The document part number will
change when extensive changes are made.
The latest version of the document will be available at: http://www.docs.hp.com
Document updates can be issued between editions to correct errors or document product
changes. To ensure that you receive the updated or new edition, subscribe to the
appropriate support service.
Contact your HP sales representative for details.
Intended Audience
This document is intended for application developers who write programs using SCTP
socket APIs. Application developers are expected to be familiar with SCTP, C, UNIX®,
TCP, UDP, networking concepts, and operating system concepts. Application developers
are recommended to read the relevant SCTP RFCs for detailed information on SCTP.
This document is not a tutorial.
Document Organization
The SCTP Programmer's Guide is organized as follows:
Chapter 1 Chapter 1 (page 17) introduces the SCTP protocol. It also discusses the
SCTP protocol architecture, the message format, congestion control,
fault management, SCTP security, and error handling.
Chapter 2 Chapter 2 (page 41) describes the different socket API styles, SCTP
events and notifications, common socket options, common socket calls,
SCTP ancillary data structures, and the new SCTP-specific socket APIs.
Chapter 3 Chapter 3 (page 69) describes how to compile and run applications
that use the SCTP APIs.
Chapter 4 Chapter 4 (page 73) describes how to migrate existing TCP applications
to SCTP. It also discusses the benefits of migrating TCP applications to
SCTP.
Typographical Conventions
This document uses the following typographical conventions:
Intended Audience 13
14. audit(5) An HP-UX manpage. The name of the manpage is audit and 5 is the
section in the HP-UX Reference. On the web and on the Instant
Information CD, it may be a link to the manpage itself. From the
HP-UX command line, you can enter “man audit” or “man 5
audit” to view the manpage. See man(1).
Book Title The title of a book. On the web and on the Instant Information CD,
it may be a link to the book itself.
The name of a keyboard key. Note that Return and Enter both refer
KeyCap
to the same key.
Emphasis Text that is emphasized.
Emphasis Text that is strongly emphasized.
Term The defined use of an important word or phrase.
Text displayed by the computer.
ComputerOut
Commands and other text that you type.
UserInput
A command name or qualified command phrase.
Command
The name of a variable that you may replace in a command or
Variable
function or information in a display that represents several possible
values.
[] The contents are optional in formats and command descriptions.
{} The contents are required in formats and command descriptions. If
the contents are a list separated by , you must choose one of the
items
... The preceding element may be repeated an arbitrary number of
times.
| Separates items in a list of choices.
Related Information
The following related documents are available for the SCTP product:
• SCTP Administrator's Guide at:
http://docs.hp.com/en/netcom.html
• SCTP Release Notes at:
http://docs.hp.com/en/netcom.html
• Request for Comments (RFC) documents:
— RFC 2960 (Stream Control Transmission Protocol) at:
http://www.ietf.org/rfc/rfc2960.txt?number=2960
— RFC 3286 (An Introduction to the Stream Control Transmission Protocol (SCTP))
at:
http://www.ietf.org/rfc/rfc3286.txt?number=3286
14 About This Document
15. — RFC 3873 (Stream Control Transmission Protocol (SCTP) Management Information
Base (MIB)) at:
http://www.ietf.org/rfc/rfc3873.txt?number=3873
— RFC 3309 (Stream Control Transmission Protocol (SCTP) Checksum Change) at:
http://www.ietf.org/rfc/rfc3309.txt?number=3309
— RFC 3758 (Stream Control Transmission Protocol (SCTP) Partial Reliability Extension)
at:
http://www.ietf.org/rfc/rfc3758.txt?number=3758
— RFC 4460 (Stream Control Transmission Protocol (SCTP) Specification Errata and
Issues) at:
http://www.ietf.org/rfc/rfc4460.txt?number=4460
• Draft RFCs:
— draft-ietf-tsvwg-sctpsocket-10.txt at:
http://tools.ietf.org/wg/tsvwg/draft-ietf-tsvwg-sctpsocket/draft-ietf-tsvwg-sctpsocket-10.txt
— draft-ietf-tsvwg-addip-sctp-10.txt (Stream Control Transmission Protocol (SCTP)
Dynamic Address Reconfiguration) at:
http://tools.ietf.org/wg/tsvwg/draft-ietf-tsvwg-addip-sctp/draft-ietf-tsvwg-addip-sctp-10.txt
— draft-stewart-sctp-pktdrprep-02.txt (Stream Control Transmission Protocol (SCTP)
Packet Drop Reporting) at:
http://tools.ietf.org/html/draft-stewart-sctp-pktdrprep-02
— draft-ladha-sctp-nonce-01.txt (ECN Nonces for Stream Control Transmission
Protocol (SCTP)) at:
http://tools.ietf.org/html/draft-ladha-sctp-nonce-05
HP Encourages Your Comments
HP encourages your comments concerning this document. We are committed to
providing documentation that meets your needs. Send any errors found, suggestions
for improvement, or compliments to:
feedback@fc.hp.com
Include the document title, manufacturing part number, and any comment, error found,
or suggestion for improvement you have concerning this document.
HP Encourages Your Comments 15
17. 1 Introduction
This chapter introduces Stream Control Transmission Protocol (SCTP). It also discusses
the SCTP architecture, the features that SCTP supports, the security features that SCTP
offers, and error handling.
This chapter addresses the following topics:
• “SCTP Overview” (page 17)
• “Limitations of TCP and UDP” (page 18)
• “SCTP Architecture” (page 19)
• “SCTP Features” (page 27)
• “Error Handling in SCTP” (page 37)
• “SCTP Security” (page 38)
SCTP Overview
SCTP is a connection-oriented transport layer protocol that enables reliable transfer of
data over IP-based networks. In an IP stack, it exists at a level equivalent to that of
Transmission Control Protocol (TCP) and User Datagram Protocol (UDP). SCTP offers
all the features that are supported by TCP and UDP. It also overcomes certain limitations
in TCP and adopts the beneficial features of UDP.
SCTP offers the following features:
• Network-level fault tolerance through support for multihoming
• Minimized delay in data delivery by sending data in multiple streams
• Acknowledged, error-free non-duplicated transfer of data
• Data fragmentation to conform to discovered maximum transmission unit (MTU)
size
• Sequenced delivery of user messages within multiple streams
• Optional bundling of multiple user messages into an SCTP packet
• Improved SYN-flood protection
• Preservation of message boundaries
SCTP also includes mechanisms, such as checksums, sequence numbers, and selective
retransmission of data, to detect data corruption, loss of data, and duplication of data.
In addition, it contains different congestion control algorithms to minimize data loss
in an unstable network. SCTP supports improved error handling methods to avoid
unnecessary retransmission of data. The security methods implemented in SCTP enable
the endpoints of an association to avoid SYN-flooding, and to identify stale or unwanted
data packets.
Initially, the features of SCTP were designed to transport telephone signaling messages
over IP networks. Other applications that require similar features can also use SCTP.
SCTP Overview 17
18. NOTE: In SCTP, the term “stream” refers to a sequence of user messages that are
delivered in sequence, with respect to other messages within the same stream. In TCP,
“stream” refers to a sequence of bytes.
HP's implementation of SCTP conforms to the following RFCs and draft RFCs:
• RFC 3286 (An Introduction to the Stream Control Transmission Protocol (SCTP))
• RFC 2960 (Stream Control Transmission Protocol)
• RFC 3873 (Stream Control Transmission Protocol (SCTP) Management Information Base
(MIB))
• RFC 4460 (Stream Control Transmission Protocol (SCTP) Specification Errata and Issues)
• RFC 3309 (Stream Control Transmission Protocol (SCTP) Checksum Change)
• RFC 3758 (Stream Control Transmission Protocol (SCTP) Partial Reliability Extension)
• draft-ladha-sctp-nonce-01.txt (ECN Nonces for Stream Control Transmission Protocol
(SCTP))
• draft-ietf-tsvwg-addip-sctp-10.txt (Stream Control Transmission Protocol (SCTP)
Dynamic Address Reconfiguration)
• draft-stewart-sctp-pktdrprep-02.txt (Stream Control Transmission Protocol (SCTP)
Packet Drop Reporting)
•
Limitations of TCP and UDP
TCP and UDP are the most widely used network layer protocols. However, the data
transfer services offered by these protocols are inadequate to meet the requirements
of a wide range of commercial applications, such as real-time multimedia and
telecommunication applications. These applications require a robust protocol, which
provides the flexibility of UDP and reliability of TCP, for transferring data between
two endpoints.
This section discusses the limitations of the TCP and UDP protocols, which led to the
development of SCTP.
This section addresses the following topics:
• “Limitations of TCP” (page 18)
• “Limitations of UDP” (page 19)
Limitations of TCP
Following are the limitations of TCP:
• TCP provides reliable data transfer, but it transmits data in a sequence. However,
some applications may need reliable data transfer, though not necessarily in a
strict sequence. These applications prefer partial ordering of data, wherein ordering
is maintained only within subflows of data. The strict sequence maintenance in
18 Introduction
19. TCP not only makes partial ordering of data impossible, it also causes unnecessary
delay in the overall data delivery. Moreover, if a single packet is lost, delivery of
subsequent packets is blocked until the lost TCP packet is delivered. This causes
head-of-line (HOL) blocking.
• TCP transmits data in a stream. This requires that applications add their own
record marking, to delineate their messages. Applications must use the PUSH flag
in the TCP header, to ensure that a complete message is transferred in reasonable
time.
• In a TCP connection, each host includes a single network interface, and a connection
is established between the network interfaces of the two hosts. As a result, if the
connection breaks because of a path failure, data becomes unavailable until the
connection is re-established.
• TCP is vulnerable to denial of service (DoS) attacks, such as SYN flood attacks. A
DoS occurs when a malicious host forges an IP packet with a fake IP address and
sends a large number of TCP SYN messages to the victim host. Each time the TCP
stack, on the victim host, receives a new SYN message, the TCP stack allocates
kernel resources to service the new SYN message. When the TCP stack is flooded
with multiple SYN messages, the victim host can run out of resources and fail to
service the new legitimate SYN messages.
Limitations of UDP
Following are the limitations of UDP:
• In UDP, the transfer of data is unreliable, because it is a connectionless protocol.
In a UDP connection, an application cannot verify if the packet has reached the
destination.
• UDP does not contain an in-built congestion control mechanism to detect path
congestion. As a result, more data may be injected into an already congested
network. This results in data loss.
• If stringent rules for reliable data transfer are implemented in applications that
use UDP, the implementation causes additional overhead and complexity in the
applications.
SCTP Architecture
SCTP is designed to address the shortcomings in TCP. It uses mechanisms, such as
four-way handshake to prevent DoS attacks. The SCTP architecture defines packet
format that contains additional fields, such as cookie and verification tag, to avoid SYN
flooding. The SCTP architecture includes improved congestion control algorithms that
are effective in controlling congestion in unstable networks.
This section addresses the following topics:
• “SCTP in the IP Stack” (page 20)
• “Connection Setup in SCTP” (page 21)
SCTP Architecture 19
20. • “SCTP Packet” (page 23)
• “Congestion Control in SCTP” (page 26)
SCTP in the IP Stack
Figure 1-1 illustrates a typical IP stack and denotes the layer in which SCTP is located.
Figure 1-1 The Internet Protocol Stack
An Internet protocol stack contains several layers and each layer provides a specific
functionality. Following are the layers in an IP stack and their functionalities:
• The physical layer defines the physical means of sending data over network devices.
• The data link layer transfers data between network entities, and detects and corrects
errors that can occur in the physical layer.
• The network layer routes data packets from the sender to the receiver in the
network. The most common network layer protocol is IP.
• The transport layer enables transfer of data between endpoints using the services
of the network layer. This layer has two primary protocols, the Transmission
20 Introduction
21. Control Protocol (TCP) and the User Datagram Protocol (UDP). TCP supports
reliable and sequential packet delivery through error recovery and flow control
mechanisms. UDP is a simple message-based connectionless protocol compared
to TCP. SCTP is yet another transport layer protocol that application developers
can use to transmit data between endpoints.
• The socket layer provides the transport layer with an interface to interact with the
application layer. The socket layer contains a set of APIs, which facilitate the
transport layer to interface with the application layer.
• The application layer provides application programs with an interface to
communicate and transfer data across the network. All application layer protocols
use the sockets layer as their interface, to interact with the transport layer protocol.
Connection Setup in SCTP
This section discusses the connection setup between two endpoints in TCP and SCTP.
It also discusses how the connection setup in SCTP prevents the DoS attack.
Both TCP and SCTP initiate a new connection with a packet handshake. TCP uses a
three-way handshake to set up a new connection, whereas SCTP uses a four-way
handshake to set up a new connection.
Figure 1-2 illustrates the three-way handshake in TCP.
Figure 1-2 Three-Way Handshake in TCP
The following steps describe the three-way handshake in TCP:
SCTP Architecture 21
22. 1. Host A sends a Synchronize (SYN) packet to Host B.
2. Upon receiving the SYN packet, Host B allocates resources for the connection and
sends a Synchronize-Acknowledge (SYN-ACK) packet to Host A.
3. Host A sends an ACK packet to confirm the receipt of the SYN-ACK packet.
The connection is set up between Host A and Host B, and Host A can now start
sending data to Host B.
Figure 1-3 illustrates the four-way handshake in SCTP.
Figure 1-3 Four-Way Handshake in SCTP
The following steps describe the four-way handshake in SCTP:
1. Host A initiates an association by sending an INIT packet to Host B.
2. Host B responds with an INIT-ACK packet that contains the following fields:
• A Verification tag
• A Cookie
The TCP SYN-ACK packet does not contain these fields. The cookie contains the
necessary state information, which the server uses to allocate resources for the
association. The cookie field includes a signature for authenticity and a timestamp
to prevent replay attacks using old cookies. Unlike TCP, Host B in SCTP does not
allocate resources at this point in the connection. The verification tag provides a
key that enables Host A to verify that the SCTP packet belongs to the current
association.
3. Host A sends the COOKIE-ECHO packet to Host B. If Host A has a forged IP address,
it never receives the INIT-ACK chunk. This prevents Host A from sending the
22 Introduction
23. COOKIE-ECHO packet. As a result, the conversation ends without the server
allocating any resources for the connection.
4. Host B responds with a COOKIE-ACK chunk and allocates resources for the
connection.
The connection is now established between Host A and Host B. Host A can now
start sending data to Host B.
In SCTP, the transfer of data may be delayed because of the additional handshake. The
four-way handshake may seem to be less efficient than a three-way handshake. To
overcome this delay, SCTP permits data to be exchanged in the COOKIE-ECHO and
COOKIE-ACK chunks.
SCTP Packet
SCTP transmits data in the form of messages and each message contains one or more
packets.
Figure 1-4 illustrates an SCTP packet format.
Figure 1-4 SCTP Packet Format
SCTP Architecture 23
24. An SCTP packet contains a common header, and one or more chunks. The SCTP
common header contains the following information:
• Source and destination port numbers to enable multiplexing of different SCTP
associations at the same address.
• A 32-bit verification tag that guards against the insertion of an out-of-date or false
message into the SCTP association.
• A 32-bit checksum for error detection. The checksum can be either a
32-bit CRC checksum or Adler-32 checksum.
A chunk can be either a control chunk or a DATA chunk. A control chunk incorporates
different flags and parameters, depending on the chunk type. The DATA chunk
incorporates flags to control segmentation and reassembly, and parameters for the
transmission sequence number (TSN), Stream Identifier (SID) and Stream Sequence Number
(SSN), and a Payload Protocol ID. The DATA chunk contains the actual data payload.
Each control and data chunk in the SCTP packet contains the following information:
Chunk Type This field identifies the type of information contained in the Chunk
Data field. The value of the chunk field ranges from 0 to 254. The
value 255 is reserved for future use, as an extension field. SCTP
consists of one DATA chunk and 12 control chunks.
Table 1-1 lists the definitions and parameters of the different chunk
types.
Table 1-1 Chunk Types
Chunk Definition
Used for data transfer.
Payload Data (DATA)
Initiates an SCTP association between two
Initiation (INIT)
endpoints.
Initiation Acknowledgement ( Acknowledges the receipt of an INIT chunk.
The receipt of the INIT ACK chunk establishes
INIT ACK)
an association.
Selective Acknowledgement Acknowledges the receipt of the DATA chunks
(SACK) and also reports gaps in the data.
Used during the initiation process. The
Cookie Echo (COOKIE ECHO)
endpoint initiating the association sends the
COOKIE ECHO chunk to the peer endpoint.
Cookie Acknowledgement Acknowledges the receipt of the COOKIE
(COOKIE ACK) ECHO chunk. The COOKIE ACK chunk must
take precedence over any DATA chunk or SACK
chunk sent in the association. The COOKIE
ACK chunk can be bundled with DATA chunks
or SACK chunks
24 Introduction
25. Table 1-1 Chunk Types (continued)
Chunk Definition
Heartbeat Request ( Tests the connectivity of a specific destination
address in the association.
HEARTBEAT)
Heartbeat Acknowledgement ( Acknowledges the receipt of the HEARTBEAT
chunk.
HEARTBEAT ACK)
Informs the peer endpoint to close the
Abort Association (ABORT)
association. The ABORT chunk also informs the
receiver of the reason for aborting the
association.
Operation Error (ERROR) Reports error conditions. The ERROR chunk
contains parameters that determine the type
of error.
Shutdown Association ( Triggers a graceful shutdown of an association
with a peer endpoint.
SHUTDOWN)
Shutdown Acknowledgement Acknowledges the receipt of the SHUTDOWN
(SHUTDOWN ACK) chunk at the end of the shutdown process.
Shutdown Complete ( Concludes the shutdown procedure.
SHUTDOWN COMPLETE)
Chunk Flag This field contains the flags, such as U (unordered bit), B (beginning
fragment bit), and E (ending fragment bit). Usage of this field
depends on the chunk type specified in the chunk type field. Unless
otherwise specified, SCTP sets this field to 0 while transmitting the
packet and ignores the chunk flag on receipt of the packet.
Chunk Length This field represents the size of the fields chunk type, chunk flag,
chunk length, and chunk value, in bytes.
Chunk Data This field contains the actual information to be transferred in the
chunk. The usage and format of this field depends on the chunk
type.
The number of chunks in an SCTP packet is determined by the MTU size of the
transmission path. Multiple chunks can be bundled into one SCTP packet except the
INIT, INIT ACK, and SHUTDOWN COMPLETE chunks. The SCTP packet size must not
be more than the MTU size.
The SCTP packet format supports bundling of multiple DATA and control chunks into
a single packet, to improve transport efficiency. An application can control bundling,
to avoid bundling during initial transmission. Bundling occurs on retransmission of
DATA chunks, to reduce the possibility of congestion. If the user data does not fit into
one packet, SCTP fragments data into multiple chunks.
For more information on the SCTP packet format, see RFC 2960 (Stream Control
Transmission Protocol).
SCTP Architecture 25
26. Congestion Control in SCTP
SCTP uses various congestion control algorithms to effectively handle network failures
or unexpected traffic surges, and ensures quick recovery from data congestion. SCTP
and TCP support the same set of congestion control algorithms. Following are the
congestion control algorithm supported by SCTP:
• Slow Start and Congestion Control
• Fast Retransmit and Fast Recovery
However, in SCTP, the congestion control algorithms are modified to suite the
protocol-specific requirements.
For information on the TCP congestion control algorithms, see RFC 2581 (TCP Congestion
Control).
This section addresses the following topics:
• “Slow Start and Congestion Avoidance Algorithms”
• “Fast Retransmit and Fast Recovery”
Slow Start and Congestion Avoidance Algorithms
The slow start and congestion avoidance algorithms are used to control the amount of
outstanding data being injected into the network. SCTP uses the slow start algorithm
at the beginning of the transmission, when the network condition is unknown, and
also in repairing loss detected by the retransmission timer. SCTP slowly probes the
network to determine the available capacity of the network to avoid congestion in the
network. If SCTP detects a congestion in the network, it switches to the congestion
avoidance algorithm to manage the congestion.
The slow start and congestion avoidance algorithms use the following congestion
control variables:
Specifies the limit on the amount of data the
Congestion window (cwnd)
sender can transmit through the network, before
receiving an acknowledgement. This variable is
maintained for each destination address.
Specifies the receiver’s limit on the amount of
Receiver window (rwnd)
outstanding data.
NOTE: The minimum value of the cwnd and
rwnd variables determine the amount of data
transmission.
Determines whether the slow start or congestion
Slow start threshold (ssthresh)
avoidance algorithm must be used to control data
transmission.
26 Introduction
27. Partial Bytes Acknowledged Adjusts of the cwnd parameter.
(partial_byte_acked)
In an SCTP connection, the sender uses the slow start algorithm if the value of cwnd
is less than the ssthresh value. If the value of cwnd is greater than the ssthresh
value, the sender uses the congestion avoidance algorithm. If the values for cwnd and
ssthresh are same, the sender can use either the slow start or congestion avoidance
algorithm. Unlike TCP, an SCTP sender must store the cwnd, ssthresh, and
partial_bytes_acked congestion control variables for each destination address of
the peer. However, the sender needs to store only one rwnd value for the whole
association, irrespective of whether the peer is multihomed or contains only one address.
Fast Retransmit and Fast Recovery
The fast retransmit congestion control algorithm is used to intelligently retransmit
missing segments of information in an SCTP association. When a receiver in an SCTP
connection receives a DATA chunk out of sequence, the receiver sends a SACK packet
with the unordered TSN, to the sender. The fast retransmit algorithm uses four SACK
packets to indicate loss of data, and retransmits DATA without waiting for the
retransmission timer to timeout. After the fast retransmit algorithm sends the DATA
that appears to be missing, the fast recovery algorithm controls the transmission of
new data until all the lost segments are retransmitted.
SCTP Features
The Signaling Transport (SIGTRAN) Working Group in IETF developed SCTP to
address the limitations in TCP and UDP. Though the development of SCTP was directly
motivated by the need to transfer Public Switched Telephone Network (PSTN) signaling
messages across the IP network, SIGTRAN ensured that the design meets the
requirements of other applications with similar requirements.
Table 1-2 compares features of SCTP, TCP, and UDP.
Table 1-2 Comparison Between SCTP, TCP, and UDP
Feature SCTP TCP UDP
no1
State required at each endpoint yes yes
Reliable data transfer yes yes no
Congestion control and avoidance yes yes no
no2
Message boundary conservation yes yes
yes2
Path MTU discovery and message fragmentation yes no
yes2
Message bundling yes no
Multi-homed hosts support yes no no
SCTP Features 27
28. Table 1-2 Comparison Between SCTP, TCP, and UDP (continued)
Feature SCTP TCP UDP
Multi-stream support yes no no
Unordered data delivery yes no yes
yes no no
Security cookie against SYN flood attack
no3
Built-in heartbeat (reachability check) yes
1 In UDP, a node can communicate with another node without going through a setup procedure, or
without changing any state information. However, each UDP packet contains the required state
information to form a connection, so that an ongoing state need not be maintained at each endpoint.
2 TCP does not preserve any message boundaries. It treats all the data passed from its upper layer as a
formatless stream of data bytes. However, because TCP transfers data in sequence of bytes, it can
automatically resize all the data into new TCP segments that are suitable for the Path MTU, before
transmitting them.
3 TCP implements a keep-alive mechanism, which is similar to the SCTP HEARTBEAT chunk. In TCP,
however, the keep-alive interval is, by default, set to two hours for state cleanup. In SCTP, the HEARTBEAT
chunk is used to facilitate fast failover.
This section addresses the following topics:
• “Multihoming” (page 28)
• “Multistreaming” (page 30)
• “Conservation of Data Boundaries” (page 31)
• “SCTP Graceful Shutdown Feature” (page 31)
• “SCTP Support for IPv4 and IPv6 Addresses” (page 32)
• “SCTP Data Exchange Features” (page 32)
• “Support for Dynamic Address Reconfiguration ” (page 33)
• “Reporting Packet Drops to an Endpoint” (page 33)
• “Support for ECN-Nonces in SCTP” (page 34)
• “SCTP Support for Partially Reliable Data Transmission” (page 35)
Multihoming
Multihoming is the ability of a single SCTP endpoint to contain multiple interfaces with
different IP addresses. In a single-homed connection, an endpoint contains only one
network interface and one IP address.
Figure 1-5 illustrates the single-homed connection in TCP.
28 Introduction
29. Figure 1-5 A Single-Homed Connection
In Figure 1-5, Host A contains a single network interface (NIA1) and Host B contains a
single network interface (NIB1). NIA1 is the only interface for Host A to interact with
Host B.
When a network or path failure occurs, the endpoint is completely isolated from the
network. Multihoming in SCTP ensures better chances of survival if a network failure
occurs, when compared to TCP. The built-in support for multi-homed hosts in SCTP
enables a single SCTP association to run across multiple links or paths, to achieve link
or path redundancy. This enables an SCTP association to achieve faster failover from
one link or path to another, with minimum interruption in the data transfer service.
Figure 1-6 illustrates the mutli-homed connection in SCTP.
Figure 1-6 A Multihomed Connection
In this figure, Host A contains multiple network interfaces to interact with Host B,
which also has multiple interfaces.
SCTP selects a single address as the quot;primaryquot; address and uses it as the destination
for all DATA chunks for normal transmission. All the other addresses are considered
as alternate IP addresses. SCTP uses these alternate IP addresses to retransmit DATA
chunks and to improve the probability of reaching the remote endpoint. Retransmission
SCTP Features 29
30. may occur because of continued failure to send DATA to the primary address. As a
result, all DATA chunks are transmitted to the alternate address until the HEARTBEAT
chunks have re-established contact with the primary address
During the initiation of an association, the SCTP endpoints exchange the list of IP
addresses, so that each endpoint can receive messages from any of the addresses
associated with the remote endpoint. For security reasons, SCTP sends response
messages to the source address in the message that prompted the response.
An endpoint can receive messages that are out of sequence or with different address
pairs, because multi-homing supports multiple IP addresses. To overcome this problem,
SCTP incorporates procedures to resolve parallel initiation attempts into a single
association.
Multistreaming
Multistreaming enables data to be sent in multiple, independent streams in parallel,
so that data loss in one stream does not affect or stop the delivery of data in other
streams. Each stream in an SCTP association uses two sets of sequence numbers, namely
a Transmission Sequence Number (TSN) that governs the transmission of messages
and the detection of message loss, and the Stream ID/Stream Sequence Number
(SID/SSN) pair that determines the sequence of delivery of the received data.
TCP transmits data sequentially in the form of bytes in a single stream and ensures
that all the bytes are delivered in a particular order. Therefore, a second byte is sent
only after the first byte has safely reached the destination. The sequential delivery of
data causes delay when a message loss or sequence error occurs within the network.
An additional delay occurs when TCP stops sending data until the correct sequencing
is restored, either upon receiving an out-of-sequence message or by retransmitting a
lost message.
The strict preservation of message sequence in TCP poses a limitation for certain
applications. These applications require sequencing of messages that affect the same
resource (such as the same call or the same channel), so that messages are loosely
correlated and delivered without maintaining the overall sequence integrity.
The multistreaming feature in an SCTP, in which reliable data transmission and data
delivery are independent of each other, overcomes this problem. This feature also
avoids HOL blocking. This independence improves the flexibility of an application, by
allowing it to define semantically different streams of data inside the overall SCTP
message flow, and by enforcing message ordering only within each of the streams. As
a result, message loss in one particular stream does not affect the delivery of messages
in a different stream. The receiver can immediately determine if there is a gap in the
transmission sequence (for example, caused by message loss), and also can determine
whether messages received following the gap are within the affected stream. If SCTP
receives a message that belongs to the affected stream, a corresponding gap occurs in
SSN. The sender can continue to deliver messages to the unaffected streams while
buffering messages in the affected stream until retransmission occurs.
30 Introduction
31. Figure 1-7 illustrates how multi-streaming works in an SCTP association.
Figure 1-7 Multistreaming in an SCTP Association
NOTE: By default, SCTP contains two streams. SCTP uses stream 0 as the default
stream to transmit data. Applications can modify the number of streams through which
SCTP transmits data.
Conservation of Data Boundaries
In SCTP, a sending application can construct a message out of a block of data bytes
and instruct SCTP to transport the message to a receiving application. SCTP guarantees
the delivery of this message (data block) in its entirety. It also indicates to the receiver
about both the beginning and end of the data block. This is called conservation of
message boundaries. TCP does not conserve data boundaries. It treats all the data
passed to it from the sending application as a sequence or stream of data bytes. It
delivers all the data bytes to the receiver in the same sequential order as they were
passed from the application. TCP does not conserve data boundaries when packets
arrive out of sequence. As a result, the receiver cannot rearrange the packets. It has to
wait till the packets arrive in sequence, starting from the last unreceived packet to the
received out-of-sequence packet.
SCTP Graceful Shutdown Feature
SCTP does not support a quot;half-openquot; connection, which can occur in TCP. In a half-open
connection, even though an endpoint indicates that it has no more data to send, the
other endpoint continues to send data indefinitely. SCTP, on the other hand, assumes
that when the shutdown procedure begins, both the endpoints will stop sending new
data across the association. It also assumes that it needs only to clear up
acknowledgements of the previously sent data.
The SCTP shutdown feature uses a three-message procedure to gracefully shutdown
the association, in which each endpoint has confirmed the receipt of the DATA chunks
SCTP Features 31
32. before completing the shutdown process. When an immediate shutdown is required,
SCTP sends an ABORT message to an endpoint.
Figure 1-8 illustrates graceful shutdown in SCTP and the half-closed state in TCP.
Figure 1-8 Shutdown in TCP and SCTP
SCTP Support for IPv4 and IPv6 Addresses
SCTP supports both IPv4 and IPv6 address parameters in an SCTP packet, as defined
in RFC 2960 (Stream Control Transmission Protocol). When an association is set up, the
SCTP endpoints exchange the list of addresses of the endpoints in the INIT and
INIT-ACK chunks. The address of the endpoint is represented by the following
parameters: an IPv4 address parameter with value 5 and an IPv6 address parameter
with value 6. The INIT chunks can contain multiple addresses, which can be an IPv4
or IPv6 address.
SCTP Data Exchange Features
This section discusses the enhanced features in SCTP that ensures reliable data exchange
between endpoints.
Following are the data exchange features in SCTP:
• In SCTP, data is transmitted in the form of packets. Each packet contains a DATA
chunk and a control chunk. An SCTP endpoint acknowledges the receipt of a DATA
chunk by sending a SACK chunk to the other endpoint. The SACK chunk indicates
the range of cumulative TSNs and non-cumulative TSNs, if any. The
non-cumulative TSNs indicate gaps in the received TSN sequence. When SCTP
identifies gaps in the TSN sequence, it resends the missing DATA chunks to the
other endpoint. SCTP uses the “delayed ack” method to send the SACK chunks.
In this method, SACK is sent for every second packet, but with an upper limit on
32 Introduction
33. the delay between SACKs. The frequency of sending SACKs increases to one per
received packet if gaps are detected in the TSN sequence.
For information on an SCTP packet, see “SCTP Packet” (page 23).
• SCTP contains various congestion control algorithms, such as slow start, congestion
avoidance, fast recovery, and fast retransmit, to control the flow and retransmission
of data. For information on these congestion control algorithms see, “Congestion
Control in SCTP” (page 26). In these algorithms, the receiver advertises the receive
window and a sender advertises a per-path congestion window to handle
congestion. The receiver window indicates buffer occupancy of the receiver. The
per-path congestion window manages the packets in flight. The congestion control
algorithms in SCTP are similar to that of TCP, except that the endpoints in an SCTP
connection manages the conversion between bytes sent and received, and TSNs
sent and received. This is because a TSN is attached only to a chunk.
• An HP-UX application can specify a lifetime for the data to be transmitted. If the
lifetime of the data has expired and the data has not been transmitted, the data,
such as time-sensitive signalling messages, can be discarded. If the lifetime of the
data has expired and the data has been transmitted, data must be delivered to
avoid a hole in the TSN sequence.
Support for Dynamic Address Reconfiguration
SCTP enables an endpoint to reconfigure the IP address information dynamically for
an existing association. When the endpoints exchange information during association
startup, the usability of SCTP also improves without modifying the SCTP protocol.
This feature is useful in computational and networking applications that add or remove
physical interface cards dynamically and need the IP address of the interface to be
changed dynamically. This feature also enables an endpoint to set the primary
destination address of a remote peer so that when the primary address of an endpoint
is deleted, the remote peer is informed of the address to which the data must be sent.
To enable SCTP to reconfigure IP addresses dynamically, an SCTP packet contains the
following chunk types:
Address Configuration Change The ASCONF chunk communicates the
Chunk (ASCONF) configuration change requests that must be
acknowledged, to the remote endpoint.
Address Configuration The ASCONF-ACK chunk is used by the receiver
Acknowledgment (ASCONF-ACK) of an ASCONF chunk to acknowledge the
reception of the ASCONF chunk.
Reporting Packet Drops to an Endpoint
When a packet drop occurs because of an error other than congestion, an endpoint can
mistakenly interpret the packet drop as an indication of congestion in the network. The
misinterpretation can cause an SCTP sender to stop sending packets. This results in
SCTP Features 33
34. under-utilization of the network link. Depending on the severity of the error, the sender
can remain in a state of congestion, which affects the performance of the association.
SCTP contains the PKTDROP chunk that discovers packets that are dropped because of
errors other than congestion. After receiving the PKTDROP chunk, an SCTP endpoint
can inform its peer that it has received an SCTP packet with an incorrect CRC32C or
Adler-32 checksum. The peer can then retransmit the SCTP packet without modifying
the congestion window.
For information on packet drop scenarios, see draft-stewart-sctp-pktdrprep-02.txt
(Stream Control Transmission Protocol (SCTP) Packet Drop Reporting) at:
http://tools.ietf.org/html/draft-stewart-sctp-pktdrprep-02
Support for ECN-Nonces in SCTP
With the increased deployment of real-time applications and transport services that
are sensitive to the delay and loss of packets, relying on packet loss alone as indicative
of congestion is not sufficient. SCTP's congestion management algorithms have built-in
techniques, such as Fast Retransmit and Fast Recovery, to minimize the impact of losses.
These mechanisms consider the network as a black box and continue to send packets
till packets are dropped because of congestion. However, these mechanisms are not
intended to help applications that are sensitive to the delay or loss of one or more
individual packets.
With the inclusion of active queue management techniques in the Internet infrastructure,
routers can assist in managing congestion. When a congestion occurs and the sender
continues to send packets, the number of packets in the queue in the router increases
and causes a bottleneck in the router. In such a case, the router marks the packets with
congestion experienced (CE) bits and sends them to the receiver to indicate congestion,
instead of dropping the packets. Explicit Congestion Notification (ECN) is a congestion
management algorithm that uses a similar method to handle congestion. ECN uses the
ECN field and the congestion experienced (CE) field in the IP header to mark the packets.
The ECN field contains the ECN-Capable Transport (ECT) field, which is set by the
data sender to indicate that the endpoints are ECN-capable. The CE bit is set by the
router to indicate congestion. The ECT code points range from 00 to 01. Senders use
the ECT (0) or ECT(1) code point to indicate ECT for each packet.
ECN uses the following information to provide congestion notifications:
• Negotiation between the endpoints during connection setup to determine whether
they are both ECN-capable.
• An ECN-Echo (ECNE) flag in the the IP header, which enables the data receiver to
inform the data sender when a CE packet is received.
• A congestion window reduced (cwr) flag in the IP header, which enables the data
sender to inform the data receiver that the congestion window has been reduced.
The drawback in ECN is that a poorly implemented receiver or an intermediate network
element, such as router, firewall, intrusion detection system, can erase the ECNE flag
34 Introduction
35. that provides congestion signal to the sender. This is because ECN does not contain
mechanisms to avoid network elements from clearing the ECNE flag. Moreover, ECN
requires the cooperation of the receiver to return congestion experienced signals to the
sender. If the receiver erases the congestion signals to conceal congestion and does not
send these signals to the sender, the sender gains a performance advantage at the
expense of competing connections that do not experience congestion.
SCTP supports the ECN method and is exposed to misbehaving receivers that conceal
congestion signals. The misbehavior includes concealment of ECNE signals that may
cause an SCTP sender to be aggressive and unfair to compliant flows. SCTP supports
ECN-nonce to avoid misbehaving receivers from concealing congestion signals.
ECN-nonce also protects senders from other forms of misbehavior, such as optimistic
acknowledgements and false duplicate TSN notifications.
The ECN-nonce is a modification of the ECN signaling mechanism. It improves the
congestion control by preventing receivers from exploiting ECN to gain an unfair share
of network bandwidth. ECN-nonce improves the robustness of ECN by preventing
receivers from concealing marked or dropped packets. Like ECN, ECN-nonce uses the
ECT(0) and ECT(1) code points, the IP header flag, the cwr, and the ECNE bits.
The ECN-nonce uses two bits of the IP header called the ECT bits. The sender randomly
generates a single bit nonce and encodes it in the ECT codepoints, ECT(0) or ECT(1).
To indicate congestion in the network, routers overwrite the ECT codepoints with the
CE bit. The nonce sum (NS) is a cumulative one bit addition of the nonces received
from the receiver. The receiver calculates the nonce sum and returns it in the NS flag
of the SACK chunk. The sender verifies the value of the NS flag in the SACK chunk.
An incorrect nonce sum implies that one or more nonces are missing at the receiver,
because all the nonces are required to calculate the correct nonce sum. If an incorrect
nonce sum is received by the sender without ECNE signals, the sender can infer that
the receiver is concealing congestion notifications.
The ECN-nonce support in SCTP includes the following:
• A single nonce-supported parameter in the INIT or INIT-ACK chunk that is
exchanged during the association establishment, to indicate to the peer whether
ECN-nonce is supported at both endpoints.
• A single bit flag in the SACK chunk called the Nonce Sum (NS).
SCTP Support for Partially Reliable Data Transmission
SCTP supports partially reliable data transmission service (PR-SCTP) that enables an
SCTP sender to signal the receiver that it must not expect data from the SCTP sender.
PR-SCTP enables ordered and unreliable data transfer service between endpoints, in
addition to unordered and unreliable data transfer (similar to UDP). PR-SCTP employs
similar congestion control and congestion avoidance algorithms as SCTP, for both
reliable or partially reliable data traffic.
SCTP Features 35
36. The communication failure detection and protection capabilities of reliable SCTP data
traffic are also applicable to partially reliable data traffic. PR-SCTP enables an endpoint
to detect a failure destination address quickly and to failover to an alternate destination
address. It also notifies when the destination address becomes unreachable.
The chunk bundling capability in SCTP enables reliable and unreliable messages to be
multiplexed over a single PR-SCTP association. Multiplexing enables a single protocol
(that is SCTP) to be used to transmit different types of messages, instead of using
separate protocols.
SCTP includes the following parameter and chunk to support the partially reliable data
transmission service:
The Forward-TSN-Supported This is an optional parameter in the INIT and
parameter INIT ACK chunks. When an association is
initialized, the SCTP sender must include this
parameter in the INIT or INIT ACK chunk to
inform its peer that it supports partially reliable
data service.
The Forward Cumulative TSN The receiver sends this chunk to a sender to
inform its support for PR-SCTP. An SCTP sender
(FORWARD TSN) chunk
uses this chunk to inform the receiver to move
its cumulative received TSN forward, because
the missing TSNs are associated with data chunks
that must not be transmitted or retransmitted by
the sender.
The timed-reliability service is an example of a partially reliable service that SCTP
provides to the upper layer using PR-SCTP. This service enables the service user to
indicate a limit on the duration of time that the sender must try to transmit or retransmit
the message.
If an SCTP endpoint supports the FORWARD TSN chunk, it can include the
Forward-TSN-supported parameter in the INIT chunk to indicate support for FORWARD
TSN chunk to its peer. If an endpoint chooses not to include the Forward-TSN-Supported
parameter, it cannot send or process a FORWARD TSN chunk anytime during the lifetime
of an association. Instead, it must pretend as if it does not support the FORWARD TSN
chunk and return an error to the peer upon the receipt of any FORWARD TSN chunk.
When a receiver of an INIT or INIT ACK chunk detects a Forward-TSN-Supported
parameter and does not support the Forward-TSN chunk type, the receiver may
optionally respond with the Unsupported Parameters parameter, as defined in
Section 3.3.3 of RFC 2960.
A receiver can perform the following tasks if it receives an INIT chunk that does not
contain the Forward-TSN-Supported parameter:
• Include the Forward-TSN-Supported parameter in INIT-ACK.
• Record the information that the peer does not support the FORWARD TSN chunk.
36 Introduction
37. • Restrain from sending a FORWARD TSN chunk at any time during the lifetime of
an association.
• Check with the upper layer if it has requested a notification on whether the peer
endpoint supports the Forward-TSN-Supported parameter.
Error Handling in SCTP
The network traffic in the Internet is unpredictable. Sudden network failures and traffic
surges can occur, which result in non-reachability of an endpoint. Such a network is
error prone and a sending application must be cautious while transmitting or
retransmitting data, because the receiving endpoint may be unavailable to receive data.
The unavailability of the endpoint is caused either by a path failure or an endpoint
failure.
SCTP offers appropriate error handling methods, to overcome this problem. Before
transmitting data, SCTP sends chunks of information to verify whether a destination
is active. Even before using a different path to reach a destination or closing an
association, SCTP ensures that the destination address is not reachable or inactive.
SCTP uses the following error handling methods:
• Retransmission of DATA chunks
• HEARTBEATs to identify path failures
• HEARTBEATs to identify endpoint failures
This section addresses the following topics:
• “Retransmission of DATA Chunks” (page 37)
• “HEARTBEATs to Identify Path Failures” (page 38)
• “HEARTBEATs to Identify Endpoint Failure” (page 38)
Retransmission of DATA Chunks
SCTP uses DATA chunks to exchange information between two addresses. Upon
receiving a DATA chunk, the receiving address sends an acknowledgement to the
sending address. If the receiving address does not receive the DATA chunk properly,
it sends a SACK packet that triggers the sending address to retransmit the DATA chunk.
The sending address also retransmits the DATA chunk when the retransmission timer
times out.
SCTP limits the rate of retransmission of DATA chunks, to reduce chances of congestion.
It modifies the retransmission timeout (RTO) value, based on the estimates of the round
trip delay and reduces the transmission rate exponentially when the message loss
increases.
In an active SCTP association with constant DATA transmission, SACKs are more likely
to cause retransmission than the retransmission timeout. To reduce unnecessary
retransmission of data, SCTP uses the four SACK rule, so that SCTP retransmits a DATA
chunk only after receiving the fourth SACK, which indicates a missing DATA chunk.
Error Handling in SCTP 37
38. SCTP also uses the four SACK rule to avoid retransmission caused by normal
occurrences, such as packets received out of sequence.
HEARTBEATs to Identify Path Failures
SCTP periodically sends HEARTBEAT chunks to idle destinations, or alternate addresses
to identify a path failure. SCTP maintains a counter to store the number of heartbeats
that are sent to the inactive destination, without receiving a corresponding Heartbeat
Ack chunk. When the counter reaches the specified maximum value, SCTP also declares
the destination address as inactive. SCTP notifies the application about the inactive
destination address and starts using an alternate address for sending the DATA chunks.
However, SCTP continues to send heartbeats to the inactive destination address until
it receives an ACK chunk. On receipt of an ACK chunk, SCTP considers the destination
address as active again. The rate at which SCTP sends heartbeats depends on the sum
of the RTO value and the delay parameter, which allow Heartbeat traffic to be tailored
per the needs of the user application.
HEARTBEATs to Identify Endpoint Failure
SCTP identifies an endpoint failure in a way that is similar to path failure discussed in
“HEARTBEATs to Identify Path Failures” (page 38)
SCTP maintains a counter across all destination addresses, to store the number of
retransmits or Heartbeats sent to the remote endpoint without a successful ACK. When
the value of the counter exceeds a preconfigured maximum value, SCTP declares the
endpoint as unreachable and closes the association.
SCTP Security
SCTP uses the following methods to provide security:
• Cookie Mechanism
• Verification Tag
This section addresses the following topics:
• “Cookie Mechanism” (page 38)
• “Verification Tag” (page 39)
Cookie Mechanism
A cookie mechanism is employed during the initialization of an association, to provide
protection against security attacks. The cookie mechanism uses a four-way handshake,
and the last pair of handshake is allowed to carry user data for fast setup.
The cookie mechanism guards against a blind attacker from generating INIT chunks,
which overload the resources of an SCTP server by causing the server to use memory
and resources to handle new INIT requests. Instead of allocating memory for a
Transmission Control Block (TCB), the server creates a cookie parameter with the TCB
38 Introduction
39. information, together with a valid lifetime and a signature for authentication, and sends
these back in the INIT ACK chunk. The blind attacker cannot obtain the cookie, because
the INIT ACK always goes back to the source address of the INIT. A valid SCTP client
gets the cookie and returns it in the COOKIE ECHO chunk, where the SCTP server can
validate the cookie and use it to rebuild the TCB. The cookie is created by the server,
and the cookie format and secret key remain with the server. The server does not
exchange these details with the client.
Verification Tag
A verification tag is a 32–bit unsigned integer that is randomly generated to verify
whether the SCTP packet belongs to the current association, or to a stale packet from
a previous association. SCTP discards packets received without the expected verification
tag value, to protect against blind masquerade attacks and also from receiving stale
SCTP packets from a previous association.
The verification tag rules apply when sending or receiving SCTP packets that do not
contain an INIT, SHUTDOWN COMPLETE, COOKIE ECHO, ABORT, or a SHUTDOWN ACK
chunk.
While sending an SCTP packet, the endpoint must fill in the verification tag field of
the outbound packet, with the tag value in the Initiate Tag parameter of INIT or
INIT ACK received from its peer.
After receiving an SCTP packet, the endpoint must ensure that the value in the
verification tag field of the received SCTP packet matches its own tag. If the received
verification tag value does not match the receiver's own tag value, the receiver silently
discards the packet and does not process it any further.
The verification tag value is chosen by each endpoint of the association during
association startup.
SCTP Security 39
41. 2 SCTP Socket APIs
This chapter discusses the different SCTP socket API types, their call flow sequence,
SCTP events and notifications, socket options, command socket calls, and the SCTP
ancillary data structures.
This chapter addresses the following topics:
• “Overview” (page 41)
• “Socket API Versus SCTP Socket API” (page 41)
• “Different Socket API Styles” (page 42)
• “API Options to Modify Socket Behavior” (page 52)
• “Common Socket Calls” (page 54)
• “SCTP Events and Notifications” (page 57)
• “SCTP Ancillary Data Structures” (page 58)
• “SCTP-Specific Socket APIs” (page 61)
Overview
The socket layer in an IP stack contains socket APIs that enable the transport layer to
interface with the application layer. The socket APIs make the various protocol-specific
features available to an application.
SCTP contains the existing socket APIs and the SCTP-specific APIs. Both these APIs
enable SCTP to interface with the application layer. These APIs are also compatible
with TCP applications that can be migrated to SCTP with minimum changes.
Following are the design objectives of the SCTP socket APIs:
• Maintain consistency and ensure compatibility with the existing sockets APIs
• Define socket mapping for SCTP that is consistent with other socket API protocols,
such as UDP, TCP, IPv4, and IPv6
• Support a one-to-many style interface
• Support a one-to one style interface
The following sections discuss the differences between the socket API and the SCTP
socket APIs, the different SCTP socket API styles, data structures that enable applications
to control an association, and socket APIs to modify the socket options.
Socket API Versus SCTP Socket API
The SCTP APIs use the existing socket APIs to perform operations that are similar to
the operating behavior of the socket APIs. For example, in the existing socket APIs and
the SCTP socket APIs, an application can call the bind() API only once and an
application can specify only a single address in the bind() API.
Overview 41
42. However, because of the unique features of SCTP, such as multistreaming and
multihoming, the existing socket APIs either do not work on an SCTP socket, or the
semantics of the socket APIs need modification. For example, because of the
multi-homing feature supported in SCTP, the socket APIs, getsockname() and
getpeername(), do not work on an SCTP socket if a given association is bound to
multiple local addresses and the association has multiple peer addresses. Applications
must use the sctp_getpaddrs() SCTP socket API to obtain the peer addresses in
an association.
Unlike the existing socket APIs, the SCTP socket APIs disclose many features of the
SCTP protocol and association status to the application, to enable applications gain
better control over the SCTP protocol. For example, an application can specify some
of the association setup parameters, such as the number of desired outbound streams
and maximum number of inbound streams, to control an association.
Different Socket API Styles
This section discusses the different socket API styles and the basic call flow sequence
of each socket API style.
Following are the different socket API styles:
• One-to-one socket APIs
• One-to-many socket APIs
The one-to-one style API is similar to the existing socket APIs for a connection-oriented
protocol, such as TCP. The one-to-many style API facilitates simultaneous associations
with multiple peers using one end point (that is, it associates with multiple peers using
one socket file descriptor simultaneously).
These socket API styles share common data structures and operations. However, each
socket API style requires a different application programming style. You can use these
socket APIs to implement all the SCTP features. You can also select the API style
depending on the type of association you need in the application.
This section addresses the following topics:
• “One-to-One Socket APIs” (page 42)
• “Basic One-to-One Call Flow Sequence” (page 43)
• “One-to-Many Socket APIs” (page 48)
• “Basic One-to-Many Call Flow Sequence” (page 48)
One-to-One Socket APIs
The one-to-one style socket APIs are designed to enable the existing TCP applications
to migrate to SCTP with minimal changes. The sequence of socket calls made by the
client and server of a one-to-one style SCTP application is similar to the sequence of
socket calls made by a TCP application. A one-to-one style SCTP application can control
only one association using one file descriptor.
42 SCTP Socket APIs
43. Basic One-to-One Call Flow Sequence
A one-to-one style SCTP application uses the following system call sequence to prepare
an SCTP endpoint for servicing requests:
1. socket()
2. bind() or sctp_bindx()
3. sctp_getladdrs()
4. sctp_freeladdrs
5. listen()
6. accept()
When a client sends a connection request to the server, the accept() call returns
with a new socket descriptor. The server then uses the new socket descriptor to
communicate with the client, using recv() and send() calls to receive requests
and send responses.
7. sctp_getpaddrs()
8. sctp_freepaddrs
9. recv() or recvmsg()
10. send() or sctp_sendx() or sctp_send()
11. close() terminates the association.
An SCTP client uses the following system call sequence to set up an association with
a server to request services:
1. socket()
2. connect() or sctp_connectx()
After returning from connect(), the client uses send() and recv() calls to
send out requests and receive responses from the server.
3. The client calls close() to terminate this association when .
For more information about the one-to-one style socket calls, see“Common Socket
Calls” (page 54).
The socket() Socket API
Applications call socket() to create a socket descriptor, to represent an SCTP endpoint.
Following is the syntax for the socket() socket API:
int socket(PF_INET, SOCK_STREAM, IPPROTO_SCTP);
or
int socket(PF_INET6, SOCK_STREAM, IPPROTO_SCTP);
where:
Specifies the IPv4 domain.
PF_INET
Specifies the IPv6 domain.
PF_INET6
Different Socket API Styles 43
44. SOCK_STREAM Indicates the creation of a one-to-one style socket.
IPPROTO_SCTP Specifies the type of the protocol.
The first syntax of the socket() socket API creates an endpoint that can use only IPv4
addresses, while the second syntax creates an endpoint, which can use both IPv6 and
IPv4 addresses.
The bind() Socket API
Applications use bind() to specify the local address with which an SCTP endpoint
must associate.
These addresses, associated with a socket, are eligible transport addresses for the
endpoint to send and receive data. The endpoint also presents these addresses to its
peers during the association initialization process. To accept new associations on the
socket, the endpoint must call listen(), after calling bind(). For information on
listen(), see “The listen() Socket API” (page 45).
Following is the syntax for the bind() API:
ret = bind(int sd, struct sockaddr *addr, socklen_t addrlen);
where:
Represents the socket descriptor returned by the socket() call.
sd
Represents the address structure (struct sockaddr_in or struct
addr
sockaddr_in6).
Represents the size of the address structure.
addrlen
If sd is an IPv4 socket, the address passed must be an IPv4 address. If sd is an IPv6
socket, the address passed can either be an IPv4 or an IPv6 address.
Applications cannot call bind() multiple times to associate multiple addresses to an
endpoint. After the first call to bind(), all the subsequent calls will return an error.
If addr is specified as a wildcard (INADDR_ANY for an IPv4 address, or
IN6ADDR_ANY_INIT or in6addr_any for an IPv6 address), the operating system
associates the endpoint with an optimal address set of the available interfaces. If bind()
is not called before a sendmsg() call that initiates a new association, the endpoint
picks a transient port and chooses an address set that is equivalent to binding with a
wildcard address. One of the addresses in the address set serves as the primary address
for the association. Thus, when an application calls bind() with the INADDR_ANY or
the IN6ADDR_ANY_INIT wildcard address, the multihoming feature is enabled in
SCTP.
The completion of the bind() process alone does not prepare the SCTP endpoint to
accept inbound SCTP association requests. When a listen() system call is performed
on the socket, the SCTP endpoint promptly rejects an inbound INIT request using an
ABORT flag.
44 SCTP Socket APIs
45. The listen() Socket API
Applications use listen() to prepare the SCTP endpoint for accepting inbound
associations.
Following is the syntax for the listen() socket API:
int listen(int sd, int backlog);
where:
Represents the socket descriptor of the SCTP endpoint.
sd
Represents the maximum number of outstanding associations allowed in
backlog
the accept queue of the socket. These associations have completed the
four-way initiation handshake and are in the ESTABLISHED state. A
backlog of 0 (zero) indicates that the caller no longer wants to receive new
associations.
The accept() Socket API
Applications use the accept() call to remove an established SCTP association from
the accept queue. The accept() API returns a new socket descriptor, to represent the
newly formed association.
Following is the syntax for the accept() socket API:
new_sd = accept(int sd, struct sockaddr *addr, socklen_t *addrlen);
where:
Represents the socket descriptor for the newly formed association.
new_sd
Represents the listening socket descriptor.
sd
Contains the primary address of the peer endpoints.
addr
Specifies the size of addr.
addrlen
The connect() Socket API
Applications use connect() to initiate an association with a peer.
Following is the syntax for the connect() socket API:
int connect(int sd, const struct sockaddr *addr, socklen_t addrlen);
where:
Represents the socket descriptor of the endpoint.
sd
Represents the address of the peer.
addr
Represents the size of the address.
addrlen
By default, the newly created association has only one outbound stream. Applications
must use the SCTP_INITMSG option before connecting to the server, to change the
number of outbound streams. The SCTP_INITMSG option enables you to set a socket
option and get a socket option, using the setsockopt() and getsockopt() APIs.
Different Socket API Styles 45
46. If SCTP does not call the bind() API before calling connect() , the application picks
a transient port and chooses an address set that is equivalent to binding with
INADDR_ANY and IN6ADDR_ANY for IPv4 and IPv6 sockets, respectively. One of these
addresses serves as the primary address for the association. When an application calls
bind() with the INADDR_ANY or the IN6ADDR_ANY_INIT wildcard address, the
multihoming feature is enabled in SCTP.
The close() Socket API
Applications use close() to gracefully close down an association.
Following is the syntax for the close() socket API:
int close(int sd);
where:
Represents the socket descriptor of the association to be closed.
sd
After an application calls close() on a socket descriptor, no further socket operations
succeed on that descriptor.
The shutdown() Socket API
Applications use the shutdown() socket API to disable send or receive operations at
an endpoint. The effect of the shutdown() call is different in SCTP and TCP. In TCP,
a connection is in half-closed state even after an application calls shutdown(). In the
half-close state, an application at the sending endpoint continues to send data even if
an application at the receiving endpoint has stopped receiving data. In SCTP,
shutdown() completely disables applications at both the endpoints from sending or
receiving data.
NOTE: Applications can use the SCTP streams feature to achieve the half closed state
in SCTP.
Following is the syntax for the shutdown() socket call:
int shutdown(int sd, int how);
Specifies the socket descriptor of the association that needs to be closed.
sd
Specifies the type of shutdown. The values are as follows:
how
Disables further receive operations
SHUT_RD
Disables further send operations and initiates the SCTP shutdown
SHUT_WR
sequence
SHUT_RDWR Disables further send and receive operations, and initiates the
SCTP shutdown sequence
In SCTP, SHUT_WR initiates an immediate and full protocol shutdown. In TCP, SHUT_WR
causes TCP to enter a half-closed state. The SHUT_RD value behaves in the same way
for SCTP and TCP. SCTP_WR closes the SCTP association while leaving the socket
46 SCTP Socket APIs
47. descriptor open, so that the receiving endpoint can receive data that SCTP was unable
to deliver.
The sendmsg() and recvmsg() Socket APIs
Applications use the sendmsg() and recvmsg() socket APIs to transmit data to and
receive data from its peer.
Following is the syntax for the sendmsg() and recvmsg() socket APIs:
ssize_t sendmsg(int sd, const struct msghdr *message, int flags);
ssize_t recvmsg(int sd, struct msghdr *message, int flags);
where:
Represents the socket descriptor of the endpoint.
sd
Specifies the pointer to the msghdr structure that contains a single user
message
message and the ancillary data. Following is the structure for the msghdr
structure:
struct msghdr {
void *msg_name;
socklen_t msg_namelen;
struct iovec *msg_iov;
size_t msg_iovlen;
void *msg_control;
socklen_t msg_controllen;
int msg_flags;
};
where:
Specifies the pointer to the socket address structure.
msg_name
Specifies the size of the socket address structure.
msg_namelen
Includes an array of message buffers.
msg_iov
Specifies the number of elements in the msg_iov
msg_iovlen
structure.
Specifies the ancillary data.
msg_control
Specifies the length of the ancillary data buffer.
msg_controllen
Specifies the flags on the received message.
msg_flags
For more information on the msghdr, see RFC 2292 (Advanced Sockets API
for IPv6).
Contains flags that affect the messages being sent or received
flags
Different Socket API Styles 47
48. NOTE: A sendmsg() API does not fail if it contains an invalid SCTP stream identifier
but an error is returned on all subsequent calls on the file descriptor.
The getpeername() Socket API
Applications use the getpeername() socket API to retrieve the primary socket address
of the peer.
Following is the syntax for the getpeername() socket API:
int getpeername(int sd, struct sockaddr *address, socklen_t *len);
where:
Specifies the socket descriptor to be queried.
sd
Contains the primary peer address. If the socket is an IPv4 socket, the
address
address will be an IPv4 address. If the socket is an IPv6 socket, the address
will be either an IPv6 or an IPv4 address.
Specifies the length of the address.
len
If the actual length of the address is greater than the length of the supplied sockaddr
structure, SCTP truncates the stored address.
NOTE: The getpeername() socket API is available only for TCP compatibility. It
must not be used for the multihoming feature in SCTP, because this socket API does
not work with one-to-many style sockets.
One-to-Many Socket APIs
The one-to-many style APIs are designed to enable applications to control many
associations from a single endpoint, using a single file descriptor. Similar to the APIs
in UDP, one-to-many style APIs in SCTP enable a single socket file descriptor to connect
to multiple remote endpoints. A one-to-many style socket can send and receive data
without connecting to an endpoint. Unlike UDP, however, SCTP always has a valid
association with the specified endpoints, because SCTP is a connection-oriented protocol.
Basic One-to-Many Call Flow Sequence
A server in the one-to-many style uses the following socket call sequence to prepare
an endpoint for servicing requests:
1. socket()
2. bind() or sctp_bindx()
3. sctp_getladdrs()
4. sctp_freeladdrs()
5. listen()
6. sctp_getpaddrs()
48 SCTP Socket APIs