SmartSwitch IP Fragmentation White Paper
Table of Contents
The following document is designed to discuss the support
for frame fragmentation techniques within the Cabletron
SmartSwitch ASIC-based product lines.
ATM and IP Fragmentation
As many organizations evolve from today's traditional hub
and router environments towards the ATM-based switched virtual
networks of tomorrow, it is important to address some of the
frame fragmentation issues associated with the deployment of
ATM switching in conjunction with traditional packet based
technologies.
ATM Switch vendors and developers have defined for the
communications industry a trend towards reducing the
complexity associated with switching fabrics and devices. In
an effort to deliver low latency, high performance switching
to support multimedia, distance learning, videoconferencing,
and client-server applications...the switches themselves are
actual becoming simpler in an effort to deal with fixed size
payloads enabling hardware based switching techniques to be
employed.
An industry standard critical to the successful
provisioning of ATM services in conjunction with traditional
packet based LANs is the ATM Forum LAN Emulation Specification
v1.0. LAN Emulation is the industry's only ratified standard
from a standards authority chartered with the focus of ATM
technology. Within the LAN Emulation standard is a definition
in regards to the Maximum Transfer Unit (MTU) size of the ATM
Adaptation Layer 5 (AAL-5) SDU. While the ATM Forum defined
the AAL-5 SDU for 1516, 4544, 9234, and 18,190 bytes in size,
the Forum only ratified support for Ethernet and Token Ring
framing. Other framing types such as those associated with
FDDI were not accepted. The ATM Forum clearly states "All
LE Clients that belong to a given emulated LAN Must use the
same maximum AAL-5 SDU size for their connections to/from the
BUS and for their connections to each other.". Therefore,
organizations who plan on implementing ATM Servers with
Ethernet clients would configure the ATM hosts for an MTU of
1500 bytes. Further, if FDDI hosts remained in use as the ATM
backbone began to be deployed, the LAN-E specification would
require these hosts to modify their MTU to be 1500 bytes
because of the Ethernet host that would be members of the same
emulated LAN.
It is important to note that ATM switches perform no
fragmentation function whatsoever...so as to reduce the
operational complexity of the switch to deliver high
performance throughput. Frame fragmentation introduces
additional complexity to the switch and thereby places
additional processing burden upon it.
Another standards organization, the IETF, continues work on
proposed standards such as RFC 1577...Classical IP over ATM.
These draft status RFC's define methods for MTU sizes
exceeding 8,000 bytes. However, one must recognize that these
RFC's are not ratified standards as referenced in RFC 1780 and
1796.
The reader of this document should take a mental step back
in time to reflect on modem technologies. When modems were
first deployed by vendors, they performed a checksum on each
transmission to ensure that after a file transfer was
completed, the data had been accurately received on the other
end of the link. Otherwise, a file may have reached the other
end, only to require complete retransmission due to a lost
datagram or line bit errors. If an ATM attached host were to
transmit 8,000 bytes before acknowledgment or checksumming,
over 160 ATM cells would be created and if even one were to be
dropped by a switch or experience a line bit error, the entire
160 cells would require retransmission. Much like modems do
error checking on smaller payloads, a similar end result can
be accomplished by limiting the ATM host MTU to 1500 bytes...a
creation of approximately 30 ATM cells. These smaller MTUs in
conjunction with traffic management techniques such as Cell
Packet Discard can combine to offer the network user the
greatest amount of network "goodput" (successful
end-to-end delivery).
In summary, if a network designer envisions the deployment
of ATM within his or her existing Ethernet infrastructure, the
ATM Forum specifications defining a common AAL-5 SDU would
result in a maximum MTU size of approximately 1,500 bytes
being required.
2.0 FDDI and IP Fragmentation
Frame Fragmentation issues are therefore only introduced in
those enviroments containing FDDI hosts and Ethernet
end-points. ATM and Fast Ethernet attached hosts already limit
their MTU to 1500 bytes (ATM through LANE and 802.3u Fast
Ethernet through retention of the 802.3 framing structure).
The IETF has defined a technique through multiple RFCs to
fragment IP datagrams to address the FDDI frame fragmentation
issues. It is critical to note that the only widely
implemented protocol within the networking industry which can
be fragmented is TCP/IP. Many of the other widely implemented
networking protocols dynamically determine their MTU and
therefore fragmentation is not required. It is critical to
note that most UNIX workstations are delivered out of the box
with the MTU set to a default of 1500 bytes. While FDDI
interfaces may be factory-installed by the workstation
manufacturer, adjusting the MTU to a larger size actually
requires the administrator or the manufacturer configuration
facility to regenerate the kernel on the workstation.
Many application developers have adjusted MTU sizes to 4500
bytes or higher on FDDI attached hosts to take advantage of a
perceived throughput benefit of using larger frame sizes. In
actuality, the MAC Layer framing overhead difference between
one 4500 byte frame and three 1500 byte frames is only a 3%
difference when inter-frame spacing, wire delay, and
additional MAC headers are accounted for. Performance testing
on a variety of vendor's internetworking devices has shown up
to 30% reductions in throughput when enabling IP
Fragmentation. Is a 3% gain in performance on server-to-server
links justified by a 30% reduction in throughput when clients
attempt to access those servers? Obviously, different
applications will behave differently and the provided
percentages are for representation and comparison purposes
only.
Where Cabletron derives the 3% FDDI-to-FDDI 1500 byte MTU
impact results from a line efficiency analysis of FDDI with
Various Frame Sizes. IP fragmentation is supported in most
routers that implement the TCP/IP suite of protocols, as well
as transparent and translational bridges. The motivation for
IP fragmentation stems from the desire to utilize the larger
frame size of FDDI (4500 bytes) to achieve maximum
line-efficiency. In other words, minimize the ratio of
addressing and other overhead to real data.
Considering the fact that TCP/IP is the only widely
deployed protocol that supports fragmentation, this document
will address line efficiency issues for two frame sizes
encompassing an IP datagram. 802.2 LLC will be used for the
Logical Link Control protocol and TCP for the remainder of the
protocol stack.
The overhead associated with :
TCP header (no options) is - 20 bytes
IP header (no options) is - 20 bytes
802.2 LLC header is - 4 bytes
FDDI Frame is - 20 bytes
========
TOTAL 64 bytes
The maximum frame size on an FDDI ring is 4500 bytes. Of
that, 4436 bytes are actually available for user data (98.5%
efficiency). The deterministic access method of FDDI requires
the use of a token and variable timers based on ring length
and number of nodes in the ring. For the sake of simplicity,
any resultant latency will not figure into these calculations.
In a "best-case scenario" consisting of 4500 byte
FDDI frames transmitted sequentially and no other node using
the token, 98.5 percent is the maximum line efficiency that
can be achieved for that FDDI ring.
A 1500 byte MTU for the data portion of the frame on FDDI
includes the LLC, IP and TCP overhead. The additional FDDI
frame overhead is 20 bytes, resulting in a 1520 byte frame.
1520 bytes minus the 64 bytes of overhead results in 1456
bytes available for user data. These figures result in a 95.7
percent line efficiency for FDDI with the MTU limited to 1500
bytes.
Therfore, not doing IP fragmentation in a switching system
(thereby limiting all MTU's to 1500 bytes if Ethernet is
involved) results in only 2.8 percent maximum degradation in
performance for that FDDI ring. These calculations also assume
that transmissions would have been at either the 4500 or 1500
byte MTU. A thorough analysis would probably reveal a wide
range of frame sizes in a production network, reflecting
different applications on the ring.
Additionally, Cabletron has performed some NFS file
transfer testing using Sun SPARC 5 workstations running the
Solaris operating system. One SPARC system contained an
Ethernet interface and a second SPARC system contained an FDDI
interface. File transfers were done from the Ethernet SPARC to
the FDDI SPARC and vice-versa. In betweeen the two SPARCs was
a Cabletron SmartSwitch performing Ethernet/FDDI translational
bridging. The results of that testing are summarized below:
FDDI to Ethernet (40Mb NFS File Transfer)
=========================================
Ethernet SPARC FDDI SPARC
MTU Size Transfer Time CPU Load Transfer Time CPU Load
1500 44 sec 32.72% 44 sec 80.00%
4500 44 sec 55.53% 44 sec 81.63%
This result shows that the 1500 byte MTU size resulted in
an equivalent completion of the NFS file transfer with a
negative CPU impact of 1.6% on the FDDI SPARC and 22.81% on
the Ethernet SPARC as it formatted data in 4500 byte blocks
but then segmented them into 1500 byte frames for transmission
onto the ethernet wire.
Ethernet to FDDI (40Mb NFS File Transfer)
=========================================
Ethernet SPARC FDDI SPARC
MTU Size Transfer Time CPU Load Transfer Time CPU Load
1500 47 sec 76.84% 47 sec 49.67%
4500 47 sec 76.81% 47 sec 50.60%
This result shows again that the 1500 byte MTU size
resulted in an equivalent completion of the NFS file transfer
with a CPU impact of 0.07% on the FDDI and 0.03% on the
Ethernet SPARC.
A third configuration of the test was run to measure the
file transfer time and CPU impact if both SPARCs were attached
to FDDI. A FDDI ring was created amongst DAS interfaces on the
two SPARCs and a Network General FDDI analyzer.
FDDI to FDDI (40Mb NFS File Transfer)
=========================================
FDDI SPARC FDDI SPARC
MTU Size Transfer Time CPU Load Transfer Time CPU Load
1500 23 sec 80.00% 23 sec 32.72%
4500 25 sec 85.10% 25 sec 31.29%
This result confirms that the 1500 byte MTU size resulted
in a faster completion of the NFS file transfer with a
negative CPU impact of 5.10% on the sending FDDI SPARC and
1.43% on the receiving FDDI SPARC.
It is therefore arguable that the CPU utilization favors
1500 byte MTU and that by reducing the MTU size on the FDDI
hosts, that NFS file transfers complete in less time. An
interesting side note is that when a Cisco Catalyst 5000
switch was used to perform the IP Fragmentation function, the
SPARC CPU utilization was, on average, 1% higher while
delivering the same file transfer completion times as the
Cabletron SmartSwitch. This could be related to Cisco's
violation of IEEE 802.3 standards through a modification of
collision back-off algorithms known as "Forward
Pressure".
3.0 The IETF and IP Fragmentation
In an effort to allow for larger MTU sizes on FDDI hosts,
while alleviating the need for IP Fragmentation to Ethernet
end stations, RFC 1191 defines a method for dynamic MTU
discovery. It is important to note, however, that the function
was defined for implementation in IP Gateways or Routers...in
other words, switching products are MAC layer devices and may
be incapable of supporting RFC 1191. Cabletron has completed
engineering investigations as to the feasability of
implementing RFC 1191 in the SmartSwitch ASIC family. This
would require a more detailed packet look-up function than the
ASIC can support thereby requiring processor intervention on
every packet which would significantly reduce packet
forwarding throughput. In light of this information, Cabletron
currently has no plans to implement RFC 1191 in the
SmartSwitch family of ASIC-based switching products.
RFC 1191, however, does contain the following text...
"When one IP host has a large amount of data to send
to another host, the data is transmitted as a series of IP
datagrams. It is usually preferable that these datagrams be of
the largest size that does not require fragmentation anywhere
along the path from the source to the destination. (For the
case against fragmentation, see [5].)"
This RFC further supports Cabletron's position that in
order to deliver the highest performance network to the end
users, no packet fragmentation should be performed along the
way. After all, should network managers focus more on
Server-to-Server performance...or Client-to-Server
performance? Cabletron NFS performance testing has shown that
even Server-to-Server file transfers complete in less time,
with litte CPU impact, when using the smaller 1500 byte MTU.
IP Next Generation (IPv6) and IP Fragmentation
4.0 IP Next Generation (IPv6) and IP Fragmentation
Recently, the IETF has further defined in RFC 1752 the
requirements for the next-generation IP Protocol...IPv6. All
industry leading vendors such as Cabletron, Cisco, 3Com, and
Bay Networks have participated in the development of this
draft specification. RFC 1752 removes support for IP
fragmentation in IPv6. RFC 1752 states that Fragmentation
issues are the responsibility of the end stations (hosts) and
not the network gateways, routers, or other networking
hardware. RFC 1752 also states that in an effort to increase
IP efficiency, features in IPv4 such as fragmentation have
been removed in IPv6 to reduce header sizes through the
removal of the fragment bit.
Please remember that if the responsibility of IP
Fragmentation is transferred to the host end
systems...additional CPU resources may be consumed which in
the end defeat the original purpose of larger MTU
sizes...increased network performance.
Bradner & Mankin [Page 28]
RFC 1752 Recommendation for IPng January 1995
12.2.4 Fragment Header
The Fragment header is used by an IPv6 source to send payloads larger than
would fit in the path MTU to their destinations. (Note: unlike IPv4,
fragmentation in IPv6 is performed only by source nodes, not by routers along a
packet's delivery path) The Fragment header is identified by a Next Header
value of 44 in the immediately preceding header, and has the following format:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Next Header | Reserved | Fragment Offset |Res|M|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
* Next Header - Identifies the type of header immediately following the
Fragment header. Uses the same values as the IPv4 Protocol field.
(8 bit selector)
* Reserved, Res - Initialized to zero for transmission; ignored on reception.
* Fragment Offset - The offset, in 8-octet units, of the following payload,
relative to the start of the original, unfragmented payload.
(13-bit unsigned integer)
* M flag - 1 = more fragments; 0 = last fragment.
* Identification - A value assigned to the original payload that
is different than that of any other fragmented payload sent recently
with the same IPv6 Source Address, IPv6 Destination Address, and
Fragment Next Header value. (If a Routing header is present, the IPv6
Destination Address is that of the final destination.) The
Identification value is carried in the Fragment header of all of the
original payload's fragments, and is used by the destination to
identify all fragments belonging to the same original payload.
(32 bit field)
5.0 Summary and Conclusion
In conclusion, Cabletron currently offers support for IP
Fragmentation in our i960 RISC software-based switching
products such as the EMM-E6, ESXMIM, , ESXMIM-F2, ESX-1320,
and SmartSwitch 9000 Ethernet MicroLAN Switch modules
(9E106-06, 9E132-15, 9E133-36, 9E138-12, and 9E138-36).
Although there are no current plans to implement IP
Fragmentation in the SmartSwitch 9000 INB SmartSwitches, this
function is, however supported in the latest addition to the
SmartSwitch family of products; the SmartSwitch 6000 for the
wiring closet, and the SmartSwitch 2200 for the
Desktop/Workgroup.
While Cabletron recognizes that the lack of support for IP
Fragmentation in the SmartSwitch product line is a design
consideration for many customer networks...it is our intent to
educate the marketplace that the resulting MTU changes which
are today an option, will be required in the future, as many
organizations evolve towards the use of ATM technology within
their networking infrastructures. It is also extremely
critical to note that as the next generation IP protocol is
being defined by the IETF, the support for IP fragmentation
has been removed from networking devices and the
responsibility transferred to the end systems.
|