Fibre Channel Credits vs. FCoE’s “Pause”

Anyone who comes from a Fibre Channel background likely has a pretty good understanding of Buffer to Buffer (B2B) Credits.  The Fibre Channel network is designed so that all capacity within the system is carefully distributed and not oversubscribed.  This is one of the many ways FC keeps from dropping frames, something that should never be allowed to happen in a FC world, because the underlying protocol at the FC4 layer, SCSI, will not tolerate it.

With Fibre Channel, each switch keeps track of how many credits it has available over each of it’s links.  If it has credits, it sends data, decrements the credits and stops sending data when it runs out of credits.  As it receives Receiver Ready frames (R_RDY), it replenishes its credits.  At no time does it allow itself to use more capacity than it has agreed upon with the distant end.

802.3 defines a PAUSE frame in order to implement a lossless fabric over ethernet.  With PAUSE, a switch does not keep track of B2B credits, it simply just keeps sending data until the far end tells it to stop.  So long as their are enough buffers in the output queues, this should work fine.  Perhaps the switch was about to send a frame but received a PAUSE, it will wait, the frame will sit in queue, and then it will put it on the wire as soon as it receives the go ahead.  The “go ahead” is in fact another PAUSE frame but with a time of 0, essentially an un-PAUSE.

Some of the benefits of PAUSE over B2B credits, is that with PAUSE and Priority Flow Control (PFC), you can pause individual lanes of the link based on priority.  With B2B you either sent data or you didn’t with no regard for prioritization.  This gives increased control in the system, we can send data that can tolerate a drop in order to maximize the efficiency of the link, and at the same time pause data that cannot handle a drop.

So on the surface it can appear that FCoE and its ability to create a lossless network does just about all we need it to do.  But I would caution that when it comes to a long distance link be careful about trying to run FCoE.  First, it was not designed with long distance links in mind.  This is what FCIP is for.  Another alternative is to run FC and use B2B credits – for example over a DWDM link.  Both of these (FCIP and FC) rely on B2B credits.  The allocation of agreed upon resources at the beginning of the exchange is essential for reliable delivery of data over a long distance.  With such long distances and the size of a FC frame at 2112 bytes, it can take a considerable number of frames in flight to fill a wire.  You don’t want to cross paths, in flight with a PAUSE frame when you have 100+ frames in flight.

A quick review of what Fibre Channel looks like on a wire:

  •  At 1Gbps a FC frame is 4km long, at 2Gbps a frame is 2km long, and at 4Gbps a frame is 1km long.
  • A 10km cable is 20km round trip.  Round trip must be accounted for since the R_RDY packet reply from the distant end needs to traverse this distance before the source receives it and can continue to send.  The goal is the saturate the link with as much traffic as possible.  In order to do this, you must allocate enough B2B credits, which is essentially the allocation of buffer resources.
  • On 10km cable @ 2Gbps we need 10 BB credits, 100km cable @ 1Gbps we need  50 BB credits, etc.

This is basic link engineering with Fibre Channel.  Calculate the amount of B2B credits you need, assign them to the switches, and you should have no worries.  You still may need to worry about latency, and of course you will always have to consider the bandwidth delay product of any link to understand it’s true capacity, but your lossless needs will be handled.

If you try to do this with FCoE you may end up in a bad race condition.  There may be substantial frames in flight by the time the distant end generates a PAUSE.  Ethernet is not a guaranteed delivery medium, so the in flight stream is not going to be buffered by the sender.  Frames could get dropped, and lots of them.  This issue doesn’t really exist on a localized data center converged network, because the speeds are so fast (10GB, typically at least 4GB for FC alone), and the distances so small (meters not kilometers) that things can be handled in relative harmony, a hysteresis does not exist that would leave to the type of adverse condition that may exist on a WAN. I am not saying it will not work, and no doubt some people will try it and may be doing it today.  FCoE is great in the data center, it’s not the right tool for the job on the WAN, use FCIP or FC instead.

This entry was posted in CCIE Storage and tagged . Bookmark the permalink.

Leave a Reply