Oct 19 2015
Last week Renato Botelho do Couto, a pfSense developer and FreeBSD ports committer, presented a talk on pfSense at Lantinoware. Renato reported that the room for his talk was full, and that many people wanted to talk after. It’s great to see this type of response to pfSense in the world.
During the past month, I’ve attended vBSDcon 2015 in Virginia, USA, EuroBSDcon 2015 in Stockholm, Sweden, and BSDCon Brazil 2015, in Fortaleza, Brazil. All along that way, (over 27,000 miles or 43,400 km), I’ve enjoyed having Groff, the BSD Goat as a traveling companion, and meeting many great BSD and pfSense people in each location.
At vBSDcon, EuroBSDcon and BSDCon Brazil, either George Neville-Neil or I spoke on, “Measure Twice, Code Once”. This is our continuing series reporting on a continual, longitudinal study of networking performance in FreeBSD and pfSense. The most recent developments here are the big improvement in IPsec performance with AES-NI support (1270 Mbps throughput, single stream, for AES-GCM with a 128-bit key on a pair of ~3GHz E5 Xeon CPUs), and the introduction of ‘tryforward’ to FreeBSD. The IPSec changes are already in -CURRENT, and the MFC to -STABLE has been accomplished in our FreeBSD tree on github. With some luck, these will also be present in FreeBSD 10.3-RELEASE, when it occurs.
FreeBSD has had a ‘turbo’ button of sorts since 2003. Enabling this feature via “sysctl -w net.inet.ip.fastforwarding=1” on FreeBSD, or via System > Advanced > System Tunables on pfSense, improves forwarding, but at the expense of reception of packets on the box (a 4% hit compared to fastforwarding=0), and, more importantly for pfSense, disabling IPsec. The tryforward code replaces the fastforward path with a tryforward() function. Since this isn’t controlled by a sysctl, it is “always on”. Importantly, tryforward() both improves the reception of packets on the box (around a 1% hit .vs the normal (non-fastforward) kernel path), and also results in functioning IPsec.
While this doesn’t improve the speed of IPsec, it does allows us to be rid of the fake fastforwarding path and have good forwarding in the normal case while also having IPSEC in the kernel. The tryforward() code should make it into pfSense version 2.3.
Also at BSDCon Brazil, Luiz Otavio Souza, a pfSense developer and FreeBSD src commiter, presented on his recent work, “netmap-forward: An IPv4 router over netmap for FreeBSD”. This is basically the FreeBSD fastforward code ported to run in userspace over netmap.
When evaluating or measuring an Ethernet device’s (switches, routers, firewalls) performance capabilities, the main indicator that most will consider is the raw bandwidth that the device backplane can provide. However it is also important to make sure that the device has the capacity or the ability to switch/route as many packets as required to achieve wire rate performance. This metric is known as ‘Packets per Second’ or PPS.
Obviously, the smallest packet size will lead to the largest PPS rate, IF the system can handle it.
On Ethernet, the smallest frame size is 64 bytes, and if you look at router or switch literature very long, you’ll see reports of “64 byte packets”. Importantly, this doesn’t count some additional framing overhead on Ethernet. A true “minimum-sized” frame on Ethernet consists of a 12 byte inter-frame gap, 8 bytes of MAC preamble + SFD, 14 bytes of MAC header (6 bytes source address, 6 bytes destination address, 2 bytes of Ethernet ‘type’ (e.g. 0x0800 is IPv4, 0x0806 is ARP), 46 bytes of minimum payload (for IPv4 this includes any IP header, plus UDP or TCP headers, plus a very small amount of data. A IPV4 + UDP frame would allow a mere 6 bytes of payload. The headers for IPv6 are large enough to), and finally a 4 byte CRC. Combined, this results in a minimum packet of 84 bytes (20 + 64). Similarly, the maximum MTU Ethernet frame size is 1538 bytes for a 1500 byte frame. (12 + 8) + 14 + 1500 + 4). 802.1q VLAN tagging allows four more bytes, if enabled.
The maximum frame rate of 1Gbps Ethernet is 10*10^8 bits/sec / (84 bytes * 8 bits/byte). This equals 1,488,095 PPS. 10Gbps Ethernet is 10X the rate, or 14,880,952 PPS. Increasing the frame size of 1500 byte (max MTU) packets substantially reduces the required PPS rate to ‘fill’ the interface. Using the equation above, and substituting 1538 byte frames for 84 byte frames, we see that it only requires 81,274 PPS to fill a 1Gbps Ethernet with maximum-sized frames. 10Gigabit Ethernet is, again 10X this rate, requiring 812,743 PPS to fill a 10G interface with max-sized frames.
These are high, though achievable rates for software routers. Using a Xeon E3-1275 (4 cores @ 3.5GHz) FreeBSD -CURRENT can forward at a rate of around 1.058 Mpps. Turning on fastforwarding (or building a kernel with tryforward support) increases this rate to about 1.33Mpps. While this is enough to ‘fill’ a 10Gbps link with full-sized frames, not all frames are full-sized, and the true test of a router is it’s ability to forward a mix of traffic, throttled only by the speed of its network interfaces.
Using a bit more pedestrian hardware, such as the C2758 that is for sale on the pfSense store, we find that we can forward at a rate of around 270 Kpps, and with fast forwarding or tryforward, we can obtain 426 Kpps. A simple SG-2220 will support 123 Kpps until we enable fastforward or tryforward, when we can obtain 217 Kpps.
netmap-fwd, available as BSD licensed open source on github, substantially changes these results.
The SG-2220 that previously forwarded 123 Kpps or 217 Kpps with fastforward will obtain 945 Kpps with netmap-fwd.
The C2758 that could forwarded 270 Kpps or 426 Kpps with fastforward on, will obtain 1.683 Mpps with netmap-fwd over a Chelsio T-520 10G Ethernet interface.
And the Xeon E3-1275 that would previously strain to obtain 1.33 Mpps with fastfoward on will obtain 5.05 Mpps with netmap-fwd using an Intel X520 10G interface. This is over 1/3 of the line-rate required to forward a full 10G of minimum-sized IP packets.
With netmap-fwd, the host stack is still available, so packets destined for the router are correctly routed to and from the host stack. This means the applications you known and love still work. Want to use ssh to manage your router? It works. Ansible? It runs over ssh. Saltstack? It should work, we haven’t tried it yet. VLANs are also supported. Configuration is simple: you configure the interfaces via the normal mechanism on FreeBSD (ifconfig, rc.conf, etc), and start netmap-fwd, giving it a list of interfaces.
Importantly, the numbers cited are all without substantial tuning, and are using an early, and still in-development version of netmap-fwd that is limited to running on a single core. All of the devices above have multiple cores, and it is likely that we can substantially increase the performance obtained thus far using multi-threaded techniques. We will also add ACLs, IPv6, BGPD / FIB integration, and better runtime statistics. Additional protection will be gained by using Capsicum to sandbox the application.
If you want to read more, Luiz’s slides are available.
Back in February, I wrote a blog post that discussed our plans for pfSense software version 2.3, which is now in alpha, and our plans for pfSense 3.0. While I promoted DPDK then, we’ve since found that netmap provides a simpler API, and substantially better safety, as the device drivers remain in the kernel, rather than running in userspace with DPDK. Still, DPDK provides a set of libraries, such as longest-prefix match, which uses a variation of the DIR-24-8 algorithm for routing lookups, which we should find useful in our pursuit of the ultimate open source software router.
“Das ist sehr, sehr viel Arbeit die da versprochen wird.”, indeed. But we are making good on that promise.
With the advent of netmap-fwd, the road ahead to pfSense 3.0 can be clearly seen. As Tom Wolfe wrote, “Put your good where it will do the most!”
We’re doing that. Join us.