Start a conversation

How to benchmark an ExaNIC

We use sockperf for testing because it is open-source, well understood and has similar properties to real applications.

For this example we use the ExaNIC X10. The ExaNIC X4 is our first generation NIC so it will be a bit slower than the X10 ( about 170ns slower). 


  1. Configure machine and the Linux kernel for high performance bench marking
  2. Download and install the latest software and firmware here: https://exablaze.com/support
  3. Update firmware on nic:
    eg. exanic-fwupdate -d exanic0 exanic_X10_20170821.fw
  4. Configure kernel and hardware for high performance. See details here: https://exablaze.com/docs/exanic/user-guide/benchmarking/benchmarking/
  5. Check that the hardware and software is working correctly
  6. Configure exanic to bypass only mode
    e.g.: exanic-config exanic0:0 bypass-only on
  7. Configure exanic to local loopback mode
    e.g.: exanic-config exanic0:0 local-loopback on
  8. Run exanic-loopback test:
    cd /perf-test/
    make exanic
    perf-test# taskset -c 2 ./exanic_loopback exanic0 0 0  64 1000000
    min=706ns median=769ns max=5344ns first=2043ns cpu_ghz=3.492
    RESULT:
    Exanic local loopback application-to-wire-to-application time has a median of <800ns (769ns).
    Therefore Exanic HW and host software is performing correctly. 
  9. Test IP configuration
  10. Set up a second machine with configuration options from 1-3.
    Connect port 0 of the two cards together. 
  11. Turn off bypass and local loopback
    e.g:  exanic-config exanic0:0 bypass-only off
    exanic-config exanic0:0 local-loopback off
  12. Set up IP's on both hosts
    e.g: client# ifconfig enp1s0 10.10.0.1
    server#  ifconfig enp1s0 10.10.0.2
  13. Test that the servers are properly connected
    client# ping 10.10.0.2
    PING 10.10.0.2 (10.10.0.2) 56(84) bytes of data.
    64 bytes from 10.10.0.2: icmp_seq=1 ttl=64 time=0.030 ms
    64 bytes from 10.10.0.2: icmp_seq=2 ttl=64 time=0.010 ms
    server# ping 10.10.0.1
    PING 10.10.0.1 (10.10.0.1) 56(84) bytes of data.
    64 bytes from 10.10.0.1: icmp_seq=1 ttl=64 time=0.019 ms
    64 bytes from 10.10.0.1: icmp_seq=2 ttl=64 time=0.010 ms
    64 bytes from 10.10.0.1: icmp_seq=3 ttl=64 time=0.010 ms
    RESULT: ExaNICs are connected and exchanging IP messages
  14. Test unaccelerated (slow) sockperf: Run sockperf on client and server:
    server#  sockperf sr -i 10.10.0.2
    client#  sockperf pp -i 10.10.0.2 -t5 -m 14
    sockperf: == version #3.1-16.gitc6a0d0e3ab53 ==
    sockperf[CLIENT] send on:sockperf: using recvfrom() to block on socket(s)
    [ 0] IP = 10.10.0.2       PORT = 11111 # UDP
    sockperf: Warmup stage (sending a few dummy messages)...
    sockperf: Test end (interrupted by timer)
    sockperf: Test ended
    sockperf: [Total Run] RunTime=5.450 sec; SentMessages=449097; ReceivedMessages=449096
    sockperf: ========= Printing statistics for Server No: 0
    sockperf: [Valid Duration] RunTime=5.000 sec; SentMessages=412007; ReceivedMessages=412007
    sockperf: ====> avg-lat=  6.038 (std-dev=0.294)
    sockperf: # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
    sockperf: Summary: Latency is 6.038 usec
    sockperf: Total 412007 observations; each percentile contains 4120.07 observations
    sockperf: ---> <MAX> observation =   77.856
    sockperf: ---> percentile 99.999 =   69.863
    sockperf: ---> percentile 99.990 =   10.248
    sockperf: ---> percentile 99.900 =    7.035
    sockperf: ---> percentile 99.000 =    6.261
    sockperf: ---> percentile 90.000 =    6.062
    sockperf: ---> percentile 75.000 =    6.040
    sockperf: ---> percentile 50.000 =    6.025
    sockperf: ---> percentile 25.000 =    6.009
    sockperf: ---> <MIN> observation =    4.473
    Result: ExaNICs are exchanging UDP traffic through the (slow) kernel interface
  15. Run accelerated  (fast) UDP sockperf on client and server:
    server# exasock taskset -c 2 sockperf sr -i 10.10.0.2
    client# exasock taskset -c 2  sockperf pp -i 10.10.0.2 -t5 -m 14
    sockperf: == version #3.1-16.gitc6a0d0e3ab53 ==
    sockperf[CLIENT] send on:sockperf: using recvfrom() to block on socket(s)
    [ 0] IP = 10.10.0.2       PORT = 11111 # UDP
    sockperf: Warmup stage (sending a few dummy messages)...
    sockperf: Starting test...
    sockperf: Test end (interrupted by timer)
    sockperf: Test ended
    sockperf: [Total Run] RunTime=5.450 sec; SentMessages=2882949; ReceivedMessages=2882948
    sockperf: ========= Printing statistics for Server No: 0
    sockperf: [Valid Duration] RunTime=5.000 sec; SentMessages=2645088; ReceivedMessages=2645088
    sockperf: ====> avg-lat=  0.932 (std-dev=0.034)
    sockperf: # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
    sockperf: Summary: Latency is 0.932 usec
    sockperf: Total 2645088 observations; each percentile contains 26450.88 observations
    sockperf: ---> <MAX> observation =    3.266
    sockperf: ---> percentile 99.999 =    1.309
    sockperf: ---> percentile 99.990 =    1.233
    sockperf: ---> percentile 99.900 =    1.150
    sockperf: ---> percentile 99.000 =    1.060
    sockperf: ---> percentile 90.000 =    0.969
    sockperf: ---> percentile 75.000 =    0.945
    sockperf: ---> percentile 50.000 =    0.925
    sockperf: ---> percentile 25.000 =    0.912
    sockperf: ---> <MIN> observation =    0.862
    RESULT: Exasock does UDP 1/2RTT in 862ns. 
  16. Run accelerated TCP (fast) sosckperf on client and server
    server# exasock taskset -c 2 sockperf sr -i 10.10.0.2 --tcp
    client#  exasock taskset -c 2  sockperf pp -i 10.10.0.2 -t5 -m 14 --tcp
    sockperf: == version #3.1-16.gitc6a0d0e3ab53 ==
    sockperf[CLIENT] send on:sockperf: using recvfrom() to block on socket(s)
    [ 0] IP = 10.10.0.2       PORT = 11111 # TCP
    sockperf: Warmup stage (sending a few dummy messages)...
    sockperf: Starting test...</code></div> <div><code>sockperf: Test end (interrupted by timer)
    sockperf: Test ended</code></div> <div><code>sockperf: [Total Run] RunTime=5.450 sec; SentMessages=2756552; ReceivedMessages=2756551
    sockperf: ========= Printing statistics for Server No: 0<
    sockperf: [Valid Duration] RunTime=5.000 sec; SentMessages=2527996; ReceivedMessages=2527996
    sockperf: ====> avg-lat=  0.977 (std-dev=0.033)
    sockperf: # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
    sockperf: Summary: Latency is 0.977 usec
    sockperf: Total 2527996 observations; each percentile contains 25279.96 observations
    sockperf: ---> <MAX> observation =    4.008
    sockperf: ---> percentile 99.999 =    1.340
    sockperf: ---> percentile 99.990 =    1.263
    sockperf: ---> percentile 99.900 =    1.205
    sockperf: ---> percentile 99.000 =    1.094
    sockperf: ---> percentile 90.000 =    1.014
    sockperf: ---> percentile 75.000 =    0.991
    sockperf: ---> percentile 50.000 =    0.973
    sockperf: ---> percentile 25.000 =    0.957
    sockperf: ---> <MIN> observation =    0.905
    RESULT: Exasock does TCP 1/2 RTT ~900ns. 
    You can find more details here https://exablaze.com/media/blog-exasock-acceleration
Choose files or drag and drop files
Was this article helpful?
Yes
No
  1. Phil Manuel

  2. Posted
  3. Updated

Comments