OpenBSD: PF queues break the 4 Gbps barrier

(undeadly.org)

98 points | by defrost 3 hours ago

11 comments

  • ralferoo 2 hours ago
    In the days when even cheap consumer hardware ships with 2.5G ports, this number seems weirdly low. Does this mean that basically nobody is currently using OpenBSD in the datacentre or anywhere that might be expecting to handle 10G or higher per port, or is it just filtering that's an issue?

    I'm not surprised that the issue exists as even 10 years ago these speeds were uncommon outside of the datacentre, I'm just surprised that nobody has felt a pressing enough need to fix this earlier in the previous few years.

    • Someone 1 hour ago
      The article is about allowing bandwidth restrictions in bytes/second that are larger than 2³²-1, not about how fast pf can filter packets.

      I guess few people with faster ports felt the need to limit bandwidth for a service to something that’s that large.

      FTA:

      “OpenBSD's PF packet filter has long supported HFSC traffic shaping with the queue rules in pf.conf(5). However, an internal 32-bit limitation in the HFSC service curve structure (struct hfsc_sc) meant that bandwidth values were silently capped at approximately 4.29 Gbps, ” the maximum value of a u_int ".

      With 10G, 25G, and 100G network interfaces now commonplace, OpenBSD devs making huge progress unlocking the kernel for SMP, and adding drivers for cards supporting some of these speeds, this limitation started to get in the way. Configuring bandwidth 10G on a queue would silently wrap around, producing incorrect and unpredictable scheduling behaviour.

      A new patch widens the bandwidth fields in the kernel's HFSC scheduler from 32-bit to 64-bit integers, removing this bottleneck entirely.”

      • nine_k 46 minutes ago
        > silently wrap around, producing incorrect and unpredictable

        Now I'm more scared to use OpenBSD than I was a minute before.

        I strongly prefer software that fails loudly and explicitly.

    • traceroute66 1 hour ago
      > Does this mean that basically nobody is currently using OpenBSD in the datacentre or anywhere

      Half the problem is lack of proper drivers. I love OpenBSD but all the fibre stuff is just a bit half-baked.

      For a long time OpenBSD didn't even have DOM (light-level monitoring etc.) exposed in its 1g fibre drivers. Stuff like that automatically kills off OpenBSD as a choice for datacentres where DOM stats are a non-negotiable hard requirement as they are so critical to troubleshooting.

      OpenBSD finally introduced DOM stats for SFP somewhere around 2020–2021, but it doesn't always work, it depends if you have the right magic combination of SFP and card manufacturer. Whilst on FreeBSD it Just Works (TM).

      And then overall, for higher speed optics, FreeBSD simply remains lightyears ahead (forgive the pun !). For example, Decisio make nice little router boxes with 10g SFP+ on them, FreeBSD has the drivers out-of-the-box, OpenBSD doesn't. And that's only an SFP+ example, its basically rolling-tumbleweed in a desert territory if you start venturing up to QSFP etc. ...

      • CursedSilicon 27 minutes ago
        How much work is it to port drivers between Free and Open BSD?
    • ffk 28 minutes ago
      A lot of the time once you get into multi-gig+ territory the answer isn't "make the kernel faster," it's "stop doing it in the kernel."

      You end up pushing the hot path out to userland where you can actually scale across cores (DPDK/netmap/XDP style approaches), batch packets, and then DMA straight to and from the NIC. The kernel becomes more of a control plane than the data plane.

      PF/ALTQ is very much in the traditional in-kernel, per-packet model, so it hits those limits sooner.

    • IcePic 2 hours ago
      One thing could also be that by the time you have 10GE uplinks, shaping is not as important.

      When we had 512kbit links, prioritizing VOIP would be a thing, and for asymmetric links like 128/512kbit it was prudent to prioritize small packets (ssh) and tcp ACKs on the outgoing link or the downloads would suffer, but when you have 5-10-25GE, not being able to stick an ACK packet in the queue is perhaps not the main issue.

    • citrin_ru 2 hours ago
      AFAIK performance is not a priority for OpenBSD project - security is (and other related qualities like code which is easy to understand and maintain). FreeBSD (at least when I followed it several years ago) had better performance both for ipfw and its own PF fork (not fully compatible with OpenBSD one).
      • traceroute66 1 hour ago
        > AFAIK performance is not a priority for OpenBSD project - security is

        TBF that was the case historically, but they have absolutely been putting in an effort into performance in their more recent releases.

        Lots of stuff that used to be simply horrific on OpenBSD, such as multi-peer BGP full-table refreshes is SIGNIFICANTLY better in the last couple of years.

        Clearly still not as good as FreeBSD, but compared to what it was...

    • toast0 2 hours ago
      > Does this mean that basically nobody is currently using OpenBSD in the datacentre or anywhere that might be expecting to handle 10G or higher per port, or is it just filtering that's an issue?

      This looks like it only affects bandwidth limiting. I suspect it's pretty niche to use OpenBSD as a traffic shaper at 10G+, and if you did, I'd imagine most of the queue limits would tend toward significantly less than 4G.

  • haunter 12 minutes ago
    My local fiber finally offers 4 Gbps connection but I’m not even sure what to use it for lol.
  • gigatexal 40 minutes ago
    It’s still single threaded. PF in FreeBSD is multithreaded. For home wan’s I’d be using openBSD. For anything else FreeBSD.
  • rayiner 2 hours ago
    Can pf actually shape at speeds above 4 gbps?
  • bell-cot 2 hours ago
    "Values up to 999G are supported, more than enough for interfaces today and the future." - Article

    "When we set the upper limit of PC-DOS at 640K, we thought nobody would ever need that much memory." - Bill Gates

    • throw0101d 2 hours ago
      > "Values up to 999G are supported, more than enough for interfaces today and the future." - Article

      Especially given that IEEE 802.3dj is working on 1.6T / 1600G, and is expected to publish the final spec in Summer/Autumn 2026:

      * https://en.wikipedia.org/wiki/Terabit_Ethernet

      Currently these interfaces are only on switches, but there are already NICs at 800G (P1800GO, Thor Ultra, ConnectX-8/9), so if you LACP/LAGG two together your bond is at 1600G.

    • bitfilped 2 hours ago
      Yes, we're already running 800G networks, so this phrasing seems really silly to me.
    • WhyNotHugo 2 hours ago
      Honestly, I'm really curious about this number. 10bits is 1024, so why 999G specifically?
      • abound 2 hours ago
        Looking at the patch itself (linked in the article), the description has this:

        > We now support configuring bandwidth up to ~1 Tbps (overflow in m2sm at m > 2^40).

        So I think that's it, 2^40 is ~1.099 trillion

      • elevation 2 hours ago
        Looks like an arbitrary validation cap. By the time we're maxing out the 64-bit underlying representation we probably won't be using Ethernet any more.
        • palmotea 2 hours ago
          > By the time we're maxing out the 64-bit underlying representation we probably won't be using Ethernet any more.

          We will be using Ethernet until the heat death of the universe, if we survive that long.

        • bell-cot 2 hours ago
          https://en.wikipedia.org/wiki/Ethernet#History (& following sections)

          Calling something "Ethernet" amounts to a promise that:

          - From far enough up the OSI sandwich*, you can pretend that it's a magically-faster version of old-fashioned Ethernet

          - It sticks to broadly accepted standards, so you won't get bitten by cutting-edge or proprietary surprises

          *https://en.wikipedia.org/wiki/OSI_model

  • holdtman47 2 minutes ago
    [dead]
  • riteshyadav02 2 hours ago
    [dead]
  • Heer_J 1 hour ago
    [dead]
  • jamesvzb 1 hour ago
    [dead]
  • Heer_J 2 hours ago
    [dead]
  • chokan 1 hour ago
    dsa