M droplet in 507 lines of stdlib

(bellone.com)

19 points | by b_llc 15 hours ago

4 comments

SkiFire13 11 hours ago
> The implementation saturates CPU before reaching I/O limits.
Is this supposed to be a pro?
[-]
- b_llc 11 hours ago
  Good question! Yes, CPU saturation is the desired behavior here.
  The multipart streaming workload is inherently expensive. The cost of generating boundaries and constructing headers scales with request count and payload size. The architecture demonstrates efficient resource utilization: bounded memory usage (<225MB) while maximizing CPU throughput.
  CPU saturation with bounded memory means performance scales predictably with processing power. On multicore systems, you can leverage multiple processes to effectively utilize all cores. Alternatively, you can distribute the workload horizontally using droplets as cost-efficient instances.
  [-]
  - syntonym2 8 hours ago
    > The multipart streaming workload is inherently expensive.
    Streaming workload is not inherently expensive. The main work is to bring the bytes of the files to the network card as quick as possible, and nearly no computation needs to be performed.
    > The cost of generating boundaries and constructing headers scales with request count and payload size.
    The only computation necessary to generate boundaries is to ensure that the chosen boundary does not occur in the content, and it seems that the code does not actually check this, but generates a random UUID4. Boundaries and headers are per-file and could be cached, so they don't scale with the number of requests or payload size.
    [-]
    - b_llc 6 hours ago
      You're right about boundary caching opportunities. The computational cost I'm referring to comes from the construction of the per-request multipart header, and file metadata operations, rather than just boundary generation.
      The "inherently expensive" claim was overstated - it's expensive relative to static file serving, and unavoidable, but you're correct that there are optimization opportunities in the current implementation. I've identified three opportunities to improve the design: boundary generation, content type assignment, and header construction.
      One clarification: the dynamic bundling via query parameters limits traditional caching strategies. Since each request can specify different file combinations, the multipart response structure must be generated per-request rather than cached.
      Axon also represents a core framework. How you implement caching depends on your specific use case - this becomes business logic rather than request processing logic. Its minimal tooling is intended to be a feature, though, as you have pointed out, it can also be limiting.
  - nomel 9 hours ago
    By I/O limits, do you mean memory size limits? If so, this wording will lead to much confusion, since addressing limits (which is what a memory limit is) is a somewhat unusual use of "I/O limits" which, in a streaming context, most would perceive as a bandwidth limit (either memory or network).
    [-]
    - b_llc 9 hours ago
      By I/O limits, I meant network bandwidth and disk throughput limits, not memory capacity.
      Thanks for pointing out the ambiguity.
      [-]
      - nomel 8 hours ago
        Now I'm more confused. An infinitely efficient system would saturate the network. An infinitely inefficient system would saturate the CPU. " The implementation saturates CPU before reaching I/O limits." is true infinitely inefficient system, but false for an infinitely efficient system. That means it's an undesirable.
        The metric that actually matters is efficiency of the task, given a hardware constraint. In this context, that's entirely network throughput (streaming ability/hardware, with hardware being constant, you can just compare streaming ability directly).
        For a litmus test of the concept, if you rewrote this in C or Rust, would the CPU bottleneck earlier or later? Would the network throughput be closer or further from its bottleneck?
        [-]
        b_llc 8 hours ago
        You're right - this represents computational duress, not optimal efficiency. The 1 CPU struggles to handle the 50 concurrent user scenario and was chosen to demonstrate worst-case behavior rather than peak performance. I intended to stress test the framework. I did not mean to indicate that CPU saturation is ideal but rather highlight that performance remained predictable even at the limits.
        Lower-level languages would certainly offer higher performance. I was hoping to showcase how Python can perform when architecture is restrained. The goal was to show that careful design choices (bounded memory, generator-based streaming) can maintain predictable behavior even when computational resources are exhausted.
nomel 4 hours ago
This appears to be an AI codebase, with AI written replies in the comments here.
sc68cal 8 hours ago
> The implementation saturates CPU before reaching I/O limits
So, I did look over the code and the thing that I walked away asking was "isn't this sort of the reason why sendfile(2) was developed?"
[-]
- b_llc 8 hours ago
  Axon generates dynamic multipart responses with boundaries and headers to bundle files specified via query parameters. sendfile handles "serve this specific file" but does not handle "bundle these N files into a multipart response."
  For static file serving, sendfile would be the better choice.
  [-]
  - sc68cal 8 hours ago
    Reading the manpage for sendfile leads me to believe that it can be used for that purpose
    [-]
    - b_llc 7 hours ago
      I might be wrong then! I'll make a note to take a look. Thanks for pointing this out.
skyzouwdev 12 hours ago
[dead]