SDS: Simple Dynamic Strings library for C

(github.com)

103 points | by klaussilveira 2 days ago

5 comments

aidenn0 6 minutes ago
Was:
```
  s = sdscat(s,"Some more data");
```
Chosen over
```
  stscat(&s, "Some more data");
```
For performance reasons, or something else?
xenotux 5 hours ago
I'm surprised this is aliased to char*, not const char*. The benefit of the aliasing is convenience, but the main risk is absent-mindedly passing it to a libc function that modifies the string without updating the SDS metadata. Const would result in a compiler warning while letting the intended use cases (e.g., the printf example) work fine.
[-]
- mid-kid 2 hours ago
  The only thing the SDS metadata holds is the string's length. Just like how you'd have to realloc() a regular string before using strcat(), you have to sdsgrowzero() an sds string before using strcat(). Basically, standard libc functions that tamper with the string have the same constraints as malloc()ed strings in terms of safety, only you might want to call sdsupdatelen() after truncating a string.
antirez 5 hours ago
Hi! The Redis tree contains more advanced versions of this library. Most of the development continued there, eventually.
[-]
- jacquesm 5 hours ago
  It might be worth extracting it back out, it seems pretty useful.
  [-]
  - antirez 4 hours ago
    Indeed, but in some way the Redis version is a bit too Redis-ish, that is, memory saving concerns are taken to the extreme instead of having a more balanced approach about simplicity. In my YouTube channel C course, I'm showing something similar to SDS in the latest lessons, and I may use SDS again in later course in order to show how to integrate back the useful features that diverged. Maybe an SDS3 maybe a middle ground among the Redis version, some API error that should be corrected (but not in Redis: not worth it), and other improvements.
- underdeserver 5 hours ago
  Hi! Are there no performance penalties that come from alignment issues? Or are your prefix structs aligned to cache line sizes?
- klaussilveira 4 hours ago
  Thank you for creating sds, btw. Very useful to have it on the toolbelt.
  Oh, and redis. That too. :)
dang 5 hours ago
Related:
Simple Dynamic Strings library for C, compatible with null-terminated strings - https://news.ycombinator.com/item?id=21692400 - Dec 2019 (83 comments)
Simple Dynamic Strings library for C - https://news.ycombinator.com/item?id=7190664 - Feb 2014 (127 comments)
zoddie 6 hours ago
Why not just use C++ strings and string_views? So weird to see this masochistic obsession some people have with doing everything in plain C.
It's 2026, there are better, more memory safe, more efficient solutions out there.
[-]
- MontyCarloHall 6 hours ago
  To actually answer your question (beyond the snark/appeal to authority replies you’ve already gotten), there are a couple good reasons:
  — You're working in embedded development (but somehow need a full-fledged dynamic string library).
  — While it's true that C++ is (almost) a strict superset of C, and “you don’t pay for what you don’t use” is a good rule of thumb, it can be very hard to restrict a team of developers to eschew all that complexity you dearly pay for and treat C++ as “C with classes and the STL.” Without very strict coding standards (and a means of enforcing them), letting a team of developers use C++ is often opening a Pandora's Box of complex, obscure language features. Restricting a project to plain old C heads that off at the pass.
  [-]
  - aidenn0 5 minutes ago
    As soon as you move to "C with classes and the STL" you've now also bought into exceptions, as the STL is not even remotely ergonomic with exceptions disabled.
  - LegionMammal978 6 hours ago
    > You're working in embedded development (but somehow need a full-fledged dynamic string library).
    The situation isn't all that implausible: e.g., many ESP32-based devices want to work with strings to interface with HTTP servers, and they do have C++ support, but the size limit is small enough that you can easily bump your head into it if you aren't careful.
    [-]
    - mikepurvis 6 hours ago
      Or anything processing JSON— it's nice to be able to get string views directly into the original payload without having to copy them into fixed size buffers elsewhere.
  - duped 5 hours ago
    > and the STL
    Even this has a lot of "payment" for what you don't use. Even some C++ libraries forbid it just because of the size of debug symbols.
- tbrockman 6 hours ago
  > SDS was a C string I developed in the past for my everyday C programming needs, later it was moved into Redis where it is used extensively and where it was modified in order to be suitable for high performance operations. Now it was extracted from Redis and forked as a stand alone project.
- Zambyte 6 hours ago
  Suggesting C++ as a solution in the face of "masochist obsession" is... an interesting choice :-D
  [-]
  - rossant 5 hours ago
    Totally. As a C developer, I really suffer whenever I need to touch anything C++.
- simonebrunozzi 6 hours ago
  This is the guy that created Redis. I would look at his repos in a different way.
- derefr 6 hours ago
  If your code is plain C, then anyone can extend it with, or embed it inside, code of literally any other language; and in so doing, they will have full access/exposure to everything in your codebase — all the same stuff that they would if they were writing their host/extension code in C.
  This is not true of C++ (or most other languages):
  • C++ has a runtime (however minimal); and so, by including any C++ code in a codebase, you're making it much more difficult to link/embed the resulting code — you now have to also dynamically link the C++ runtime, and ensure that your host code spins it up "early", before any of the linked C++ code gets to run. (This may even be impossible in some host languages!)
  • Also, even if there was no associated runtime to deal with, C++ isn't wholly C-FFI-clean. All the stuff that people like about C++ — all the reasons you'd want to use C++ — result in codebases that aren't cleanly C-FFI exposable, due to name mangling, functions taking parameters with non-C-exportable types, methods + closures not being C-FFI thunkable [and functions returning those], etc.
  • And even if you bite that bullet, and write your library in C++ but carefully wrap its API to give it C-FFI-clean linkage (usually via a hybrid C / C++ project), this still introduces a layer of FFI runtime overhead. When another non-C language consumes your code, it's then getting double FFI overhead — a call from its code to yours has to convert from its abstractions, to C's abstractions, to C++'s abstractions, and back. (This is why you don't tend to see e.g. non-C++ projects embedding LLVM, or LLVM being extended with non-C++ passes, despite LLVM being designed in this "C wrapper around a C++ core" style.)
  C is one of the only languages with a zero-impedance-mismatch, zero-overhead default or forced binding of external symbols to the C FFI (i.e. the C set of platform ABIs + C symbol naming standard.)
  The others that do this are: C3 (https://c3-lang.org/); Zig, unless you do weird things on purpose, and... that's really it. Everything else has the same two problems as C++ outlined above.
  Even Rust, even Odin, etc. only provide C-FFI linkage as an opt-in feature; and they do nothing to incentivize use of it; and so, of course, due to their useful non-C-FFI-clean features, developers are disincentivized from ever enabling it before they "need" it. So in practice, most libraries in those languages are not consumable from C [or other C-FFI-compatible languages] — and most software in those languages are not extendable in C [or another C-FFI-compatible language] — without extra effort on the upstream's part to add explicit support for doing that. And most upstreams don't bother.
  Writing software in C itself, is essentially a way for a project to "tie itself to the mast" and commit to its ABI always being C-FFI clean; such that it can be consumed not only from C, but also from any other language a project might use that supports importing C-FFI libraries. (Which is most languages.)
  [-]
  - anitil 2 hours ago
    > C++ has a runtime (however minimal)
    I'm not familiar with this, are you able to explain it? Do you mean something analogous to _start?
  - CyberDildonics 4 hours ago
    C++ has a runtime (however minimal);
    No it doesn't.
    Also, even if there was no associated runtime to deal with, C++ isn't wholly C-FFI-clean
    Yes it is, you just extern "C" whatever you want.
    All the stuff that people like about C++ — all the reasons you'd want to use C++ — result in codebases that aren't cleanly C-FFI exposable
    Not true at all, the biggest two things, destructors and move semantics you still have everywhere except for the boundaries with C.
    And even if you bite that bullet, and write your library in C++ but carefully wrap its API to give it C-FFI-clean linkage (usually via a hybrid C / C++ project), this still introduces a layer of FFI runtime overhead
    There is no overhead here, it is not different from C.
    I don't know where all this comes from, but I doubt it comes from heavy experience with modern C++.
    [-]
    - cyber1 1 hour ago
      No, it does. "The only two features in the language that do not follow the zero-overhead principle are runtime type identification and exceptions, and are why most compilers include a switch to turn them off." - https://en.cppreference.com/w/cpp/language/Zero-overhead_pri...
      [-]
      - CyberDildonics 1 hour ago
        So saying 'it has a runtime' doesn't really make sense, it has a runtime if you want for two features that aren't necessary.
- bryanlarsen 5 hours ago
  Redis started in 2009, and this library was started there. string_view didn't appear until C++17.
- uecker 5 hours ago
  I switched to C because I could not stand the pain of using C++ anymore. I find C refreshingly simple.
  (Also, as a comment to other responses: C++ is not a superset of C, it is a fork from 95 with divergent language evolution since then).
  [-]
  - jacquesm 5 hours ago
    It is not a fork from 95. Cfront, the original C++ compiler was a new front-end for the C compiler from 1983 that output C which was then compiled regularly.
    https://en.wikipedia.org/wiki/Cfront
    [-]
    - uecker 4 hours ago
      Maybe, but it is 1995 when it diverged: https://isocpp.org/wiki/faq/c Before it was an extension since then it is a fork.
- MintPaw 6 hours ago
  There are real downsides to even #including C++ headers. And there are certainly downsides to introducing a templated string type. It's not hard to imagine why people would want another solution.
- gkbrk 5 hours ago
  How am I supposed to use C++ strings and string_views in C?
- hiccuphippo 5 hours ago
  Someone had to create C++ strings so C++ developers could use them. What's wrong with someone doing the same for C so C developers can use it?
- spookie 6 hours ago
  I guess those stuck on MSVC might have this perception, but newer C standards have added plenty of niceties. Unsure if the claim that Cpp is safer is correct.
- ethin 6 hours ago
  But if your in C...
- jimbob45 6 hours ago
  Seems like everyone wants to believe they’re as skilled and hardcore as the kernel devs. In reality, I agree - C++ is basically a superset of C and the whole point of “you don’t pay for what you don’t use” is to be able to avoid ridiculous situations like these.
  [-]
  - anitil 2 hours ago
    > "you don’t pay for what you don’t use"
    In my experience (mostly embedded development) including C++ in a C project adds a lot of build complexity and build time, whereas C99 or C89 is trivial to install in pretty much all situations