I think you should be able to get rid of most of the undocumented API usage with the newer CreateFileMapping2/MapViewOfFile3 APIs. Though that does require a higher minimum OS version and the crucial NtExtendSection function still has no documented equivalent as far as I can tell, so it's kind of a wash.
Also, you can link against ntdll.lib directly. Manually calling GetProcAddress for a few functions isn't a tragedy by any means, but in this case, why bother?
A memory mapped file can be used to store complex object oriented structures if you make use of 'relative' pointers, where you have to add the address of the pointer to the pointer to get to the object you are pointing to. I once used this method, to persist a complex data structure without having to write serialization code.
A long time ago, I implemented some C++ classes that would hide most of the additional work and also took care of allocating new objects inside the memory mapped file. See: https://www.iwriteiam.nl/D0205.html#13MMF (Note that this implementation actually makes use of a slightly different approach where pointers are relative with respect to the first position of the file. This implementation has the limitation that you can only open one such store as a memory mapped file. It was only later I realized that it was possible to do without the offset. I never came to rewriting all the code.)
Boost.Container [1] has reimplementations of the STL containers that use offsets to be compatible with memory-mapped files. Boost.Interprocess [2] has some other useful types, such as smart pointers that are compatible with memory-mapped files, along with platform-independent APIs for handling them.
Most stl classes will work just fine with smart pointers. For example, std::vector takes a custom allocator. You can have the allocators pointer type be a custom pointer class. The pointer class is a wrapper over an offset that is the same size as the native word.
Then the operator-> is just (this + offset).
I've used this approach successfully in many projects. If you're worried about differing stls(probably a reasonable worry here), then use any of the boost containers.
Back in 2022 libtorrent implemented memory mapped IO as part of v2 [1]. Unfortunately, it didn't go so well. Memory usage went through the sky, leading to performance degradation and crashing [2]. The issue is still open in the project to this day, and many programs have stuck with the v1.2 library instead.
It looks like they are headed to a multi-threaded pread implementation now [3] and someone has created a patch to tweak the current mmap fallback that uses pread to perform better in the meantime [4].
> A legitimate problem here is the case where you need to read from or write to many locations at once. With the memory-mapped scheme as described, you can only issue as many concurrent page faults as you have threads in the program.
> Unfortunately, I don’t have great answers here, and view this as a legitimate use case that warrants a dedicated code path using other mechanisms at your disposal.
Right. I think it’s similar on linux. Mmap makes sense until the data size is too large and tlb misses start to add up.
Right. Not tlb, but more of memory subsystem stress. Typically changes to vm layout cant be done concurrently.
Also, if the vm area is large, pages might be thrashing, requiring the kernel to do loads of work swapping things in and out and changing the pagetable struct.
Also, you can link against ntdll.lib directly. Manually calling GetProcAddress for a few functions isn't a tragedy by any means, but in this case, why bother?
Great article nonetheless!
A long time ago, I implemented some C++ classes that would hide most of the additional work and also took care of allocating new objects inside the memory mapped file. See: https://www.iwriteiam.nl/D0205.html#13MMF (Note that this implementation actually makes use of a slightly different approach where pointers are relative with respect to the first position of the file. This implementation has the limitation that you can only open one such store as a memory mapped file. It was only later I realized that it was possible to do without the offset. I never came to rewriting all the code.)
[1] https://www.boost.org/doc/libs/1_86_0/doc/html/container.htm...
[2] https://www.boost.org/doc/libs/1_86_0/doc/html/interprocess....
Then the operator-> is just (this + offset).
I've used this approach successfully in many projects. If you're worried about differing stls(probably a reasonable worry here), then use any of the boost containers.
It looks like they are headed to a multi-threaded pread implementation now [3] and someone has created a patch to tweak the current mmap fallback that uses pread to perform better in the meantime [4].
[1] - https://www.libtorrent.org/upgrade_to_2.0-ref.html
[2] - https://github.com/arvidn/libtorrent/issues/6667
[3] - https://github.com/arvidn/libtorrent/pull/7013
[4] - https://github.com/qbittorrent/qBittorrent/pull/21300
> Unfortunately, I don’t have great answers here, and view this as a legitimate use case that warrants a dedicated code path using other mechanisms at your disposal.
Right. I think it’s similar on linux. Mmap makes sense until the data size is too large and tlb misses start to add up.
The article is describing a separate problem where you can’t issue concurrent reads although that doesn’t feel true if you make use of madvise.
Also, if the vm area is large, pages might be thrashing, requiring the kernel to do loads of work swapping things in and out and changing the pagetable struct.