I don't see why you would have to write your own - there are plenty of options in the crate ecosystem, but perhaps you found them insufficient?
As a video game developer, I've found the case for custom general-purpose allocators pretty weak in practice. It's exceedingly rare that you really want complicated nonlinear data structures, such as hash maps, to use a bump-allocator. One rehash and your fixed size arena blows up completely.
95% of use cases are covered by reusing flat data structures (`Vec`, `BinaryHeap`, etc.) between frames.
That's public information. It's up to you to make the choice whether to trust someone, but the least you can do is look at the code and see if it matches what you would have done.
The allocator we wrote for $dayjob is essentially a buffer pool with a configurable number of "tiers" of buffers. "Static tiers" have N pre-allocated buffers of S bytes each, where N and S are provided by configuration for each tier. The "dynamic" tier malloc's on demand and can provide up to S bytes; it tracks how many bytes it has currently allocated.
Requests are matched against the smallest tier that can satisfy them (static tiers before dynamic). If no tier can satisfy it (static tiers are too small or empty, dynamic tier's "remaining" count is too low), then that's an allocation failure and handled by the caller accordingly. Eg if the request was for the initial buffer for accepting a client connection, the client is disconnected.
When a buffer is returned to the allocator it's matched up to the tier it came from - if it came from a static tier it's placed back in that tier's list, if it came from the dynamic tier it's free()d and the tier's used counter is decremented.
Buffers have a simple API similar to the bytes crate - "owned buffers" allow &mut access, "shared buffers" provide only & access and cloning them just increments a refcount, owned buffers can be split into smaller owned buffers or frozen into shared buffers, etc.
The allocator also has an API to query its usage as an aggregate percentage, which can be used to do things like proactively perform backpressure on new connections (reject them and let them retry later or connect to a different server) when the pool is above a threshold while continuing to service existing connections without a threshold.
The allocator can also be configured to allocate using `mmap(tempfile)` instead of malloc, because some parts of the server store small, infrequently-used data, so they can take the hit of storing their data "on disk", ie paged out of RAM, to leave RAM available for everything else. (We can't rely on the presence of a swapfile so there's no guarantee that regular memory will be able to be paged out.)
As for crates.io, there is no option. We need local allocators because different parts of the server use different instances of the above allocator with different tier configs. Stable Rust only supports replacing GlobalAlloc; everything to do with local allocators is unstable, and we don't intend to switch to nightly just for this. Also FWIW our allocator has both a sync and async API for allocation (some of the allocator instances are expected to run at capacity most of the time, so async allocation with a timeout provides some slack and backpressure as opposed to rejecting requests synchronously and causing churn), so it won't completely line up with std::alloc::Allocator even if/when that does get stabilized. (But the async allocation is used in a localized part of the server so we might consider having both an Allocator impl and the async direct API.)
And so because we need local allocators, we had to write our own replacements of Vec, Queue, Box, Arc, etc because the API for using custom A with them is also unstable.
As a video game developer, I've found the case for custom general-purpose allocators pretty weak in practice. It's exceedingly rare that you really want complicated nonlinear data structures, such as hash maps, to use a bump-allocator. One rehash and your fixed size arena blows up completely.
95% of use cases are covered by reusing flat data structures (`Vec`, `BinaryHeap`, etc.) between frames.