Overview
Welcome! Let's talk about the buffer pool system—one of the secret weapons for making ascii-chat's real-time video streaming fast and smooth.
You know how constantly calling malloc() and free() can slow things down? Well, imagine doing that 30 times per second for video frames, per client! The buffer pool solves this by pre-allocating a bunch of memory buffers up front, so when you need one, it's ready to go. No waiting for the system allocator, no frame drops, no latency spikes—just grab a buffer and get to work.
Think of it like having a stack of clean plates ready for a dinner party. Instead of washing a plate every time someone needs one, you just grab from the stack. Much faster!
Implementation: lib/buffer_pool.c/h
What does the buffer pool give you?
- Multiple size classes (small/medium/large/xlarge) for different needs
- Thread-safe operation with mutex protection (multiple threads can use it safely)
- Detailed statistics so you can see how well it's working
- Automatic fallback to malloc when pools run dry (graceful degradation)
- Global singleton pattern for convenience (one pool for the whole app)
Architecture
Size Classes
The buffer pool isn't one-size-fits-all. Instead, it provides four different size classes, each optimized for different types of data you'll be working with:
| Size Class | Buffer Size | Pool Count | Total Memory | What's it good for? |
| Small | 1 KB | 1024 | 1 MB | Audio packets (nice and compact) |
| Medium | 64 KB | 64 | 4 MB | Small video frames |
| Large | 256 KB | 32 | 8 MB | Large video frames |
| XLarge | 2 MB | 64 | 128 MB | HD video frames (the big stuff) |
Total pre-allocated memory: ~141 MB (yeah, it's a decent chunk, but remember—this eliminates malloc overhead for thousands of allocations per second!)
Allocation Strategy
So how does the buffer pool decide which buffer to give you? It's pretty straightforward:
- Pick the right size: It selects the smallest size class that can fit your request
- Try the pool first: It attempts to grab a buffer from the corresponding pool's free list
- Fallback gracefully: If the pool is exhausted, it falls back to regular malloc() (better slow than crashing!)
- Track everything: It keeps statistics so you can tune the pool sizes if needed
Here's how it works in practice:
void buffer_pool_free(buffer_pool_t *pool, void *data, size_t size)
void * buffer_pool_alloc(buffer_pool_t *pool, size_t size)
Data Structures
Buffer Node
Individual buffer in the pool:
typedef struct buffer_node {
void *data;
size_t size;
struct buffer_node *next;
bool in_use;
} buffer_node_t;
Single Pool
Pool for one size class:
typedef struct buffer_pool {
buffer_node_t *free_list;
buffer_node_t *nodes;
void *memory_block;
size_t pool_size;
size_t used_count;
uint64_t hits;
uint64_t misses;
uint64_t returns;
uint64_t peak_used;
uint64_t total_bytes_allocated;
} buffer_pool_t;
int buffer_size
Size of circular buffer.
Pool Manager
Multi-size pool coordinator:
typedef struct data_buffer_pool {
buffer_pool_t *small_pool;
buffer_pool_t *medium_pool;
buffer_pool_t *large_pool;
buffer_pool_t *xlarge_pool;
mutex_t pool_mutex;
uint64_t total_allocs;
uint64_t pool_hits;
uint64_t malloc_fallbacks;
} data_buffer_pool_t;
API Usage
Global Pool (Recommended)
For most cases, you'll want to use the global singleton pool. It's simple and convenient— one pool for your whole application:
data_buffer_pool_init_global();
if (!buffer) {
return SET_ERRNO(ERROR_MEMORY, "Buffer allocation failed");
}
memcpy(buffer, frame_data, frame_size);
data_buffer_pool_cleanup_global();
CRITICAL: You must pass the same size to buffer_pool_free() that you used in buffer_pool_alloc(). The pool needs to know which size class to return the buffer to. If you mix this up, bad things happen!
Custom Pool
For isolated subsystems, create dedicated pools:
data_buffer_pool_t *my_pool = data_buffer_pool_create();
void *buf = data_buffer_pool_alloc(my_pool, 1024);
data_buffer_pool_free(my_pool, buf, 1024);
data_buffer_pool_destroy(my_pool);
Statistics
Basic Statistics
Track hit rate for performance tuning:
uint64_t hits, misses;
data_buffer_pool_get_stats(pool, &hits, &misses);
double hit_rate = (hits * 100.0) / (hits + misses);
log_info("Buffer pool hit rate: %.1f%%", hit_rate);
Detailed Statistics
Per-size-class analysis:
buffer_pool_detailed_stats_t stats;
data_buffer_pool_get_detailed_stats(pool, &stats);
log_info("Small pool: %llu hits, %llu misses, peak=%llu",
stats.small_hits, stats.small_misses, stats.small_peak_used);
log_info("Medium pool: %llu hits, %llu misses, peak=%llu",
stats.medium_hits, stats.medium_misses, stats.medium_peak_used);
Automatic Logging
Log comprehensive stats:
buffer_pool_log_global_stats();
data_buffer_pool_log_stats(my_pool, "Video encoder pool");
Output example:
* [buffer_pool] Global pool statistics:
* Small (1KB): hits=45231 misses=12 peak=890/1024 (86.9%)
* Medium (64KB): hits=8901 misses=156 peak=48/64 (75.0%)
* Large (256KB): hits=3421 misses=89 peak=28/32 (87.5%)
* XLarge (2MB): hits=142 misses=3 peak=45/64 (70.3%)
* Overall hit rate: 98.7%
*
Performance
Benchmarks
Okay, let's talk numbers. How much faster is the buffer pool compared to regular malloc/free? The answer: dramatically faster.
| Operation | Buffer Pool | malloc/free | Speedup |
| Allocate 64KB | 120 ns | 2,400 ns | 20x faster |
| Allocate 256KB | 135 ns | 8,100 ns | 60x faster |
| Free 64KB | 95 ns | 1,200 ns | 12.6x faster |
| Free 256KB | 98 ns | 3,800 ns | 38.8x faster |
Test system: AMD Ryzen 9 5950X, 64GB DDR4-3200, Linux 6.1
Real-Time Impact
But what does this mean for real-time video? Glad you asked!
For 30 FPS video (you've got 33.3ms per frame to do everything):
- malloc/free: 10.5µs per frame (0.03% of your time budget)
- Buffer pool: 0.35µs per frame (0.001% of your time budget)
- Savings: 10.15µs per frame = 305µs per second you get back for other work!
Now scale that up to 9 clients × 30 FPS = 270 frames/sec:
- malloc/free overhead: 2.84ms/sec (oof, that adds up!)
- Buffer pool overhead: 0.09ms/sec (barely noticeable)
- Recovered time: 2.75ms/sec for other processing (encoding, networking, rendering, etc.)
Thread Safety
The buffer pool is fully thread-safe:
void* capture_thread(void* arg) {
while (running) {
capture_frame(buf);
enqueue_frame(buf);
}
}
void* encoder_thread(void* arg) {
while (running) {
void *buf = dequeue_frame();
encode_frame(buf);
}
}
Synchronization: Single mutex protects all four pools. This is acceptable because buffer operations are very fast (<200ns). For higher concurrency, consider per-pool mutexes.
Tuning
Pool Sizing
Adjust pool sizes in buffer_pool.h based on workload:
#define BUFFER_POOL_LARGE_SIZE (512 * 1024)
#define BUFFER_POOL_LARGE_COUNT 512
#define BUFFER_POOL_MEDIUM_COUNT 128
Rule of thumb:
- Pool count ≥ (max_clients × frames_per_second × 2)
- "× 2" provides headroom for encode/decode pipeline depth
Monitoring
Watch for malloc fallbacks in production:
buffer_pool_detailed_stats_t stats;
data_buffer_pool_get_detailed_stats(global_pool, &stats);
double miss_rate = (stats.medium_misses * 100.0) /
(stats.medium_hits + stats.medium_misses);
if (miss_rate > 5.0) {
log_warn("Medium pool miss rate high: %.1f%% - consider increasing pool size",
miss_rate);
}
Best Practices
- Always match alloc/free sizes:
- Check allocation failure:
if (!buf) {
return SET_ERRNO(ERROR_MEMORY, "Out of memory");
}
- Don't hold buffers too long:
- Return buffers immediately after use
- Long holds exhaust pool → more malloc fallbacks
- Monitor statistics:
- Log stats periodically in production
- Tune pool sizes based on miss rates
- Use global pool for most cases:
- Simpler API
- One pool is usually sufficient
- Create custom pools only for isolation
- See also
- buffer_pool.h
-
buffer_pool.c