We're now working on improving the scalability of LISTEN/NOTIFY in PostgreSQL, and to guide that work, I'd like to better understand how it's used (or was used) in real-world systems. What works well? What doesn't?
The current implementation has some known scalability bottlenecks:
1. Thundering Herd Problem: A NOTIFY wakes up all listening backends in the current database, even those not listening on the notified channel. This is inefficient when many listeners are each listening to their own channels (e.g. in job queues).
2. Commit Lock Contention: NOTIFY operations are serialized behind a heavyweight lock at transaction commit. This can become a bottleneck when many transactions send notifications in parallel.
If you've used LISTEN/NOTIFY in production, I'd love to hear:
- What is/was your use case?
- Does each client listen on its own channel, or do they share channels?
- How many listening backend processes?
- How many NOTIFYs in parallel?
- Are you sending payloads? If so, how large?
- What worked well for you? What didn't?
- Did you hit any scalability limits?
Feedback much appreciated, thanks!
/Joel
https://github.com/graphile/worker
Every worker pool seems to listen on the same shared channels: jobs:insert - all workers get notified when new jobs are added worker:migrate - all workers get notified about database migrations