We used to have a pub rate of about 200k msgs/s, from about 400 producers all to a single exchange and had similar issues. However, we were able to mitigate this by using lazy queues.
This worked fine until things got behind and then we couldn't keep up. We were able to work around that by using a hashed exchange that spread messages across 4 queues. It hashed based on timestamp inserted by a timestamp plugin. Since all operations for a queue happen in the same event loop, any sort of backup led to pub and sub operations fighting for CPU time. By spreading this across 4 queues we wound up with 4x the CPU capacity for this particular exchange. With 2000 queues you probably didn't run into that issue very often.
This worked fine until things got behind and then we couldn't keep up. We were able to work around that by using a hashed exchange that spread messages across 4 queues. It hashed based on timestamp inserted by a timestamp plugin. Since all operations for a queue happen in the same event loop, any sort of backup led to pub and sub operations fighting for CPU time. By spreading this across 4 queues we wound up with 4x the CPU capacity for this particular exchange. With 2000 queues you probably didn't run into that issue very often.