I also found splitting interrupts between the two cores helps with latency, but even if one core has only a single interrupt, that interrupt latency is increased compared to a single core system with a single interrupt. I suspect this is at least partly because they only put a single fetch pipe between the instruction cache and the crossbar.