It is said that NSA has requested the addition of POPCNT to the Control Data Corporation CDC 6600 (1964), as a condition for procurement.
The condition has remained in force later, so all its successors, like CDC 7600 (1969) or Cray-1 (1976), have included POPCNT.
POPCNT has been added to the x86 ISA by AMD, in "Barcelona", in 2007, presumably because some customer for AMD Opteron has requested it. This happened during the period when the AMD server CPUs were much better than the Intel Xeons, so any wise customer was buying Opterons, not Xeons. Intel has followed AMD and it has added POPCNT to Nehalem, in 2008/2009 (for server CPUs, Nehalem has been the first that was better for any purpose than the AMD server CPUs, unlike for consumer CPUs, where Intel had surpassed AMD already since the middle of 2007, with Core 2).
Back in my days as a CPU logic designer, I actually worked on a scientific mainframe where the MIB came by and said "We'll buy some if you add a vector pop count."
Anyway.... if you have cipher text that has been scrambled by a linear-feedback shift register, you can take two copies of the cipher text, shift one copy by N bits, XOR them together, and do a pop count on the result. Repeat for bunch of different N's. For some N that corresponds to the length of the LFSR the auto-correlation will be much better. So now you have at least that to go on... of course you don't know the feedback equation and you don't know the initialization constant, but you have the start of a handle.
I've heard this a few times, do you have anything that explains this?