The failing after every step has been when they close the loop. That is, agile is used to score and measure velocity. This velocity is then used to impact future scoring. This nullifies Agile's use for measurement since management is now insisting that x number of points be completed per developer per sprint.
The places I've had it succeed greatly have all been new development. We consistently scored stories. Product owners could then had a budget equal to the team's velocity to purchase from the backlog into the next sprint. What it really looked like was they got to arrange the order that the backlog would be picked up and how much would get done in a sprint was an estimate, but with consistent implementation it became eerily accurate over time. It has been my observation that it takes about 6 sprints to calibrate -- where the team scores consistently and the variances from sprints cancel out.
Another point of emphasis is that velocity is a team score. Every failing effort I've been involved with used it to score individuals which inevitably ends up in scoring the stories higher so that an individual gets more points. It also results in cherry-picking easy points and ignoring tasks and bug fixes which may result in zero. It also (like bug fixes) results in everything being assigned points such that points become a proxy for time. If you're developing a new system and you have a bug fix, the team has already claimed the points for that bug fix. You should not make more points for the bug fix, your velocity should actually go down, giving a more accurate estimate in the future of how much new system is actually produced and not a measure of how much a programmer has worked.
So, measurements should absolutely be available to management. But they shouldn't be used to impact future scoring.
That assumes really good (measured against the norm) management. If your management is "normal" (or worse), giving management access to that just destroys the whole value of the scoring system. (Always remember that fifty percent is worse than the centerline, to say nothing of the still-within-a-standard-deviation "better" that isn't enough better to matter.)
It really sounds like a "not true Scotsman" argument, but I've had Agile fail when the team failed to use Agile. As I wrote above, points inevitably became a proxy for time in one manner or another. Production was pinned to a certain number of points (say 15 per developer per sprint).
I can't throw it all on management either. People who should know better will accept 2 days worth of points for conducting training or attending a workshop.
Largely, though, I would say it's a simple matter of not understanding how Agile works at all. It has just become, "you score points for doing stuff" (in proportion to time, but it's not time, but it is).
The places I've had it succeed greatly have all been new development. We consistently scored stories. Product owners could then had a budget equal to the team's velocity to purchase from the backlog into the next sprint. What it really looked like was they got to arrange the order that the backlog would be picked up and how much would get done in a sprint was an estimate, but with consistent implementation it became eerily accurate over time. It has been my observation that it takes about 6 sprints to calibrate -- where the team scores consistently and the variances from sprints cancel out.
Another point of emphasis is that velocity is a team score. Every failing effort I've been involved with used it to score individuals which inevitably ends up in scoring the stories higher so that an individual gets more points. It also results in cherry-picking easy points and ignoring tasks and bug fixes which may result in zero. It also (like bug fixes) results in everything being assigned points such that points become a proxy for time. If you're developing a new system and you have a bug fix, the team has already claimed the points for that bug fix. You should not make more points for the bug fix, your velocity should actually go down, giving a more accurate estimate in the future of how much new system is actually produced and not a measure of how much a programmer has worked.
So, measurements should absolutely be available to management. But they shouldn't be used to impact future scoring.