Because the dataset is incomplete if even a single city doesn't opt into sharing data, or only provides a portion of the real data.
From the FBI itself:
> Pitfalls of Ranking
> UCR data are sometimes used to compile rankings of individual jurisdictions and institutions of
higher learning. These incomplete analyses have often created misleading perceptions which
adversely affect geographic entities and their residents. For this reason, the FBI has a long-‐
standing policy against ranking participating law enforcement agencies on the basis of crime
data alone. Despite repeated warnings against these practices, some data users continue to
challenge and misunderstand this position.
That's the point: there is no dataset that will allow you to reliably rank every "major city" by crime stats because every city reports crime differently and no crime dataset also accounts for factors which may affect those statistics (eg law enforcement misreporting crimes for political reasons).
How do you even define a city in this case? You say those are normalized crime stats per 1k residents - how big are the metro areas? Are there any cities that also count a portion of their suburbs as part of the city? Does doing so reduce per-capita crime stats? etc...
My suggestion would be to thoroughly vet your sources before posting their conclusions as fact.
No I didn't. I've been working with UCR data for 10+ years and I recognize it in the wild. Furthermore, my points still stands regardless of the source: "there is no dataset that will allow you to reliably rank every 'major city' by crime stats because every city reports crime differently and no crime dataset also accounts for factors which may affect those statistics."
If you know where their source data comes from and why its so accurate, feel free to educate me. Their own site says this:
> Q. Why rank cities on safety even though the FBI cautions against it?
> A. This report and/or our data cannot and should not be used as a measure of the effectiveness of law enforcement due to the myriad factors that can contribute to crime in a community.
This is suprising to me as non-American because Detroit reads to me "like gang-infested murdercity" and I would have expected San Francisco to do significantly better in crime numbers...
Are the numbers misleading or IS criminality in San Francisco comparable to Detroit?
The numbers are misleading because they're based on voluntary reporting from law enforcement agencies to the FBI's UCR program. The FBI has this to say:
Data users should not rank locales because there are many factors that cause the nature and type of crime to vary from place to place. UCR statistics include only jurisdictional population figures along with reported crime, clearance, or arrest data. Rankings ignore the uniqueness of each locale. Some factors that are known to affect the volume and type of crime occurring from place to place are:
• Population density and degree of urbanization.
• Variations in composition of the population, particularly youth concentration.
• Stability of the population with respect to residents; mobility, commuting patterns, and
transient factors.
• Economic conditions, including median income, poverty level, and job availability.
• Modes of transportation and highway systems.
• Cultural factors and educational, recreational, and religious characteristics.
• Family conditions with respect to divorce and family cohesiveness.
• Climate.
• Effective strength of law enforcement agencies.
• Administrative and investigative emphases on law enforcement.
• Policies of other components of the criminal justice system (i.e., prosecutorial, judicial,
correctional, and probational).
On the link you provided (https://www.neighborhoodscout.com/ca/san-francisco/crime) there's a "Source & Methodology" link which says "Raw data sources: 18,000 local law enforcement agencies in the U.S." and "Date(s) & Update Frequency: Reflects 2021 calendar year; released from FBI in Oct. 2022"
The FBI puts out one main source of crime stats - UCR. On the UCR site that I linked to: "The UCR Program includes data from more than 18,000 city, university and college, county, state, tribal, and federal law enforcement agencies." Those numbers aren't a coincidence.
UCR is the main data product that people use for (usually incorrect) analysis. The FBI itself says that the data is incomplete to the point that it should not be used as a way to determine the safety of a location.
They even discuss it in detail:
"What makes NeighborhoodScout® Crime Data uniquely accurate?
Most city neighborhood crime data are incomplete and inaccurate because crimes are reported by individual law enforcement agencies, rather than by city or town, and many cities – even small ones – have more than one agency responsible for law enforcement (municipal, university, county, transit, etc.). Even FBI data are reported by agency not by city or town, providing an incomplete assessment of city-wide crime counts. It is an agency-centric rather than locality-centric reporting method. If you use FBI data, you only get city-wide general counts, and only from one agency in the city, so it is generally incomplete for the city overall, as well as not specific to a neighborhood or address."
And further:
Once we have these complete set of reported crime data, along with millions of geocoded reported crime incidents using a GIS, we begin our crime data development process.
The results are fine resolution, highly accurate crime data that are comparable nationally.
Our approach provides you the ability to look at small areas effectively.
In some cases a city agency is in charge of law enforcement, while in other areas it’s a county. In many cases it is more than one agency for a geographic area. Since the geography varies, it’s difficult to compare the scores among jurisdictions, or to get a true and complete picture of crime risk. This is why we use a relational database to assess the true count of reported crimes in a locality.
Although most agencies report, not all do. This creates holes in the data. Our method allows us to accurately fill in the holes based on the crime experience of many like locales, and provide accurate crime data for anywhere in the U.S.
The quote you posted offers no actual information about how they purport to plug gaps in the data:
> And further: Once we have these complete set of reported crime data, along with millions of geocoded reported crime incidents using a GIS, we begin our crime data development process.
> The results are fine resolution, highly accurate crime data that are comparable nationally.
> Our approach provides you the ability to look at small areas effectively."
Can you explain to me what their methodology is? Because this is just marketing-speak and doesn't provide any facts about how or why their analysis is accurate.
---
Edit: from their own site!
> Q. Why rank cities on safety even though the FBI cautions against it?
> A. This report and/or our data cannot and should not be used as a measure of the effectiveness of law enforcement due to the myriad factors that can contribute to crime in a community.
You can generally assume most statistics posted within any comment on any site are misleading, and to use that cynicism to assess the "numbers".
Just how it is on the internet, most people who post the "numbers" are doing so for a very specific intention, and it's not at all to highlight the objectivity of them. You'll want to get the source research yourself and see what they say about it, and usually most papers tend to be far more nuanced than "thing bad" or "thing good".
The numbers might not be 100% accurate (as other commenter mentioned, there is no normalized dataset available) but they show an approximate scale.
It seems though that some Bay located HN readers disagree with both statistical and anecdotal evidence, which might be actually an example of sunken cost bias (it's not surprising considering the cost of property in SF / Bay Area)