
If you're going to use our data to understand what's happening in San Francisco, you should know exactly where it comes from, how we handle it, and where the gaps are. This post is our transparency commitment.
If you're new here, start with Welcome to Civic Informer, San Francisco.
What This Data Is
Our Source
Every data point on Civic Informer for San Francisco comes from DataSF, the City of San Francisco's open data portal at `data.sfgov.org`.
Specifically, we pull from the SFPD Incident Reports 2018 to present dataset, which SFPD uploads to DataSF and DataSF publishes through its public Socrata API. We do not receive data directly from SFPD and we do not scrape any police website.
These are formal incident reports: documented events that SFPD has recorded and DataSF has published. Not 911 calls, not tips, not rumors, not social media posts.
We collect this data daily through an automated process. Each morning, the prior day's published reports are processed and added to our system.
How We Process It
Each incident goes through several steps before it appears on the site:
- Raw data is pulled from DataSF's Socrata API for the SFPD Incident Reports dataset
- Only Initial, Vehicle Initial, and Coplogic Initial report types are kept; supplemental updates are skipped to avoid double-counting, following DataSF's own guidance
- Coordinates are used as published by DataSF (no third-party geocoding) and each record is assigned to one of 41 SF analysis neighborhoods from the DataSF Analysis Neighborhoods polygon dataset; records without coordinates are bucketed into a synthetic "County" group that appears in totals but not on the map
- SFPD incident_category and incident_subcategory values are mapped into our standardized categories; because a single SF incident report can cite multiple penal code offenses, we group rows by incident number and attach all offenses to one incident record
- Each incident is flagged as violent or non-violent using NIBRS "Crimes Against Persons" plus specific NIBRS descriptions (Arson, Robbery, Weapon Law Violations, Kidnapping)
- Daily aggregate statistics are computed: totals, category breakdowns, neighborhood rankings, hourly distribution, and arrest data
What We Track Per Incident
- Case number and CAD number (SF incidents carry a Computer-Aided Dispatch number distinct from the case number)
- Date and time reported
- Offenses (raw, from SFPD via DataSF) — an incident can have multiple
- Category (our standardized grouping)
- Violent/non-violent classification
- Location (latitude/longitude as published — block-level accuracy)
- Analysis neighborhood
- Description and resolution (as published by SFPD)
We do not track any personally identifying information for San Francisco incidents. No names, no person identifiers — SFPD does not publish them in this dataset, so we cannot and do not store them.
How We Present It
We turn that processed data into several tools you can use:
- Daily reports with charts, tables, and the full incident log
- An interactive map where you can filter incidents by category, time of day, arrest status, and neighborhood
- Weekly and monthly trend summaries
- Neighborhood rankings with 30-day trends
- A free daily email newsletter with a condensed version of the day's data
What This Data Isn't
Not Real-Time
Reports cover the prior day's activity, published each morning. This is by design. We prioritize accuracy and completeness over speed.
DataSF's own publication cycle means some reports arrive a day or two after the underlying incident, and SFPD may file supplemental updates later that reclassify an event. We ingest daily and records may be added or updated retroactively as DataSF refreshes the dataset.
Not Call Data
We reflect SFPD incident reports as published by DataSF, not every call to 911 or dispatch. An incident report means SFPD formally documented an event. Many calls for service never become an incident report. This means our data represents a subset of all police activity, not the full picture.
Incident reports represent events SFPD has formally documented and DataSF has published. Raw CAD and 911 dispatch data lives in separate DataSF datasets that we may integrate in the future.
Not a Decision-Maker
Civic Informer presents data as published by SFPD through DataSF. We are not saying anyone is guilty. We are not making legal claims. We are not recommending actions.
When an arrest appears in our data, it reflects a police action, not a conviction or determination of guilt. Cases may be dismissed, charges may change, and outcomes may differ from what the initial report suggests.
We organize information so you can draw your own conclusions. What you do with this data, whether that's making neighborhood decisions, attending a city meeting, or simply staying informed, is up to you.
An arrest reported in our data does not imply guilt. It reflects a police action, not a legal outcome.
Not Complete
Our data only includes incidents SFPD has recorded and DataSF has published. Unreported activity isn't captured, and some incident types are underreported by nature, particularly domestic incidents and certain property crimes where victims don't file reports.
Some SFPD districts have historically under-reported in the DataSF feed, and supplemental updates can reclassify an incident after it first appears. Higher incident counts in a neighborhood often reflect commercial activity, nightlife, tourism, or population density, not necessarily that the area is less safe.
Not Editorial
We don't interpret what the data "means" for policy. We don't take positions on policing, funding, or enforcement strategy. That's your job as a resident, voter, or community member.
Our analysis focuses on patterns and context: is this week busier than average? Which categories are trending up or down? What time of day sees the most activity? The conclusions are yours to draw.
What We're Working Toward
- Additional DataSF datasets: expanding coverage as more San Francisco public safety datasets (arrests, 311, calls for service) come online and stabilize
- Richer neighborhood context: deeper historical baselines as we accumulate more days of SF history on the platform
- Seasonal and multi-year comparisons: DataSF's incident history runs from 2018 to present, giving us a long runway for year-over-year and seasonal views
Why This Transparency Matters
Data without context is noise. A busy day during a holiday weekend or a major event reads differently than a busy day in February. A neighborhood with high incident counts near a commercial or nightlife district means something different than the same number in a residential area.
We include tools to help you find that context: rolling averages, seasonal comparisons, neighborhood-level detail, time-of-day breakdowns. A single number never tells the whole story. We'd rather you trust our data and draw your own conclusions than see us sensationalize it.
Our Commitment
When we get something wrong, a miscategorization, a data gap, a broken chart, we'll say so. Publicly, here on this blog.
If you spot something that doesn't look right, or have questions about our methodology, reach out through the contact page or reply to any daily email.
For a complete walkthrough of the platform, read How To Use Civic Informer, San Francisco.