e-government for uniform crime reporting data and how to make it more accessible
TRANSCRIPT
-
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
1/27
1
E-Government for Uniform Crime Reporting Data: How to Make it More Accessible
The Value of Ensuring OpenGovernmentData is Truly Accessible
Transparency and availability of government data is a critical goal. Jaeger, Bertot, and
Grimes describe a survey of benefits of government transparency, including democratic
participation, prevention of corruption, informed decision-making, and accuracy of government
information.1 However, they note that, If the Obama administration's renewed commitments to
transparency and the focus on the capacities of e-government and the web as an avenue to
advance transparency are to succeed, their efforts must ultimately result in citizen-centered
approaches that are available to and usable by all members of the publicfuture policy will need
to focus on the human dimensions of transparency, not just the technological dimensionsto be
embraced by the public, it will need to not only be available to all, but be designed to be usable
by all. They go on to describe the technical skill gaps most often encountered by the general
public in attempting to fully interpret and use government data:2
1) Technology literacy: the ability to use and understand technologies2) Usability: the design of technologies in such ways that are intuitive and allow users to
engage in the content embedded within the technology
3) Accessibility: the ability of persons with disabilities (or any person lacking a certaintechnical background) to be able to access the content
By: David Blau, George Mason University Graduate School of Public Policy, 11/26/2013
Contact: [email protected]
1Bertot, John Carlo, and Jaeger, Paul T. Transparency and Technological Change: Ensuring Equal and Sustained
Public Access to Government Information. Government Information Quarterly, Vol. 27, pp. 371, 2010.
2Bertot, John Carlo, Grimes, Justin M., and Jarger, Paul T. Using ICTs to Create a Culture of Transparency: E-
Government and Social Media as Openness and Anti-Corruption Tools for Societies. Government Information
Quarterly, Vol. 27, pp. 268, 2010.
-
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
2/27
2
4) Functionality: the design of the technologies to include features (e.g., search, e-government service tracking; accountability measures, etc.) that users desire
In order to bridge the gap, and make data truly accessible to the public, government should
provide training for users and testing that each aspect of the technical gap is minimized. But,
accommodating all users, regardless of technical background, is often neglected. Bertot et al.
find that government transparency websites likewww.data.govare directed toward the more
technically-inclined.3 In addition, e-government services generally are limited by difficulties
in organization, structure, search, metadata, and other factors.4
The National Incident-Based Reporting System: Intelligent Federal Data on Crime
The Uniform Crime Reporting (UCR) system was developed in 1929 to meet the need for
reliable standardized crime statistics. According to the Federal Bureau of Investigation, the
Uniform Crime Reporting (UCR) system was, conceived in 1929 by the International
Association of Chiefs of Police to meet the need for reliable uniform crime statistics for the
nation. In 1930, the Federal Bureau of Investigation was tasked with collecting, publishing, and
archiving those statistics.5
3See Bertot et al., Using ICTs to Create a Culture of Transparency: E-Government and Social Media as Openness
and Anti-Corruption Tools for Societies. This approach is typified by the nascent and ambitious plan by the
Obama administration to make vast amounts of government data available through the www.data.gov site. These
types of transparency initiatives are directed toward the more technically inclined citizen: researchers,
technologists, and civic-minded geeks. (page 268)
4See Bertot et al., Transparency and Technological Change: Ensuring Equal and Sustained Public Access to
Government Information. Even for members of the public with internet access, e-government services generally
are limited by difficulties in organization, structure, search, metadata, and other factors. Consider the unrefined
scope and disorganization of the typical results of a search on www.usa.gov. E-government in the United States
simply has not been designed to account for the needs of the users of e-government, particularly the members of
the public seeking information or engagement. (page 373)
5http://www.fbi.gov/about-us/cjis/ucr/ucrFBI Uniform Crime Reports website, retrieved June 30, 2013
http://www.data.gov/http://www.data.gov/http://www.data.gov/http://www.fbi.gov/about-us/cjis/ucr/ucrhttp://www.fbi.gov/about-us/cjis/ucr/ucrhttp://www.fbi.gov/about-us/cjis/ucr/ucrhttp://www.fbi.gov/about-us/cjis/ucr/ucrhttp://www.data.gov/ -
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
3/27
3
In 1985, Abt Associates6recommended critical changes to the UCR program in the
Blueprint for the Future of the Uniform Crime Reporting Program.7 Instead of just collecting
summary data, they proposed that the FBI should collect detailed data on each individual
criminal incident, and make the data more meaningful by attaching demographic information, in
order to provide better insight into the causes and effects of crime. By collecting and providing
detailed information on each incident, the National Incident-Based Reporting System (NIBRS)
would be able to enhance the quantity, quality, and timeliness of data collected by the law
enforcement community, and to improve the methodology used for compiling, analyzing,
auditing, and publishing the collected crime data.
From the start of the program, NIBRS was intended for a variety of types of users. As
the FBI NIBRS program history states8:
While the Programs primary objective is to generate a reliable set of criminal
statistics for use in law enforcement administration, operation, and management, its data
have over the years become one of the countrys leading social indicators. The American
public looks to Uniform Crime Reports for information on fluctuations in the level of
crime, while criminologists, sociologists, legislators, municipal planners, the press, and
other students of criminal justice use the statistics for varied research and planning
purposes.
6http://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/1995/95sec1.pdfFBI Summary of the Uniform Crime
Reporting Program, retrieved June 30, 2013
7http://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/2011/crime-in-the-u.s.-2011/aboutucrmain FBI - About
The Uniform Crime Reporting Program, retrieved June 30, 2013
8http://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/1995/95sec1.pdfIbid Summary
http://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/1995/95sec1.pdfhttp://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/1995/95sec1.pdfhttp://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/1995/95sec1.pdfhttp://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/2011/crime-in-the-u.s.-2011/aboutucrmainhttp://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/2011/crime-in-the-u.s.-2011/aboutucrmainhttp://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/2011/crime-in-the-u.s.-2011/aboutucrmainhttp://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/1995/95sec1.pdfhttp://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/1995/95sec1.pdfhttp://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/1995/95sec1.pdfhttp://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/1995/95sec1.pdfhttp://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/2011/crime-in-the-u.s.-2011/aboutucrmainhttp://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/1995/95sec1.pdf -
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
4/27
4
However, if the version of the data made available to the American public does not
include all of this demographic data, or at least requires significant additional technical expertise
and time to make the detailed information easily understandable, then the process of integrating
and presenting the data could be improved. In order for data to be effective for research, one
would want both historical data and a sufficient sample size.
This paper focuses on NIBRS/UCR processes for data gathering and analysis regarding
hate crimes. This is because this data is far less available than other UCR data, and does not
fulfill the open e-government accessibility criteria discussed above.9 The original NIBRS/UCR
data for hate crimes is available to the public in two formats. One format is summary data that
omits many of the data fields from the original NIBRS entry, separated into spreadsheets based
on types of victims and offenses, found online.10 This format has three limitations. First, most
of the NIBRS-gathered additional data intended to provide a greater degree of specificity and to
enable more productive research and analysis is not included, in order to present a simplified
view.11 The FBI summary data gives totals each year, whereas in the original NIBRS, each
incident is its own data point with an incident number and detailed info. Second, the data is
presented for each year in fourteen different spreadsheets, requiring a manual combination of the
data across spreadsheets to analyze and compare hate crime rates between regions and across
9The National Archive of Criminal Justice Data (NACJD) website does not contain any of the original full incident
UCR hate crime data. Seewww.icpsr.umich.edu/icpsrweb/NACJD/.
10http://www.fbi.gov/stats-services/crimestatsFBI UCR statistics, including hate crime statistics. Retrieved June30, 2013.
11Prior research into UCR Hate Crime data often relies on the FBI summary data, which is far removed from the
raw data, and omits the accompanying demographic data. For example, see Rubenstein, William B., The Real
Story of U.S. Hate Crimes Statistics: An Empirical Analysis.Tulane Law Review, Vol. 78, pp. 1213-1246, 2004.
Available at SSRN: http://ssrn.com/abstract=547883
http://c/Users/David/Downloads/www.icpsr.umich.edu/icpsrweb/NACJD/http://c/Users/David/Downloads/www.icpsr.umich.edu/icpsrweb/NACJD/http://c/Users/David/Downloads/www.icpsr.umich.edu/icpsrweb/NACJD/http://www.fbi.gov/stats-services/crimestatshttp://www.fbi.gov/stats-services/crimestatshttp://www.fbi.gov/stats-services/crimestatshttp://www.fbi.gov/stats-services/crimestatshttp://c/Users/David/Downloads/www.icpsr.umich.edu/icpsrweb/NACJD/ -
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
5/27
5
years. Third, FBI hate crime Tables 1-11 show totals for incidents, offenses and victims by bias
type or offender nationwide; and Tables 12-14 show much more summarized data by individual
cities, counties, other agencies such as universities, and by state totals. This separation and
simplification of location data results in the user being able, for example, to see the total of
religious bias motivated crimes in Atlanta, Georgia, but not the bias-motived crimes against a
more detailed breakdown of religion in Atlanta - not, in other words, how many crimes were
allegedly committed against Jews, Catholics, Protestants or Muslims specifically within a county
or city because of their religious beliefs. The data is summarized and separated to such an
extent, it limits the ability of civil society or law enforcement organizations to understand the
nature of hate crimes in their local communities, and to work with local resources to reduce or
eliminate the incidence of those crimes. This is clearly an example of where the Obama
administrationsprogress towards truly open e-government is admirable, but the goals of full
functionality and detail of the data versus accessibility and understandability for all public
citizens are not being jointly met. Even the FBI summary totals only go back to 2004, meaning
that the entirety of hate crime data gathered from the period 19912003 must be obtained and
interpreted in ASCII. Without the process described here of reformatting that data, it would be
unavailable to any researcher, law enforcement personnel, policy maker, or average citizen
lacking the requisite technical background.
In the second format, the public can obtain detailed, incident-by-incident data including
geographic location from the FBI, but it is in ASCII, a format requiring technical expertise and
translation to make it usable for research and analysis. In order to translate from ASCII into a
more easily understood format like that for Microsoft Excel (.xls or .xlsx), one needs statistical
software like STATA, SPSS, or SAS. These three programs, however, require substantial
-
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
6/27
6
technical training in order to teach the user how to translate and analyze data, creating a major
barrier to access to any citizen lacking this training. They are also prohibitively expensive, if the
user does not have an educational discount. The cheapest version of STATA, STATA 13/IC,
costs $595 for a single year license, or $1,195 for a perpetual license without technical support.12
The cheapest version of SPSS, SPSS Standard, costs $2,320 for a fixed term license with a year
of technical support. The cheapest version of SAS is $8,700 for an annual license with a year of
technical support. This again is a prohibitive barrier to the average citizen being able to access
this data.
The full original ASCII dataset is only available through an email request. This second
format provides more detailed data (e.g., the subcategory breakdowns of bias motivations, the
known offenders races, and the victim types for each agency submitting hate crime data to the
FBI UCR Program) from the FBI UCR Programs Hate Crime Master Files at the Criminal
Justice Information Services Division.13 While this data is comprehensive, it is not provided in a
format that is easy to understand or use. A single-page of documentation is provided, which
states in entirety:
What you should know before you download the Uniform Crime Reporting
(UCR) Programs master files
12SPSS on the IBM website:https://www-
112.ibm.com/software/howtobuy/buyingtools/paexpress/Express?part_number=D0EKZLL%2CD0EEMLL%2CD0EK0
LL%2CD0EEJLL&catalogLocale=en_US&Locale=en_US&country=USA&PT=jsp&CC=USA&VP=&TACTICS=%26S_TACT
%3D%26S_CMP%3D%26brand%3Dnone&ibm-submit=View+US+prices+%26+buy. SAS:
https://www.sas.com/order/product.jsp?code=PERSANLBNDL. And STATA:
http://www.stata.com/order/new/bus/single-user-licenses/dl/.
13http://www.fbi.gov/about-us/cjis/ucr/hate-crime/data-collection-manual/viewUniform Crime Reports Hate
Crime Data Collection Guidelines and Training Manual, p. 57, retrieved June 30, 2013.
https://www-112.ibm.com/software/howtobuy/buyingtools/paexpress/Express?part_number=D0EKZLL%2CD0EEMLL%2CD0EK0LL%2CD0EEJLL&catalogLocale=en_US&Locale=en_US&country=USA&PT=jsp&CC=USA&VP=&TACTICS=%26S_TACT%3D%26S_CMP%3D%26brand%3Dnone&ibm-submit=View+US+prices+%26+buyhttps://www-112.ibm.com/software/howtobuy/buyingtools/paexpress/Express?part_number=D0EKZLL%2CD0EEMLL%2CD0EK0LL%2CD0EEJLL&catalogLocale=en_US&Locale=en_US&country=USA&PT=jsp&CC=USA&VP=&TACTICS=%26S_TACT%3D%26S_CMP%3D%26brand%3Dnone&ibm-submit=View+US+prices+%26+buyhttps://www-112.ibm.com/software/howtobuy/buyingtools/paexpress/Express?part_number=D0EKZLL%2CD0EEMLL%2CD0EK0LL%2CD0EEJLL&catalogLocale=en_US&Locale=en_US&country=USA&PT=jsp&CC=USA&VP=&TACTICS=%26S_TACT%3D%26S_CMP%3D%26brand%3Dnone&ibm-submit=View+US+prices+%26+buyhttps://www-112.ibm.com/software/howtobuy/buyingtools/paexpress/Express?part_number=D0EKZLL%2CD0EEMLL%2CD0EK0LL%2CD0EEJLL&catalogLocale=en_US&Locale=en_US&country=USA&PT=jsp&CC=USA&VP=&TACTICS=%26S_TACT%3D%26S_CMP%3D%26brand%3Dnone&ibm-submit=View+US+prices+%26+buyhttps://www-112.ibm.com/software/howtobuy/buyingtools/paexpress/Express?part_number=D0EKZLL%2CD0EEMLL%2CD0EK0LL%2CD0EEJLL&catalogLocale=en_US&Locale=en_US&country=USA&PT=jsp&CC=USA&VP=&TACTICS=%26S_TACT%3D%26S_CMP%3D%26brand%3Dnone&ibm-submit=View+US+prices+%26+buyhttps://www-112.ibm.com/software/howtobuy/buyingtools/paexpress/Express?part_number=D0EKZLL%2CD0EEMLL%2CD0EK0LL%2CD0EEJLL&catalogLocale=en_US&Locale=en_US&country=USA&PT=jsp&CC=USA&VP=&TACTICS=%26S_TACT%3D%26S_CMP%3D%26brand%3Dnone&ibm-submit=View+US+prices+%26+buyhttps://www.sas.com/order/product.jsp?code=PERSANLBNDLhttps://www.sas.com/order/product.jsp?code=PERSANLBNDLhttp://www.stata.com/order/new/bus/single-user-licenses/dl/http://www.stata.com/order/new/bus/single-user-licenses/dl/http://www.fbi.gov/about-us/cjis/ucr/hate-crime/data-collection-manual/viewhttp://www.fbi.gov/about-us/cjis/ucr/hate-crime/data-collection-manual/viewhttp://www.fbi.gov/about-us/cjis/ucr/hate-crime/data-collection-manual/viewhttp://www.fbi.gov/about-us/cjis/ucr/hate-crime/data-collection-manual/viewhttp://www.stata.com/order/new/bus/single-user-licenses/dl/https://www.sas.com/order/product.jsp?code=PERSANLBNDLhttps://www-112.ibm.com/software/howtobuy/buyingtools/paexpress/Express?part_number=D0EKZLL%2CD0EEMLL%2CD0EK0LL%2CD0EEJLL&catalogLocale=en_US&Locale=en_US&country=USA&PT=jsp&CC=USA&VP=&TACTICS=%26S_TACT%3D%26S_CMP%3D%26brand%3Dnone&ibm-submit=View+US+prices+%26+buyhttps://www-112.ibm.com/software/howtobuy/buyingtools/paexpress/Express?part_number=D0EKZLL%2CD0EEMLL%2CD0EK0LL%2CD0EEJLL&catalogLocale=en_US&Locale=en_US&country=USA&PT=jsp&CC=USA&VP=&TACTICS=%26S_TACT%3D%26S_CMP%3D%26brand%3Dnone&ibm-submit=View+US+prices+%26+buyhttps://www-112.ibm.com/software/howtobuy/buyingtools/paexpress/Express?part_number=D0EKZLL%2CD0EEMLL%2CD0EK0LL%2CD0EEJLL&catalogLocale=en_US&Locale=en_US&country=USA&PT=jsp&CC=USA&VP=&TACTICS=%26S_TACT%3D%26S_CMP%3D%26brand%3Dnone&ibm-submit=View+US+prices+%26+buyhttps://www-112.ibm.com/software/howtobuy/buyingtools/paexpress/Express?part_number=D0EKZLL%2CD0EEMLL%2CD0EK0LL%2CD0EEJLL&catalogLocale=en_US&Locale=en_US&country=USA&PT=jsp&CC=USA&VP=&TACTICS=%26S_TACT%3D%26S_CMP%3D%26brand%3Dnone&ibm-submit=View+US+prices+%26+buy -
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
7/27
7
The UCR Program provides its master files (i.e., raw, unpublished data), data sets
extracted from these master files (as electronic text files), and printouts (in the form of
paper or portable document format files [PDFs])also extracted from the master files
upon request. Anyone wishing to obtain any of these data may do so via e-mail at
Before you download master files from this site, you should be aware of the
following facts:
UCR master files are formatted in ASCII only. (Please be aware they are
not available in Microsoft Excel or Access.)
UCR master files must be imported into statistical software, such as SAS
or SPSS, in order to be read properly. The UCR Program does not provide statistical
software in conjunction with the master files; therefore, the data requester must have
access to the appropriate software.
The UCR Program does not provide technical assistance for the master
files. We will provide record descriptions, which will be disseminated with the files, but
staff is not available to assist you with technical glitches such as problems importing
data, etc.
-
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
8/27
8
Users should be aware that some of the files (most notably the National
Incident-Based Reporting System [NIBRS] master files) are quite large and typically are
used in a mainframe environment.14
Two university research centers have taken the whole of the original FBI UCR data and
reformatted it, but have normalized the data according to certain assumptions and present the
data in a format that still requires technical expertise and use of expensive statistical modeling
software. The National Archive of Criminal Justice Data (NACJD) of the Inter-University
Consortium on Political and Social Research (ICPSR) at the University of Michigan houses an
alternative version general UCR data, albeit aggregated to the county-level, isolated by year, and
which still requires knowledge of software like STATA or SPSS. Michael Maltz and Joseph
Targonski note the comments of a reviewer of their research on the inconsistencies of UCR data
in ICPSR format, One reviewer of an earlier version of this article noted that [t]he weakness
of the ICPSR county-level data file is [obvious] to anyone who carefully reads the ICPSR
codebook. Yetnone of the critics of MGLC (referring to a 1998 paper using UCR data, called
More Guns, Less Crime) noted or mentioned it; apparently many users of the data do not read
the ICPSR codebook carefully.15 The NACJD/ICPSR data is not the raw original FBI data, but
comes from the Crime-by-CountyFBI summary file. This crime-by-county file distributes
crime totals from an agency that lies in multiple counties to each county based on the percentage
of the total agency population that lies in each county. In addition, county population figures in
14Uniform Crime Reporting What you should know before you download the Uniform Crime Reporting (UCR)
Programs master files, text document provided with ascii file database, received by email January 2013 from the
Criminal Justice Information Services Division.
15Maltz, Michael D., and Targonski, Joseph. A Note on the Use of County-Level UCR Data. Journal of
Quantitative Criminology, Vol. 18, No. 3, pp. 298-300, September 2002.
-
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
9/27
9
this data do not include the populations from jurisdictions that failed to report crime data, so the
total county population figures can be incorrect.16 Maltz and Targonski note that the FBI is
aware of the shortcomings of UCR data, and on the basis of the above, The FBI publishes
warnings about comparing cities and counties; they can be found in every issue of their annual
report, Crime in the United States. The warning reads, These rankings [of cities and counties
based on their Crime Index figures] lead to simplistic and or incomplete analyses which often
create misleading perceptions.17
The National Consortium on Violence Research (NCOVR) at Carnegie Melon also
houses UCR data, but it is only available via requests to an Oracle database, requiring knowledge
of SQL. Maltz and Targonski go so far as to say, Such factors can often be accounted for by
analyzing the data using sophisticated computer models. But these models cannot compensate
for missing data of the type and magnitude encountered in the UCRDue to the Problems
caused by the imputed data, we conclude that county-level crime data, as they are currently
constituted, should not be used, especially in policy studies.18 But no source provides
comprehensive data in the form of the original raw incident reports in the NIBRS system in a
way that an average citizen can use.
On behalf of the public, journalists and academic researchers who may not have access to
the mainframe environment recommended by the Criminal Justice Information Services
Division, we reorganized and streamlined the data, and conducted an assessment of its
availability and quality using commonly available software: a combination of Microsoft Excel,
16ibid., p. 300.
17ibid., p. 300.
18ibid., p. 300.
-
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
10/27
-
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
11/27
11
to contact us so that we can correct them. The file is also very large, approximately 25
megabytes, and so can take a long time to load and save on many computers.
Step 1: Make ASCII Code Readable for Anyone
The Uniform Crime Reporting ASCII dataset for hate crimes obtained from the FBI
consists of 155,338 data records, separated by year, for 1991 through 2011. The data must be
translated using the FBI-provided codebook, the Hate Crime Yearly Master Record Description
(1995). This document explains which digits of the string of numbers and letters pertain to what
information. Otherwise, the data appears as shown below:
BH01ALAST0000 19910430 MONTGOMERY, AL AL8 D632N
BH01AL0010000 1991043019910101BIRMINGHAM, AL AL9 A632N
BH01AL0010100 1991043019910101BESSEMER, AL AL4 631N
Individual data within each cell appears meaningful, and indeed it is. For example, the first eight
digits of the second cell are the date, in a year/month/day format. However, the columns are not
separated in the document according to where each data field should end, as one can see above
from the date/date/city name combination in the second column.
One method to format the large UCR hate crimes ASCII file into a readable document
requires creating a dictionary file within statistical software such as STATA, a schema in
Microsoft Access, or the function to manually adjust column widths in the Text Import Wizard
function in Microsoft Excel. We relied primarily on STATA and Excel, based on the data
definitions provided by the UCR Hate Crime Yearly Master Record Description (or codebook).
Other researchers may be able to provide even simpler approaches. While the use of a dictionary
-
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
12/27
12
file would assist in translation of such large datasets in STATA, some individual years stray from
the attached codebook dictionary file (given as a picture pdf, not text, so it also must be manually
re-created), making that approach arguably more time-consuming than simply readjusting
column widths on a year-by-year basis. But, it should be noted that individual years can be off
from the codebook dictionary by a digit or two, meaning each year has to be visually inspected
digit-by-digit in the translation and integration process.
Here is a detailed description of the process used to integrate these disparate documents
into a single, streamlined database for the general user with a less technical background in ASCII
or STATA. Below is a graphic of the Text Import Wizard in Microsoft Excel, which we used
for the first step in the reformatting process.
The process must be repeated once for the Batch Header data(rows containing
information about a reporting agency), and once for the Incident Report data (rows containing
information about a particular incident), as the column definitions differ between the two.
-
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
13/27
13
Incidents are arranged in the rows below the corresponding Batch (agency that reported the
incident), rather than in the same row. The latter format is more useful for the researcher,
because it would link batch (geographic and demographic) information with the details of the
incident in the same record. An agency can be any of several types of local reporting
organizationcity, county, university etc. Given the current configuration of the file, using the
codebook alone to define the columns in order to make the data readable is insufficient, because
correcting the columns for the Batch Header data renders the Incident Report data
meaningless, as the columns are defined in different places for each type of entry. For example,
see below for hate crime data reported for the year 1991 for Hoover, AL:
BH 1 AL0011200 1991 430 19910101 HOOVER AL 3 6 3 1
IR 1 AL0011200 K0Y L6XY ED4K2007 1010N400101B13
IR 1 AL0011200 3A0G Q-FO 771G2007 1226N400100U29
Two hate crime incidents were reported by the Hoover, Alabama agency. However,
when formatting columns for the BH data type, the IR data underneath loses its meaning, as
the column breaks are different for each data type. Therefore, in order to compare trends in
incident reports (hate crimes) across the data containedwithin the batch header (such as
population, reporting agency name, or locality type, such as county or city), the Incident
Report data must be manually re-arranged left-to-right in the same row with its corresponding
locality information, rather than in a separate row beneath it.20
20Again, a dictionary file would solve this issue, but some years are off by a digit or two.
-
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
14/27
14
This was accomplished by using a lookup function in Microsoft Excel using the one
common denominator between the IR and BH data, the Originating Agency Identifier
(ORI) number. The ORI number is the unique identifying federal number pertaining to each
reporting local agency. The end result is a spreadsheet where an individual row entry contains
both information on the details of each individual hate crime (incidents) as well as data
describing the agency and associated locality that reported it.21
Step 3: Reformatting Codes into Text
Next, certain data fields, even after formatting, still require the codebook to interpret their
meaning. The location where the hate crime occurred, for example, is a two-digit code:
21Using a dictionary file in STATA is likely the most efficient way to do this. This requires knowledge of STATA
programming using a dictionary file. Doing so would still require using the ORI (the only common denominator
between the IR and BH data).
-
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
15/27
15
(Reproduced from Hate Crime Yearly Master Record Description, p. 22)
We translated the values for this variable and for similar variables across the entirety of
the reported incidents from 1991-2011, using STATA and value labels based on the Master
Record Description. Researchers working with this data need some sort of program (like
STATA) to automate this renaming process, because there are more than 150,000 records in the
hate crime UCR data, each with over 50 data fields. So, while an incident originally appeared as
follows:
loccode01 populationgroup ucroffensecode01 biasmotivation01
12 9A 290 25
It now appears in our version as this:
-
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
16/27
16
Location Population
Group
UCR Offense Code Bias Motivation
Grocery/Supermarket MSA Counties
100,000 or over
Destruction/Damage/Vandalism
of Property
Anti-Other Religion
These simple translation methods fulfill the programs intention of making this data more
accessible to the media, researchers, local officials and the general public.
Data Label (column) Definitions for Final Recompiled MS Excel Database
From left to right (with page in the codebook where the value label is provided in
parentheses):
1) Incident Date: the date the incident occurred (page 18 of the codebook)2) Incident Number: the unique identifying number given to each hate crime incident by the
local reporting agency (pg. 17)
3) Data Source: the media by which the FBI received the incident data (pg. 17)4) Agency Name (brought over from Batch Header data): the name of the local reporting
agency (note that this is not always identical to City Name) (pg. 15)
5) ORI: Originating Reporting Identifier (brought over from Batch Header data): uniquenumber assigned to each local reporting agency (pg. 6)
6) City Name (brought over from Batch Header data): name of the city covered by the ORI(pg. 7)
7) Population (brought over from Batch Header data, Current Population) (pg. 14)
-
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
17/27
17
8) Agency Indicator (brought over from Batch Header data): distinguishes amongcounties/colleges/cities for local reporting agency (pg. 12)
9) Core City (brought over from Batch Header data): indicates whether or not the ORI is thecore city of a Metropolitan Statistical Area (pg. 12)
10)Country Division (brought over from Batch Header data): geographic division in whichthe state of the local reporting agency is located (pg. 9)
11)Country Region (brought over from Batch Header data): a higher-level agglomeration ofthe Country Division field above, the geographic region in which the above
geographic division is located (pg. 11)
12)Population Group (brought over from Batch Header data): the population category of theORI (pg. 8)
13)Original ASCII Victims Total: the Total Number of Individual Victims field in theincident data (pg. 18)
14)New Victims Total: Sum of Victims for Each Offense: As noted below, the TotalNumber of Individual Victims for a given year is often smaller in this detailed data than
it is in the value for the similarly titled variable presented in FBI summary data at the
UCR website. Field 13) is also less for a given year than the total derived by simply
adding the individual victims for each individual offense within an incident. The latter
approach may overstate the true number of victims, but we have calculated it and
included it side-by-side with the original total number of victims for comparison and
audit purposes.
-
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
18/27
18
15)Vics Difference: the difference between the Original ASCII Victims Total (or TotalNumber of Individual Victims, or TotalVics in the codebook and original ASCII
code) and the newly recalculated NewVictims Total
16)Total Offenders: the total number of offenders in the incident (pg. 19)17)Offenders Race (pg. 19)
The following 5 fields repeat for each individual offense reporting within a single
incident:
18)Bias Motivation: for example: Anti-Black, the nature of the hate crime bias (pg. 23)
19)Types of Victims: for example: individual, business, or society/public (pg. 24)20)UCR (Uniform Crime Reporting) Offense: the UCR criminal offense type (pg. 20-22)21)Location Code: the location where the offense occurred (pg. 22)22)Number of Victims: the number of victims pertaining to that individual offense within the
reported incident (pg. 22)
The last 3 fields appear once:
23)State Code (brought over from Batch Header data): the 2-digit code used by the FBI toidentify the state (pg. 5)
24)FIPS Code (brought over from Batch Header data): Federal Information ProcessingSystem County Code (the codebook refers the user to FIPS Publication 55 for a legend
for these county codes) (pg. 16)
25)State Abbreviation
-
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
19/27
19
Mini-Tutorial For the Excel Novice: Using Filters for the Beta-Version Hate Crime Incident
Report Spreadsheet
The combined dataset of all hate crimes from 1991 to 2011 is large. Academic
researchers and interested citizens analyzing crime may benefit from isolating the data by
attribute. One simple way to sort the data is to use filters in Microsoft Excel. The downloadable
data spreadsheet we have prepared is pre-sorted first by year. The spreadsheet has the
Autofilterfunction of Microsoft Excel already applied. This feature can be turned off by
clicking on the Data tab on the top of the screen (for Excel 2010) and then deselecting the
Filter option. With filters left on, click on a drop-down arrow and select and deselect data to
be shown as required. Below is a graphic showing how to restrict the data to show only 2004:
Simply click on the drop-down arrow for the Incident Date (column A) and check only the
2004box, and the search criteria is restricted to only those incidents occurring in 2004. The
data is provided with incidents sorted by year, and within each year, by incident number. It is
-
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
20/27
20
important to remember that if the user has multiple years showing, and then sorts by incident
number or some other factor, that data for each year will be intermingled, and each year will
cease to be in contiguous order if the spreadsheet is saved. In order to resort the data by year,
remove all restrictions on non-date filters, and then resort again by year (sort oldest to newest).
Using the Tool for Verification of UCR ASCII Source Data Compared to FBI UCR Online
Summary Reports
Maltz and Targonski note that the UCR is a voluntary program. Local agencies are not
required to submit any type of criminal data. Thus, the FBI has no control over the reliability,
accuracy, consistency, timeliness, or completeness of the data they receive.22 As a
consequence, some agencies may submit data for only a portion of a year. In order to keep data
comparable from one year to another, the FBI often imputes full year trends off of a portion of a
years reported data. That is, they extrapolate the rate for a portion of the year out to the full
year.
This combined data tool allows the researcher to compare the detailed incident reports to
the summary reports posted online at the FBI UCR website for hate crimes. The FBI summary
reports (available in spreadsheet format only from 2004-2011, and in pdf files for earlier years)
always state that the number of hate crime victims and hate crime offenses for each year are
identical.23 In the source data, the number of victims and offenses are not identical.
22Maltz elaborates why so much data never makes it to the federal database: natural disasters, budgetary
restrictions, personnel changes, inadequate training, and conversion to new computer or crime reporting systems
all have affected the ability of police departments to report consistently, on time, completely, or at all. And some
agencies may not fill out crime reports simply because they rarely have any crime to report. Maltz and Targonski,
A Note on the Use of County-Level UCR Data, p. 299.
-
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
21/27
21
The data appears to be aligned or normalized before being presented in the online
summaries, in a process that is not explained to the public user of the online summary data. A
possible cause of the discrepancy may come from the ambiguity of recording multiple offenses
within a single incident, and how to count the victims in the total victims figure. The source
data total victims number is always smaller than the equivalent figure in the FBI summary
report. For example, in the FBI public summary report for 2004, there were 9.514 total victims.
However, if one adds the total victimsfield for each incident in 2004, first by selecting only
2004for Incident Dateand then selecting the Original ASCII Victims Total column
(originally named Total Victims in the codebook), the sum given at the bottom right of the
Microsoft Excel window indicates that there were only 5,390 victims.
Sometimes the total could be understated on the part of the local reporting agency. In the
source data, some incidents have an Original ASCII Victims Total (total victims) number
much smaller than the sum resulting from adding the number of victims named in each separate
offense within that single incident. It is unclear whether or not the true number of total victims
for an incident should be computed as the sum of the number of victims for each offense within
that incident, or if some of the number of victims columns are redundant. Pending further
clarification on the actual process, the data could be interpreted as showing the same victims
being counted multiple times, if there are multiple offenses, for a single incident. It appears to be
equally unclear, or at least inconsistent for local reporting agencies as well. Logically, one can
hypothesize several scenarios where the number of victims and offenses would not be identical
every year. One offense could have any number of victims, or there could be many offenses
with a single victim. For example, a man threatened John Doe, kicked a small dent in his car,
shook him when John Doe exited his car, and called him a racial epithet. There are several
-
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
22/27
22
offenses: intimidation, simple assault, destruction of property, all resulting from a single
incident. If one is supposed to sum the victim totals for each offense in the incident data, this
counts John Doe as a victim three times. This is not intended as a normative evaluation of such a
process, if it is indeed the process being followed, but it is unclear from the data if this is what is
happening at the local reporting agency level, or if it is what is intended by the FBI as the correct
methodology, or if some local agencies are recording the total number of victims this way, and
others are following some other unclear approach.24
If we instead compute the total number of victims by adding each of the number of
victims subtotals for each offense, the total is actually the same as the FBI summary data, at
9,528. As a result of this tracing process for 2004, we decided to add a new column that
computes the total number of victims by adding the number for each offense.
The database enables many other types of research. For example, now hate crime totals
for each locality for 19912011 can be recomputed on a per capita basis from a single source,
as the data tool incorporates the population totals from the locality as reported by the last census
within each incident row. Basic population-indexed rates of change for offenses against each
vulnerable group can now be computed by any user, as well as a more robust set of summary
statistics.
Omissions in Provided Documentation
According to the 1995 edition of the Hate Crime Yearly Master Record Description, or
codebook, bias motivation fields for each hate crime are coded 11 -15 for Anti-Racial biases,
24The current form for local agencies to fill out a hate crime incident report is located at:
http://www.fbi.gov/about-us/cjis/ucr/reporting-forms/hate-crime-incident-report-pdf.
http://www.fbi.gov/about-us/cjis/ucr/reporting-forms/hate-crime-incident-report-pdfhttp://www.fbi.gov/about-us/cjis/ucr/reporting-forms/hate-crime-incident-report-pdfhttp://www.fbi.gov/about-us/cjis/ucr/reporting-forms/hate-crime-incident-report-pdf -
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
23/27
23
21-27 for Anti-Religious, 32-33 for Anti-Ethnicity/National Origin, and 41-45 for Anti-Sexual.
However, codes for Anti-Disability (51 for Anti-Physical Disability and 52 for Anti-Mental
Disability) are used in the Hate Crime data, but no explanation of these codes is provided in the
codebook (we have recoded them into the appropriate text in the streamlined data tool). The
actual definitions are included in the more recent Hate Crime Data Collection Guidelines and
Training Manual, (2012) from the Criminal Justice Information Services Division, FBI, at
http://www.fbi.gov/about-us/cjis/ucr/hate-crime/data-collection-manual. The FBI Hate Crime
Technical Specification, Version 2.1 (dated 2012), also provides an explanation for these codes.
The new technical specification is helpful in solving some of the unexplained discrepancies
found in the older codebook that the FBI emails in response to public requests. However, the
new specification is intended for submissions from 2012 and after, where the data is in a slightly
different format and the strings of digits are slightly longer (and so yet again, it is near-
impossible to create a dictionary file in STATA). Even the 2012 documentation cannot solve
some issues related to the original codebook mentioned here.
For bias motivation, code 31 (Anti-Arab) is still occasionally usedas a bias motivation
code incorrectly by local reporting agencies, even though it was eliminated by the FBI (and
presumably replaced with some other bias category). These potentially confusing and
unexplained overlaps between ethnicity and different religions as bias categories are not
addressed in the codebook either.
A third coding inconsistency is the inclusion of the 23* field, presumably a larceny
offense, as all larceny offenses start with a UCR code of 23. However, there is already a
separate All Other Larceny category, and the 23* offense is not mentioned in the codebook
(nor the 2012 Hate Crime Data Collection Guidelines and Training Manual or the 2012
http://www.fbi.gov/about-us/cjis/ucr/hate-crime/data-collection-manualhttp://www.fbi.gov/about-us/cjis/ucr/hate-crime/data-collection-manualhttp://www.fbi.gov/about-us/cjis/ucr/hate-crime/data-collection-manual -
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
24/27
24
Technical Specificationnoted above). We have recoded these entries as Additional Larceny
Offense.
Another inconsistency occurs within the Offenders Race data category. According to
the 1995 codebook,both Black and Unknown are coded as B. This may overstate the
statistic of African-American hate crime offenders, as entries intended as Unknown (U) may
have been coded as B for Blackand counted accordingly. While every year (1991-2010) has
a number of U entries for offenders race, this does not necessarily prove or disprove that some
B entries may have been intended as U. There remains apossibility that the number of
African-American hate crime offenders is overstated, if local agencies coded entries based on
incorrect guidelines provided in the codebook. There is no such typo in the updated
documentation, but that does not prevent there from being incorrectly coded entries in the old
data.
The Hate Crime Record Description (codebook) states on page 7 that for the Batch
Header (agency and other area demographic information) data, there is a placeholder from digit
14 to 25 for incident number. This blank space is intended to ensure that the batch and
incident data, while regrettably organized top-to-bottom, rather than right-to-left, will at least
line up by column width. Unfortunately, none of the data has this placeholder in the batch
data. Digit 14 is actually the beginning of the date the agency began submitting information to
the FBI, while in the codebook, it tells the user that this information begins at digit 26. No
documentation is provided that addresses this inconsistency. The user has to deduce it without
assistance. So, if one intended to use a formal dictionary file in a program like STATA, the
codebook would again give you an incorrect guide. While the 2012 update to the codebook
(which the user has to find themselves, as the FBI only provides the 1995 version) provides
-
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
25/27
25
definitions for several value labels not defined in the 1995 codebook, it also follows a different
column-width-digit definition. This is why we had to go through the painstaking process of
setting column widths manually, because the data often strays slightly from the codebook.
The agency indicator field, described on page 12 of the codebook, indicates whether
the reporting agency is a city (1), county (2), university or college (3), state police (4),
or is covered by another agency (0). However, several agencies have the number 5, 6, or 7 in
this field. There is also no explanation in the 2012 documentation.
The type of victim field is an 8-character field with the letter B standing for a
business, I for an individual, and so on (the list is on page 24 of the codebook. However,
often the entry for this field consists of multiple applicable initials, separated by different
numbers of spaces. Once again, no explanation is provided in the codebook. These are most
likely not instances where the multiple victim types were intended as different individual victims
associated with different offenses, because there are already separate data entries for separate
offenses, and these multiple victim types in question are listed within the same single offense.
Offenses with multiple victim types listed have been recoded as though the multiple initials are
not an error. So, B, G is now recoded as Business/Government, and I, G, S as
Individual/Government/Society/Public, and so on. This problem is specific to the 1991-2011
data, and so the Data Collection Manual and 2012 Technical Specification can provide no
assistance.
The Population Group codes (listed on page 8 of the codebook) pertain to the type of
population district (size and type) covered by the ORI. However, no definition for the code 2C
-
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
26/27
26
is listed in any document we found, and such a code appears in the data throughout. These
entries have simply been left as 2C.
The Location Codes in the codebook only cover 01 25, but there are entries from 37
50 in the data. In order to correctly interpret these entries, one needs to track down the current
form for recording hate crime incidents, which includes the correct codes for 3750, and is
located at:http://www.fbi.gov/about-us/cjis/ucr/reporting-forms/hate-crime-incident-report-pdf.
Looking Ahead: The UCR Redevelopment Project
Many of the issues created by using a difficult to interpret format like ASCII for
submissions of UCR data from source agencies can be solved with a faster, cheaper, less paper-
dependent format, such as Extensible Markup Language, or XML. The FBI has begun to
implement an XML-based reporting solution to help check submissions efficiently for basic
logical tests (such as the total of the number of victims agreeing with the sum of the number of
victims for each individual offense). An XML-based system will also enable easier creation of a
more powerful data query tool similar to the one which we have created here for the legacy data.
However, there is no mention on the FBI website as to how the legacy data will be converted or
integrated with the new format. Failure to do so will result in the loss of the ability to research
crime data for those years (1991-2011), at least within a single tool, and with the format
standardized. Also, streamlining the data filing/collection, verification, and publication
processes does not address any ambiguity for local source agencies as to how certain data should
be counted (validation).
Regrettably, a March 4, 2013 update notes that work on the contract has stopped. The
System Test Readiness Review (S-TRR), planned for 10/29/2012 has not been completed, and
http://www.fbi.gov/about-us/cjis/ucr/reporting-forms/hate-crime-incident-report-pdfhttp://www.fbi.gov/about-us/cjis/ucr/reporting-forms/hate-crime-incident-report-pdfhttp://www.fbi.gov/about-us/cjis/ucr/reporting-forms/hate-crime-incident-report-pdfhttp://www.fbi.gov/about-us/cjis/ucr/reporting-forms/hate-crime-incident-report-pdf -
7/27/2019 E-Government for Uniform Crime Reporting Data and How to Make it More Accessible
27/27
delays in the delivery schedule may conflict with the 11/18/2013 contract expiration date. These
delays have led the FBI to abandon for the moment the concurrent project of ensuring that the
XML-based filing solution currently under development would be usable for both state and
federal agencies. Researchers should look for future updates for the new UCR system at the
UCR Redevelopment Project website.25
Additional Resources
The Hate Crime Yearly Master Record Description (codebook)26provided by the FBI
can be requested from the Criminal Justice Information Services Division. This document also
contains definitions of the types of data collected by the FBI. More detailed information on each
of these categories of data presented can be found in the U.S. Department of Justice, Office of
Justice Programs, Bureau of Justice Statistics, Special Report: Hate Crimes Reported in NIBRS,
1997-1999 athttp://bjs.gov/content/pub/ascii/hcrn99.txt. Other useful sources include the Hate
Crime Data Collection Guidelines and Training Manual, (2012) from the Criminal Justice
Information Services Division, FBI, athttp://www.fbi.gov/about-us/cjis/ucr/hate-crime/data-
collection-manual. The data submission specifications for local reporting agencies can be found
atwww.fbi.gov/about-us/cjis/ucr. Finally, the original report establishing the purpose of the
modern NIBRS system is located athttps://www.ncjrs.gov/pdffiles1/bjs/98348.pdf.
25http://www.fbi.gov/about-us/cjis/ucr/ucr-redevelopment-project. UCR Redevelopment Project. Retrieved July
15, 2013.
26http://www.asucrp.net/FBI%20Manuals%20and%20Addendums.html
http://bjs.gov/content/pub/ascii/hcrn99.txthttp://bjs.gov/content/pub/ascii/hcrn99.txthttp://bjs.gov/content/pub/ascii/hcrn99.txthttp://www.fbi.gov/about-us/cjis/ucr/hate-crime/data-collection-manualhttp://www.fbi.gov/about-us/cjis/ucr/hate-crime/data-collection-manualhttp://www.fbi.gov/about-us/cjis/ucr/hate-crime/data-collection-manualhttp://www.fbi.gov/about-us/cjis/ucr/hate-crime/data-collection-manualhttp://www.fbi.gov/about-us/cjis/ucrhttp://www.fbi.gov/about-us/cjis/ucrhttp://www.fbi.gov/about-us/cjis/ucrhttps://www.ncjrs.gov/pdffiles1/bjs/98348.pdfhttps://www.ncjrs.gov/pdffiles1/bjs/98348.pdfhttp://www.fbi.gov/about-us/cjis/ucr/ucr-redevelopment-projecthttp://www.fbi.gov/about-us/cjis/ucr/ucr-redevelopment-projecthttp://www.fbi.gov/about-us/cjis/ucr/ucr-redevelopment-projecthttp://www.asucrp.net/FBI%20Manuals%20and%20Addendums.htmlhttp://www.asucrp.net/FBI%20Manuals%20and%20Addendums.htmlhttp://www.asucrp.net/FBI%20Manuals%20and%20Addendums.htmlhttp://www.asucrp.net/FBI%20Manuals%20and%20Addendums.htmlhttp://www.fbi.gov/about-us/cjis/ucr/ucr-redevelopment-projecthttps://www.ncjrs.gov/pdffiles1/bjs/98348.pdfhttp://www.fbi.gov/about-us/cjis/ucrhttp://www.fbi.gov/about-us/cjis/ucr/hate-crime/data-collection-manualhttp://www.fbi.gov/about-us/cjis/ucr/hate-crime/data-collection-manualhttp://bjs.gov/content/pub/ascii/hcrn99.txt