Data Quality

The integrity and reliability of data-based analysis and reporting depend, in large part, on the quality of the underlying data.  Quality control of crime reporting data should begin at the local level at the points of data collection, data entry, and data processing.  This section gives an overview of the FBI-required edits.   Related screens provide more information on the significance of local level quality control and examples, with SPSS code, of verifying the accuracy of local data sets.

For NIBRS, the FBI offers multiple levels of data quality controls and tests.   Prior to participating in NIBRS, agencies are required to submit data on magnetic media for testing, and agency participation depends upon the submission of accurate data.  The FBI provides detailed documentation of the coding and submission requirements: data element definitions; specification of the valid data values for each element; lists of the mandatory, conditional, and optional data elements; specification of which data segments are required under what circumstances; required sequencing of data segments; data submission schedules; and resubmission guidelines and controls.

When the FBI receives data submissions, an extensive series of data quality checks are run and an incident report is rejected if errors are found.  The data quality checks include: value type and position within field, presence of values for mandatory and conditional data elements, use of valid values, duplication if multiple valid values are permitted for a data element, use of logical values and cross-checks between related data elements (e.g., age of child is less than age of parent), presence of required data segments, duplication of segments, duplicate reports, number and type of segments, segment sequencing, and appropriate links between segments (e.g., an offense segment requires a victim segment).

Checks should also be made to verify entries allowed in data management systems.   For example, default values may be overused with required elements.  If your system inserts default variables, these fields should be carefully reviewed.  For example, time of day is a required field, and some systems may default to zero.  If no entry is provided, the system will insert a zero, resulting in a peak at zero, or midnight.

Next: Analyzing and Presenting Incident-Based Data