1.2 Data Dictionary

A data dictionary provide human readable description of the data, providing context on the nature and structure of the data. This helps someone not familiar with the data understand, and use the data. At a minimum they should contain the following pieces of information about the data:

  • variable names
  • variable labels
  • variable codes, and
  • special values for missing data.

Data Dictionary

An example data dictionary table from incarceration trends repository. This includes information on the variable, its class (type), and a longer description.

Variable Class Description
year integer (date) Year
urbanicity character County-type (urban, suburban, small/mid, rural)
pop_category character Category for population - either race, gender, or Total
rate_per_100000 double Rate within a category for prison population per 100,000 people

Note: Every data dictionary should also be provided in its raw form (e.g., a CSV) in the repository

References