next up previous contents
Next: Statistics on Precincts and Up: Merging Electoral and Census Previous: The End of the

Outline of the Merging Process

 

In order to merge Census data with voting data at the MCD Group level, several steps were necessary:

  1. Construct MCD Groups from the existing Census geographical definitions of VTDs, MCDs, and counties.
  2. Match electoral precincts to Census VTDs, as defined in the Census PL94-171\ files.
  3. Aggregate the precincts to MCD Groups using their new VTD codes.
  4. Aggregate Census MCDs to MCD Groups.
  5. Merge the two datasets at the MCD Group level.

Note that Step 2 above accomplishes most of the work necessary to merge Census and voting data at the VTD-level. (As we note below, even though the matching at this point is not perfect, the small errors are for the most part inconsequential once the data are aggregated to MCD-Groups.)

Also, recall that MCD-level merges were possible for some states (i.e., those in which electoral precincts precisely matched VTDs and in which VTDs were fully nested within MCDs). In such states, MCD Groups were equivalent to MCDs. In those states with few or no VTD definitions, the optimal MCD Groups were entire counties. However, unless otherwise noted, when we discuss MCD Group-level merged data, we refer to both these trivial MCD Group states as well as the states in which MCD Groups are not in all cases equal to MCDs or counties.

Unfortunately, a difficulty in Step 2 was that the naming conventions for the precincts usually were not identical across the voting data and the Census PL94-171 data. To correctly assign the voting precincts to the Census precincts, we wrote a SAS program to match as many precincts as possible, given cues such as county name, precinct name, and precinct number. After the program matched as many precincts as possible (typically 50-80 percent of them), a person manually matched the remaining precincts based on a precisely-defined set of coding rules.

Often a precinct could not be conclusively matched to a precinct in the PL94-171 data. In such cases it was occasionally possible to figure out which MCD (or MCD Group) the precinct resided in. As long as a precinct could be correctly assigned to an MCD or MCD Group, it did not matter that a precinct could not be matched directly to a Census VTD, since the two datasets would be aggregated up to the MCD Group-level anyway.

Sometimes, an electoral precinct could not be matched to even an MCD (or MCD Group). In these cases nothing was done with the precinct (it is still included in the precinct-level electoral dataset, but not assigned in the MCD Group-level merge file). Statistics about the total number of precincts assigned and unassigned for each state and for each county in each state are recorded in log files produced during the merge process.

Relatedly, because not all of the precincts from the vote data could be matched to an MCD Group, some MCD Groups are not exhaustively covered by the precincts from the vote data. That is, there are ``gaps'' in some MCD Groups caused by unassigned precincts. For these MCD Groups, voting turnout is artificially low, because voters in the unassigned precincts are missing. There is no way to calculate precisely the number of missing voters per MCD Group; we can only ascertain the number not included in a county's worth of MCD Groups. In general, however, the number of precincts and registered voters not matched into MCD Groups is quite low-- typically less than one percent of the total for a state. We believe the ROAD Project MCD Group-level data are highly reliable. Moreover, the files used in the merge process are available for interested researchers to further refine these merged datasets.


next up previous contents
Next: Statistics on Precincts and Up: Merging Electoral and Census Previous: The End of the
Copyright © 1997-2004 [ROAD Home] Questions? Contact the ROAD webmaster.