next up previous contents
Next: Add MCD-group codes to Up: A Formal Checklist Previous: Create the MCD-group Units

Assign MCD-group IDs to the Precinct-Level Electoral Data

 

This step takes the just-created MCD-group index (xx.mg) and assigns the precinct-level electoral data MCD-group codes based on the index. The result will be that the data will have several new columns, most notably one identifying which MCD-group each precinct resides in.

The assignment of precinct-level electoral data codes involves two stages. The first is an MCD-group assignment which the computer performs, and the second is carried out by precinct-matching from first a computer pattern-matching algorithm, and second by a human ``hand-matcher.''

In the first stage we assign MCD-group codes without needing precinct code matches. The shortcut is possible only for counties which are not disaggregated into more than one MCD-group. In other words, for some counties the MCD-group is equivalent to the county. In these cases it is not necessary to assign census precinct codes in the precinct-level electoral data dataset because we will not need to divide the county. So we use a computer program to fill in these numbers automatically: if MCD-group 23 is equivalent to precinct-level electoral data county 17 then all precincts in county 17 are assigned an MCD-group code of 23. Note that this step first assumes we assign the census county codes to precincts, but this is a trivial step.

The second stage requires that we give precincts a census code, which comes from the PL94-171 data now in the mcdpr-ok.sd2 file. In this case, unlike the first stage cases, more than one MCD-group will exist in the county. We assign precinct codes by matching the precinct name field in the precinct-level electoral data files (the only identifier we are given in the original form of this dataset) with the precinct name fields in PL94-171 (which also contains standard numerical census precinct codes). First a computer program generates as many suggested matches as possible, and then the data is turned over to a human matcher to complete the un-paired precincts by hand. Since precinct codes have been associated with an MCD-group code in the group generating step, assigning a precinct code is equivalent to assigning an MCD-group code.

When this step is finished, all precinct-based rows of precinct-level electoral data will have an MCD-group code.

  1. Assign census county codes to the precinct-level electoral data, 

    Assign census standard county codes from PL94-171 to precincts, based on a county-name field merge. Since this information comes from the precinct-level xx.mg, MCD-group codes for non-disaggregated counties are also assigned at this step. The program is named matctyxx.sas. This program will have to be tweaked to get all of the county names to match correctly since they may differ slightly. Our precinct-level electoral data uses five-letter abbreviations of county names while PL94-171 uses full names. Note that the single file matctyxx.sas performs county code assignment to every precinct-level electoral data file (every year). The original files are modified, creating no new files but adding information to the existing ones (xxyyMP20.sd2).

    Note that this step creates an exception to the general principle that programs should clean up after themselves, by deleting their intermediate or working files. Three such files are created in this step and are preserved for use by subsequent steps and for diagnostics. These are key.sd2, key1.sd2, and key2.sd2.

    What should be left over:





    matctyok.sas.
    SAS program to assign county (and many MCD-group) codes to precinct-level electoral data.
    key.sd2
    (intermediate file) SAS version of ok.mg
    key1.sd2
    (intermediate file) SAS version of unique precincts by MCD-group.
    key2.sd2
    (intermediate file) SAS version of unique MCDs by MCD-group.

    Modifies:
    precinct-level electoral data files OK84MP20.sd2, OK86MP20.sd2, OK88MP20.sd2, OK90MP20.sd2,

  2. Computer-match as many MCD-groups as possible.

    This step involves the heaviest SAS programming yet. For each precinct-level electoral data year file, a computer program named matcyyxx.sas is adapted to match as many MCD-groups as it can, and outputs the results to a Microsoft Excel (.xls) file for a human hand-matcher to check and complete. (DBMSCopy is called as an intermediate step for converting from SAS to Excel format.) It also produces a log file of the computer matching with information on how much work was saved.

    What should be left over:





    matc84ok.sas.
    matc86ok.sas.
    matc88ok.sas.
    matc90ok.sas.
    These SAS programs perform computer matching on each precinct-level electoral data year file and prepare the results for the human hand-matcher.
    matc__ok.log.
    Interesting information on the efficiency of the computer matches, generated by the SAS programs.
    matc84ok.xls.
    matc86ok.xls.
    matc88ok.xls.
    matc90ok.xls.
    Microsoft Excel files which are sent to the hand-matcher. These contain all of the matches that previous computer steps were able to find.

  3. Hand-match the remaining precincts. 

    Give the partially-matched .csv files to a human. The human should assign precinct and MCD-group codes to the unmatched precinct-level electoral data precincts and verify those already matched. A record of all anomalies and decisions should be recorded in math__xx.log, grouping comments for all different precinct-level electoral data year files. The hand-matcher should write her or his changes to new .xls files named mathyyxx.csv and resubmit these to the person in charge of the computer work for this step.

    What should be left over:





    math__ok.log.
    A record of all anomalies and decision calls which the hand-matcher had to make on this state in the process of finishing the precinct matches.
    math84ok.xls.
    math86ok.xls.
    math88ok.xls.
    math90ok.xls.
    Microsoft Excel format files which are sent to the hand-matcher. These contain all of the matches that previous computer steps were able to find.

  4. Inspect the hand-matcher's work.

    The person performing the computer work on this step should inspect the returns from the hand-matcher and scan for any problems or visible errors. He or she should also read the hand-matcher's notes in math__xx.log.

  5. Assign updated MCD-group codes to the precinct-level electoral data files. 

    Since this is the ``final'' precinct-level electoral data match we will call the program file for this step matfyyxx.sas. There will be as many of these are there are years, and each will simply modify the existing file (filling in the MCD-group field for values that were previously missing). In cases where no assignment could be made the MCD-group field value will be zero.

    What should be left over:





    matf__ok.log.
    Statistics on the success of final precinct-to-PLVD matching process.
    matf84ok.sas.
    matf86ok.sas.
    matf88ok.sas.
    matf90ok.sas.
    These four program files reintegrate the MCD-group information in the .xls format files from the hand-matcher into the corresponding precinct-level electoral data dataset.
    Modifies:
    precinct-level electoral data files OK84MP20.sd2, OK86MP20.sd2, OK88MP20.sd2, OK90MP20.sd2,


next up previous contents
Next: Add MCD-group codes to Up: A Formal Checklist Previous: Create the MCD-group Units
Copyright © 1997-2004 [ROAD Home] Questions? Contact the ROAD webmaster.