Dummy Coding with IBM SPSS
To understand what is meant by dummy coding, you need to understand 2 forms of data:
Qualitative or Quantitative?
“Qualitative data describes items in terms of some quality or categorization while Quantitative data are described in terms of quantity (and in which a range of numerical values are used without implying that a particular numerical value refers to a particular distinct category).” To better understand the differences, always remember that qualitative data is more of an observance, while quantitative is measurable.
Your Morning Latte…
So!
The Future of Big Data
With some guidance, you can craft a data platform that is right for your organization’s needs and gets the most return from your data capital.
If we consider a morning latte example, we might note the following:
Qualitative Examples
- robust aroma
- frothy appearance
- strong taste
- burgundy cup
Quantitative Examples
- 12 ounces of latte
- Serving temperature 150º F.
- serving cup 7 inches in height
- cost $4.95
Statistical Analysis often includes variables in which the numbers represent qualitative categories (such as gender, ethnicity or political affiliation).
Including these variables in an analytical model requires special steps to ensure the results can be interpreted properly. These steps involve coding a categorical variable into multiple dichotomous variables, in which variables take the value of “1” or zero.
For clarity, a dichotomous variable is defined as a variable that splits or groups data into 2 distinct categories. An example would be employed and unemployed.
This process is known as “dummy coding.” IBM SPSS makes dummy coding an unpretentious practice. Let’s walk through the steps!
Dummy Coding Step by Step
- Select the categorical variable that you want to dummy code. (Note the number of categories, remembering that dummy coding transforms a variable with “n” categories into “n-1” categories. For example, a categorical variable on political affiliation with three categories — Democrat, Republican and Independent — would be dummy coded into two dichotomous variables, such as Democrat and Republican. A person who identifies as one of these would be coded a “1” in the data set. A person with a zero in these categories would be counted as independent).
- Click the “Transform” menu at the top of the SPSS data sheet, then select “Recode Into Different Variable,” because you will transform the categorical variable into one or more dichotomous or dummy variables. This opens a window that displays the variables in your data set. Select the variable you want to recode, and then click the arrow, which moves the variable name into the box labeled “Numeric Variable.”
- Click the “Output Variable” name box and type a name for your new dichotomous variable. Click “Change.” Click “Old and New Values,” which opens a new display, showing old and new values for the variable you want to transform.
- Recode the values of the variable by coding one category as a “1” and the others as zero. Under “Old Value,” enter the category value to be recoded. Under “New Value,” type a “1,” then click “Add.” On the “Old Value” side, select the “All Other Values” button and type “0” as the new value. For example, the political affiliation example that codes Democrat as a “1,” Republican as a “2” and Independent as a “3” could be recoded into the dichotomous variable Republican, with all “2s” recoded as “1” and other values coded as zero.Click “Continue” after entering the old and new values for your dummy codes, then click “OK.” SPSS will then recode the categorical variable as you have specified.
Done Deal!