# Logistic Regression in SAS

Nikki Kamouneh

Posted on
logit logisitic regression SAS

This post outlines the steps for performing a logistic regression in SAS. The data come from the 2016 American National Election Survey. Code for preparing the data can be found on our github page, and the cleaned data can be downloaded here.

The steps that will be covered are the following:

1. Check variable codings and distributions
2. Graphically review bivariate associations
3. Fit the logit model
4. Interpret results in terms of odds ratios
5. Interpret results in terms of predicted probabilities

The variables we use will be:

• vote: Whether the respondent voted for Clinton or Trump
• gender: Male or female
• age: The age (in years) of the respondent
• educ: The highest level of education attained

For simplicity, this demonstration will ignore the complex survey variables (weight, PSU, and strata).

## Univariate Summaries

We can assign labels to our data to make interpretation easier using the following syntax:

proc format;
value votecode
1 = 'Clinton'
2 = 'Trump';
value gendercode
1 = 'Male'
2 = 'Female';
value educcode
1 = 'HS Not Completed'
2 = 'Completed HS'
3 = 'College <4 Years'
4= 'College 4 Year Degree'
run;

data cleaned_anes;
set cleaned_anes;
format vote votecode. gender gendercode. educ educcode.;
run;

The first step in any statistical analysis should be to perform a visual inspection of the data in order to check for coding errors, outliers, or funky distributions. We can take a look at the frequencies of our categorical variables using:

proc freq data = cleaned_anes;
tables vote gender educ;
run;

This will give us the following output: