# Estimating Regression Models with Binary Independent Variables

Posted on

In our previous tutorials, we discussed simple regression and multiple regression with continuous variables, but what happens when our independent variable is nominal rather than interval?

The data used in this tutorial are again from the More Tweets, More Votes: Social Media as a Quantitative Indicator of Political Behavior study from DiGrazia J, McKelvey K, Bollen J, Rojas F (2013), which investigated the relationship between social media mentions of candidates in the 2010 and 2012 US House elections with actual vote results. The authors have helpfully provided replication materials. The results presented here are for pedagogical purposes only.

## Binary Independent Variables

First we will take a look at regression with a binary independent variable. The variables used are:

• vote_share (dependent variable): The percent of voters for a Republican candidate
• rep_inc (independent variable): Whether the Republican candidate was an incumbent or not

We will code an incumbent, a candidate who is currently in office, as one, and a non-incumbent as zero. Take a look at the first six observations in the data:

Vote Share Rep Incumbent
51.09 0
59.48 1
57.94 0
27.52 0
69.32 1
53.20 0

Plotting our observations, we see the points cluster together at the two possible values of the nominal variable.