ECON 101: Statistics for Economics
Lab Assignment 4
Dates: November 1st, 4th or 5th
Due November 8th, 11th or 12th
The overall aim of our lab section this term is to increase your proficiency in Microsoft Excel and data
analysis. By the end of the term, you should feel comfortable taking on any workplace assignments that
involves data analysis through Excel. The third lab aims to teach you:
How to collect, organize and analyze a very large dataset
Create your own dataset
A method to combine two datasets into one
Similar to what we did in lab a few weeks ago, let’s go get data on the Standard & Poor’s 500 Index (S&P
500). Historical data on the S&P 500 can be gathered at Yahoo!:
https://finance.yahoo.com/quote/%5EGSPC. Type in the site into the address bar.
1. When you reach the webpage for the S&P 500, you will see the real-time value of the index.
Click on the “Historical Data” tab.
2. We want to get data for the entire time period available. Click on the date under “Time Period”
and a range of possible dates can be chosen. Click on “MAX”, then “Done” and finally “Apply”.
Once the screen reloads, you can click “Download Data” and an Excel spreadsheet should
automatically download. Open up the file and see all the data you have gathered!
3. THIS IS DIFFERENT FROM THE PREVIOUS LAB. The dataset shows the Date, Open, High, Low,
Close, Volume and Adjusted Closing Price. For this exercise we will be interested in the
difference between the opening and closing prices. Because of that, we will delete the “High”,
“Low”, “Volume” and “Adj Close” columns by holding down the “Ctrl” button and then clicking
on columns C, D, F and G, which highlights the columns. Click the “Delete” key on the keyboard.
The dataset should now look like this:
4. We will be linking the data with a new dataset based on the year of the observation. This means
we have to “split” the data into month, day and year. In order to do this, the first thing we need
to do is move the data in column B to column D. This is done by clicking on the “B” at the top of
the second column. A green box will appear around the data in the column. Click on the outside
of the box and “drag” the data to column D:
5. Now we can “Split” the date since there are two columns to put our “Day” and “Year”
information in. To do this, click on the “A” at the top of the column, highlighting all the dates.
Then click on the “Data” tab and in the middle, there is a “Text to Columns” button
on the button. Here, you can split the data in the columns based on a delimiter, such as a
period, comma or semi-colon. A new column will be created every time Excel sees the delimiter
in the column. (A fixed-width delimiter is appropriate in some situations, but not this one.) Click
on the “Delimited” button and click next.
You have options as to what you what you want the delimiter to be. Because the month, day
and year are separated by a slash “/”, we want to use that. That is not a common delimiter, but
we can use it if we click on “Other” and then type / into the empty. A preview of the new
dataset will come up. Hit next. On the last page, you can format the data in your columns. It is
important here to click on “Text”, otherwise the month remaining will be read as a full date,
which we don’t want. Then click finish and your data should have split:
Rename the first three columns: Month, Day, Year and similar to the previous lab, get rid of all
the years before 1962 by highlighting the appropriate rows and deleting them. Create the “Daily
Change” in the S&P 500 in column F as 100 x (Close Price – Open Price)/Open Price. To do this,
6. The question we want to answer now is whether the US President’s political party has an
influence on the average daily change of the S&P 500. We could go through the dataset and
manually create a column called Party and type in “Republican” or “Democrat”. This is time
consuming and becomes even more cumbersome when there are more than two outcomes.
Instead, lets create our own dataset. Click on the “+” button at the bottom left of the worksheet
to open a new worksheet.
Right-click on the name of the new sheet (it will be something
like “Sheet2”. Rename the sheet, Political Party.
7. In cell A1, type “Year”. In cell A2, type 2018. In A3, type 2016. Then highlight 2016 and 2017.
Click on the small box in the bottom right corner, then drag the box down to row 57 and the
years will automatically decrease by 1 every cell. The last year should be 1962.
8. In cell B1, type “Party”. In 2018, we have a Republican president, so type “Republican” in B2.
From 2009 to 2016, we had a Democratic president. Type “Democrat” in 2016, highlight the cell,
then drag it down to 2009.
Do this for all the Presidential parties back to 1962. Data on the political parties is available at:
http://jjmccullough.com/prezidents.htm. Round your years so that terms start in odd numbered
years and end in even numbered years.
9. We now want to link the party data to our original dataset. Return to your S&P data by clicking
the appropriate tab at the bottom of the worksheet. Create a “Party” column in G1. In cell G2,
you will use the “VLookup” command. You want to link the “Year” in the S&P data to the “Year”
in the party data. To do this, type “=vlookup(cell with year in it, ‘name of party
worksheet’$A$1:$B$57, ‘column in party worksheet you want reported’, false). There is a lot
here, but what you are doing is telling Excel that when the cell with the S&P year in it needs to
be matched to the leftmost column in your party worksheet. The dollar signs make sure the
entire worksheet you are drawing from is the same no matter what. The next part notes which
column in the party worksheet you want reported and false means the years must match
In my own worksheet, I typed =VLOOKUP(C2,'Presidential Parties'!$A$1:$B$57,2,FALSE) into cell
G2. Then I double-clicked the corner box on G2 and the presidential parties were automatically
Construct Sample Distributions based on Party in Power
1. Now you should feel comfortable putting together some basic statistics about the daily returns
when the President is a Republican vs. Democrat vs. All Years. Provide the mean, standard
deviation and number of observations for each of the three groups in an easy to read table.
(Hint: Use the “=averageif” command. You can also sort and create separate columns of returns
for Democrat and Republican years, just make sure you paste “Values”.)
2. Create a distribution of daily returns using the histogram command for each of the three groups.
Provide a brief explanation about the shape and statistics of each distribution.
3. In a single paragraph, discuss whether a Republican or Democrat President is better for the
stock market using the information you have gathered. Your write-up, including histograms and
tables, should not exceed two pages.
Purchase answer to see full