Access over 20 million homework & study documents
search

MAS 332 University of Miami Covid in United States Project

Content type

User Generated

Subject

Computer Systems

School

University of Miami

Type

Other

Rating

Showing Page:
1/7
Homework 1 / R
Yingying Zhou
Due by Midnight on Tuesday, 10/13
Instructions
Each question below should be solved with exactly one line of code.
No saving intermediate data steps or data sets
Your R code is enough for an answer, you do not need to additionally type the
answer to each question.
No entering numbers manually.
Example: What percent of people like the color yellow?
Good: mean(favColor==‘Yellow’) <- always correct, even if data
changes
Bad: 6/15, after looking at data and seeing that 6 out of 15 had
favColor of Yellow.
No unnecessary or irrelevant output in your code. Keep it organized and
relevant.
Submit both your RMD and knitted document to Blackboard
library(tidyr)
## Warning: package 'tidyr' was built under R version 3.6.3
Part 1
The data for this section contains information for each US state (and Washington
DC) related to COVID-19 cases; total cases, cases in the last 7 days, and infection rate
per 100,000. COVID related data for this portion can be found online here -
https://covid.cdc.gov/covid-data-tracker/#cases_casesinlast7days
states = read.csv('https://douglas2.s3.amazonaws.com/data/cases_by_stat
e.csv', stringsAsFactors = F)
1. What is the total US population?
sum(states$population)
## [1] 328300544
2. What is the total population for states in the Southeast region?
sum(states$population[states$region == 'Southeast'])

Sign up to view the full document!

lock_open Sign Up
Showing Page:
2/7
## [1] 85880188
3. Using tapply(), obtain the total population for each region.
tapply(states$population, states$region, sum)
## Midwest Northeast Southeast Southwest West
## 68329004 62791232 85880188 42328398 68971722
4. How many states are in the West region?
sum(states$region == 'West')
## [1] 11
5. Using the table() function, how many states are in each region?
table(states$region)
##
## Midwest Northeast Southeast Southwest West
## 12 11 13 4 11
6. What is the total number of cases in the US?
sum(states$total_cases)
## [1] 7132093
7. What is the total number of cases for states in the West region?
sum(states$total_cases[states$region == 'West'])
## [1] 1274901
8. What percent of states have had at least 5,000 cases in the last 7 days?
mean(states$cases_last_seven_days >= 5000)
## [1] 0.4509804
9. How many states in the Southeast region have had at least 5,000 cases in the
last 7 days?
sum(states$region == 'Southeast' & states$cases_last_seven_days >= 5000
)
## [1] 9
10. What percent of the total cases in Florida have occured in the last 7 days?
subset(states, states$state == 'Florida')$cases_last_seven_days / subse
t(states, states$state =='Florida')$total_cases
## [1] 0.0228356
11. Which state has the highest number of cases in the last 7 days?
subset(states, states$cases_last_seven_days == max(states$cases_last_se
ven_days))

Sign up to view the full document!

lock_open Sign Up
Showing Page:
3/7

Sign up to view the full document!

lock_open Sign Up

Unformatted Attachment Preview

Homework 1 / R Yingying Zhou Due by Midnight on Tuesday, 10/13 Instructions • • • • • Each question below should be solved with exactly one line of code. No saving intermediate data steps or data sets Your R code is enough for an answer, you do not need to additionally type the answer to each question. No entering numbers manually. – Example: What percent of people like the color yellow? • Good: mean(favColor==‘Yellow’) <- always correct, even if data changes • Bad: 6/15, after looking at data and seeing that 6 out of 15 had favColor of Yellow. No unnecessary or irrelevant output in your code. Keep it organized and relevant. Submit both your RMD and knitted document to Blackboard library(tidyr) ## Warning: package 'tidyr' was built under R version 3.6.3 Part 1 The data for this section contains information for each US state (and Washington DC) related to COVID-19 cases; total cases, cases in the last 7 days, and infection rate per 100,000. COVID related data for this portion can be found online here https://covid.cdc.gov/covid-data-tracker/#cases_casesinlast7days states = read.csv('https://douglas2.s3.amazonaws.com/data/cases_by_stat e.csv', stringsAsFactors = F) 1. What is the total US population? sum(states$population) ## [1] 328300544 2. What is the total population for states in the Southeast region? sum(states$population[states$region == 'Southeast']) ## [1] 85880188 3. Using tapply(), obtain the total population for each region. tappl ...
Purchase document to see full attachment
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Anonymous
Awesome! Made my life easier.

Studypool
4.7
Trustpilot
4.5
Sitejabber
4.4