Description
- Your data file must have at least 300 observations .
- Data must have at least two numeric variables that you think might have an association with each other. One of these will be your response variable (y) and the other your explanatory variable (x).
- You may choose data from any source that interests you. Here are a couple of sites where you might find suitable data (not all data is suitable - see me if you're not sure). You might also google something like "free small dataset csv download"
- https://archive.ics.uci.edu/ml/datasets.php
- http://veekaybee.github.io/2018/07/23/small-datasets/
- Do a Google search on "free small dataset csv download"
Once you have identified your data, post the following:
- Where did you find this data (exactly - include the URL) and what does it represent?
- What two numerical values will you be looking at?
- Do you expect there will be a relationship between these variables?
- Create a scatterplot and compute the correlation value for these two numerical values
- Write a 2-3 paragraph describing the association (or lack thereof) between your two numeric variable
Explanation & Answer
View attached explanation and answer. Let me know if you have any questions.
1
PRESENTING DATA
PRESENTING DATA
Name
Institution
Instructor
Course
Date
PRESENTING DATA
2
Where did you find this data (exactly - include the URL) and what does it represent?
Forest fires dataset found URL: https://archive.ics.uci.edu/ml/datasets/Forest+Fires is a
nonlinear dataset used to examine the factors that led to forest fires in Northeast Portugal. This
dataset is used to investigate the relationship between forest fires and meteorological values such
as temperature, wind, relative humidity, rainfall and area covered. The dataset includes other
categorical values suc...