## Description

Overview

## Recall that samples are used to generate a statistic, which businesses use to estimate the population parameter. You have learned how to take samples from populations and use them to produce statistics. For two quantitative variables, businesses can use scatterplots and the correlation coefficient to explore a potential linear relationship. Furthermore, they can quantify the relationship in a regression equation.

Prompt

**This assignment picks up where the Module Two assignment left off and will use components of that assignment as a foundation.**

You have submitted your initial analysis to the sales team at D.M. Pan Real Estate Company. You will continue your analysis of the provided Real Estate Data spreadsheet using your selected region to complete your analysis. You may refer back to the initial report you developed in the Module Two Assignment Template to continue the work. This document and the National Statistics and Graphs spreadsheet will support your work on the assignment.

**Note:** In the report you prepare for the sales team, the dependent, or response, variable (y) should be the listing price and the independent, or predictor, variable (x) should be the square feet.

Using the Module Three Assignment Template, specifically address the following:

**Regression Equation:** Provide the regression equation for the line of best fit using the scatterplot from the Module Two assignment.

**Determine ****r****:** Determine *r* and what it means. (What is the relationship between the variables?)

Determine the strength of the correlation (weak, moderate, or strong).

Discuss how you determine the direction of the association between the two variables.

Is there a positive or negative association?

What do you see as the direction of the correlation?

**Examine the Slope and Intercepts:** Examine the slopeb1{"version":"1.1","math":"b1"} and intercept b0{"version":"1.1","math":"b0"}.

- Draw conclusions from the slope and intercept in the context of this problem.

Does the intercept make sense based on your observation of the line of best fit?

- Determine the value of the land only.

**Note:**You can assume, when the square footage of the house is zero, that the price is the value of just the land. This happens when*x*=0, which is the*y-intercept*. Does this value make sense in context?

**Determine the ****R****-squared Coefficient:** Determine the *R*-squared value.

Discuss what *R*-squared means in the context of this analysis.

**Conclusions:** Reflect on the Relationship: Reflect on the relationship between square feet and sales price by answering the following questions:

Is the square footage for homes in your selected region different than for homes overall in the United States?

- For every 100 square feet, how much does the price go up (i.e., can you use slope to help identify price changes)?
- What square footage range would the graph be best used for?

### Unformatted Attachment Preview

Purchase answer to see full attachment

## Explanation & Answer

View attached explanation and answer. Let me know if you have any questions.😀 Hi there! I completed the assignment! 😀 I'm aiming to be a 5 starts tutor, hence feel free to send me any questions and to review what ever I did. I will be willing to fix it asap.

[Note: To complete this template, replace the bracketed text with your own content. Remove this

note before you submit your report.]

Housing Price Prediction Model for D.M. Pan Real Estate Company

[Your Name]

Southern New Hampshire University

Median Housing Price Prediction Model for D.M. Pan National Real Estate Company

2

Module Two Notes

Scatter Plot

$400,000

$350,000

y = 175.97x - 118809

Listing Price

$300,000

$250,000

$200,000

$150,000

$100,000

$50,000

$0

0

500

1000

1500

Square Feet

2000

2500

3000

Regression Equation

The regression equation for the line of best fit that I obtained using the scatterplot from

Module Two assignment is:

𝐇𝐨𝐮𝐬𝐞 𝐥𝐢𝐬𝐭𝐢𝐧𝐠 𝐩𝐫𝐢𝐜𝐞 = −𝟏𝟏𝟖𝟖𝟎𝟗 + 𝟏𝟕𝟓. 𝟗𝟕 ∗ 𝐬𝐪𝐮𝐚𝐫𝐞 𝐟𝐨𝐨𝐭𝐚𝐠𝐞

Determine r

The r coefficient between these 2 variables is 0.6862 and it means that the variables are

weakly positively correlated in the same direction, if one increases the other one will probably

increase. The direction of the association is positive or direct, we can determine that based both

in the sign of the correlation coefficient and also by looking at the scatterplot.

Examine the Slope and Intercepts

The slope of the regression line is $175.97, which means that on average the listing price of a

house increase by 175.97 per additional square feet. The intercept is -$118,809; and this

Median Housing Price Prediction Model for D.M. Pan National Real Estate Company

3

represents the price of a value with 0 square foot, which can’t be interpreted as the value of only

the land, since it’s negative. Even if the value would have made sense we should be careful about

it since the smallest house in the sample has a size of 1,331 sq. ft. which implies that the

intercept is an extrapolation of the data and it could be biased.

R-squared Coefficient

The R-squared is 0.4709 for this relationship, and it means that 47.09% of the variability

in the listing prices could be explained by the variability in the square footage. This is telling us

that the size of the house is one of the main factors to predict its listing price, but clearly there is

a big share that is not explained by size.

Conclusions

The relationship between square feet and sales price was not as strong as I expected, the

main reason might be that other variables such us the number of bedrooms the age of the house

and the quality of the building could be important characteristics when determining the price of a

home.

The worst part was the negative intercept I would have like to have a value that could

represent the value of land only. Anyway, if we compare this to the overall homes in the United

States, this region is clearly cheaper, mainly because houses are smaller in this region.

The equation I found, despite being not a perfect estimate it’s very useful to have a first

educated guess of the price of the house based only on its size, this is very easy to obtain and a

very quick tools to start with house valuations. Still, this would only work for houses between

1,300 sq2 and 2700 sq2, since these where the range of square footage in our sample.

Finally, for every 100 square feet, the listing price of the house goes up on average by

$17597.

Region

East North Central

East North Central

East North Central

East North Central

East South Central

East North Central

East South Central

East North Central

East South Central

East North Central

East South Central

East North Central

East South Central

East North Central

East South Central

East North Central

East South Central

East North Central

East South Central

East North Central

East North Central

East North Central

East North Central

East North Central

East North Central

East North Central

East North Central

East North Central

East South Central

East North Central

East South Central

East North Central

East North Central

East North Central

East South Central

East North Central

East North Central

East North Central

East North Central

East North Central

East South Central

East North Central

East South Central

East South Central

East South Central

East North Central

East North Central

median $'s per square

State

County median listing price

foot

oh

wayne

$185,393

$105

oh

clermont

$305,466

$115

in

porter

$309,355

$124

wi

kenosha

$277,265

$141

ky

mccracken

$173,171

$89

il

kendall

$266,967

$119

tn

washington

$250,241

$112

in

vigo

$94,994

$63

tn

maury

$32...