### Question Description

The student should choose one of the datasets and form a hypothesis (or hypotheses) based on the variables contained in the chosen dataset. The student should fully describe the pertinent data and perform statistical analysis. This statistical analysis should speak to the research question (hypothesis) the student has chosen.

i attach everything about the research paper in file(specific demonstration of research paper, topics and data). And there is a research paper work excel should be finished also.

TutorAR
School: New York University

Hi,Find attached the completed work.Feel free to ask for any clarification or editing if need be.Looking forward to working with you in the future.Thank you.

Surname 1

Introduction to Statistics Research Paper
Student’s Name
Professor’s Name
Course Title
Date

Surname 2
Introduction
There has been a continuous debate on whether or not the crime rate in towns has an
impact on the proportion of non-retail businesses in those specific towns. The increased crime
rate in most US towns has spurred a lot of fear in businesspeople regarding the type of
businesses they can invest in. It would be a loss if a businessman invested a non-retail business
only for him or her to lose all their investment to criminals. For this reason, most people would
prefer to invest in retail businesses as they seem more secure compared to their counterparts. The
Research question is ‘Does the crime rate by town affect the proportion of non-retail business
acres per town?’. The objective of this research is to establish whether the crime rate by the town
has a significant impact on the proportion of non-retail business acres per town.
Data Description
The data used for this research is the Boston Housing Dataset which was obtained from
https://archive.ics.uci.edu/ml/datasets/Housing. Two variables are selected for analysis. The
independent variable is CRIM: per capita crime rate by town while the dependent Variable is
INDUS: the proportion of non-retail business acres per town. The mean of per capita crime rate
is 1.3737, obtained from the excel formula =AVERAGE (A2:A437) while that of the proportion
of non-retail business acres per town is 10.24 obtained from the excel formula =AVERAGE
(B2:B437). The mean values indicate that on average, the per capita crime rate in Boston town is
1.37% while the proportion of non-retail business acres per town is 10.24% on average. This
research will seek to find out whether a 1.37% per capita crime rate significantly brings about the
10.24% proportion of non-retail business acres. The mean may not be a very fair representative
of the data because it is easily influenced by outliers. This explains why the median has also been
used here because it is not influenced by outliers. This represents the true center of the data set.

Surname 3
The standard deviation for the per capita crime rate in Boston town is 2.4578, the median
is 0.180. For the proportion of non-retail business acres in Boston, the standard deviation is
6.7373 while the median is 8.14. Comparing each individual value with their means, it can be
noted that the variation from the mean is not great thus the data is normally distributed. Without
this statistic, one cannot tell whether the data is close to the average of whether it is spread out
over a wider range. It is used to compare two datasets that have the same average values. A
scatter plot showing the per capita crime rate is shown below. It has been generated in excel.

CRIM
12
10
8
6
4
2
0
0

50

100

150

200

250

300

350

400

450

500

Statistical Analysis
From the data analysis, the correlation coefficient of the two variables is 0.567475. It is a
moderate positive correlation. This shows that as the per capita crime rate increases in Boston,
the proportion of non-retail business acres also increases. Also, as the crime rate decreases, the
proportion of non-retail businesses also decreases.

Surname 4
The confidence interval is the given range of values that are defined in that there is a
given probability that proves that values of a parameter are found within them. Confidence
intervals are determined by the confidence level that is selected by the user. A confidence
interval is used to estimate the margin of error in a particular experiment. We need confidence
intervals because they are used to estimate how well the sample statistic or the point estimate
reflects the value of the underlying value of the population. It does this by providing a range of
values that is likely to contain the population underestimation (Larson & Farber, 2012).
A point estimate, on the other hand, i...

