Access over 20 million homework & study documents

Introduction to statistics supplementary lecture notes

Content type
User Generated
Rating
Showing Page:
1/50
1

Sign up to view the full document!

lock_open Sign Up
Showing Page:
2/50
2
1
Introduction: What is Statistics?
Statistics is:
‘the science of learning from
data
, and of measuring, controlling, and communicating
uncertainty
;
and it thereby provides the navigation essential for controlling the course of scientific and societal
advances.’
Davidian, M. and Louis, T.A. (2012), Science.
There are two basic forms: descriptive statistics and inferential statistics. In this course we will discuss both,
with inferential statistics being the major emphasis.
Descriptive Statistics is primarily about summarizing a given data set through numerical summaries and
graphs, and can be used for exploratory analysis to visualize the information contained in the data and
suggest hypotheses etc.
It is useful and important. It has become more exciting nowadays with people regularly using fancy
interactive computer graphics to display numerical information (e.g. Hans Rosling’s visualisation of the
change in countries’ health and wealth over time see Youtube).
Inferential Statistics is concerned with methods for making conclusions about a
population
using infor-
mation from a
sample
, and assessing the reliability of, and uncertainty in, these conclusions.
This allows us to make judgements in the presence of uncertainty and variability, which is extremely
important in underpinning evidence-based decision making in science, government, business etc.
Many statistical analyses and calculations are easiest to perform using a computer. We will learn how to
use the
statistical software R
, which is freely available to download from
http://r-project.org
for use
on your own computer. A good introductory guide is ‘Introduction to R’ by Venables et al. (2006), which
can be downloaded as a PDF from the R project website, or accessed from the R software itself via the menu
(Help
Manuals).
To interact with R, we type commands into the console, or write script files which contain several commands
for longer analyses. These commands are written in the R computer programming language, whose syntax
is fairly easy to learn. In this way, we can perform mathematical and statistical calculations. R has many
existing built-in functions, and users are also able to create their own functions. The R software also has very
good graphical facilities, which can produce high quality statistical plots. Datasets for use in the R sessions
are available from the course website https://minerva.it.manchester.ac.uk/
~
saralees/intro.html You
can download these and store them for use in the lab sessions.
2
Populations and samples
A
population
is the collection of all individuals or items under consideration in the study. For a given
population there will typically be one or more variables in which we are interested. For example, consider the
following populations together with corresponding variables of interest:
(i)
All adults in the UK who are eligible to vote; the variable of interest is the political party supported.
(ii)
Car batteries of a particular type manufactured by a particular company; the variable of interest is the
lifetime of the battery before failure.

Sign up to view the full document!

lock_open Sign Up
Showing Page:
3/50

Sign up to view the full document!

lock_open Sign Up
End of Preview - Want to read all 50 pages?
Access Now
Unformatted Attachment Preview
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Anonymous
I was having a hard time with this subject, and this was a great help.

Studypool
4.7
Trustpilot
4.5
Sitejabber
4.4