Running head: COURSE PROJECT
Overview of the scenario
A client is interested in knowing salary distributions of jobs in Minnesota State.
Therefore, in this paper, an analysis will be carried out on data comprising of 364 records of job
listings by title as well as yearly salaries that range from approximately $40,000 to $120,000.
There are two variables in the dataset. The first variable is job titles that comprise of
different Job titles while the second variable is the salaries of each job title in the state of
Minnesota. The qualitative variable in the dataset is job titles. This is so since it describes data
that is not numerical as well as data that fit into categories. The quantitative variable is salaries.
This is so it comprises of numerical data.
The salary variable is a continuous variable since it can have almost any numerical value.
Additionally, it can be subdivided into finer increments that depend on upon the precision of
measurement. The different numerical variables in the dataset within the salary variable are
attributed to the fact that they represent different job titles.
With the job title variable lacking numerical significance as a result of being a qualitative
variable, the nominal level of measurement has been used as the level of measurement for this
variable. Secondly, the level of measurement of the salary variable is the ration level of
measurement. This is so since this variable can have a value of zero.
Measures of center
These are measures that provide a representative value that summarizes the data set. They
include the mean, mode, median as well as the midrange. To begin with, the mean which is also
referred to as the average is the most common measure of center. However, it tends to be
affected by extreme variables thereby making it unreliable in a skewed distribution (Witte,
Secondly, the median is simply the value which is at the center of a given dataset. Half of
the values in the dataset will be less than the median while the remaining half is greater than the
median value. For this reason, the median is the most suitable measure of center for skewed
distribution. Thirdly, the mode value is simply the value that appears more than any other v...