Showing Page:
1/3
is highly recommended that you make these entries on a daily basis. You will be
assessed on the completeness of your Learning Journal, and the quality of your self-
reflection.
You should date each entry, and use clear titles and sub-headings. These entries should
be brief, direct sentences indicating quick comments or notes such as:
* when you completed each step in the Learning Guide during the week,
* any problems or unexpected events that occurred during the week (including
problems understanding new or old material), and
* any other noteworthy that might affect your performance in this class.
There is no need to include personal information or details of family events, but be sure
to mention the existence of any situations that will positively or negatively affect your
ability to focus on the classwork.
2. Vocabulary and R functions
Enter the following command in R to read a simple help page about the table() command
(this is for your information, you do not need to show the output):
?table
Now enter the following command and describe the output of the table() command. What
does the first row of numbers in table() output represent? What does the second row of
output represent?
x <- c(5, 8, 4, 1, 5, 6, 5, 9, 4, 2, 5, 7, 5, 3, 6, 4, 5, 3, 7, 6)
table(x)
[optional: You can test your theory by altering the numbers and rerunning the table
command.]
3. Task (References: Question 1.1 page 10-12 and self-Quiz Unit 1 Question 6 and 7)
a) Read section 1.5 in the Yakir textbook. If you were a teacher and had 30 students in
your class and wanted to know the class average on the first quiz, would you use a
parameter or a statistic? Why?
b) If you wanted to know how many people in your country recognize the name of your
new company, would you use a parameter or a statistic? Why?
Showing Page:
2/3
I download R program then I found R studio is better to work on it
start reading the first chapter in the book
15/11/2020
after using the function table(), I do not understand what is this frequency for each value,
who indicates the frequency
I faced an issue in the tutorial for downloading R program and Data Files, I do not
know if it is a problem, but when I check the data files as the instructions in Data Files
section provide specifically at the stage when giving the (ex1.csv) file variable name and
try to see the summary for this file :
> summary(ex.1)
This is my results:
> summary(ex.1)
id sex height
Min. :1538611 Length:100 Min. :117.0
1st Qu.:3339583 Class :character 1st Qu.:158.0
Median :5105620 Mode :character Median :171.0
Mean :5412367 Mean :170.1
3rd Qu.:7622236 3rd Qu.:180.2
Max. :9878130 Max. :208.0
But in the tutorial, this is the result that I suppose to get :
id sex height
Min. :1538611 FEMALE:54 Min. :117.0
1st Qu.:3339583 MALE :46 1st Qu.:158.0
Median :5105620 Median :171.0
Mean :5412367 Mean :170.1
3rd Qu.:7622236 3rd Qu.:180.2
Max. :9878130 Max. :208.0
I did not get the number of females and males
Finally, I found a course on Coursera about Data Science: Foundations using R, and I
thought these courses may help me to achieve a better grade in this course and will help
me to understand R more
2.
> x <- c(5, 8, 4, 1, 5, 6, 5, 9, 4, 2, 5, 7, 5, 3, 6, 4, 5, 3, 7, 6)
> table(x)
Showing Page:
3/3
x
1 2 3 4 5 6 7 8 9
1 1 2 3 6 3 2 1 1
after this example, I can understand what is the frequency
in the first line of the table, we have the order of the number of our vector of variable x
so, the table function sorts them from lower number to bigger number. Then, we have in
the second line the frequency which is how much each number appears in our vector
3.
a) I would use a statistic because the statistic is the average number of points earned by
students in the class ( Yakir, 2011)
b)if i want to know the number of people that recognize the name of my company I will
choose a sample then from this sample I will get the average of people that recognized
my company name and this is the statistic but if I want to know how many people in the
whole country recognize the name I should use the parameter because it is a number
that is a property of the population while statistic is the number that is property of the
population( Yakir, 2011)
A good submission, Youssif
In Q3a) the entire population is at your disposal so you could find the parameter.
In Q3b) it is not feasible to obtain the population given its nature, so it's more appropriate
to draw a sample and compute the statistic.
Hope this helps.
Q1: 4/4, Q2: 2/2, Q3: 2/4, Total = 8/10