Access Millions of academic & study documents

Florida international university psy 4304 exam 2 review

Content type
User Generated
Showing Page:
1/28
Psychological Testing
___
Exam 2
MILLER CH 6: RELIABILITY/PRECISION
Define reliability/precision, and describe three methods for estimating the
reliability/precision of a psychological test and its scores.
Reliability: consistency of test scores. All test contain some error.
More consistency of test scores, more reliability/precision
Results of the statistical evaluation of reliability > reliability coefficient
Measurement Error: variations on the measurements of the room
Reliable Test: one we can trust to measure e/person in approx. the same way in every
way it’s used.
Psychological tests don’t have the same reliability as physical measurements.
Reliability Coefficient: (3 categories) The method chosen depends on the test itself and the
conditions under which it is administered.
1. Test-retest (TRT) method: same group, two occasions. Scores are then compared using
correlation. The time interval might be few hours to several years, the longer the time the
more TRT reliability will decline. Dif. in administration will introduce error.
Ex. Personality Assessment Inventory (PAI) Interval M = 24 days, 75 normal
adults
Practice Effect (cons): when test takers benefit from taking the test the first time,
which enables them to solve problems more quickly. TRT is appropriate when test
takers are NOT likely to learn something or when interval time is long enough to
prevent practice effect.
2. Alternate forms(AF) method: two forms of the same test, same people. Time interval is
as close as possible, usually on the same day. Parallel test might not be truly equivalent.
Much easier to dev. For well-defined characteristics (i.e. math ability) not for personality
traits. No practice effect.
Ex. Test of Nonverbal Intelligence, ed. 4 (TONI-4) Does not req. Language,
answers are given by nod, blink, or point.
Order Effect: changes in test scores resulting from the order in which tests were
taken.
3. Internal consistency method: (1 test, 1 group) measure of how related the items (or
group of items) on the test are to one another, how they are measuring the same/similar
attribute. Requires a large sample.
Split-half method: divides the tests into ½, equivalent in length and content, &
compares the set of ind. scores on the first ½ to the second ½. Book: use random

Sign up to view the full document!

lock_open Sign Up
Showing Page:
2/28
assignment to place the Q inside the test. Professor: odd/even number method, 1-
50/51-100 method. Shortening the test reduces reliability.
Note:
Homogeneous tests = measures only 1 characteristic or trait.
Heterogeneous tests = measures more than 1 characteristic or trait. Internal consistency might
be lower, a solution is separating the test by traits.
Ex. PAI scale has 5 options, they used the Crombach 𝛂.
4. Scorer reliability: an ind. can make mistakes in scoring, 2 or more should score the
test.Conducting studies of scorer reliability for a test ensures that the instructions are clear
so that multiple scorers arrive to the same results faster.
Interscorer agreement: amount of consistency among scorers’ judgment/
Intrascorer reliability: each scorer internal consistency.
Ex. Wisconsin Card Sorting Test (WCST) tests preservation 7 abstract thinking.
Also Bailey Scales of Infant & Toddler Dev. far exceeded accepted guidelines.
Describe how an observed test score is made up of the true score and random error,
and describe the difference between random error and systematic error.
Classical Test Theory: any test score that a person obtains can be considered to be made of
two parts.
X(obtained/observed score) = T(true score) + e(error)
1. True Score: a measure of an attribute that the test measures, it cannot be really known. If
an ind. took a test infinite # of times & his scores were averaged.
2. Random error (RE): dif. b/t a person’s actual score X and his true score T. In a single
examination it’s impossible to know its effects. B/c it’s random, over infinite measures its
M = 0. This is why making a test longer, reduces the influence of error.
a. Systematic error(SE): a single source of error that increases the T by the same
amount in every measure. Difficult to identify. Unlike random error, SE does not
lower the reliability of a test. Practice and order effect can add systematic &
random error to test scores.
Reliability Coefficient (RC) (Pearson product moment correlation)
We use correlation to provide an index of the strength of the relationship b/t two sets of scores
(symbol r
xx
, always two subscripts of the same letter)
Differentiate between the KR-20 and coefficient alpha formulas, and understand
how they are used to estimate internal consistency
Adjusting Split-half Reliability
The # of questions relates to reliability, since the test is split in two, RC can be adjusted with the

Sign up to view the full document!

lock_open Sign Up
Showing Page:
3/28

Sign up to view the full document!

lock_open Sign Up
End of Preview - Want to read all 28 pages?
Access Now
Unformatted Attachment Preview
Psychological Testing ___ Exam 2 MILLER CH 6: RELIABILITY/PRECISION ● Define reliability/precision, and describe three methods for estimating the reliability/precision of a psychological test and its scores. Reliability: consistency of test scores. All test contain some error. More consistency of test scores, more reliability/precision Results of the statistical evaluation of reliability > reliability coefficient ● Measurement Error: variations on the measurements of the room ● Reliable Test: one we can trust to measure e/person in approx. the same way in every way it’s used. Psychological tests don’t have the same reliability as physical measurements. Reliability Coefficient: (3 categories) The method chosen depends on the test itself and the conditions under which it is administered. 1. Test-retest (TRT) method: same group, two occasions. Scores are then compared using correlation. The time interval might be few hours to several years, the longer the time the more TRT reliability will decline. Dif. in administration will introduce error. ○ Ex. Personality Assessment Inventory (PAI) Interval M = 24 days, 75 normal adults ○ Practice Effect (cons): when test takers benefit from taking the test the first time, which enables them to solve problems more quickly. TRT is appropriate when test takers are NOT likely to learn something or when interval time is long enough to prevent practice effect. 2. Alternate forms(AF) method: two forms of the same test, same people. ...
Purchase document to see full attachment
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.
Studypool
4.7
Indeed
4.5
Sitejabber
4.4

Similar Documents