Showing Page:
1/25
Course: Educational Assessment and Evaluation
Code : 8602
Semester: Autumn, 2021
Level : B.Ed
Assignment # 1
Q.1 What are the types of assessment? Differentiate assessment for training of
learning and as learning?
Types of Assessment
"As coach and facilitator, the teacher uses formative assessment to help support and enhance
student learning, As judge and jury, the teacher makes summative judgments about a
student's achievement..." Atkin, Black & Coffey (2001) Assessment is a purposeful activity
aiming to facilitate students’ learning and to improve the quality of instruction. Based upon
the functions that it performs, assessment is generally divided into three types: assessment for
learning, assessment of learning and assessment as learning.
a) Assessment for Learning (Formative Assessment)
Assessment for learning is a continuous and an ongoing assessment that allows teachers to
monitor students on a day-to-day basis and modify their teaching based on what the students
need to be successful. This assessment provides students with the timely, specific feedback
that they need to enhance their learning. The essence of formative assessment is that the
information yielded by this type of assessment is used on one hand to make immediate
decisions and on the other hand based upon this information; timely feedback is provided to
the students to enable them to learn better. If the primary purpose of assessment is to support
high-quality learning then formative assessment ought to be understood as the most important
assessment practice.
Showing Page:
2/25
Assessment for learning has many unique characteristics for example this type of assessment
is taken as “practice." Learners should not be graded for skills and concepts that have been
just introduced. They should be given opportunities to practice. Formative assessment helps
teachers to determine next steps during the learning process as the instruction approaches the
summative assessment of student learning. A good analogy for this is the road test that is
required to receive a driver's license. Before the final driving test, or summative assessment, a
learner practice by being assessed again and again to point out the deficiencies in the skill.
Another distinctive characteristic of formative assessment is student involvement. If students
are not involved in the assessment process, formative assessment is not practiced or
implemented to its full effectiveness.
b) Assessment of Learning (Summative Assessment)
Summative assessment or assessment of learning is used to evaluate students’ achievement at
some point in time, generally at the end of a course. The purpose of this assessment is to help
the teacher, students and parents know how well student has completed the learning task. In
other words summative evaluation is used to assign a grade to a student which indicates
his/her level of achievement in the course or program. Assessment of learning is basically
designed to provide useful information about the performance of the learners rather than
providing immediate and direct feedback to teachers and learners, therefore it usually has
little effect on learning. Though high quality summative information can help and guide the
teacher to organize their courses, decide their teaching strategies and on the basis of
information generated by summative assessment educational programs can be modified.
Many experts believe that all forms of assessment have some formative element. The
difference only lies in the nature and the purpose for which assessment is being conducted.
Showing Page:
3/25
What Is The Difference?
Assessment OF learning involves looking at assessment information at the end of the
teaching and learning process to rank students’ achievement levels against a standard. It is
summative in nature and typically involves standardized tests. Assessment OF learning scores
are often used to rate teachers’ or schools’ ability to move student achievement based on the
results of single, point-in-time tests e.g., those generated by Northwest Evaluation
Association (NWEA) or state tests.
Assessment FOR learning embeds assessment processes throughout the teaching and
learning process to constantly adjust instructional strategy. While it can include test data, it
also addresses other quantitative and even qualitative data, and even encompasses a great deal
of anecdotal and descriptive data. Using NWEA in conjunction with teacher generated daily
data (checks for understanding, exit tickets, observations of student engagement) to alter
instructional strategy during lesson or unit delivery is an example of assessment FOR
learning in action.
Reference
• Charles Hopkins, (2008). Classroom Measurement and Evaluation. Illinois: Peacock
• Carolin Gipps, ( 1994) Beyond Testing : Towards a Theory of Educational Assessment
Routledge Publishers
Q.2 What do you know about taxonomy of educational objectives? Write in detail?
Definition of Objectives
Education is, without any doubt, a purposeful activity. Every step of this activity has and
should definitely have a particular purpose. Therefore learning objectives are a prime and
integral part of teaching learning process.
Showing Page:
4/25
A learning objective refers to the statement of what students will obtain through instruction of
certain content. In other words ‘an objective is a description of a performance you want
learners to be able to exhibit before you consider them competent. An objective describes an
intended result of instruction, rather than the process of instruction itself.’ (Mager, p. 5)
In teaching learning process, learning objectives have a unique importance. The role learning
objectives play includes but is not limited to the following three: firstly, they guide and direct
for the selection of instructional content and procedures. Secondly, they facilitate the
appropriate evaluation of the instruction. Thirdly, learning objectives help the students to
organize their efforts to accomplish the intent of the instruction.
Taxonomy of Educational Objectives
Following the 1948 Convention of the American Psychological Association, a group of
college examiners considered the need for a system of classifying educational goals for the
evaluation of student performance. Years later and as a result of this effort, Benjamin Bloom
formulated a classification of "the goals of the educational process". Eventually, Bloom
established a hierarchy of educational objectives for categorizing level of abstraction of
questions that commonly occur in educational settings (Bloom, 1965). This classification is
generally referred to as Bloom's Taxonomy. Taxonomy means 'a set of classification
principles', or 'structure'. The followings are six levels in this taxonomy: Knowledge,
Comprehension, Application, Analysis, Synthesis, and Evaluation. The detail is given below:
Cognitive domain:
The cognitive domain (Bloom, 1956) involves the development of intellectual skills. This
includes the recall or recognition of specific facts, procedural patterns, and concepts that
serve in the development of intellectual abilities and skills. There are six levels of this domain
starting from the simplest cognitive behaviour to the most complex. The levels can be
Showing Page:
5/25
thought of as degrees of difficulties. That is, the first ones must normally be mastered before
the next ones can take place.
Affective domain:
The affective domain is related to the manner in which we deal with things emotionally, such
as feelings, values, appreciation, enthusiasms, motivations, and attitudes. The five levels of
this domain include: receiving, responding, valuing, organization, and characterizing by
value.
Psychomotor domain:
Focus is on physical and kinesthetic skills. The psychomotor domain includes physical
movement, coordination, and use of the motor-skill areas. Development of these skills
requires practice and is measured in terms of speed, precision, distance, procedures, or
techniques in execution. There are seven levels of this domain from the simplest behaviour to
the most complex. Domain levels include: Perception, set, guided response, mechanism,
complex or overt response, adaptation.
Reference
Gronlund, N. E. (2006). Assessment of Student Achievement. (Eighth Edition). USA:
Pearson Education.
Popham, W.J. (2005). Classroom Assessment: What Teachers Need to Know. USA:
Pearson Education.
Q.3 How will you define attitude? Elaborate its components?
Attitude
Attitude is a posture, action or disposition of a figure or a statue. A mental and neural state of
readiness, organized through experience, exerting a directive or dynamic influence upon the
individual's response to all objects and situations with which it is related.
Showing Page:
6/25
Attitude is the state of mind with which you approach a task, a challenge, a person, love, life
in general. The definition of attitude is “a complex mental state involving beliefs and feelings
and values and dispositions to act in certain ways”. These beliefs and feelings are different
due to various interpretations of the same events by various people and these differences
occur due to the earlier mentioned inherited characteristics’.
(i) Components of Attitude
1. Cognitive Component:
It refers that's part of attitude which is related in general know how of a person, for example,
he says smoking is injurious to health. Such type of idea of a person is called cognitive
component of attitude.
2. Effective Component:
This part of attitude is related to the statement which affects another person. For example, in
an organization a personal report is given to the general manager. In report he point out that
the sale staff is not performing their due responsibilities. The general manager forwards a
written notice to the marketing manager to negotiate with the sale staff.
3. Behavioral Component:
The behavioral component refers to that part of attitude which reflects the intension of a
person in short run or long run. For example, before the production and launching process the
product. Report is prepared by the production department which consists of the intention in
near future and long run and this report is handed over to top management for the decision.
(ii) List of Attitude:
In the broader sense of the word there are only three attitudes, a positive attitude, a negative
attitude, and a neutral attitude. But in general sense, an attitude is what it is expressed
through. Given below is a list of attitudes that are expressed by people, and are more than
personality traits which you may have heard of, know of, or might be even carrying them:
Showing Page:
7/25
• Acceptance
• Confidence
• Seriousness
• Optimism
• Interest
• Cooperative
• Happiness
• Respectful
• Authority
• Sincerity
• Honest
• Sincere
Reference
Ward, A.W., & Murray-Ward, M. (1999). Assessment in the Classroom. Belmont, CA:
Wadsworth Publishing Co.
Gronlund, N. (1993) "How to Make Achievement Tests and Assessments," 5th Edition,
NY: Allyn and Bacon
Haladyna, T.M. & Downing, S.M. (1989) Validity of a Taxonomy of MultipleChoice
Item-Writing Rules. "Applied Measurement in Education," 2(1), 51-78.
Q.4 What are the type of every questios? Also write its advantages and
disadvantages?
Multiple Choice Questions
Multiple-choice test items consist of a stem or a question and three or more alternative
answers (options) with the correct answer sometimes called the keyed response and the
Showing Page:
8/25
incorrect answers called distracters. This form is generally better than the incomplete stem
because it is simpler and more natural. Grounlund (1995) writes that the multiple choice
question is probably the most popular as well as the most widely applicable and effective
type of objective test. Student selects a single response from a list of options. It can be used
effectively for any level of course outcome. It consists of two parts: the stem, which states the
problem and a list of three to five alternatives, one of which is the correct (key) answer and
the others are distracters (incorrect options that draw the less knowledgeable pupil away from
the correct response).
Advantages
Versatility
Multiple-choice test items are appropriate for use in many different subject-matter areas, and
can be used to measure a great variety of educational objectives. They are adaptable to
various levels of learning outcomes, from simple recall of knowledge to more complex
levels, such as the student’s ability to:
• Analyze phenomena
• Apply principles to new situations
• Comprehend concepts and principles
• Discriminate between fact and opinion
• Interpret cause-and-effect relationships
• Interpret charts and graphs
• Judge the relevance of information
• Make inferences from given data
• Solve problems
The difficulty of multiple-choice items can be controlled by changing the alternatives, since
the more homogeneous the alternatives, the finer the distinction the students must make in
Showing Page:
9/25
order to identify the correct answer. Multiple-choice items are amenable to item analysis,
which enables the teacher to improve the item by replacing distracters that are not functioning
properly. In addition, the distracters chosen by the student may be used to diagnose
misconceptions of the student or weaknesses in the teacher’s instruction.
Validity
In general, it takes much longer to respond to an essay test question than it does to respond to
a multiple-choice test item, since the composing and recording of an essay answer is such a
slow process. A student is therefore able to answer many multiplechoice items in time it
would take to answer a single essay question. This feature enables the teacher using multiple-
choice items to test a broader sample of course contents in a given amount of testing time.
Consequently, the test scores will likely be more representative of the students’ overall
achievement in the course.
Reliability
Well-written multiple-choice test items compare favourably with other test item types on the
issue of reliability. They are less susceptible to guessing than are true-false test items, and
therefore capable of producing more reliable scores. Their scoring is more clear-cut than
short answer test item scoring because there are no misspelled or partial answers to deal with.
Since multiple-choice items are objectively scored, they are not affected by scorer
inconsistencies as are essay questions, and they are essentially immune to the influence of
bluffing and writing ability factors, both of which can lower the reliability of essay test
scores.
Efficiency
Multiple-choice items are amenable to rapid scoring, which is often done by scoring
machines. This expedites the reporting of test results to the student so that any follow-up
clarification of instruction may be done before the course has proceeded much further. Essay
Showing Page:
10/25
questions, on the other hand, must be graded manually, one at a time. Overall multiple choice
tests are: • Very effective • Versatile at all levels
• Minimum of writing for student
• Guessing reduced
• Can cover broad range of content
Disadvantages
Versatility
Since the student selects a response from a list of alternatives rather than supplying or
constructing a response, multiple-choice test items are not adaptable to measuring certain
learning outcomes, such as the student’s ability to:
• Articulate explanations
• Display thought processes
• Furnish information
• Organize personal thoughts.
• Perform a specific task
• Produce original ideas
• Provide examples Such learning outcomes are better measured by short answer or essay
questions, or by performance tests.
Reliability
Although they are less susceptible to guessing than are true false-test items, multiplechoice
items are still affected to a certain extent. This guessing factor reduces the reliability of
multiple-choice item scores somewhat, but increasing the number of items on the test offsets
this reduction in reliability.
Difficulty of Construction
Showing Page:
11/25
Good multiple-choice test items are generally more difficult and time-consuming to write
than other types of test items. Coming up with plausible distracters requires a certain amount
of skill. This skill, however, may be increased through study, practice, and experience.
Gronlund (1995) writes that multiple-choice items are difficult to construct. Suitable
distracters are often hard to come by and the teacher is tempted to fill the void with a “junk”
response. The effect of narrowing the range of options will available to the test wise student.
They are also exceedingly time consuming to fashion, one hour per question being by no
means the exception. Finally multiple-choice items generally take student longer to complete
(especially items containing fine discrimination) than do other types of objective question.
True/False Questions
A True-False test item requires the student to determine whether a statement is true or false.
The chief disadvantage of this type is the opportunity for successful guessing. According to
Gronlund (1995) the alternative response test items that consists of a declaration statement
that the pupil is asked to mark true or false, right or wrong, correct or incorrect, yes or no,
fact or opinion, agree or disagree and the like. In each case there are only two possible
answers. Because the true-false option is the most common, this type is mostly refers to true-
false type. Students make a designation about the validity of the statement. Also known as a
“binary-choice” item because there are only two options to select from. These types of items
are more effective for assessing knowledge, comprehension, and application outcomes as
defined in the cognitive domain of Blooms’ Taxonomy of educational objectives.
Advantages:
• Easily assess verbal knowledge
• Each item contains only two possible answers
• Easy to construct for the teacher
• Easy to score for the examiner
Showing Page:
12/25
• Helpful for poor students
• Can test large amounts of content
• Students can answer 3-4 questions per minute
Disadvantages:
• They are easy to construct.
• It is difficult to discriminate between students that know the material and students who don't
know.
• Students have a 50-50 chance of getting the right answer by guessing.
• Need a large number of items for high reliability.
• Fifty percent guessing factor.
• Assess lower order thinking skills.
• Poor representative of students learning achievement.
Matching items
According to Cunningham (1998), the matching items consist of two parallel columns. The
column on the left contains the questions to be answered, termed premises; the column on the
right, the answers, termed responses. The student is asked to associate each premise with a
response to form a matching pair.
Advantages:
The chief advantage of matching exercises is that a good deal of factual information can be
tested in minimal time, making the tests compact and efficient. They are especially well
suited to who, what, when and where types of subject matter. Further students frequently find
the tests fun to take because they have puzzle qualities to them.
• Maximum coverage at knowledge level in a minimum amount of space/prep time
• Valuable in content areas that have a lot of facts Disadvantages:
Showing Page:
13/25
The principal difficulty with matching exercises is that teachers often find that the subject
matter is insufficient in quantity or not well suited for matching terms. An exercise should be
confined to homogeneous items containing one type of subject matter (for instance, authors-
novels; inventions inventors; major events-dates terms definitions; rules examples and the
like). Where unlike clusters of questions are used to adopt but poorly informed student can
often recognize the ill-fitting items by their irrelevant and extraneous nature (for instance, in
a list of authors the inclusion of the names of capital cities).
Q.5 Construct a test, administer it and ensure its reliability?
Construct a test
the four main steps of standardized test construction. These steps and procedures help us to
produce a valid, reliable and objective standardized test. The four main steps are: 1. Planning
the Test 2. Preparing the Test 3. Try Out the Test 4. Evaluating the Test.
Step # 1. Planning the Test:
Planning of the test is the first important step in the test construction. The main goal of
evaluation process is to collect valid, reliable and useful data about the student.
Therefore before going to prepare any test we must keep in mind that:
(1) What is to be measured?
(2) What content areas should be included and
(3) What types of test items are to be included.
Therefore the first step includes three major considerations.
1. Determining the objectives of testing.
2. Preparing test specifications.
Showing Page:
14/25
1. Determining the Objectives of Testing:
A test can be used for different purposes in a teaching learning process. It can be used to
measure the entry performance, the progress during the teaching learning process and to
decide the mastery level achieved by the students. Tests serve as a good instrument to
measure the entry performance of the students. It answers to the questions, whether the
students have requisite skill to enter into the course or not, what previous knowledge does the
pupil possess. Therefore it must be decided whether the test will be used to measure the entry
performance or the previous knowledge acquired by the student on the subject.
Tests can also be used for formative evaluation. It helps to carry on the teaching learning
process, to find out the immediate learning difficulties and to suggest its remedies. When the
difficulties are still unsolved we may use diagnostic tests. Diagnostic tests should be prepared
with high technique. So specific items to diagnose specific areas of difficulty should be
included in the test.
Tests are used to assign grades or to determine the mastery level of the students. These
summative tests should cover the whole instructional objectives and content areas of the
course. Therefore attention must be given towards this aspect while preparing a test.
2. Preparing Test Specifications:
The second important step in the test construction is to prepare the test specifications. In
order to be sure that the test will measure a representative sample of the instructional
objectives and content areas we must prepare test specifications. So that an elaborate design
is necessary for test construction. One of the most commonly used devices for this purpose is
‘Table of Specification’ or ‘Blue Print.’
Preparation of Table of Specification/Blue Print:
Preparation of table of specification is the most important task in the planning stage. It acts,
as a guide for the test construction. Table of specification or ‘Blue Print’ is a three dimen-
Showing Page:
15/25
sional chart showing list of instructional objectives, content areas and types of items in its
dimensions.
Step # 2. Preparing the Test:
After planning preparation is the next important step in the test construction. In this step the
test items are constructed in accordance with the table of specification. Each type of test item
need special care for construction.
The preparation stage includes the following three functions:
(i) Preparing test items.
(ii) Preparing instruction for the test.
(iii) Preparing the scoring key.
(i) Preparing the Test Items:
Preparation of test items is the most important task in the preparation step. Therefore care
must be taken in preparing a test item. The following principles help in preparing relevant test
items.
1. Test items must be appropriate for the learning outcome to be measured:
The test items should be so designed that it will measure the performance described in the
specific learning outcomes. So that the test items must be in accordance with the performance
described in the specific learning outcome.
2. Test items should measure all types of instructional objectives and the whole content
area:
The items in the test should be so prepared that it will cover all the instructional objectives
Knowledge, understanding, thinking skills and match the specific learning outcomes and
subject matter content being measured. When the items are constructed on the basis of table
of specification the items became relevant.
3. The test items should be free from ambiguity:
Showing Page:
16/25
The item should be clear. Inappropriate vocabulary and awkward sentence structure should
be avoided. The items should be so worded that all pupils understand the task.
4. The test items should be of appropriate difficulty level:
The test items should be proper difficulty level, so that it can discriminate properly. If the
item is meant for a criterion-referenced test its difficulty level should be as per the difficulty
level indicated by the statement of specific learning outcome. Therefore if the learning task is
easy the test item must be easy and if the learning task is difficult then the test item must be
difficult.
In a norm-referenced test the main purpose is to discriminate pupils according to
achievement. So that the test should be so designed that there must be a wide spread of test
scores. Therefore the items should not be so easy that everyone answers it correctly and also
it should not be so difficult that everyone fails to answer it. The items should be of average
difficulty level.
5. The test item must be free from technical errors and irrelevant clues:
Sometimes there are some unintentional clues in the statement of the item which helps the
pupil to answer correctly. For example grammatical inconsistencies, verbal associations,
extreme words (ever, seldom, always), and mechanical features (correct statement is longer
than the incorrect). Therefore while constructing a test item careful step must be taken to
avoid most of these clues.
6. Test items should be free from racial, ethnic and sexual biasness:
The items should be universal in nature. Care must be taken to make a culture fair item.
While portraying a role all the facilities of the society should be given equal importance. The
terms used in the test item should have an universal meaning to all members of group.
Showing Page:
17/25
(ii) Preparing Instruction for the Test:
This is the most neglected aspect of the test construction. Generally everybody gives attention
to the construction of test items. So the test makers do not attach directions with the test
items.
But the validity and reliability of the test items to a great extent depends upon the
instructions for the test. N.E. Gronlund has suggested that the test maker should
provide clear-cut direction about;
a. The purpose of testing.
b. The time allowed for answering.
c. The basis for answering.
d. The procedure for recording answers.
e. The methods to deal with guessing.
Direction about the Purpose of Testing:
A written statement about the purpose of the testing maintains the uniformity of the test.
Therefore there must be a written instruction about the purpose of the test before the test
items.
Instruction about the time allowed for answering:
Clear cut instruction must be supplied to the pupils about the time allowed for whole test. It is
also better to indicate the approximate time required for answering each item, especially in
case of essay type questions. So that the test maker should carefully judge the amount of time
taking the types of items, age and ability of the students and the nature of the learning
outcomes expected. Experts are of the opinion that it is better to allow more time than to
deprive a slower student to answer the question.
Instructions about basis for answering:
Showing Page:
18/25
Test maker should provide specific direction on the basis of which the students will answer
the item. Direction must clearly state whether the students will select the answer or supply the
answer. In matching items what is the basis of matching the premises and responses (states
with capital or country with production) should be given. Special directions are necessary for
interpretive items. In the essay type items clear direction must be given about the types of
responses expected from the pupils.
Instruction about recording answer:
Students should be instructed where and how to record the answers. Answers may be
recorded on the separate answer sheets or on the test paper itself. If they have to answer in the
test paper itself then they must be directed, whether to write the correct answer or to indicate
the correct answer from among the alternatives. In case of separate answer sheets used to
answer the test direction may be given either in the test paper or in the answer sheet.
Instruction about guessing:
Direction must be provided to the students whether they should guess uncertain items or not
in case of recognition type of test items. If nothing is stated about guessing, then the bold
students will guess these items and others will answer only those items of which they are
confident. So that the bold pupils by chance will answer some items correctly and secure a
higher score. Therefore a direction must be given ‘to guess but not wild guesses.’
(iii) Preparing the Scoring Key:
A scoring key increases the reliability of a test. So that the test maker should provide the
procedure for scoring the answer scripts. Directions must be given whether the scoring will
be made by a scoring key (when the answer is recorded on the test paper) or by a scoring
stencil (when answer is recorded on separate answer sheet) and how marks will be awarded
to the test items.
Showing Page:
19/25
In case of essay type items it should be indicated whether to score with ‘point method’ or
with the ‘rating’ method.’ In the ‘point method’ each answer is compared with a set of ideal
answers in the scoring Hey. Then a given number of points are assigned.
In the rating method the answers are rated on the basis of degrees of quality and determine
the credit assigned to each answer. Thus a scoring key helps to obtain a consistent data about
the pupils’ performance. So the test maker should prepare a comprehensive scoring procedure
along with the test items.
Step # 3. Try Out of the Test:
Once the test is prepared now it is time to be confirming the validity, reliability and usability
of the test. Try out helps us to identify defective and ambiguous items, to determine the
difficulty level of the test and to determine the discriminating power of the items.
Try out involves two important functions:
(a) Administration of the test.
(b) Scoring the test.
(a) Administration of the test:
Administration means administering the prepared test on a sample of pupils. So the ef-
fectiveness of the final form test depends upon a fair administration. Gronlund and Linn have
stated that ‘the guiding principle in administering any class room test is that all pupils must
be given a fair chance to demonstrate their achievement of learning outcomes being
measured.’ It implies that the pupils must be provided congenial physical and psychological
environment during the time of testing. Any other factor that may affect the testing procedure
should be controlled.
Physical environment means proper sitting arrangement, proper light and ventilation and
adequate space for invigilation, Psychological environment refers to these aspects which in-
fluence the mental condition of the pupil. Therefore steps should be taken to reduce the
Showing Page:
20/25
anxiety of the students. The test should not be administered just before or after a great
occasion like annual sports on annual drama etc.
One should follow the following principles during the test administration:
1. The teacher should talk as less as possible.
2. The teacher should not interrupt the students at the time of testing.
3. The teacher should not give any hints to any student who has asked about any item.
4. The teacher should provide proper invigilation in order to prevent the students from
cheating.
(b) Scoring the test:
Once the test is administered and the answer scripts are obtained the next step is to score the
answer scripts. A scoring key may be provided for scoring when the answer is on the test
paper itself Scoring key is a sample answer script on which the correct answers are recorded.
When answer is on a separate answer sheet at that time a scoring stencil may be used for
answering the items. Scoring stencil is a sample answer sheet where the correct alternatives
have been punched. By putting the scoring stencil on the pupils answer script correct answer
can be marked. For essay type items separate instructions for scoring each learning objective
may be provided.
Correction for guessing:
When the pupils do not have sufficient time to answer the test or the students are not ready to
take the test at that time they guess the correct answer, in recognition type items.
In that case to eliminate the effect of guessing the following formula is used:
Showing Page:
21/25
But there is a lack of agreement among psychometricians about the value of the correction
formula so far as validity and reliability are concerned. In the words of Ebel “neither the in-
struction nor penalties will remedy the problem of guessing.”
Guilford is of the opinion that “when the middle is excluded in item analysis the question of
whether to correct or not correct the total scores becomes rather academic.” Little
said “correction may either under or over correct the pupils’ score.” Keeping in view the
above opinions, the test-maker should decide not to use the correction for guessing. To avoid
this situation he should give enough time for answering the test item.
Step # 4. Evaluating the Test:
Evaluating the test is most important step in the test construction process. Evaluation is
necessary to determine the quality of the test and the quality of the responses. Quality of the
test implies that how good and dependable the test is? (Validity and reliability). Quality of the
responses means which items are misfit in the test. It also enables us to evaluate the usability
of the test in general class-room situation.
Evaluating the test involves following functions:
(a) Item analysis.
(b) Determining validity of the test.
(c) Determining reliability of the test.
(d) Determining usability of the test.
(a) Item analysis:
Item analysis is a procedure which helps us to find out the answers to the following
questions:
a. Whether the items functions as intended?
b. Whether the test items have appropriate difficulty level?
c. Whether the item is free from irrelevant clues and other defects?
Showing Page:
22/25
d. Whether the distracters in multiple choice type items are effective?
The item analysis data also helps us:
a. To provide a basis for efficient class discussion of the test result
b. To provide a basis for the remedial works
c. To increase skill in test construction
d. To improve class-room discussion.
Item Analysis Procedure:
Item analysis procedure gives special emphasis on item difficulty level and item
discriminating power.
The item analysis procedure follows the following steps:
1. The test papers should be ranked from highest to lowest.
2. Select 27% test papers from highest and 27% from lowest end.
For example if the test is administered on 60 students then select 16 test papers from highest
end and 16 test papers from lowest end.
3. Keep aside the other test papers as they are not required in the item analysis.
4. Tabulate the number of pupils in the upper and lower group who selected each alternative
for each test item. This can be done on the back of the test paper or a separate test item card
may be used (Fig. 3.1)
5. Calculate item difficulty for each item by using formula:
Where R= Total number of students got the item correct.
T = Total number of students tried the item.
In our example (fig. 3.1) out of 32 students from both the groups 20 students have answered
the item correctly and 30 students have tried the item.
The item difficulty is as following:
Showing Page:
23/25
It implies that the item has a proper difficulty level. Because it is customary to follow 25% to
75% rule to consider the item difficulty. It means if an item has a item difficulty more than
75% then is a too easy item if it is less than 25% then item is a too difficult item.
6. Calculate item discriminating power by using the following formula:
Item discriminating power =
Where R
U
= Students from upper group who got the answer correct.
R
L
= Students from lower group who got the answer correct.
T/2 = half of the total number of pupils included in the item analysis.
In our example (Fig. 3.1) 15 students from upper group responded the item correctly and 5
from lower group responded the item correctly.
A high positive ratio indicates the high discriminating power. Here .63 indicates an average
discriminating power. If all the 16 students from lower group and 16 students from upper
group answers the item correctly then the discriminating power will be 0.00.
It indicates that the item has no discriminating power. If all the 16 students from upper group
answer the item correctly and all the students from lower group answer the item in correctly
then the item discriminating power will be 1.00 it indicates an item with maximum positive
discriminating power.
7. Find out the effectiveness of the distracters. A distractor is considered to be a good
distractor when it attracts more pupil from the lower group than the upper group. The
Showing Page:
24/25