505941
research-article2013
LTR18210.1177/1362168813505941Language Teaching ResearchZheng and Borg
Article
Task-based learning and
teaching in China: Secondary
school teachers’ beliefs
and practices
LANGUAGE
TEACHING
RESEARCH
Language Teaching Research
2014, Vol. 18(2) 205–221
© The Author(s) 2013
Reprints and permissions:
sagepub.co.uk/journalsPermissions.nav
DOI: 10.1177/1362168813505941
ltr.sagepub.com
Xinmin Zheng
Shanghai International Studies University, China
Simon Borg
University of Leeds, UK
Abstract
While much has been written about task-based language teaching (TBLT), research examining
teachers’ understandings of what TBLT means remains limited. This article explores the
understandings of TBLT of three Chinese secondary school teachers of English and the
implementation of TBLT in their lessons. Narrative accounts were constructed for each teacher
using observational data from two lessons and two semi-structured interviews. These accounts
illustrate how each teacher implemented the curriculum as well as the cognitive and contextual
factors that shaped their decisions with specific reference to the use of tasks. One key finding
is that TBLT was defined in a narrow manner and was strongly associated with communicative
activities, especially oral work involving pair and group work. The study also shows that the two
more experienced teachers introduced a stronger formal element of grammar into their lessons
than recommended by the curriculum; and while all three teachers highlighted the challenges
for them in using tasks (e.g. due to large classes), the youngest of the three displayed most
commitment to the principles in the curriculum. The qualitative accounts we present here are
empirically instructive in the way they extend our understandings of how teachers respond to
innovative curricula and specifically to TBLT; these accounts also have concrete practical value:
they are a source of material that can be used in teacher education contexts to encourage
teachers to reflect on their own beliefs and practices in relation to TBLT.
Keywords
China, curriculum implementation, task-based learning and teaching, teaching English
Corresponding author:
Xinmin Zheng, School of Education, Shanghai International Studies University, Room 820, Building No. l, 550
Dalian Xilu, Hongkou District, Shanghai 200083, China.
Email: sxmzheng@shisu.edu.cn
206
Language Teaching Research 18(2)
I Introduction
In 2003 the new English language curriculum at senior secondary school level in the
People’s Republic of China was introduced. Prior to this, English language teaching
(ELT) in China had been portrayed in the literature as predominantly teacher-centered,
textbook-directed and memorization-based (Cortazzi & Li, 1996; Zheng & Adamson,
2003). The pedagogy for teaching English was characterized by a focus on grammar
teaching and learning and on reading comprehension, with little attention to the development of students’ communicative competence, particularly in speaking and listening.
The new secondary curriculum in 2003 was a response to this situation, and part of a
broader policy of globalization that was being pursued in China. The Ministry of
Education sought, through this curriculum, to support the development of students who
can use English to communicate internationally (Ministry of Education, 2003). This curriculum also represented a marked shift in the pedagogy that teachers of English were
expected to adopt. Learning was presented as a process of enquiry rather than being
based on knowledge transmission and memorization. More specifically, the traditional
emphasis on grammar and vocabulary learning was replaced by a focus on the development of communication skills (Wang, 2007; Wang & Lam, 2010).
Another key element in the new curriculum, and that which we focus on here, is that
it recommends that task-based teaching methods be used to develop students’ communicative competence. Teachers are also provided with guidelines to consider in developing
appropriate tasks (Ministry of Education, 2003):
•
•
•
•
•
•
Activities must have clear and achievable aims and objectives.
Activities must be relevant to students’ life experiences and interests; the content
and style should be as true to life as possible.
Activities must benefit the development of students’ language knowledge, language skills and ability to use language for real communication.
Activities should be of a cross-curricular nature, promoting the integrated development of students’ thinking and imagination, aesthetic and artistic sense, cooperative and creative spirit.
Activities should make students gather, process and use information, using
English to communicate with others in order to develop their ability to use English
to solve real problems.
Activities should not purely be limited to the classroom but also extend to out of
school learning.1
Despite such guidelines, however, the curriculum does not define what precisely it
understands a task to be, and how it might be distinct from other types of language learning activities. Clearly, then, the curriculum requires teachers to interpret what a task is
and to design their pedagogy accordingly, with the concomitant risk that if teachers’
interpretations are not aligned with those implied in the curriculum, the latter will not be
implemented as intended. The motivation for this article thus stems from our interest in
exploring what sense teachers make of the need to use tasks and the implications this has
for the implementation of the curriculum.
Zheng and Borg
207
II Literature review
We now discuss three areas of literature that underpin this study: curriculum innovation,
task-based language teaching, and teachers’ beliefs.
1 Uptake of ELT curriculum innovation
Curriculum innovation in education is characterized by an extensive literature that examines this phenomenon from multiple perspectives (see, for example, Fullan, 2001;
Markee, 1997). The traditional conceptualization of curriculum sees it as a blueprint
setting out all of planned learning experiences for a specific group of learners (the
intended curriculum). More recently, it has been acknowledged that an adequate understanding of curriculum must include not just what is intended, but also the implemented
curriculum: what actually happens at the level of classroom action (Wedell, 2009). Here
we are particularly interested in the relationship between the intended 2003 English language curriculum and its implementation, with specific attention to teachers’ use of
tasks. One approach to research on curriculum implementation described by Snyder,
Bolin and Zumwalt (1992) is the fidelity perspective, which studies the extent to which
a curriculum is implemented as planned. Our study reflects such a perspective, although
our interests extend beyond describing whether what the teachers do reflects the planned
curriculum; we are also interested in factors that shape teachers’ actions, such as teachers’ beliefs and the classroom context.
Evidence of the challenges that English language teachers face when they are required
to move from traditional to communicative curricula emerges from a number of studies
(e.g. Orafi & Borg, 2009), though specific attention to the use of tasks has been limited
(one exception is Wyatt & Borg, 2011). In a recent study in China, Yan (2012) identified
an implementation gap despite the fact that teachers of English were positively disposed
towards the new curricular principles; however, teachers felt that their ability to implement those principles was hindered by several adverse conditions, including student
resistance, the lack of support from school administrators and the backwash effect of the
examinations (see also Yan & He, 2012; Zheng & Davison, 2008).
2 Task-based language teaching
There has been much discussion in the literature about what tasks are and are not, and
these have been reviewed in sources such as Van de Branden (2006). Ellis (2003, pp.
9–10), for example, highlights six elements of a task (e.g. it has a primary focus on meaning and involves real world language use), while Samuda and Bygate (2008) define a
task as ‘a holistic activity which engages language use in order to achieve some nonlinguistic outcome while meeting a linguistic challenge, with the overall aim of promoting language learning, through process or product or both’ (p. 69). The use of ‘activity’
in this definition may be one potential source of confusion to teachers (tasks, one
assumes, are meant to be different from the ‘activities’ normally found in textbooks). To
take one further example, Scrivener (2011) defines task-based learning (TBL) as ‘a variant of CLT [communicative language teaching] … which bases work cycles around the
208
Language Teaching Research 18(2)
preparation for, doing of, and reflection and analysis of tasks that reflect real-life needs
and skills’ (p. 32). This particular definition does little to further clarify what a task is and
also seems to minimize the distinctiveness of TBLT compared to CLT more generally.
Clearly, multiple interpretations of ‘task’ exist in the literature and teachers too will have
their own understandings of what the term means. This is even more likely in contexts,
such as that we examine here, where the curriculum does not seem to provide a precise
definition of what a task is.
3 Teachers’ beliefs
In the past two decades or so research on language teachers’ beliefs and practices has
witnessed a rapid growth (Borg, 2006) but specific analyses of TBLT from teachers’
perspectives have been limited. Andon and Eckerth (2009) analysed the understandings
of TBLT held by four teachers of adult EFL and found that they had ‘a well-developed
awareness of their own teaching as well as an awareness of … core principles of TBLT’
(p. 304). In a more recent study conducted with foreign language teachers in New
Zealand, East (2012) concluded that there was encouraging evidence of attempts by
teachers to implement TBLT, though of concern was the finding that a quarter of the
participants in this study had minimal understandings of what TBLT was, and in some
cases task was interpreted as simply a synonym for ‘activity’.
Work conducted in Hong Kong is also very relevant here. Carless (2004, 2007a) examined the experiences of teachers of English in secondary schools in implementing taskbased learning. Teachers highlighted several challenges: for example, TBLT was more
time-consuming, was hard to manage in large classes and did not reflect the manner in
which English was assessed in examinations. More recently, Carless (2009) reported that
teachers of English in Hong Kong secondary schools seemed to prefer a presentation–
practice–production (PPP) model of teaching rather than TBLT, which they perceived to be
more complex. Similar empirical interest is starting to emerge from mainland China too,
though work in this context remains limited. Deng and Carless (2009) studied the extent to
which the activities in a primary school classroom in Guangdong reflected principles of
task-based learning and found limited evidence that they did; contextual factors such as
traditional examinations were one explanation for this; a second factor was the teacher’s
own limited understandings of how to implement TBLT. The work of Zhang (2007; Zhang
& Hu, 2010), also conducted in primary classrooms, provides further insight into the extent
to which TBLT is being implemented in China and into the factors that shape this implementation. One conclusion from this work is that the original top-down conceptualization
of TBLT had been reconstructed and become progressively weaker at each subsequent
level of the educational system, with the result that there was only limited evidence of
principled use of TBLT in the work of the three teachers who took part in the study.
While the above studies demonstrate increasing interest in what TBLT means to
teachers, we believe that there is a need for further research to examine what happens in
the classroom when Chinese teachers of English implement the 2003 task-based curriculum, to understand the beliefs and contextual factors that shape these teachers’ instructional decisions, and to use the findings from this kind of research to inform the
development of in-service training, which can support the more effective curriculum
implementation.
209
Zheng and Borg
III The study
The research questions addressed in this study were:
1.
2.
3.
What beliefs about TBLT do three secondary school Chinese teachers of English
hold?
To what extent do they implement TBLT as intended by the curriculum?
What factors, according to the teachers, shape their implementation of TBLT?
1 Participants
Three Chinese secondary school teachers of English took part in this qualitative study.
Two factors were taken into consideration when these teachers were selected. First, they
had collaborated with the first author on a previous study (Zheng, 2005) and had indicated then that they would be willing to participate in subsequent research projects.
Second, they were all following the new secondary curriculum described earlier, teaching the same level of students and using the same textbook. Thus these teachers were
suitably positioned to allow the investigation of the research questions highlighted
above. All three teachers taught English in Fuzhou, capital of Fujian Province, and
Table 1 provides background information for them. In terms of their schools, Mr Yang
(all names are pseudonyms) worked in an average school in terms of resources and student academic achievement. Miss Wu’s school, in comparison, was less well resourced
and had lower student admission standards. Ms Ma, on the other hand, worked in a key
school with excellent resources and students.
2 Data collection and analysis
Observational and interview data were collected in three phases for each teacher: prelesson interview, classroom observation and post-lesson interview. The pre-lesson interviews (each of which lasted approximately two hours) focused on how the participants
understood the 2003 English curriculum generally and task-based instruction specifically. The interviews were semi-structured in nature (see, for example, Richards, 2009),
allowing both the interviewer and the interviewees some flexibility in introducing and
pursuing themes of relevance that emerged during the conversation. These interviews
were audio recorded (with permission), conducted largely in Chinese by the first author,
translated into English by him, and verified for accuracy with both the participants of the
study and a colleague at Shanghai International Studies University.
Table 1. Teachers’ background.
Pseudonym
Mr Yang
Miss Wu
Ms Ma
Gender Age
Male
Female
Female
60
32
40
Qualifications
BA
BA and MA
BA
Experience (years)
Teaching English
Of new curriculum
36
8
16
3
3
3
210
Language Teaching Research 18(2)
Following the pre-lesson interview, observational data were collected from two of
each teacher’s normal classes (lessons lasted 45 minutes). Decisions about which classes
would be observed were made by the teachers, and in each lesson a researcher was present (as a non-participant observer) and recorded information about the lesson via written field notes, with specific attention to the kinds of instructional activities and materials
used. Copies of teaching resources and lesson plans were also collected. Additionally, the
first observed lesson for each teacher was also video recorded. These observational data
provided direct evidence of the manner in which the teachers were implementing the
curriculum and using tasks. These data also provided the stimulus for the third phase of
data collection with each teacher: a post-lesson interview. The purpose of these interviews (which lasted an hour and were also semi-structured and audio recorded) was to
discuss with the teachers the pedagogical options they used during the observed lessons,
to examine the factors shaping teachers’ instructional decisions, and to explore further
teachers’ understandings of tasks.
The analysis of the observational data was primarily formative; in other words, it took
place prior to (and informed the design of) the post-lesson interview rather than occurring at the end. The video recorded lessons were not transcribed in full; rather, they were
watched and narrative summaries made of episodes in the lessons that we felt provided
evidence of the teachers’ understandings and implementation of tasks. Informed by the
discussion of tasks in the literature discussed above, the lessons were analysed with particular attention to the focus of activities (form or meaning), their outcomes (linguistic
versus non-linguistic), organization (e.g. accuracy to fluency), interactional patterns (e.g.
teacher explanation versus pair work), and the skills involved (e.g. speaking, reading).
The interview data were transcribed and subjected to a process of qualitative content
analysis (see, for example, Newby, 2010) through which a range of beliefs held by the
teachers were identified (initially through coding) and then categorized; contextual factors that teachers cited in explaining their teaching were identified in a similar manner.
Overall, the process involved close and repeated readings of the interview transcripts and
identifying from the data (i.e. inductively) the themes that characterized the teachers’
commentaries as they articulated the rationale for their instructional decisions. The teachers were also given the chance to read and comment on our interpretations of their work.
IV Findings
1 Mr Yang
Mr Yang’s lessons were characterized by three stages: communication/skills work, language explanation, and language practice. Table 2 summarizes the stages in the two
observed lessons and indicates how long each lasted.
The topic of Lesson 1 was ‘Friendship’. Mr Yang started with a pre-reading activity
where students were asked to discuss in groups points such as ‘Make a list of reasons
why friends are important in your life’ and to report some of their ideas back to the class.
He then moved to the reading text by asking his students to guess what Anne’s (the text
was about Anne Frank) friend was, and then he set the following questions and asked the
students to read the text quickly in order to answer them:
211
Zheng and Borg
Table 2. Mr Yang’s lessons.
Stage Lesson 1
1
2
3
Time (mins) Lesson 2
Reading
18
Grammar explanation 15
Grammar practice
12
••
••
••
Time (mins)
Revision/listening
21
Grammar and vocabulary explanation 9
Grammar and vocabulary practice
15
How long did Anne and her family hide away?
According to Anne, what kind of person can be a true friend?
Why did Anne say that she was crazy about nature?
In the second stage of the lesson Mr Yang first explained the target grammar (direct/
indirect speech), and then the students practised this grammar orally and in writing.
A similar overall procedure was observed in Lesson 2. This time the skills work
focused on listening and speaking, using activities from the Teachers’ Book. This was
followed by the explanation of grammar and vocabulary contained in the listening material just used. In the third stage of the lesson, students completed exercises in the textbook. Mr Yang explained the grammar in Chinese briefly, and then had his students do
substitution exercises. As regards vocabulary, Mr Yang explained this in Chinese and
showed his students how nouns could be turned into verbs.
On the basis of the two lessons observed, there was not much evidence of TBLT in any
‘strong’ sense (e.g. there were no real world activities), and the interviews provided some
insight into Mr Yang’s understanding of TBLT. He had attended in-service training courses
about the new curriculum run by experts from Beijing, watched model lessons of task-based
lessons on video, and also observed his colleagues to see how they were using tasks. Informed
by these sources of evidence he explained his understandings of TBLT as follows:
I think task-based teaching is consistent with the principles of communicative language
teaching. As I understand it, tasks are communicative and goal-directed and they are the
extension of my communicative teaching activities … I think task-based teaching is a strong
communicative approach where students spend lots of time communicating. (Pre-lesson
interview, Mr Yang)
For Mr Yang, students’ communicating seemed to be the defining feature of TBLT, and
this is why (as in both lessons we observed) he gave students opportunities in each lesson
to talk in pairs or groups.
Although he was positively disposed towards the new curriculum, he had been initially unsure about whether he could use tasks in his teaching. However, on the basis of
the greater understanding of TBLT he now felt he had, he believed that he could use
tasks, but with caution:
The new curriculum introduced task-based teaching … It sounds good, but I was not sure if
task-based teaching worked in my context as I had little knowledge about it. Anyway, I must be
212
Language Teaching Research 18(2)
careful with it, you know, our national college entrance examination still remains unchanged
and I have a large size class to take care of. (Pre-lesson interview, Mr Yang)
The examinations were a factor that Mr Yang mentioned more than once:
As you know, I am responsible for my students’ college entrance examination, so I have to
teach each unit carefully and thoroughly. As far as I am concerned, I find it comfortable to teach
with my traditional method, but I am also happy to try task-based teaching for the purpose of
supporting my teaching. For instance, I apply task-based teaching to encourage my students to
work in pairs or groups so that they can do question and answer activities to improve their
speaking ability. (Pre-lesson interview, Mr Yang)
This extract shows once again how he equated TBLT with oral pair and group work.
As part of his drive to be more task-based, Mr Yang also said he had reduced the time
he spent on explaining grammar, but he still felt that explaining grammar was an important part of his lessons: ‘I still need to explain [grammar] to my students clearly as I was
concerned with some slow learners. To consolidate the grammar items, I had my students
do both oral and written exercises.’
In the interviews, Mr Yang said that he was unable to follow exactly what was suggested in the Teachers’ Book due to his beliefs and context. For example, the suggested
procedures in the Teachers’ Book for the listening work he did in Lesson 2 were:
1.
2.
3.
4.
5.
Let students discuss the questions in Exercise 1 before listening.
Have students quickly go over the other exercises on page 41 of Workbook.
Let them listen to the whole text and ask them for main idea.
Let them listen for the second and the third time and complete the exercise.
Listen once again if necessary and check the answers.
However, Mr Yang went directly to steps 3–4, omitting the rest, then moved straight onto
the explanation and practice of grammar (stages 2 and 3 in his lesson; see Table 2 above).
Mr Yang explained these changes to the suggested procedures as follows:
It is very important for my students to master those words and phrases from the listening
materials, such as ‘make friends’ and ‘may have been trying to do something’. The Teachers’
Book does not tell us to teach those points, but I think it is necessary. That is why I cut out some
of their suggestions and added my own design. (Post-lesson interview, Mr Yang)
To sum up, although he was a very experienced teacher, Mr Yang showed a willingness
to learn about the pedagogical ideas being promoted by the new curriculum and to integrate these into his established ways of teaching. He felt that using TBLT meant increasing the opportunities he gave students to speak English in pairs and groups, but even with
this conservative interpretation he felt that tasks could only be used in moderation due to
the examination system and large class sizes. Mr Yang also believed it was important to
explain language to students and for them to practise this language through exercises.
His lessons, therefore, once the first phase of communication and skills work had been
completed, were quite language oriented.
213
Zheng and Borg
Table 3. Miss Wu’s lessons.
Stage
Lesson 1
Time (mins)
Lesson 2
Time (mins)
1
2
3
4
5
Lead-in
Reading
Listening
Grammar explanation
Grammar practice
6
18
7
8
6
Explanation (skills for writing)
Feedback on sample of writing
Writing practice
14
17
14
2 Miss Wu
Table 3 summarizes the stages of two lessons that were observed with Miss Wu. The first
of these was a reading lesson and the second focused on writing.
In Lesson 1, Miss Wu followed five stages in working on the reading text ‘Earthquake’,
which was about the Tangshan Earthquake, which occurred in China in 1976. In the lead-in
she played a video about natural disasters, and students were encouraged to discuss this. The
second stage, reading, consisted of three steps: pre-reading, while-reading and post-reading.
In the pre-reading section, students were asked to predict what had happened to Tangshan in
1976, and for while-reading students had to identify in the text information about the damage caused by this earthquake. After the reading work, Miss Wu played a recording of the
text for her students to listen to. In stage 4 of the lesson Miss Wu explained the use of the
relative clauses and students (in stage 5) completed both oral and written practice.
Lesson 2 further developed the work started above on writing a news item and contained three stages. First Miss Wu explained several basic features of news stories, with
attention to matters such as headlines, content and language. In the second stage of the
lesson she presented one example of news writing completed by a student and evaluated
it with reference to the features previously identified. Finally, she asked her students to
work in groups to discuss ways of improving the sample story.
In discussing her teaching Miss Wu commented on the use of tasks in the new curriculum as follows:
If we can use task-based teaching in a proper way, I think it will give our students more
opportunities to communicate in English and learn how to work socially with other learners.
Besides, I believe task-based teaching can help our students develop skills to solve the real life
problems they meet. (Pre-lesson interview, Miss Wu)
Three purposes of TBLT were signalled here: increasing opportunities for communication, co-operative learning, and preparation for real-life problems. These ideas were all
in line with those promoted by the curriculum.
Our analysis of Miss Wu’s teaching suggested that she followed the recommendations
in the Teachers’ Book quite closely and she agreed that this resource provided a good
model to follow:
I think the new curriculum, together with its rich resources, does provide us with some very
useful and practical lesson patterns to follow. Why should I not just make use of it? (Post-lesson
interview. Miss Wu)
214
Language Teaching Research 18(2)
She was aware, though, that following the Teachers’ Book was not always effective. For
example, during the lead-in activity in Lesson 1, after she had played the video she asked
students to discuss these questions in groups :
•
•
•
What do you know about earthquakes?
How do you think people can avoid being hurt in the quake?
Can you describe the Wenchuan Earthquake according to what you have just seen
on the screen?
The students, though, did not engage with these questions and this led to off-task talk
within the groups and a challenging (i.e. noisy) classroom management situation for
Miss Wu to control. In reflecting on this episode, she complained that:
Our school is an ordinary one in a suburban area. We don’t have any privilege in taking in
students. Apparently, the students in the same class are at very different levels, which
made it very difficult for me to link my teaching with their real life. (Post-lesson interview,
Miss Wu)
Class size and mixed ability groups did in fact emerge here as the main obstacles that
Miss Wu experienced in seeking to implement the curriculum as intended:
I think I would like to use task-based teaching method whenever I can … I intend to follow the
procedures suggested by the Teachers’ Book in my teaching plan, but I meet difficulty in
teaching. You know, my students’ levels are very different and it is hard to control the large
class. Moreover, it is time-consuming to carry out activities … I have lots of content to cover.
(Pre-lesson interview, Miss Wu)
Time pressures were also noted here as an obstacle. Nonetheless, Miss Wu persisted in
trying to follow the curriculum guidelines by providing regular opportunities for students
to work in groups. In Lesson 2, after she had explained how to write a news report and
made comments on the sample paper, she once again organized students in groups and
asked them to discuss further how to improve this sample paper. In relation to her continuing attempts to use group work, she said:
Though I have met difficulty in organizing activities in pairs or in groups, yet I still try my best
to do so. This kind of teaching style and learning style takes time to shape. If I persist, I think
my students will, more or less, make progress. (Post-lesson interview, Miss Wu)
These comments reveal a commitment to the new syllabus and a belief that, in time, tasks
(which for her meant mainly pair and group work) would become easier to implement
and beneficial for the students.
Grammar explanation, though not as explicit and lengthy as Mr Yang’s, was still a
feature of Miss Wu’s work. In Lesson 1 she presented PowerPoint slides on which some
basic rules for relative clauses were summarized. Students then practised relative clauses
by doing oral and written translation. When asked for her views on grammar teaching,
Miss Wu explained that:
215
Zheng and Borg
I think the most effective way for our students to learn English grammar is to practise it, not just
to learn the rules by heart. I tried to use the exercises designed by the Teachers’ Book and got my
students to practise and practise. I know that my students made some mistakes in doing so, but I
think they will improve themselves through constant practice. (Post-lesson interview, Miss Wu)
Once again, Miss Wu’s commitment to following the Teachers’ Book is clear. In concluding our conversations with Miss Wu, we asked her whether she felt her teaching was
task-based:
I myself was not quite sure if my performance was up to that standard, but, generally speaking,
I tried to follow the procedures from the Teachers’ Book. When necessary, I would definitely
skip some steps and use my own way. (Post-lesson interview, Miss Wu)
This final comment from the teacher emphasizes the major theme to emerge from this
analysis of her beliefs and practices in relation to the use of tasks: her view that implementing the curriculum, and hence TBLT, meant following as closely as possible the
procedures specified in the Teachers’ Book. She said that she would diverge from these
when required, though we saw no examples of this in the two observed lessons. She was
committed to making the officially recommended procedures work and persisted with
these (particularly the use of oral group work, which for Miss Wu seemed to be a key
feature of TBLT), even when it impacted negatively on her control of the class. Despite
challenges posed by class size, student ability, and time pressures, she remained positive
and optimistic that in time the benefits of the new curriculum would be felt by her and
the students.
3 Ms Ma
Ms Ma’s lessons are summarized in Table 4. Each lesson consisted of four stages, though
the only element in common was the final stage which was language practice.
Lesson 1 was a reading lesson and Ms Ma started this by explaining grammar; specifically, she focused on the differences between a request and a command and told the
students about direct and indirect speech. After the explanation the students were asked
to practise the grammar in small groups. The third stage of the lesson was the reading
focus. Ms Ma first encouraged her students to get a general understanding of the passage
by skimming and scanning, then she asked them to read the text paragraph by paragraph
so that they could find more detailed information. At the end of the lesson Ms Ma gave
the students some written practice in using words and phrases selected from the text.
Table 4. Ms Ma’s Lessons.
Stage
Lesson 1
Time (mins)
Lesson 2
Time (mins)
1
2
3
4
Grammar explanation
Grammar practice
Reading
Vocabulary practice
10
12
15
8
Revision (reading)
Listening
Speaking
Language practice
6
11
14
14
216
Language Teaching Research 18(2)
In Lesson 2, Ms Ma started by checking students’ homework and asking some of them
to recite paragraphs from the text they had revised at home. She also asked three pairs of
students to turn sentences from direct speech into indirect speech. After this revision, Ms
Ma did some listening work; students listened to a short passage and were required to
identify detailed information in response to her prompts. This listening work was followed by speaking practice about direct and indirect speech which the students did in
pairs and groups. Ms Ma concluded her lesson by reviewing and practising the key grammar and vocabulary covered in this lesson.
Our observations suggested that Ms Ma regularly dedicated class time to explicit
grammar work and her views about grammar were confirmed when we asked her about
her understandings of the new curriculum:
With the introduction of the new curriculum, it appeared to me that our educational officials
and experts overemphasized the importance of developing students’ communication abilities in
speaking and listening. To them, it seemed as if developing students’ speaking and listening
comprehension is all there is in English language teaching. I definitely agree with the idea that
speaking and listening should be enhanced, but it does not mean there is no place for grammar
teaching. On the contrary, grammar teaching must also be better enhanced. (Pre-lesson
interview, Ms Ma)
Ms Ma, then, felt that the new curriculum placed too much emphasis on communicative
speaking and listening. These concerns resurfaced when we asked her for her views on
using tasks:
Nowadays, when people mention task-based teaching, they have a bias or the wrong idea, that
is, it seems as if task-based teaching can only be used in developing students’ speaking and
listening abilities. I think it is completely wrong. As I understand it, we can also use task-based
teaching to engage our students in grammar learning. Task-based teaching obviously provides
students with contexts to use English, doesn’t it? (Pre-lesson interview, Ms Ma)
Ms Ma was of course correct here in her argument that TBLT is not just about speaking
and listening. Her suggestion that TBLT provides students with contexts in which to use
grammar was also correct, and she was the only one among the three teachers in this
study who saw a connection between using tasks and teaching grammar. Her approach to
grammar in Lesson 1, though, was not consistent with TBLT as it took the form of explanations prior to controlled practice (more in tune with a PPP approach to language teaching) followed by reading work. Ms Ma’s rationale here was:
I think grammatical points, new words and expressions are usually the main difficulty for my
students to understand the text. So I prefer to help them clear away the obstacle in advance. In
fact, I added just one more component to the pre-reading process that the Teachers’ Book
suggested, that is, I explained the grammar before my students set out to get the general idea of
the text. My teaching experience tells me it is very necessary. (Post-lesson interview, Ms Ma)
She acknowledged too that this was an example of the typical approach she adopted in
organizing her lessons:
Zheng and Borg
217
Based on my personal belief that grammar is the first priority in learning a foreign language, I
usually teach grammar first, and then I have my students practise useful grammatical items to
consolidate what they have just learned. This does not necessarily mean I ignore developing their
speaking and listening comprehension. In fact, I feel they can speak more correctly and understand
better when they get familiar with the key grammatical points of the lesson. That is why I usually
teach grammar followed by reading, listening and speaking. (Pre-lesson interview, Ms Ma)
As she noted above, explaining grammar prior to reading the text was not a step proposed in the Teachers’ Book. However, Ms Ma added this step based on the belief that
students would find it hard to understand the text without first having the key grammar
explained. In this sense, she was not implementing the curriculum as it was intended and
introducing a focus on form to activities that were meant to be more meaning oriented.
More generally, TBLT seemed to have had limited impact on Ms Ma’s work. Grammar
remained a priority for her, and although she did give the students some opportunities to
talk in groups, she did not feel that developing students’ oral communicative skills
deserved the prominence she felt it was given in the new curriculum. She also felt that
the large size of her class, and the different levels of students in it, made pair and group
work difficult to manage:
The thing that I am worried about is that I am not able to monitor all of the performance in pairs
or in groups, as you know, the class size is extremely large … I have fifty-one students in all.
The good students always take the advantage to speak more, but the poor students are afraid of
making mistakes … and, therefore, they often keep silent. I think the big challenge for me is
how to organize more suitable activities for my students at different levels. (Post-lesson
interview, Ms Ma)
In summary, then, it appeared that Ms Ma had not embraced in any deep manner the
new curriculum. She disagreed with the emphasis it placed on speaking and listening and
felt that large classes made regular interactive oral work problematic. She held very
strong beliefs about the need for students to know grammar well and felt that skills work
such as reading needed to be prefaced with explanations of the grammar to appear in the
text. She recognized tasks could be used for any skill, not just speaking, but her persistent
focus on grammar (introducing it even when not recommended in the Teachers’ Book)
meant that it was difficult to discern in her teaching any places where her work could be
described as task-based.
V Discussion
The analysis we have presented here highlights the value of interpretive studies grounded
in a descriptive understanding of what teachers do in the classrooms. By combining evidence of what the teachers did with their explanations, in their words, for their behaviours, insights have emerged here into teachers’ implementation of tasks in the 2003
secondary English curriculum in China.
Our first research question examined teachers’ understandings of TBLT. Overall, the
common understanding of tasks we can extract from these teachers is that it involved
communicative work, in pairs or groups, with a predominant focus on speaking. If we
218
Language Teaching Research 18(2)
compare this to the characteristics of tasks listed in the curriculum document (Ministry
of Education, 2003; see earlier literature review), teachers’ understandings of tasks
seemed narrow. There was also no evidence in the teachers’ commentaries of an awareness of the different ways that tasks and TBLT are defined in the literature (e.g. Ellis,
2003; Nunan, 2004; Willis & Willis, 2007). In particular, one point often stressed in such
definitions – that tasks focus on non-linguistic outcomes – did not emerge at all here. The
understandings of tasks held by these teachers, then, did not distinguish them from communicative activities more generally. In this respect they were similar to some of the
teachers in East (2012), who also held very broad views of what a task was. In reaching
this conclusion we are not being critical of the teachers; it is likely that their views were
powerfully shaped by the examples they saw in the curriculum materials they were
given. In fact one conclusion these findings suggest is that, as implemented in the 2003
curriculum for English, tasks do seem to be synonymous with communicative activities
more generally.2
Our second research question asked about the extent to which the teachers implemented TBLT as intended. Here too various perspectives were evident. Miss Wu’s
implementation of the curriculum was very close to what was proposed in the Teachers’
Book. Mr Yang and Ms Ma both adhered less closely to the guidelines and in both cases
they introduced a stronger element of grammar work than was recommended. They were
both experienced teachers and the persistent power of their beliefs about language learning and teaching was clear, especially through Ms Ma’s commitment to explicit preemptive grammar work. These two senior teachers thus provide evidence of how beliefs
grounded in experience can mediate curricular recommendations (for similar insights see
Orafi & Borg, 2009).
Our final research question examined the factors which shaped the teachers’ implementation of TBLT. As noted above, the curriculum materials the teachers worked from
were a very strong influence on their lessons; in particular, all three participants here
were guided by their Teachers’ Books, which provided detailed procedural advice. Also
as noted above, teachers’ decisions about implementing the curriculum were shaped by
their beliefs about aspects of language teaching and learning, such as the importance of
grammar or of speaking. While these beliefs did not lead to major adaptations in the
implemented curriculum, they were nonetheless powerful enough to cause shifts in the
orientation that the proposed curriculum was given (e.g. in Ms Ma’s case, the curriculum
assumed a stronger grammar orientation than intended). If Mr Yang and Ms Ma are in
any way typical of experienced secondary school teachers of English in China, then, as
indicated by the literature on curriculum innovation discussed earlier, the wider persistence of such deep-rooted beliefs about grammar would represent a challenge to the
implementation of the new curriculum. A third set of influences on teachers here were
contextual, and they commented on how large classes, low proficiency or mixed ability
students, time pressures, and examinations all hindered their implementation of the curriculum. These are factors that have emerged in several other studies of curriculum
implementation (e.g. Carless, 2007a) and their presence here was not particularly surprising. What was interesting, though, was the manner in which Miss Wu, the youngest
teacher in the study, persisted in her commitment to the curriculum even though doing so
created classroom management problems for her. In her case it seems that her beliefs in
Zheng and Borg
219
the value of the curricular principles outweighed the concerns she had about the potential
problems that implementing these principles might have in her classroom. The two more
experienced teachers here did not exhibit such behaviour.
VI Conclusions
Before discussing the implications of this study we must acknowledge its limitations.
Clearly, we cannot make general claims about Chinese secondary school teachers of
English based on the three cases we have analysed, although we believe that many of the
issues highlighted here reflect ones highlighted in previous research and will resonate more
widely in the Chinese context. We also acknowledge that only two lessons per teacher were
examined; while they generated interesting descriptive data of classroom events, observing
further lessons for each teacher would have provided a stronger basis for claims about these
teachers’ work more generally. Nonetheless, we feel this work makes a valuable contribution to our understandings of teachers’ practices and beliefs in relation to TBLT.
In terms of implications, this study suggests that teachers of English in secondary
schools in China may benefit from opportunities to deepen their understandings of what
TBLT means, both as implied in the curriculum and in the literature more generally. For
example, it is important that teachers extend their understandings of TBLT beyond a focus
on speaking in pairs or groups; teachers would also benefit from an understanding of the
non-linguistic outcomes of tasks and of the different roles that grammar can play in TBLT
(see, for example Carless, 2007b; Loschky & Bley-Vroman, 1993). This latter point seems
particularly important given that attention to grammar continues to be a valued aspect of
English lessons in China. In-service teacher educators can address all of these issues, and
one strategy they can use in doing so is data-based teacher development (Borg, 1998).
This approach to teacher education emphasizes reflection and awareness-raising based on
the study of transcripts of lessons and of teachers’ commentaries on their work. Thus, during an in-service teacher education session participants could first be presented with a
transcript, for example, of the start of Lesson 1 from Ms Ma’s case, and asked to comment
on the extent to which they feel it is task-based; the recommended procedures from the
curriculum or Teachers’ Book could then be fed in, and participants could be asked to
compare these to the lesson and to identify any discrepancies. The next stage might be for
the participants to consider why those discrepancies exist, after which Ms Ma’s own commentary on her teaching – her explanation for diverging from the recommended procedures – could be introduced into the discussion. Analyses of this kind can lead to a greater
awareness among participants of, for example, how teachers’ beliefs and contextual factors influence instructional choices, of what TBLT is, and of the role of grammar in taskbased teaching. The insights emerging from case-based in-service teacher education of
this kind can then be extended through reading and, most importantly, by inviting teachers
to undertake similar reflective analyses of their own teaching and of the factors that shape
it (so motivated by the analysis of others’ work, teachers can, for example, then study the
extent to which they use TBLT and what role grammar plays in their own work). The
kinds of qualitative insight we have provided here into the teaching of English in Chinese
secondary schools, in addition to being of empirical value, can thus also provide the basis
of participant-centred in-service teacher education.
220
Language Teaching Research 18(2)
Funding
This work was supported by a grant from Shanghai International Studies University Major
Scientific Research Project (project number: KX161027).
Notes
1. This extract in English of the MOE document was taken from a translation of the curriculum document produced by Shanxi Institute of Education and verified by the foreign language
department there.
2. It is interesting to note that recent years have seen the revision of English curricula in China
and that in these revisions less emphasis is being placed on the use of tasks. The 2003 senior
high curriculum we focused on here will be revised in the near future and a similar change of
emphasis is likely.
References
Andon, N., & Eckerth, J. (2009). Chacun à son gout? Task-based L2 pedagogy from the teacher’s
point of view. International Journal of Applied Linguistics, 19, 286–310.
Borg, S. (1998). Data-based teacher development. ELT Journal, 52, 273–281.
Borg, S. (2006). Teacher cognition and language education: Research and practice. London:
Continuum.
Carless, D. (2004). Issues in teachers’ reinterpretation of a task-based innovation in primary
schools. TESOL Quarterly, 38, 639–662.
Carless, D. (2007a). Grammatical options in a task-based approach. Modern English Teacher, 16,
29–32.
Carless, D. (2007b). The suitability of task-based approaches for secondary schools: Perspectives
from Hong Kong. System, 35, 595–608.
Carless, D. (2009). Revisiting the TBLT versus P-P-P debate: Voices from Hong Kong. Asian
Journal of English Language Teaching, 19, 49–66.
Cortazzi, M., & Li, J. (1996). Cultures of learning: Language classrooms in China. In H. Coleman
(Ed.), Society and the language classroom (pp. 169–206). Cambridge: Cambridge University
Press.
Deng, C.R., & Carless, D. (2009). The communicativeness of activities in a task-based innovation
in Guangdong, China. Asian Journal of English Language teaching, 19, 113–134.
East, M. (2012). Task-based language teaching from the teachers’ perspective: Insights from New
Zealand. Amsterdam: John Benjamins.
Ellis, R. (2003). Task-based language learning and teaching. Oxford: Oxford University Press.
Fullan, M. (2001).The new meaning of educational change. 3rd edition. New York: Teachers
College Press.
Loschky, L., & Bley-Vroman, R. (1993). Grammar and task-based methodology. In G. Crookes &
S. Gass (Eds.), Tasks and language learning: Integrating theory and practice (pp. 123–167).
Clevedon: Multilingual Matters.
Markee, N. (1997), Second language acquisition research: A resource for changing teachers’ professional cultures?. The Modern Language Journal, 81, 80–93.
Ministry of Education (2003) Putong gaozhong yingyu kecheng biaozhun [English Curriculum
Standards for Senior Secondary School (trial version)]. Beijing: Beijing Normal University
Press.
Newby, P. (2010). Research methods for education. Harlow: Pearson Education Limited.
Nunan, D. (2004). Task-based language teaching. Cambridge: Cambridge University Press.
Zheng and Borg
221
Orafi, S.M.S., & Borg, S. (2009). Intentions and realities in implementing communicative curriculum reform. System, 37, 243–253.
Richards, K. (2009). Interviews. In J. Heigham & R.A. Croker (Eds.), Qualitative research in
applied linguistics (pp. 182–199). Basingstoke: Palgrave Macmillan.
Samuda, V., & Bygate, M. (2008). Tasks in second language learning. New York: Palgrave
Macmillan.
Scrivener, J. (2011). Learning teaching. 3rd edition. Oxford: Macmillan.
Snyder, J., Bolin, F., & Zumwalt, K. (1992). Curriculum implementation. In P.W. Jackson (Ed.),
Handbook of research on curriculum (pp. 402–435). New York: Macmillan.
Van de Branden, K. (2006). Introduction: Task-based language teaching in a nutshell. In K.
Van de Branden (Ed.), Task-based language education: From theory to practice (pp. 1–16).
Cambridge: Cambridge University Press.
Wang, Q. (2007). The national curriculum changes and their effects on English language teaching in the People’s Republic of China. In Cummins, J. and Davison, C. (Eds), International
handbook of English language teaching: Volume 1 (pp. 87–106). Norwell, MA: Springer.
Wang, W., & Lam, A.S.L. (2010). The English language curriculum for senior secondary school
in China: Its evolution from 1949. RELC Journal, 40, 65–82
Wedell, M. (2009). Planning for educational change: Putting people and their contexts first.
London: Continuum.
Willis, D., & Willis, J.R. (2007). Doing task-based teaching. Oxford: Oxford University Press.
Wyatt, M., & Borg, S. (2011). Development in the practical knowledge of language teachers:
A comparative study of three teachers designing and using communicative tasks on an inservice BA TESOL programme in the Middle East. Innovation in Language Learning and
Teaching, 5, 233–252.
Yan, C. (2012). ‘We can only change in a small way’: A study of secondary English teachers’
implementation of curriculum reform in China. Journal of Educational Change, 13, 431–447.
Yan, C., & He, C. (2012). Bridging the implementation gap: An ethnographic study of English
teachers’ implementation of the curriculum reform in China. Ethnography and Education,
7, 1–19.
Zhang, Y. (2007). TBLT-innovation in primary school English language teaching in Mainland
China. In K. Van den Branden, K. Van Gorp, & M. Verhelst (Eds.), Tasks in action: Taskbased language education from a classroom-based perspective (pp. 68–91). Cambridge:
Cambridge Scholars.
Zhang, Y., & Hu, G. (2010). Between intended and enacted curricula: Three teachers and a mandated curricular reform in mainland China. In K. Menken & O. García (Eds.), Negotiating language policies in schools: Educators as policymakers (pp. 123–142). New York: Routledge.
Zheng, X. (2005). Pedagogy and pragmatism: secondary English language teaching in the People’s
Republic of China. Unpublished PhD thesis, The University of Hong Kong, China.
Zheng, X., & Adamson, B. (2003). The pedagogy of a secondary school teacher of English in the
People’s Republic of China: Changing the stereotypes. RELC Journal, 34, 323–337.
Zheng, X., & Davison, C. (2008). Changing pedagogy: Analysing ELT teachers in China. London:
Continuum.
Copyright of Language Teaching Research is the property of Sage Publications, Ltd. and its
content may not be copied or emailed to multiple sites or posted to a listserv without the
copyright holder's express written permission. However, users may print, download, or email
articles for individual use.
Jul. 2008, Volume 5, No.7 (Serial No.44)
US-China Education Review, ISSN1548-6613, USA
Practice on assessing grammar and vocabulary: The case of the TOEFL
ZHUANG Xin
(College of Foreign Languages, Zhejiang Gongshang University, Hangzhou Zhejiang 310018, China)
Abstract: The Test of English as a Foreign Language (TOEFL) brings tremendous influence to EFL (English
as a Foreign Language) learners worldwide. TOEFL 2000 project claims that TOEFL, as a more reflective of
communicative model, could provide more information about international students’ language ability that it is
supposed to measure. However, after detailed analyzing an authentic paper-based test paper in May, 2001 in China
as a sample from four aspects—test reliability, construct validity, authenticity and interactiveness respectively, it is
found that the test puts too much emphasis on vocabulary and grammar knowledge within almost every session of
the test paper, in which “structure and written expression” could be the most disputed part. The content could not
fully demonstrate its validity and communicative purposes so that it is suspected that test takers could meet the
later demands in academic study abroad. Nevertheless, this is a powerful explanation about the current
revolutionized change in the framework and content of TOEFL to meet the principles of designing a test, which
could provide more information and guidance for later test designs.
Key words: TOEFL; validity; grammar; vocabulary; communicative purposes
During the couple of years, TOEFL (Test of English as a Foreign Language) has undergone a revolutionized
change in the test content and framework. What makes the change? What are the changes? What are the
implications in the changes? Answering these three W-questions could provide us a guideline for making language
tests much more reliable, valid, authentic and interactive in accordance with the communicative language teaching
worldwide.
1. Background knowledge about the TOEFL
The eagerness of learning a foreign language promotes the development of foreign language learning. In
order to prove one’s language proficiency, the TOEFL, as one form of international language tests, becomes the
dominant type worldwide. It is slightly different from tests in the classroom. It has no fixed content that have been
taught to test takers, which decides its wide range and general contents towards EFL learners worldwide. It is
rather a proficiency test than an achievement test since it measures someone’s language abilities at a certain time.
The TOEFL test is norm-referenced test but not criterion-referenced one since test results are interpreted with
reference to the performance of a certain group, whose performance is used to relates one candidate’s performance
to that of other candidates (Hughes, 1989, pp. 17-18), that is, to obtain meaning from the referenced scores (Ebel
& Frisbie, 1991, p. 34).
TOEFL 2000 project claims that TOEFL is “more reflective of communicative competence models” and it
“provides more information than current TOEFL scores do about international students’ ability to use English in
ZHUANG Xin, lecturer, College of Foreign Languages, Zhejiang Gongshang University; research fields: English language
teaching, teacher education.
46
Practice on assessing grammar and vocabulary: The case of the TOEFL
an academic environment” (Jamieson, et al., 2000, p. 3). Before the birth of the TOEFL 2000 project, some
researchers categorized TOEFL as a non-communicative test. But does it really make a revolutionized change? As
Morrow (1986, p. 9) mentions that in communicative testing, “What we are concerned with is the performance of
an individual performing a set of tasks in a foreign language”. Can it really attain its ambitious goals?
According to the TOEFL 2000 project, the traditional TOEFL test exams one’s language competence in
listening, reading and writing skills, among which integrated with vocabulary and structure knowledge for the
years around. Moreover, there are standard procedures for administering and scoring the test and TOEFL that is
held systematically in fixed work-based worldwide and the total paper-based test score is now reported on a scale
that ranges from 310-677, while TWE (Test of Written English) score is reported separately on a scale of 1-6.
Finally, through a process of empirical research and development, the characteristics of the tests are well-known,
and the testees even have suggestions and tips of preparing a TOEFL test, which are provided by Educational
Testing Service (ETS). In the survey done by Brown and Ross (1996, p. 233), there are approximately 85.2%
testers using the TOEFL test score for graduate, undergraduate studies or another type of school, 13.8% ones for a
license or a company and only 1% people give no reason for taking the TOEFL tests among 20,000 randomly
selected testees. Evidently, more and more people use TOEFL score as a proof to demonstrate the individual
language proficiency to meet the later requirements from both academic degree programmes and ESL learning as
well, even though there is no standard criterion to define which score is a “pass” and which is a “failure”.
As a large scale proficiency test, TOEFL is designed to measure people’s language abilities. However, it is
not a test to discover whether someone has adequate command of the language for a particular purpose but rather
the one with more general concept. It is a common sense that TOEFL has been thrived for a long period to meet
the global requirements on EFL testing due to either its rationality or its exclusiveness, but definitely it is meeting
the new challenges from other test systems as the time goes by. For instance, more and more countries, especially
the European countries adopted International English Language Testing System (IELTS) as a main assessment of
English proficiency. This is not the national preference makes the tendency but undoubtedly reflects the basic
considerations and appealing that come from the test principles.
2. Study on the TOEFL paper 2001
In order to have better understanding about some revolutionized changes of TOEFL in recent years, it is
sensible to have a review on its tests based on TOEFL 2000 project.
2.1 Test framework
Take one TOEFL paper-based test for example, it was taken in May, 2001 in China generally. The whole
structure of the test paper mainly consists of four parts:
(1) Section 1: Test of Written English (TWE) (30 minutes);
(2) Section 2: Listening Comprehension (30 minutes);
(3) Section 3: Structure and Written Expression (25 minutes);
(4) Section 4: Reading Comprehension (45 minutes).
Among the sections, section 2, 3 and 4 are timed tests in multiple-choice format with four options for each
question. TOEFL, as a popular norm-referenced test for the whole world, is designed not based on certain contents
or a language course but to meet the fundamental and necessary requirement of using language—to communicate.
As the foremost aspect in the criteria, thus, we have to be careful about the designing and to reconsider the
47
Practice on assessing grammar and vocabulary: The case of the TOEFL
function of the tests. To measure language proficiency in almost every aspect of situations, we need to take
account of when, where, how, why and what is to be used. Therefore, how would the tests be as representative as
possible is the key issue in designing language tests. Bachman and Palmer (1996, pp. 19-25) provide us some
basic criteria which need to be reflected in the test paper. They are test reliability, construct validity, authenticity
and interactiveness.
2.2 Test reliability
The concept of reliability is particularly important in the language tests. Although we can never have
complete trust in any set of the scores, we try to produce a perfect and consistent test score which is free from
measurement error mainly intrigued by different testing times, test forms, raters and other characteristics of the
measurement context, that is, to concern the consistency of test judgements and results (Bachman, 1990; Hughes,
1989; Weir, 1990; Davies, 1990). And the highly reliable score ought to be “accurate, reproducible and
generalizable to other testing occasions and other similar test instruments” (Ebel & Frisbie, 1991, p. 76). In
TOEFL, there are two components of test reliability we need to consider, one is the performance of testers and the
other is the reliability of the scoring. Let’s look at the data provided by ETS diachronically. In China, there were
31,462 students took TOEFL CBT between July 1999 and June 2000, in which the average scores in three parts
listening, structure and reading were 20, 21, 21 respectively and the mean of total score was 206. Between July,
2001 and June, 2002 there were 58,772 students took TOEFL CBT, and they got 20, 21, 21 separately in three
parts, the mean of total score was 207 (TOEFL test score and data 2000-2001, 2002-2003). From the data above,
we could find that the scores of Chinese students generally cluster around the 20 level and the reliability estimates
were well within the desirable range and substantial. Part of the reason is that the mark of TWE does not add into
the whole score so that other three sessions require no judgement on scoring for the testing format, and could be in
practice carried out by a computer, thus, the main part of TOEFL test is said to be objective and highly reliable.
2.3 Test validity
It seems to be axiomatic that “validity cannot be established unless reliability is also established for specific
contexts of language performance” (Cumming & Mellow, 1995, p. 77). “A test, part of a test, or a testing
technique is said to have construct validity if it can be demonstrated that it measures just the ability which it is
supposed to measure” (Hughes, 1989, p. 26). If test scores are affected by other abilities rather than the one we
want to measure, they will not be the satisfactory interpretation of the particular ability. In this TOEFL test paper
of May 2001, if we look at each session rather than the holistic structure, reading comprehension won’t cause too
much concern since it is fairly demonstrate, which measures a distinct ability. There are five pieces of articles,
related with social science, biology, literature, ethology and geology, which covered wide varieties of topics.
Including these fifty questions, the whole reading comprehension has 3,673 words, which means that the testees
need to finish reading in about 82 words per minute. This is a high demand for EFL learners who need to prove
their abilities in language knowledge as well as cultural background knowledge. What’s more the reading part not
only questions the related information but also questions the implied meaning and even the specific meaning of a
certain word. From those aspects we need the skills of reading both extensively and intensively. If “the purpose,
events, skills, functions, levels are carried out as what they are expected to” (Carroll, 1980, p. 67), the construct
validation is fully displayed in the TOEFL reading part.
2.4 Test authenticity and interactiveness
The other two principles we need to concern are authenticity and interactiveness. “Authenticity provides a
means for investigating the extent to which score interpretations generalize beyond performance on the test to
48
Practice on assessing grammar and vocabulary: The case of the TOEFL
language use” (Bachman & Palmer, 1996, pp. 23-24), which means the task that the test set is correspond with the
content of the test. In the language test, authenticity sometimes distantly related with real communicative tasks by
carrying out series of linguistic skills rather than genuine operational ones for reliability and economy (Carroll,
1980, p. 37). The listening comprehension in TOEFL test simulates the speaking environment in the North
American colleges or universities and adds some idiomatic expressions common to spoken English to attain the
features of the target language usage, which we could say this session provides the authentic materials in a certain
extent. Nevertheless, for the language proficiency if we only test listening or reading, the whole test are not fully
activated and we would never have the generalized idea about the testees’ language standard so that the test could
not be called successful at all.
Interactiveness refers to the extent and type of involvement of the test taker’s individual characteristics in
accomplishing a test task (Bachman & Palmer, 1996, p. 25). Due to the different areas of language knowledge,
planning strategy and personality, how could we give each testee a fair chance is always a question. TOEFL test
demonstrates this point by offering a general topic in writing, by providing standard written English in grammar
structure, and by covering various topics in reading, however, we still could find something which is too
“Americanized”. For instance, the pronunciation of listening comprehension is sounded in American way which
seems to be a hard work for the learners whose first language is not English worldwide.
Compared with the TOEFL listening section, Cambridge First Certificate in English (FCE) provides a variety
of accents in both standard variants of English native speaker accent and English non-native speaker accents
(Cambridge FCE Handbook, 1997). These designs in FCE initiate the similar environment in English countries
and make the whole test more communicative and practical. Many articles of reading comprehension concern lots
of American topics but fairly rare non-American ones, although it seems to cover abundant topics. As Hilke and
Wadden (1997, p. 36) note that “what certain TOEFL texts choose to include, moreover, is often as significant as
what they fail to include”. In this test paper, two fifths of the reading contents attach closely with American
history background. Thus, whether the TOEFL test provides each candidate a fair chance is not clearly
demonstrated.
3. Analyzing the “language” knowledge in the TOEFL
In the framework of the language structure put forward by Bachman and Palmer (1996, pp. 68-75), we could
infer that learners’ language ability consists of two parts, one is language knowledge and the other is strategic
competence/metacognitive strategy. That is to say, learners need to know the vocabulary, grammar, sound system
as well as to use the coherent sentences in a certain language setting to achieve the communicative goals of
language users. The TOEFL test, the way to demonstrate candidates’ achievement in English, should determine
whether they could apply the knowledge and skills in their future real-life study, that is, to assess their
performance in this language. This is the main reason to construct the tests to get the information: “How well
individuals perform on the test represents to some degree how they might be expected to respond outside the
testing environment” (Sax, 1997, p. 304). However, we do not expect a test can measure all the aspects of
language in each section, thus, the samplings should be as represented as possible. And here more emphasis will
be put on the grammatical knowledge part in the TOEFL test.
3.1 Testing grammatical knowledge in writing, listening and reading skills
Grammatical knowledge mainly includes three parts: vocabulary, syntax and phonology (Bachman & Palmer,
49
Practice on assessing grammar and vocabulary: The case of the TOEFL
1996, p. 70). In this TOEFL test paper, knowledge of vocabulary seems to be tested in all the sections, which
proves the common sense that words are basic building blocks of language. Vocabulary, which is embedded,
comprehensive and context dependent in nature, plays an explicit role in the assessment of learners’ performance
(Read & Chapelle, 2001). The best way to test people’s vocabulary is to use various ways to test either the basic
meaning of a word, or its derived form, its collocations or its meaning relationship in a context. Nation (1990)
gives a systematic list of competencies which has come to know as types of word knowledge, which are (1)
spoken form of the word; (2) written form of the word; (3) grammatical behaviour of the word; (4) collocational
behaviour of the word; (5) frequency of the word; (6) stylistic register constrains of the word; (7) conceptual
meaning of the word; (8) associations the word has with other related words (Schrutt, 1999, p. 194). These word
knowledge types decide the meaning of knowing a word, thus, if we want to analyze the construct validity of
vocabulary items in TOEFL, whether the meaning sense is typical way of usage in an academic context in the
future is the key element. Schrutt (1999, p. 192) also points out: “Although any individual vocabulary item is
likely to have internal content validity, there are broader issues involving the representativeness of the target
words chosen”.
In TWE it checks not only the written form of the words but also the function and collocations of their
grammatical usage. Cumming and Mellow (1995, p. 77) define a general ESL composition profile, which is
“vocabulary (range, choice, usage, word form mastery, register), language use (complex constructions, errors of
agreement, tense, number, word order/function, articles, pronouns, prepositions) and mechanics (spelling,
punctuation, capitalization, paragraphing)”. The testees need to finish a composition in 30 minutes which is
constituted by more than 300 words are more preferable. However, the limitation in TWE is its limited styles of
writing. Like the topic in this test paper, most of the writing style in the TOEFL is a contrastive writing to show
personal preference or the choice. Although the writing section is not the specific part to test grammatical
knowledge, whether the sample chosen in the TOEFL test is truly the representative of the communicative
competence is still a question.
In listening comprehension, testing vocabulary is not limited to single word any more. There are many
compound words, phrases and even idiomatic expressions and slang. For example, in May, 2001 test paper, there
are some idioms in the dialogues between two speakers like “have something checked out, headed one’s way, big
show storm, get a little carried away, that sure beats sticking around here” etc. Since most of the dialogues are
selected from American daily life, lots of phrases and sentences cause great difficulty for EFL testees since it is
difficult to work out the meaning by the surface meaning of the words. Moreover, both the conversation and the
choices have a high demanding on grammar to require the testers give definite response in fifteen seconds. For
example, four choices in No. 8 display four different tenses: the present simple, the past perfect, the subjunctive
mood in future sense and the future tense. And the dialogue in No. 8 is:
M: My back has been aching ever since I started playing tennis on the weekends.
W: Haven’t you had that checked out yet?
Q: What does the woman imply?
From this short dialogue, we notice that usually the first person present the content or the background of their
conversation and the second person gives the hint to the answer of the question. Summing up the questions from
first thirty short dialogues, we get the following results (Table 1):
50
Practice on assessing grammar and vocabulary: The case of the TOEFL
Table 1
Questions of 30 dialogues in the TOEFL test, May 2001
Typical questions
What does the man/woman imply?
What does the man/women mean?
What does the man/woman suggest
What can be inferred from the conversation?
Others
Percentage
37%
33%
13%
10%
7%
From the type of questions, it is not so difficult to find out that answering these questions in “listening
comprehension” needs either fluency and consolidated grammatical knowledge. Listening comprehension test is
much more a combination of testing on both vocabulary and syntax.
The communicative philosophy of reading test is to test “in what situations do we read which texts for which
purposes” (Wijgh, 1995, p. 155). Originally, TOEFL tests had vocabulary items, which were selective and
context-independent multiple-choice items presenting words in isolation. They were criticized since international
students simply spent time unproductively memorizing long list of words together with synonyms or definitions
(Read & Chapelle, 2001, p. 14). And now the prominent feature in vocabulary items still exists in the TOEFL
reading comprehension subtest, that is, the testing on the meaning of words or short phrases. Banerjee and
Clapham (2003, p. 116) point out that although the previous section in the TOEFL test called the reading and
vocabulary and now it is renamed as reading comprehension, which still consists two distinct tests: Reading and
vocabulary. In this test paper, there are 20 questions related to the close meaning or referring meaning of the
words or phrases, in which 16 of them are questions about words. These questions take up two fifths of the overall
reading questions, and the second article has the largest number of questions, which are five in ten. These
questions always demonstrate in several fixed way: “The word ‘lured’ in line 19 is closest in meaning to…; the
word “them” in line 11 refers to…”. Although the “closest in meaning to” questions concern much of the word
meaning in the context, the rest of the word questions seem to assess the range of candidates’ vocabulary. And
sometimes without referring back to the contents, testees still could get the answer if they simply know the
meaning of words. As what Read (1997, 2000, cited by Read & Chapelle, 2001) has said those vocabulary items
in the reading test of the TOEFL can be categorized into the relative independent group, despite the manner in
which they are presented. Is it another section which focuses on the vocabulary again?
3.2 Testing grammar knowledge independently
Assessing language knowledge is always reflected in the “four-basic skills”, speaking, listening, reading and
writing. But considering some well-known proficiency tests erase the grammar component (Hughes, 1989, p. 141),
the “structure and written expression” still remains as one part of the TOEFL tests, whose contents are similar to
the section “use of English” in the First Certificate in English (FCE) in Cambridge Level Three. Wall, et al (1991,
p. 214) suggest that if we want to decide the content validity, several elements need to be determined, that is,
whether the tasks they are testing are the ones they intend to test; whether the sampling of tasks is adequate; and
whether the level of difficulty of its components is proper. The principles for communicative language learning
guiding test construction in “structure and written expression” suggest that testees should know how to use
different structures and useful expressions in language output to be effective and efficient speakers and writers,
which could satisfy the original purposes of studying in North America. In FCE, testers are expected to
51
Practice on assessing grammar and vocabulary: The case of the TOEFL
demonstrate their knowledge on vocabulary and grammar and control of the language system1 (First Certificate in
English Handbook, 1997, p. 7).
Nevertheless, the TOEFL test gives a higher requirement on vocabulary and syntax, which both belong to the
grammatical knowledge, by setting an independent section as a grammar-like test. There are two parts in the
session: (1) incomplete sentences, with words or phrases as options (15 items); and (2) sentences in which some
words or phrases are underlined. In the second part (25 items), the examinee must identify the words or phrases in
each sentence that are not appropriate to standard, formal written English (Stevenson, 1987, p. 80). The forty
items in this subtest need to be finished in 25 minutes which have a high demand on the speed of testing. In FCE,
there are five parts needed to be finished in one hour and fifteen minutes, which are (1) multiple choice cloze; (2)
open cloze; (3) “key” word transformations; (4) error correction and (5) word formation. (1) and (5) emphasize
vocabulary while (4) focuses on grammar, and (2) and (3) concern the integration of grammar and vocabulary2
(First Certificate in English Handbook, 1997, p. 28). There are 65 items in five sections of the FCE. Moreover,
compared with the TOEFL test paper, it is evidently to see that the FCE has more variation on the form of the
items and except part three which is produced on sentence-based questions, the other parts are all integrated with
context. “Structure and written expression” in the TOEFL underwent no significant change in the recent revision
of the test, and grammatical words are still the necessary parts in these exercises. Words like articles, prepositions,
pronouns, conjunctions, auxiliaries, etc. are often referred to as function words which belong more to the grammar
(Read, 2000, p. 18), as in the following example:
The hamster’s basic diet is vegetarian, some hamsters also eat insects.
(A) Despite
(B) Although
(C) Regardless of
(D) Consequently
Content words such as nouns, “full” verbs, adjectives and adverbs provide links within sentences or modify
the meaning of the content words (Read 2000, p. 18). Furthermore, there are many phrases become one of the
testing focuses, as follows:
The giant ragweed, or buffalo weed, grows
(A) 18 feet up to high
(B) To high 18 feet up
(C) Up to 18 feet high
(D) 18 feet high up to
However, it is more appropriate to say that “structure and written expression” section in the TOEFL test is
more like a grammar subtest rather than a simple vocabulary test, which has been considered “an important trait in
the measurement of an individual’s overall performance in a language” (Rea-Dickins, 1991, p. 115). The grammar
subtest here cannot be regarded as a skill subtest like listening or reading subtests, so we cannot help but wonder
“What does this grammar subtest measure? Is it a communicative test”. When we talk about the communicative
competence, often we concern with “generalized abilities” (Skehan, 1991, p. 9), the abilities to express one’s
1
Details of the TOEFL test introduction could be found at http://www.toefl.org/.
Cambridge Examinations. (1997). Certificates and Diplomas: FCE Handbooks. University of Cambridge Local Examinations
Syndicate.
2
52
Practice on assessing grammar and vocabulary: The case of the TOEFL
meanings by using appropriate language in various contexts. In order to carry out a persuasive and rigorous
assessment in the communicative test like TOEFL, we need to ensure that “the sample of communicative language
ability in our tests is as representative as possible” (Weir, 1990, p. 11). Rea-Dickins (1991, p. 125) defines five
factors to the “communicative” nature of a grammar test which I think two of them are quite necessary for the test:
one is the contextualization of test items and the other is to put the instructions into focusing on meaning but not
simply on form.
Before we look at the content validity of “structure and written expression” section, some elements are
needed consideration. They are mainly the format of test item to be used, the area of the content to be sampled, the
number of the items in the area, and the level of item difficulty (Osterlind, 1998, p. 78). The purpose on the
requirement of contextualized test items is to meet the heuristic functions of knowledge. Providing authentic
material to solve problems and to develop thinking is “highly relevant to communication in the discipline or
occupation concerned”. And “the aim is to assess communicative proficiency in the subject concerned, not to test
specific knowledge of it” (Carroll, 1980, p. 38). In both part one and part two of “structure and written
expression” section of TOEFL test, testees mainly focus on the selection of the appropriate form on
sentence-based format, which hardly have to exchange the information during the whole process of production,
although most of the topics relate with academic areas. The limited items of testing grammatical knowledge could
not reflect the testees’ knowledge of grammar completely and have no practical usage to prove that the testees
would be competent enough to meet the later requirements on academic learning. Moreover, the written
expression part in this section seems to test the writing ability in an indirect way. However, we suspect this section
could really decide whether the learners’ competence in writing could meet the later demands in academic study.
One of the remarkable features in the TOEFL test is its format of answering the questions in four answer
options—multiple choices. From the aspect of scoring, it is highly reliable since all the scoring could be carried
out by computers, thus easily “discriminate between high- and low-achieving students” (Ebel & Frisbie, 1991, p.
124). And it could offer more flexibility for assessing a diversity of content and psychological processes
(Osterlind, 1998, p. 163). For a four-option test, students have 25% point for each choice which seems to be more
complex and less ambiguous than true-false decision item when making the correct choice. However, from the
other side, not all the grammatical knowledge could be tested simply by choosing one correct answer from four
choices. What’s more, the distractors of the test items decide a lot on the validity of the content. The choices could
not attain the effect of checking certain knowledge if the other three choices have little relation with knowledge
that wanted to test or without looking at the given context, no higher order thinking skills are needed. What’s more,
students get credit to recognizing the wrong options through a process of elimination or simply guessing even
though they still could not identify the correct option (Sax, 1997, p. 106; Ebel & Frisbie, 1991, p. 156). Moreover,
the weakness of setting up easily-devised and objectively-scored tests of strings of linguistic items is that the
testees may miss the essence of the measurement of communicative performance (Carroll, 1980, p. 9) and inhibit
creativity and original thinking, and reduce all important knowledge to superficial facts (Osterlind, 1998, p. 164).
Therefore, when the grammatical knowledge is simply tested in four-option multiple-choice, two criteria have
been presented either to the task in the test item or the test takers’ performance.
In order to have a better understanding of the contents of “structure and written expression” section, I
borrowed the types analysis from Hilke and Wadden (1997, pp. 30-34) to give a brief generalization of this test
paper.
Part A: Structure (Fill in the blank):
53
Practice on assessing grammar and vocabulary: The case of the TOEFL
(1) WIAS (what Is A Sentence): about 47.7%;
(2) Word choice: about 27.7%;
(3) Word order: about 13.3%;
(4) Verb form: about 13.3%.
(WIAS is a category to show that one clause contains one subject and one verb. Word choice tests how to use
appropriate words and phrases. Word order is to check the proper order of the words. Verb form concerns with
tense or aspect. )
Part B: Written expression (error analysis):
(1) Part of speech error: 24%;
(2) Prepositional error: 16%;
(3) Verb form error: 16%;
(4) Plural: 8%;
(5) Pronoun error: 8%;
(6) Redundancy error: 8%;
(7) Word order: 8%;
(8) Article error: 8%;
(9) Conjunction error: 4%.
From the data above, seventy-five percent of all TOEFL questions in part A are categorized into the WIAS
(What Is A Sentence) and word choice errors groups. In part B more than half of the errors (56%) belong to three
major types: part of speech, preposition and verb form. Grammar knowledge points in this test paper are relatively
non-complex and less various. From the analysis above, we could find that some knowledge points in the items
are repeatedly tested. If students are reinforced these basic structural rules, they will quickly improve their
accuracy by answering these questions since grammar knowledge points in the TOEFL test are incomplete and
limited. Furthermore, although the multiple-choice test of grammatical knowledge in “structure and written
expression” could produce consistent or reliable scores, it is not sufficient to use this section as a placement
subtest for writing. As Bachman and Palmer (1996, p. 23) noted that “grammatical knowledge is only one aspect
of the ability to use language to perform academic writing tasks” since the area of language knowledge is quite
narrow. However, high proportions of these sentences are compound and complex ones, which most of them are
selected from academic articles, such as natural science, biography or history etc. with a large amount of
professional and abstract words. Nevertheless, those complicated and academic words appear to be a superficial
“threat” to the testers. Since one has a solid knowledge of grammar and notices the cohesion within a sentence,
getting the answer is still an easy task. Like Bachman, et al (1995, p. 123) claimed that the TOEFL vocabulary
items were judged to be more familiar for unprepared test takers. In sum, a multiple-choice format in a test has
high reliability since it provides a relatively consistent result, but “reliability is not a sufficient condition for either
construct validity or usefulness” (Bachman & Palmer, 1996, p. 23).
4. Implications
When thinking about the reliability of “structure and written expression” section in the TOEFL tests, the test
developers want to set the minimum acceptable level of reliability as high as possible. Bachman and Palmer (1996,
p. 135) provide two criteria to evaluate: One is “the way the construct has been defined”, the other is “the nature
54
Practice on assessing grammar and vocabulary: The case of the TOEFL
of the test tasks”. That is to say, only the test focuses on a relatively narrow range of components of language
ability with relatively uniformed test tasks, could the test achieves higher levels of reliability. In “structure and
written expression” section, it only adopts two types of tasks: One is to complete a sentence by selecting one of
the four choices, the other is to choose one error from four underlined choices. They are all in the format of
multiple-choice, but they demonstrate limited language abilities, which include language knowledge and strategic
competence mentioned before. Therefore, with incomplete language knowledge and less variations in test task
characteristic, the result is consistency, which is the essence of reliability.
What’s the usage of the “structure and written expression” section? Can it really improve the English
language proficiency as a part of communicative test? As Vollmer (1981, p. 154) mentioned, “most
multiple-choice items are discrete-point items”, but these items in “structure and written expression” differ from
the ones in listening and writing in the TOEFL test. From the aspect of the contextualization of test items,
listening and reading parts involve much more holistic understanding of the whole text, which integrate not only
the abilities of listening or reading intensively or extensively, but the overall aspects of the communicative
competence as well. In language proficiency testing, Read (1993, p. 357) recommended that words need to be
understood in connected written or spoken discourse rather than just isolated items, which is very important to the
EFL learners. In grammar test, the content should be defined more broadly than “syntax and morphology” and
includes textual competence as well (Rea-Dickins, 1997, p. 92). The forms of the grammar test should be varied
as to meet the original purpose of communicative testing, and some suggested techniques on like paraphrase,
completion or gap-filling, guided short answer and summary and modified cloze could help to verify the test
formats (Hughes, 1989, p. 143; Rea-Dickins, 1997, p. 91). Only when the grammatical knowledge integrates with
language skills or it could be existed independently in variety of forms on either sentence-based or text-based
format, could it be meaningful in the TOEFL tests. Furthermore, much more emphasis needs to be put on the
meaning and communicative functions which presents its features on written expression rather than on the
structure or form in the “structure and written expression”.
Another issue of the “structure and written expression” in the TOEFL test is that how to design the proper
choices in the multiple choices to make it more reliable to achieve its communicative goal. Thus, keeping a good
balance between content words with useful expressions and functional words with structures reflects the main
focus of this testing item. It is unwise to centralize too much on limited words but can still get the answer even
neglecting the whole sentence. Moreover, like Huhta and Randell (1995, p. 105) mentioned that superficially
constructing the choices seems to be relatively easy and we can produce different distractors in various ways. But
if we expect testees to analyze the whole sentence in more detail and involve more worthy reading and
comprehension skills, testers might not intentionally to eliminate or guess the choice and give proper reasons and
understanding to each answer. However, the easiest and the most mechanic way is to increase the number of
distractors, e.g. to 5-choice instead of 4-choice, which could decrease the probability of correctness simply by
guessing or elimination.
Is it necessary to test grammatical knowledge independently? The place of grammar testing of foreign
languages is as controversial as the place of grammar teaching. However, the target language learning differs from
one’s mother tongue, which decides the necessary position for a test to include grammar and vocabulary. But how
to reflect this part into the TOEFL tests to maintain its quality as a communicative testing is a very important issue.
One way as Rea-Dickins (1997, p. 93) mentioned, it is unnecessary to test grammar as distinct form but reflects it
in some skill-based tests such as reading and writing; and it could be also in another way, which grammar should
55
Practice on assessing grammar and vocabulary: The case of the TOEFL
be tested in integrative way rather than simply be put into the limited items in decontextualised single sentences as
the TOEFL does.
5. Conclusion
In this paper from the aspect of grammatical knowledge, the author have attempted to examine the holistic
relationship between grammar and vocabulary in the TOEFL tests and English language skills, and also the
specific section, “structure and written expression”, designing features as well. From the analysis on one TOEFL
test paper, it appears that the TOEFL has its fully-developed merits since it exists for quite a long time as a
systematic test worldwide with high reliability, although some of the demerits make it seem not so perfect to
demonstrate its validity and communicative purposes. “Knowledge of the second language is a necessary but not
sufficient condition for success on the test tasks”, since success needs to be measured “in terms of performance on
the task but not only in terms of knowledge of language” (Hamilton, et al., 1993, p. 350). “Structure and written
expression” section attempts to use an indirect way to exam the competence of testers in writing, but it neglects its
general characteristics as a communicative test in an academic environment. Because of its limited types of items
with very limited related grammatical knowledge and sentence-based structure in this section, there is a suspicion
about its construct validity and the role of grammar in language use in TOEFL needs to be changed. “Use of
English” section in Cambridge First Certificate in English could be a good example to assess grammar if
“structure and written expression” section would still be kept in the TOEFL test.
Nevertheless, diachronically the TOEFL test has been always doing self-revising and self-improving work.
The TOEFL 2000 project is a broad effort under which language testing at Educational Testing Service (ETS) will
evolve into the 21st century (Jamieson, et al., 2000, p. 1). It will revise the Test of Spoken English and introduce a
computer-based version of the TOEFL test. However, the greatest change happened on the TOEFL 2000 test will
not only depend on multiple-choice tasks but will include open-ended and constructed-response tasks as well
(Jamieson, et al., 2000, p. 13). Compared the grammar and vocabulary items decades year ago with the “structure
and written expression” now, necessary and evident efforts have been demonstrated and now it still continues. In
the TOEFL 2000 framework, a statement indicates a tendency in the later test paper, that is, the TOEFL test may
not continue to include a separate measure of structure (Jamieson, et al., 2000, p. 11), which implies that integrate
the grammatical knowledge into four language skills could be an appropriate choice to meet the requirements of a
communicative testing. All in all, the TOEFL test as a mean of gathering information about the EFL testers is to be
used in making mainly educational decisions worldwide. No test design could be called perfect but we could find
the TOEFL is trying to meet the principles of designing a test in large-scale. Like the comments from Stevenson
(1987, p. 81):
Given its purposes, examinee populations, and multiple uses and considering the attendant limitations on test
content, tasks, and predictive specificity, TOEFL remains the best of its breed. Beyond those practical limitations that are
necessary to its purposes and scope, TOEFL’s weakness largely reflects the state of the language testing art.
References:
Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford University Press.
Bachman, L. F. & Palmer, A. S. (1996). Language testing in practice. Oxford University Press.
Banerjee, J. & Clapham, C. (2003). Test review: The TOEFL CBT (Computer-based Test). Language Testing, 20(1), 111-123.
Brown, J. D. & Ross, J. A. (1996). Decision dependability of subtests, tests and the overall TOEFL test battery. In: Michael
Milanovic & Nick Saville. (Eds.). Performance testing, cognition and assessment: Selected papers from the 15th language
56
Practice on assessing grammar and vocabulary: The case of the TOEFL
testing research colloquium, Cambridge and Arnhem. Cambridge Universtiy Press.
Carroll, B. J. (1980). Testing communicative performance: An interim study. Pergamon Press.
Cumming, A. & Mellow, D. (1995). An investigation into the validity of written indicators of second language proficiency. In: Alister
Cumming & Richard Berwick. (Eds.). Validation in language testing. Multilingual Matters Ltd.
Davies, A. (1990). Principles of language testing. UK: Basil Blackwell.
Ebel, R. L. & Frisbie, D. A. (1991). Essentials of educational measurement (5th ed.). New Jersey: Prentice Hall.
Hamilton, J. (1993). Rating scales and native speaker performance on a communicatively oriented EAP test. Language Testing, 10(3),
337-353.
Hilke, R. & Wadden, P. (1997). The TOEFL and its imitations: Analyzing the TOEFL and evaluating TOEFL-prep texts. RELC
Journal, 28(1), 28-53.
Hughes, A. (1989). Testing for language teachers. Cambridge: Cambridge University Press
Huhta, A. & Randell, E. (1995). Multiple-choice summary: A measure of text comprehension. In: Alister Cumming & Richard
Berwick. (Eds.). Validation in language testing. Multilingual Matters Ltd.
Jamieson, J., et al. (2000), TOEFL 2000 framework: A working paper. Educational Testing Service
Morrow, K. (1986). The evaluation of tests of communicative performance. In: Matthew Portal. (Ed.). Innovations in language
testing. NFER-NELSON.
Osterlind, S. J. (1998). Constructing test items: Multiple-choice, constructed-response, performance, and other formats (2nd ed.).
Kluwer Academic Publishers.
Rea-Dickins, P. M. (1991). What makes a grammar test communicative? In: Charles Alderson & Brian North. (Eds.). Language
testing in the 1990s. Macmillan Publishers Limited.
Rea-Dickins, P. (1997). The testing of grammar in a second language. In: Caroline Clapham & David Corson. (Eds.). Encyclopedia of
language and education: Language testing and assessment (vol. 7). Kluwer Academic Publishers.
Read, J. (1993). The development of a new measure of L2 vocabulary knowledge. Language Testing, 10(3), 355-371.
Read, J. (2000). Assessing vocabulary. Cambridge University Press.
Read, J. & Chapelle, C. A. (2001). A framework for second language vocabulary assessment. Language Testing, 18(1), 1-32,
Sax, G. (1997). Principles of educational and psychological measurement and evaluation (4th ed.). Wadsworth Publishing Company.
Schrutt, N. (1999). The relationship between TOEFL vocabulary items and meaning, association, collocation and word-class
knowledge. Language Testing, 16(2), 189-216.
Skehan P. (1991). Progress in language testing: The 1990s. In: Charles Alderson & Brian North. (Eds.). Language testing in the
1990s. Macmillan Publishers Limited.
Stevenson, D. K. (1987). Test of English as a foreign language. In: Alderson, J. C., et al. (Eds). Reviews of English language
proficiency tests. Teachers of English to Speakers of Other Languages.
Vollmer, H. J. (1981). Why are we interested in general language proficiency? In: Charles Alderson & Arthur Hughes. (Eds.). ELT
documents 111- Issues in language testing. The British Council.
Wall, D., et al. (1991). Validating tests in difficult circumstances. In: Charles Alderson & Brian North. (Eds.). Language testing in the
1990s. Macmillan Publishers.
Weir, C. J. (1990). Communicative language testing. UK: Prentice Hall.
Wijgh, I. F. (1995). A communicative test in analysis: Strategies in reading authentic tex...
Purchase answer to see full
attachment