 |
Test-taking strategies are
defined as those test-taking processes which the respondents have
selected and which they are conscious of, at least to some degree.
In other words, the notion of strategy implies an element of selection.
Otherwise, the processes would not be considered as strategies.
At times, these strategies constitute opting out of the language
task at hand (for example, through a surface matching of identical
information in the assigned passage and with information in one
of the response choices).
At other times, the strategies may constitute short-cuts to arriving
at answers (for example, not reading the text as instructed but
simply looking immediately for the answers to the given reading
comprehension questions). In such cases, the respondents may be
using test-wiseness
to circumvent the need to tap their actual language knowledge or
lack of it, consistent with Fransson's
(1984) assertion that respondents may not proceed via the text
but rather around it.
In the majority of testing situations, however, test-taking strategies
do not lead to opting out or to the use of short cuts. In some cases,
quite the contrary holds true.
In a study of test-taking strategies in Israel, one Hebrew second-language
respondent determined that he had to produce a written translation
of a text before he could respond to questions dealing with that
text (Cohen & Aphek,
1979).
At times, the use of a limited number of strategies in a response
to an item may indicate genuine control over the item, assuming
that these strategies are well-chosen and are used effectively.
At other times, true control requires the use of a host of strategies.
It is also best not to assume that any test-taking strategy is
a good or a poor choice for a given task. It depends on how given
test takers – with their particular cognitive style profile
and degree of cognitive flexibility, their language knowledge, and
their repertoire of test-taking strategies – employ these
strategies at a given moment on a given task.
Some respondents may get by with the use of a limited number of
strategies that they use well for the most part. Others may be aware
of an extensive number of strategies but may use few, if any of
them, effectively. So, for example, while a particular skimming
strategy (such as paying attention to subheadings) may provide adequate
preparation for a given test taker on a recall task, the same strategy
may not work well for another respondent. It also may not work well
for the same respondent on another text which lacks reader-friendly
subheadings.
As long as the task is part of a test, students may find themselves
using strategies that they would not use under non-test conditions.
It is for this reason, that during the pilot phase, it is crucial
for test constructors to find out what their tests are actually
measuring.
Verbal Report as a Window onto Test-Taking Strategies
Test-taking involves cognitive processes that are not readily open
to objective observation and evaluation. Consequently, in order
to get the best picture possible of what it is that respondents
do as they, for example, read test prompts and respond to test questions,
researchers have tended to use verbal report protocols.
A comprehensive and in-depth overview of how verbal reports can
and are used in language testing has been provided by Green
(1998). According to him, “Verbal protocols are increasingly
playing a vital role in the validation of assessment instruments
and methods” in that they “offer a means for more directly
gathering evidence that supports judgments regarding validity than
some of the other more quantitative methods” (p. 3).
Green, in fact, notes that verbal reports are frequently used to
address “one of the most fundamental questions” about
language tests: what is it that a test actually measures (p. 3).
Verbal reports include data that reflect:
- self-report: learners' descriptions of what
they do, characterized by generalized statements, in this case,
about their test-taking strategies – for example, "On
multiple-choice items, I tend to scan the reading passage for
possible surface matches between information in the text and that
same information appearing in one of the alternative choices,"
or questionnaires and other kinds of prompts which ask learners
to describe the way they usually take a certain type of language
test are likely to elicit self-report data.
- self-observation: the inspection of specific,
not generalized language behavior, either introspectively, that
is, within 20 seconds of the mental event, or retrospectively
– for example, "What I just did was to skim through
the reading passage for possible surface matches between information
in the text and that same information appearing in one of the
alternative choices,"
Self-observation data would entail reference to some actual instance(s)
of language testing behavior. For example, recollections of why
certain distracters were rejected in search of the correct multiple-choice
response on previously answered items would count as retrospective
self-observation.
- self-revelation: "think-aloud," stream-of-consciousness
disclosure of thought processes while the information is being
attended to – for example, "Hmm...I wonder if the information
in one of these alternative choices also appears in the text."
Self-revelation or think-aloud data are only available at the
time that the language event is taking place (that is, within
20 seconds of it), and the assumption would be that the respondent
is simply describing, say, the struggle to determine which five
out of seven or more statements constitute the best set of main
points for a text. Any thoughts that the respondent has which
are immediately analyzed would constitute introspective self-observation
– for example, “Now, does this utterance call for
the present or imperfect-subjunctive? Let me see...”
Verbal reports can and usually do comprise some combination of
these (Radford, 1974;
Cohen & Hosenfeld, 1981;
Cohen, 1987).
By asking test-takers to think-aloud as they work through a series
of test items, it becomes possible to analyze the resulting protocol
to identify the cognitive processes involved in carrying out the
task.
Think-aloud protocols have the advantage of giving a more direct
view of how readers process a text as they indicate what they are
doing at the moment they are doing it (Cohen,
1987).
Retrospective interviews, in turn, provide an opportunity for investigators
to ask directed questions to gain clarification of what was reported
during the think-aloud.
Early work in verbal report with language testing found, for example,
that some assumptions were ill-founded. One was that technical vocabulary
does not cause as much difficulty as non-technical vocabulary and
non-technical vocabulary used technically within a given field.
Furthermore, seemingly obvious discourse markers may not be so obvious
to the L2 reader. In addition, the problems arising from syntactic
features may be quite limited in scope – stemming mostly from
structures such as heavy noun phrases (Cohen,
Glasman, Rosenbaum-Cohen, Ferrara, & Fine, 1979). Cohen
(1986) laid out a series of measures to be taken to ensure that
verbal report tasks could be used effectively to obtain data on
the reading process.
More recently, numerous studies have been done to determine the
strategies that students use to read texts (see Singhal,
2001, for a review). Upton
(1997, 1998),
for example, reported on 11 natives speakers of Japanese, half still
taking ESL classes and half finished with courses. The students
were asked to provide think-aloud protocols while they read academic
passages.
In retrospective interviews, they then listened to their tape-recorded
protocols and were asked to clarify and explain their thoughts.
Upton’s study demonstrated how verbal report can be used
to describe the ways in which nonnatives can misconstrue the meaning
of words and phrases as they read an L2 text, and how this throws
off their understanding of the entire text. He found that many reading
errors could be explained in terms of what Laufer
(1991) has called synforms – that is, words that
look or sound similar to other words that the readers know. The
respondents would make vocabulary in the passage conform in their
minds to what they thought the meaning of these look-alike words
was. A more recent study by Upton
and Lee-Thompson (2001) used verbal report with 20 native speakers
of Chinese and Japanese to explore the question of when and how
they use L1 resources while reading L2 texts.
Verbal report measures have helped determine how respondents actually
take reading comprehension tests as opposed to what they may be
expected to be doing (Cohen,
1984, 1994a: 130-136).
Studies calling on respondents to provide immediate or delayed retrospection
as to their test-taking strategies regarding reading passages with
multiple-choice items have, for example, yielded the following results:
- When the instructions ask students to read the passage before
answering the questions, students have reported either reading
the questions first or reading just part of the article and then
looking for the corresponding questions.
- When advised to read all alternatives before choosing one, students
stop reading the alternatives as soon as they have found one that
they decide is correct.
- Students use a strategy of matching material from the passage
with material in the item stem and in the alternatives, and prefer
this surface-structure reading of the test items to one that calls
for more in-depth reading and inferencing.
- Students rely on their prior knowledge of the topic and on their
general vocabulary.
From these findings and from others, a description of what respondents
do to answer questions is emerging. Unless trained to do otherwise,
they may use the most expedient means of responding available to
them – such as relying more on their previous experience with
seemingly similar formats than on a close reading of the description
of the task at hand. Thus, when given a passage to read and multiple-choice
items to answer, students may attempt to answer the items just as
they have answered other multiple-choice reading items in the past,
rather than paying close attention to what is called for in the
current one. Often, this strategy works, but on occasion the particular
task may require subtle or major shifts in response behavior in
order to perform well.
Assessing the Interaction of Reading and Writing
Perhaps at the cutting edge of research on reading and writing
is that of assessing language behavior at the intersection of reading
and writing. One vehicle for doing this is through a close inspection
of the process of summarizing.
Summarization tests are complex in nature. The reading portion
entails identifying topical information, distinguishing superordinate
from subordinate material, and identifying redundant and trivial
information.
The writing of the summary entails the selection of topical information
(or generating it if it is not provided), deleting trivial and redundant
information, substituting superordinate material, and restating
the text so that it is coherent and polished (Brown
& Day, 1983; Kintsch
& van Dijk, 1978).
Given the lack of clarity that often accompanies such tasks, research
has shown that it may be useful to give test takers specific instructions
about how to go about the summarization task (Cohen,
1993, 1994b). For
example:
Summarization task
How to Read:
- Read to extract the most important points – for
example, those constituting topic sentences signaled as
crucial by the paragraph structure: points that the reader
of the summary would want to read.
- Reduce information to superordinate points.
- Avoid redundant information – points off.
How to Write:
- Prepare indraft form and then rewrite.
- Link points smoothly.
- Pay attention to the required length for the summary (e.g.,
it may be 10 percent of original test, so 75 words for 750-word
text)
- Write the summary in your own words.
- Be brief.
- Write legibly.
|
It may also be beneficial to give raters specific instructions
as to how to assess the summaries. For example:
Assessing your Summary:
- Check to see whether each important point is included
(point that were agreed upon by a group of experts in advance).
- Check to make sure that these points are linked together
by the key linking/integrating elements appearing on the
master list.
- Points(s) off for each irrelevant point.
- Points off for illegibility.
- A rubric for what would constitute “writing in your
own words.”
|
This last item is a difficult one to regulate since some texts
or phrases in a text lend themselves to paraphrase better than others.
At some points in a text, the best summary of a point requires the
use of the words found there. In other cases, paraphrase is not
only possible but preferable.
Assessing Written Expression
Perhaps the main thing to be said about a given test of written
expression is that it is a poor substitute for repeated samplings
of a learner’s writing ability while not under the pressure
of an exam situation.
The current process-oriented approach to writing raises the question
of whether it is sound testing practice to have learners write a
single draft of a composition as a measure of their writing ability.
Instead, might it not be more appropriate to have learners prepare
multiple drafts that are reviewed both by peers (in small groups)
and by the teacher at given moments?
Hence, if writing is to be assessed on a test, it would be important
to provide the learners with specific guidelines as to the nature
of the task. For example:
Your boss has asked you to rough out an argument for why
the factory employees should not get longer coffee breaks.
Try to present your arguments in the most logical and persuasive
way. Do not worry about grammar and punctuation at this point.
There is no time for that now. Just concern yourself with
the content of your ideas, their organization, and the choice
of appropriate vocabulary to state your case. |
It is important for the person doing the assessment of the writing
to pay attention only to those aspects of the task that learners
were requested to consider.
Furthermore, the field of L2 writing has embraced the use of portfolios
whereby the students prepare a series of compositions (possibly
including the various drafts of each as well). Each entry may represent
a different type of writing – for instance, one a narrative
or descriptive or expressive piece, the second a formal essay, and
the third an analysis of a prose text. Hence, the portfolio represents
multiple measures of the students' writing ability. (For more on
portfolios, see Hamp-Lyons,
Condon, & Farr, 2000. The National
Language Resource Center has developed a manual, Portfolio Assessment
in the Second Language Classroom, that can be downloaded or ordered
from the web site.
Final Thoughts
Advances in assessment have brought relatively untapped elements
of language into assessment measures.
For example, language assessment may now include more finely-tuned
assessment of languages for specific purposes (see Douglas,
2000) and of vocabulary (see Read,
2000), more sophisticated computer-based assessment (Dunkel,
1999), as well as the assessment of cross-cultural pragmatics
(see Hudson, Detmer, &
Brown, 1995; Brown,
2001; Cohen, in press).
With regard to pragmatics, expertise is accumulating in the assessment
of speech acts such as complaining, apologizing, requesting, and
the so forth.
Likewise, the assessment field is refining its means of assessing
language sensitive to the use of the target language in specific,
often technical contexts; the field is taking assessment of second-language
vocabulary knowledge beyond simplistic measures to better assess
the depth and breadth of lexical control; and testers are pursuing
research and development projects to provide us with not only computer-assisted
assessment measures but computer-adaptive ones as well.
|
 |