1. General
The Wechsler Adult Intelligence Test - Third Edition (WAIS - III)
is part of a family of Wechsler tests. "The WAIS-III is the great-grandchild
of the original 1939 Wechsler-Bellevue Form I." (Kaufman and
Lichtenberger 1999, p. 3) It has been the subject of extensive research,
so this short review will merely present an overview, with some focus
on the issue of its validity.
The WAIS - III project directors were David Tulsky and Jianjun Zhu,
the publisher is Harcourt Brace & Company, and the date of publication
of the test and of its normative data was 1997. Administration of
the Verbal IQ, Performance IQ, and Full Scale IQ is intended to take
60-90 minutes. The test costs either US$ 914 or US$ 967, depending
on the packaging (box or case) required (according to the WAIS-III
WMS-III Technical Technical Manual and from the website http://harcourtassessment.com/).
2. Purpose and nature
The purpose of WAIS-III is to measure an adult's intellectual ability
using a multiple aptitude battery. The test is for adults between
the ages of 16 and 89 years. It is designed for use with individuals,
and the battery is composed of seven performance subtests and seven
verbal subtests. The overall result of a WAIS-III test is called a
Full Scale IQ, but the verbal and performance components also yield
their own scores -- respectively the Verbal IQ and the Performance
IQ. The verbal and performance components each also have subcomponents,
and these subcomponents yield scores called indices.
According to Kaufman and Lichtenberger (1999), these subtests were
largely based on other researchers' work, especially the Stanford-Binet
and the Army Performance Scale Examination.
3. Practical evaluation
Face validity is one of the test's strong-points, because it is apparently
inclusive of a lot of intelligence-related skills. The design of the
main materials (Stimulus Booklet, Block Design blocks, Picture Arrangement
pictures, Object Assembly objects, and Administration and Scoring
Manual) is excellent. The content is literate, clearly laid-out, and
easy to use. Everything is attractive and apparently durable, except
that the cover of the stimulus booklet I saw was showing its age,
being slightly turned-up at the bottom. The materials seems appropriate
to the age of the users.
It is lengthy and somewhat complicated to administer. It cannot be
administered by computer -- this has to be done one-on-one, human-to-human.
The directions are not always completely clear, but administrators
would normally be taught how to administer the test, so this should
not be a real issue in most cases. According to the webpage http://harcourtassessment.com/hai/ProductLongDesc.aspx?Catalog=TPC-USCatalog&ISBN=015-8980-727&Category=Adolescents
, there is a training video available for purchase.
There is computer-assisted scoring available, but scoring procedures
-- while not simple -- are not really difficult. However, there is
always the risk of human error because of the number of manual entries
of raw scores and scaled scores that the administrator has to make.
Scoring templates are used for some subtests. Index scores are generated
for Verbal Comprehension, Working Memory, Perceptual Organization,
and Processing Speed, as well as scores for Verbal IQ, Performance
IQ and Full-Scale IQ.
According to an email received on 25 April 2007 from Harcourt Assessment
Customer Service, WAIS-III requires a high level of expertise in test
interpretation, and can be purchased by individuals with:
- Licensure or certification to practice in a field related to the
purchase, or
- A doctorate degree in psychology, education, or closely related
field with formal training in the ethical administration, scoring,
and interpretation of clinical assessments related to the intended
use of the assessment.
4. Technical evaluation of psychometric properties
(a) Norms:
Shum, O'Gorman, and Myors (2006, p. 130) state that one of the strengths
of the WAIS-III is the size and representativeness of the standardisation
sample used in test development. According to the Technical Manual,
the WAIS-III and WMS-III (Wechsler Memory Scale -- Third Edition)
normative information was based on United States standardisation samples
of 2,450 individuals representative of the population of adults aged
16-89 years. A stratified, census-based sampling plan ensured that
the standardisation samples included representative proportions of
adults according to each selected demographic variable. The variables
used for stratification were age, sex, race/ethnicity, education level,
and geographic region.
According to the Technical Manual, one set of norms was produced
that was representative of US Census proportions as regards all variables
except age. It was based on the performance of a reference group that
consisted of the participants in the standardisation sample who were
between the ages of 20 and 34. The Manual recommends that this set
of norms be used when clinical questions dictate comparisons of an
individual's performance to that of a reference group. Another set
of norms was produced that was based on age-corrected subtest scores.
The Manual recommends that this set of norms be used when clinical
questions dictate comparisons of an individual's performance to that
of his or her age peers.
(b) Reliability
The WAIS-III only exists in one version, so there is no issue with
alternate forms. According to the Technical Manual, interscorer agreement
is very high, averaging in the high .90s. According to the Technical
Manual, the stability of WAIS-III scores was assessed in a study and
found to be adequate across time for all age-groups.
According to the Technical Manual, the reliability of each WAIS-III
subtest (except Digit Symbol-Coding and Symbol Search) was estimated
using a split-half procedure from the item scores from a single administration,
with the correlation corrected using the Spearman-Brown formula. Since
Digit Symbol-Coding and Symbol Search subtests are speeded subtests,
the split-half coefficient was not considered to be a good estimate
of their reliability. For that reason, test-retest stability coefficients
were used as the reliability estimates for these two subtests, with
the correlation being corrected for the variability of the standardisation
sample.
The sample included 394 participants, with roughly 30 participants
from each of the 13 age-groups. The reliability coefficients of the
WAIS-III IQ scales and indexes were calculated with the formula recommended
by Guilford (1954) and Nunnally (1978). The average reliability coefficients
across age-groups of the subtests (except Picture Arrangement, Symbol
Search and Object Assembly), which were calculated with Fisher's z
transformation, range from .82 to .93. The Symbol Search subtest had
a coefficient of .77, Picture Arrangement had .74, and Object Assembly
had .70. The Object Assembly subtest is not included in the computation
of IQ and Index scores, in part because of its low reliability for
older adults.
(c) Validity:
The Technical Manual (p. 75) asserts that, in order to ensure content
validity, comprehensive literature reviews were undertaken, consultants
were consulted, surveys were carried out, and focus groups and an
advisory panel were set up. The Manual also provides considerable
detail about the testing that was done of the WAIS-III's concurrent
criterion-related validity.
A later section will examine the issue of construct validity in more
detail. Here it suffices to state that the Technical Manual provides
a lot of data on intercorrelation studies within the components of
the WAIS-II itself, on factor analysis and on the ability of the WAIS-III
to discriminate between the normal population and groups with various
neurological disorders, alcohol-related disorders, schizophrenia,
psychoeducational and developmental disorders, and deafness or hearing-impairment.
5. Research Relevant to Usefulness of Measure
There has been a vast amount of research done on WAIS-III and its
predecessors, so it is beyond the scope of this review to do more
than just to sample it -- giving a hopefully varied but unsystematic
taster of the available body of research. Watkins, C. E. Jnr., Campbell,
V. L., Nieberding, R. and Hallmark (1995) conclude that the Wechsler
scales are amongst the assessment procedures most frequently recommended
by American clinicians for clinical students to learn about and that
most clinicians still use most often what they call the "most
tried and true" assessment standards, including the Wechsler
scales. The WAIS-R (the immediate predecessor of WAIS-III) was the
clear frontrunner in terms of frequency of use of intelligence tests.
Camara, Nathan and Puente (2000) made a similar finding.
In this connection, it is worth noting that the WAIS-III Technical
Manual (on page 75) states that "... because of the similarities
between the WAIS-III and the WAIS-R ..., the accumulated research
on the WAIS-R ... should be considered in any evaluation of the validity
of the (WAIS-III)."
In Australia, Sharpley and Pain (1988) report that the Wechsler tests
of intelligence were also the most valued and recommended, and in
New Zealand Knight and Godfrey (1984) reports that the WAIS was the
test that the most hospital psychologists believed clinical psychology
graduates should have had experience in administering and interpreting.
There has been a lot of factor analysis of the validity of various
aspects of the WAIS-III subsequent to its publication, as was anticipated
in the Technical Manual itself. It is interesting to note that such
studies sometimes appear to contradict each other -- for example,
Taub (2001) concluded that his evidence did not support the Verbal
IQ/Performance IQ dichotomy, whereas Jones, van Schaik, and Witts
(2006) conclude as follows:
...we suggest that index scores should be used with caution in
individuals with low IQ (74 or less). The use of two scores (for
verbal and performance domains) is justified based on the two-factor
solution obtained in the current study.
Bennett (1981) investigates the effect of encouragement of examinees
by administrators on measured IQ and found a significant positive
correlation for Full-Scale IQ, with those who had received encouragement
scoring higher than those who had not. This effect was also found
for Performance IQ, but the effect for Verbal IQ was not significant.
Bennett cites previous research which had also found that reinforcement
of various kinds had a significant effect on academic performance
and test scores. He also investigated the interaction of encouragement
with examinee personality-type (Locus of Control), but he found no
significant effect in this case.
With regard to the issue of encouragement, Bennett states (p. 78)
that it is inevitable that some differences will arise among examiners.
"These differences do not matter if the encouragement has no
effect, but if that is the case, there is little point in using it."
He goes on to state (p.80):
Although the differences obtained in the present research were within
the standard error of measurement of the WAIS, it must be remembered
that as a result of factors mentioned above, and the fact that examiner
differences were kept to a minimum, the effect found in the present
study was probably a minimal one.
Heaton, Taylor and Manly (2003) investigates certain aspects of both
the WAIS-III and the WMS-III, which were standardised jointly. The
authors are concerned to optimise these two tests for clinical --
especially neurodiagnostic -- purposes. The use of the tests that
they have in mind is for comparing the scores achieved by particular
individuals with what they would have achieved if they did not have
any neuropsychiatric disorder, so that the scores can be used to establish
the presence or absence, nature and extent of any such disorder in
that individual.
This would involve comparing test results with norms (unless the
individuals concerned happened to have been recently tested prior
to the suspected onset of any relevant morbidity). Confounding variables
would, of course, need to be taken into account and it would be preferable
therefore to have separate norms for every relevant category that
an individual might fall into. The authors state that there is evidence
that sex, education-level and ethnicity are relevant in this regard.
However, WAIS-III only has separate norms for particular age-groups.
The authors address this problem and claim to have solved it. They
investigate the effect of these variables on WAIS-III and WMS-III
test scores and also the effect on score-interpretation of not taking
these factors into account. They then provide new standardised scores
that correct for these demographic influences, and demonstrate how
these result in more accurate score-interpretations.
6. Evaluation and Discussion
The Wechsler family of tests are long-established and well-known,
and have both a large amount of face-validity and professional credibility
because of this. The subtests of the WAIS-III are varied and attractive,
which reduces the tedium (for the examinee) which might be associated
with sitting a long test, although there is evidence (Axelrod and
Ryan 2000) that some examinee groups can average as long as 110 minutes
to complete the full test.
One of the main strengths of the WAIS-III is the size and representativeness
of the standardisation sample used in test development. However, as
Kaufman and Lichtenberger (1999, p. 3) state, "The development
of Wechsler's tests was not based on theory ... but instead on practical
and clinical perspectives." This theoretical vacuum reflects
on its construct validity.
As the technical manual states (p. 75):
The validity of a test is regarded as the most fundamental and
important aspect of test development... validity is the overall
evaluation of the degree to which empirical evidence and theoretical
rationales support the adequacy and appropriateness of interpretations
of test scores.
The main weakness of the WAIS-III relates to the theoretical rationales
which underpin its claims to validity. The Technical Manual states
that Wechsler maintained throughout his career the definition of intelligence
as the "capacity of the individual to act purposefully, to think
rationally, and to deal effectively with his environment." From
the point of view of construct validity, however, it is implausible
to claim that WAIS-III measures the constructs intended by its design,
if the constructs are based on the above definition, which is extremely
broad. How prominently does "purposefulness" figure in the
WAIS-III? Not at all, as far as I am aware. And the term "environment"
is so broad that it would be implausible to suggest that sitting any
test at a desk under supervision was at all relevant to assessing
how an individual dealt with his environment as a whole (however that
might be defined). I have seen no evidence that the subtests (derived,
as they mostly were, from tests developed by other researchers) were
developed to test a construct based on that definition of intelligence
-- or anything like it.
Moreover, it would also be implausible to claim that users (administrators,
user organisations and examinees) of the WAIS-III would generally
have as broad a definition as that in mind when they purchased and/or
used it in good faith to produce scores of "intelligence".
It is beyond the scope of this review to investigate whether there
have been or might in future be legal arguments raised in connection
with the above issues.
Coolican (2005, p. 288) warns:
Note that psychologists have not discovered that intelligence has
a normal distribution in the population. The tests were purposely
created to fit a normal distribution, basically for research purposes
and practical convenience in test comparisons.
This artificiality and pragmatism are not limited to the distribution
of intelligence scores. Psychologists often apply their theories to
important social purposes, and one of these purposes is to assess
the "amount" of the pre-existing popular concept of "intelligence"
that particular people possess. This popular concept itself is vague
and understood in different ways by different ordinary people, but
tests such as the WAIS-III are marketed back to ordinary people as
being tests of "intelligence" (as is shown by the appearance
of the word "intelligence" in the name of the test), with
the implication that this is the same concept that lay people have
in mind when they use that word. It might have been better to use
a term such as "rational cognitive ability".
There are no substantial ethical issues involved with the WAIS-III
that are not common to all psychometric tests. The one possible exception
is the ethical need to resist pressures from political groups to interpret
as ethical issues what are properly considered political issues related
to the education or employment of particular ethnic or other groups.
These need to be decided through the proper democratic poiltical processes.
7. References
Axelrod, B. N. & Ryan, J.J. (2000). Prorating Wechsler Adult
Intelligence Scale-III Summary Scores. Journal of Clinical Psychology,
Vol. 56(6), 807-811.
Bennett, W.J. (1981). Effects of Encouragement and Locus of Control
on WAIS IQ Scores. Massey University: M.A. Thesis.
Camara, W. J., Nathan, J. S., & Puente, A. E. (2000). Psychological
Test Usage: Implications in Professional Psychology. Professional
Psychology: Research and Practice, Vol. 31(2), 141-54.
Coolican, H. (2005). Research Methods and Statistics in Psychology.
London:Hodder & Stoughton.
Guilford, J.P. (1954). Psychometric Methods (2nd ed.). New York:McGraw-Hill.
Heaton, R.K., Taylor, M.J., & Manly, J. (2003). Demographic
effects and Use of Demographically Corrected Norms with the WAIS-III
and WMS-III. In Tulsky, D. S., Chelune, G. J., Ivnik, R. J., Prifitera,
A., Saklofske, D. H., Heaton, R. K., Bernstein, R. & Ledbetter,
M. F. (Eds.) (2003). Clinical Interpretation of the Wechsler Adult
Intelligence Scale. San Diego:Academic Press.
Jones, J. J. S., van Schaik, P. & Witts, P. (2006). A Factor
Analysis of the Wechsler Adult Intelligence Scale 3rd Edition (WAIS-III)
in a Low IQ Sample. British Journal of Clinical Psychology Vol. 45,
No. 2, June 2006, Page 145-152.
Kaufman, A. S. & Lichtenberger, E. O. (1999). Essentials of
WAIS-III Assessment. New York:Wiley.
Knight, R. G. & Godfrey, H. P. D. (1984). Tests recommended
by New Zealand Hospital Psychologists. New Zealand Journal of Psychology,
13, 32-6.
Nunnally, J. (1978). Psychometric Theory (2nd ed.). New York:McGraw-Hill.
Sharpley, C. F. & Pain, M. D. (1988). Psychological Test Usage
in Australia. Australian Psychologist, Vol. 23 No. 3, 361-9.
Shum, D., O'Gorman, J. & Myors, B. (2006). Psychological Testing
and Assessment. Melbourne:Oxford University Press.
Taub, G.E. (2001). A Confirmatory Analysis of the Wechseler Adult
Intelligence Scale-Third Edition: Is the Verbal/Performance Discrepancy
Justified? Practical Assessment, Research & Evaluation, Retrieved
30 April 2007 from http://PAREonline.net .
Watkins, C. E. Jnr., Campbell, V. L., Nieberding, R. & Hallmark,
R. (1995). Contemporary Practice of Psychological Assessment by Clinical
Psychologists [Psychological Assessment and Clinical Practice]. Professional
Psychology: Research and Practice, Vol. 26(1), 54-60.