Wybrane metody diagnozy artykuł do zajęć


Psychological Assessment Copyright 2003 by the American Psychological Association, Inc.
2003, Vol. 15, No. 3, 333 339 1040-3590/03/$12.00 DOI: 10.1037/1040-3590.15.3.333
Positive Impression Management and Its Influence on the Revised NEO
Personality Inventory: A Comparison of Analog and Differential
Prevalence Group Designs
R. Michael Bagby Margarita B. Marshall
University of Toronto and Centre for Addiction University of Toronto
and Mental Health
Participants (n 22) completed the Revised NEO Personality Inventory (NEO PI R) as part of an
authentic job application. Protocols produced by this group were compared with  analog participants
(n 23) who completed the NEO PI R under standard instructions and again under instructions designed
to mimic the test-taking scenario of the job applicants (the  fake-good condition). Participants com-
pleting the NEO PI R under fake-good instructions and the job applicants scored lower on the
Neuroticism and higher on the Extraversion scales than did the participants responding under standard
instructions. Analog participants in the fake-good condition scored higher on the Extraversion and lower
on the Agreeableness scales than did the job applicants. These results suggest that outcomes from analog
designs are generalizable to real-world samples where response dissimulation is probable.
A long-held belief among many personality researchers is that bad and fake-good responding. Widely used instruments such as
self-report instruments designed to measure personality and psy- the Minnesota Multiphasic Personality Inventory 2 (MMPI 2;
chopathology are vulnerable to response bias (Edwards, 1953; Butcher et al., 2001) and the Personality Assessment Inventory
Hogan & Nicholson, 1988). It is assumed that in many situations, (Morey, 1991), for example, have validity scales, which are used
test takers may be motivated to respond to items in a manner that to assess fake-bad and fake-good responding. One notable excep-
maximizes a desired outcome. An individual applying for insur- tion is the Revised NEO Personality Inventory (NEO PI R; Costa
ance compensation for psychiatric disability may intentionally & McCrae, 1992c), which is the most frequently used instrument
exaggerate or even fabricate symptoms associated with mental to assess today s most prominent model of personality the Five-
illness in order to procure financial award. This type of responding Factor Model of Personality (FFM). The NEO PI R was designed
is commonly referred to as  faking bad, or negative impression
to measure the FFM, and it provides five personality domain
management. Others may be in situations in which it is in their best
scores that correspond to five broad dimensions of personality:
interests to underreport symptoms of mental illness that they
Neuroticism, Extraversion, Openness-to-Experience, Agreeable-
experience or claim to possess desirable personality traits that they
ness, and Conscientiousness. Although Costa and McCrae (1992c,
know to be untrue. One such scenario might be a job applicant
1997) have been criticized for not including validity scales on the
seeking employment and therefore motivated to underreport the
NEO PI R (Ben-Porath & Waller, 1992; Butcher & Rouse, 1996),
presence of psychopathology or endorse what he or she believes
they justify the continued exclusion of such scales on a number of
the potential employer would view as highly desirable traits. This
different grounds, including the existence of a third-person (infor-
type of responding has been labeled  faking good, or positive
mant) version of the instrument that can be used, among other
impression management.
purposes, when there is suspicion of nonveridical responding
Most measures of personality and psychopathology include spe- (Costa & McCrae, 1992a, 1992b, 1992c) and the finding that
cial scales designed to detect the presence and influence of fake- correction adjustments based on validity scales rarely increase the
predictive validity of individual content and clinical scales (Mc-
Crae et al., 1989). Perhaps the most compelling argument set forth
R. Michael Bagby, Department of Psychiatry, University of Toronto,
by these personality researchers is that the  analog design, typi-
Toronto, Ontario, Canada, and Centre for Addiction and Mental Health,
Toronto, Ontario, Canada; Margarita B. Marshall, Department of Psychol- cally employed to develop and evaluate the effectiveness of va-
ogy, University of Toronto. lidity scales, has not yet been shown to be generalizable to samples
This research was completed as part of an independent study for course
where faking is clearly known or is suspected to have occurred.
credit on the part of Margarita B. Marshall and funded, in part, by a Senior
These latter samples are called  known groups and  differential
Research Fellowship from the Ontario Mental Health Foundation awarded
prevalence groups, respectively (Rogers, 1997).
to R. Michael Bagby.
Despite Costa and McCrae s clear stance on the use of validity
We thank Andrew G. Ryder and Paul T. Costa Jr. for their comments
scales, Schinka, Kinder, and Kremer (1997) constructed research
and suggestions on earlier versions this article.
validity scales for the NEO PI R to assess response dissimulation:
Correspondence concerning this article should be addressed to R. Mi-
the Positive Presentation Management (PPM) scale, designed to
chael Bagby, Centre for Addiction and Mental Health, Clarke Site, 250
detect response styles that reflect  claimed uncommon virtues
College Street, Toronto, Ontario M5T 1R8, Canada. E-mail:
michael_bagby@camh.net and/or denied common faults (p. 129) and the Negative Presen-
333
BAGBY AND MARSHALL
334
tation Management (NPM) scale, designed to detect response ipants responding in the standard instruction condition and those
styles that reflect  claimed uncommon faults and/or denied com- participants in the differential prevalence group sample. No dif-
mon virtues (p. 129). Overall, results from studies evaluating ferences in scores on Openness-to-Experience, Agreeableness, and
these validity scales suggest that the NPM and PPM are effective Conscientiousness were expected, as results from previous inves-
in detecting, respectively, fake-bad and fake-good responding tigations have not revealed reliable differences across standard and
(Berry et al., 2001; Caldwell-Andrews, Baer, & Berry, 2000; fake-good instructions for these personality traits (Ballenger,
Reid-Seiser & Fritzsche, 2001; Schinka et al., 1997; Young & Caldwell-Andrews, & Baer, 2001; Caldwell-Andrews et al., 2000).
Schinka, 2001). All of these studies that provided support for these We also hypothesized that research participants responding in
validity scales, however, used analog research designs. the condition in which they were instructed to fake-good would
Although no investigation has examined whether analog sam- score significantly higher on the PPM and lower on the NPM
ples are similar to differential prevalence group or known-group validity scales than when responding under standard instructions;
samples, researchers have assumed, at least implicitly, that one can this hypothesis was in line with results from some previous studies
generalize from analog to differential prevalence group and that used analog research designs (e.g., Ballenger et al., 2001; see
known-group designs. Such empirically unsubstantiated assump- also Baer & Miller, 2002, for a review). Again, under the assump-
tions have prompted some to suggest that an entire line of research tion that analog designs are comparable to differential prevalence
examining fake-bad and fake-good response styles may have com- group samples, we also predicted that the participants in the
mitted a Type II error (Piedmont, McCrae, Riemann, & Angleit- differential prevalence group sample would score higher on the
ner, 2000). That is, investigators using analog designs may be PPM and lower on the NPM than those in the analog sample, who
assuming falsely that research participants instructed to fake in the completed the NEO PI R under standard instructions.
experimental context do not differ from persons in known-group
and differential prevalence group samples, when in actuality they
Method
do. If, however, persons in differential prevalence groups and
Participants
participants in analog designs who are instructed to fake good
perform similarly or differ in similar ways from those in analog
The differential prevalence group sample consisted of 25 finalists from
designs who respond honestly (i.e., standard instructions), the
a competition to become hosts for a popular reality TV show based in a
generalization from analog studies can be more confidently ex-
major North American city. From this sample of 25, 8 were to be selected.
tended to the applied setting.
As part of the selection process, all applicants were required to complete a
The goal of this study was to compare directly test results from
psychological evaluation. At the time of their assessment, informed con-
the NEO PI R completed by respondents under a condition in sent was obtained from all participants, indicating their awareness of the
which positive impression management was highly likely (differ- purpose of testing. Because the results of these measures would have an
impact on their success or failure to obtain one of the eight positions as a
ential prevalence group sample) with test results produced by
TV  host, there was a strong likelihood that the respondents were highly
research participants who were provided with information and
motivated to present themselves in a manner that they believed would
instruction in an experimental context designed to mimic the same
enhance their selection.1 Thirteen of the applicants were men, 11 were
test-taking scenario (analog sample). In particular, we sought to
women, and 1 was in the process of changing from a woman to a man.
examine two issues. First, do individuals from a differential prev-
Twenty-three of the 25 applicants were single and had never been married,
alence group sample produce NEO PI R personality domain scale
and applicants were between 19 and 29 years of age (M 23.0 years,
elevations similar to those produced by participants in analog
SD 3.16). One participant was a Canadian  First Nations person (this
research who are instructed to fake good, and do the scale eleva- term is the accepted designation for indigenous peoples of Canada), 2
tions of these two groups differ from those of research participants participants were Canadians of African descent, and the remaining 23
(92%) were Canadians of European descent. Twenty-one of the 25 appli-
who take the NEO PI R under standard (honest) instructions? In
cants (84%) were currently employed at the time of testing, and 14 of
addition, do the recently developed research scales designed by
these 21 indicated that they were working as actors or performers. All but
Schinka et al. (1997) to assess fake-good and fake-bad responding
one (who had a high school degree) had at least 4-year university degrees.
distinguish among the three groups? To this end, an analog sample
The analog sample was recruited through postings on Internet discussion
composed of aspiring actors solicited from local acting schools
boards for local actors on the basis of the advertisement used to recruit the
first completed the NEO PI R under standard (i.e., honest respond-
applicants in the differential prevalence group. Inclusion and exclusion
ing) instructions and then were readministered this test under
criteria for this group were based on the demographics of the differential
instructions to respond in such a way as to maximize their chances
prevalence group sample. The inclusion criteria were that participants be
of successfully acquiring a role on a local and highly publicized
 reality TV show (i.e., the fake-good condition). The results from
1
As noted, the applicants in the differential prevalence sample were
these two test-taking conditions were then compared with one
informed that their responses on the personality measure were to be used
another and with the test results from a sample of bona fide reality
as part of the selection process. There are two reasons to believe that
TV applicants the differential prevalence group who had pre-
applicants in the differential prevalence sample were highly motivated to
viously completed the NEO PI R as part of the actual selection
alter their responses. First, participants on other reality TV programs (e.g.,
procedure.
 Survivor ) have received positive career benefits as a result of media
We hypothesized that the analog research participants respond-
exposure. Second, applicants in the present study were aware that as part
ing in the standard instruction condition would score higher on the
of their job, they would be required to conduct interviews with well-known
Neuroticism and lower on the Extraversion scales than when
stars in the entertainment industry as well as to create and host their own
responding under fake-good instruction; the same pattern of results
programs. Such experiences would considerably enhance the resumes of
should emerge in the comparison of those analog research partic- successful applicants and have a great impact on their careers.
RESPONSE STYLE AND THE NEO PI R
335
between 19 and 29 years of age and currently employed or seeking We then explained that test results in this second session would be
employment as an actor or other performer at the time of testing. Appli- compared with those of a sample of individuals who had taken these tests
cants were excluded if their current or usual occupation (full or part time) as part of their application for the actual reality TV show; the respondent
was not acting or performing.2 Twelve of the analog participants were men, who produced a personality profile that most closely matched the  aver-
and 16 were women. Of the 28 participants, 25 were single and had never aged personality profile of the bona fide applicants would be awarded
been married, and 3 participants were married. Two participants in this $100, in addition to that individual s remuneration for participating. Fol-
sample were Canadians of African descent, and the remaining 26 (93%) lowing the completion of the second testing session, all research partici-
applicants were Canadians of European descent. The mean age of the pants were paid $75.
research participants was 23.8 years (SD 3.10). All of these participants
had university degrees.
Results
Between-groups comparisons were conducted by means of t tests and the
chi-square statistic to determine if the analog sample differed from the
Protocol and Data Screening
differential prevalence sample on the basis of demographic variables. None
of the demographic variables differed significantly between the analog and
Protocols were first screened for acquiescence, nay-saying, ran-
differential prevalence samples for sex, marital status, age, race/ethnicity,
dom responding, and incomplete response sets according to the
or employment status.
guidelines in the NEO PI R manual (Costa & McCrae, 1992c). In
the analog sample, under standard instructions, 1 respondent was
Measurement found to have engaged in acquiescence, 1 engaged in random
responding, and 1 did not complete the protocol. Protocols from
Personality domain scales. The NEO PI R was used to assess the five
these research participants (i.e., protocols completed by them
personality domains of the FFM. The NEO PI R comprises 240 self-report
under both instruction conditions) were removed from the sample.
items answered on a 5-point Likert format scale, with separate scales for
Similarly, two additional protocols were dropped from the analog
each of the five domains. Each scale consists of six correlated facets or
sample, as participants in the fake-good instruction condition had
subscales with 8 items, for a total of 48 items for each scale. In this study,
only the five personality domain scales were used. extreme personality domain scale scores that deviated markedly
Validity scales. Two NEO PI R research validity scales developed by
from the rest of the analog sample in the fake-good instruction
Schinka et al. (1997), the PPM and NPM, were used to assess response
condition.
style. The PPM and NPM scales comprise items selected from the pool of
For the differential prevalence group sample, 3 protocols were
240 NEO PI R items using an empirical/rational scale strategy (see
removed: 1 for lack of clarity regarding which norms to use to
Schinka et al., 1997, Study 1).3 Both the PPM and NPM scales comprise 10
score the NEO PI R protocol (i.e., the individual involved in a
items; 6 of the 10 of the items on the PPM and all 10 items on the NPM
sex-change operation), and 2 because of random responding. In
are negatively keyed. The PPM scale includes 2 Neuroticism items, 3
sum, a total of 68 test protocols remained 46 from the analog
Extraversion items, 3 Openness-to-Experience items, 1 Agreeableness
sample (23 under standard and 23 under fake-good instructions)
item, and 1 Conscientiousness item. Two items from each of the five
and 22 in the differential prevalence group sample.
personality domains reside on the NPM scale.
Experimental Manipulation Check
Procedure
Rogers (1997) strongly recommended that research participants
Persons who responded to these advertisements had the study explained
in analog dissimulation studies be questioned on their understand-
to them by telephone, and if they agreed to participate, a testing session
was booked. Written informed consent was obtained from participants ing of the instructions, as some studies have shown that many
upon their arrival at the session. The number of participants in each session
participants may not have comprehended them adequately. In
ranged from 2 to 8 people. Sessions lasted approximately 4 hr and were
order to determine if the analog research participants understood
completed in two parts.4 In the first part respondents were administered the
the instructions to fake-good provided in that instructional condi-
tests under the standard instructions and told that after completing the first
tion, we administered a postexperimental questionnaire, which
testing session, they would be taking the tests again in a second session
included two open-ended discussion topics: (a)  In your own
under different (but unspecified) instructions. After completing the first
part of the session, they had a 30-min break (to minimize the effect of
fatigue and boredom) and were given $5 for a beverage and a snack. When
2
None of the analog participants was a finalist for the first season of the
the participants returned from the break, they were given the tests a second
program, from which the differential prevalence group sample was drawn.
time, with the instruction to respond to the questions as if they were trying
However, 2 participants in the analog sample had applied for positions on
to maximize their chances of gaining a role on a popular and local reality
the second season, and of these, 1 was a finalist.
TV show. The instructions were as follows:
3
A third research scale, the Inconsistency scale, was also created by
Schinka et al. (1997) for the purpose of detecting random responding.
Imagine that you are auditioning for a job as a television host for
However, because the main purpose of the present study was to examine
[ Name of Show ] and that as part of the selection and audition
impression management, this scale was not used in the analyses.
process you must undergo psychiatric examination and psychological
4
Participants also completed the MMPI 2 (Butcher et al., 2001), the
testing.5 We would now ask that you answer the questions on the
Fundamental Interpersonal Relations Orientation Behavior (Schnell &
psychological test in a manner that you believe would enhance your
Hammer, 1993), and the BarOn Emotional Quotient Inventory (Bar-On,
chances of being selected to be a television host. One thing to keep in
1997); however, these measures were not included for analysis in the
mind is that you want to respond in a manner that is believable, but at
present article.
the same time enhance the potential of your being selected. Remember
5
that the research participant in this study who produces the most The name of the TV show in question is withheld in order to preserve
believable and desirable profile will receive an extra $100. anonymity of respondents.
BAGBY AND MARSHALL
336
words, please explain the instructions you received for the second .05. Effect size calculations (Cohen s d; Cohen, 1988) were used
half of the study and (b)  Please briefly explain what strategies to supplement the mean difference analyses.
you employed to respond as if you were applying for the [Name of The means and standard deviations for the personality domain
Show]. Two students (1 graduate student and 1 advanced under- scores for each of the three groups are displayed in the upper
graduate student) then reviewed and rated the responses from each portion of Table 1. In the repeated analysis, the analog sample
of the participants and used the following scheme to code the research participants responding under fake-good instructions as
answers:  Did not appear to have understood the instructions,
opposed to those responding under standard instructions scored
 most likely understood the instructions, and  definitely under- significantly lower on the Neuroticism scale, t(22) 3.21, p
stood the instructions. Both raters agreed that 21 of the 23
.01, d 1.37, and significantly higher on the Extraversion scale,
research participants either most likely or definitely understood the
t(22) 6.02, p .01, d 2.57, and the Conscientiousness scale,
instructions on the basis of their responses to the first topic; one
t(22) 2.43, p .05, d 1.04, respectively. For the between-
rater thought that 2 participants did not appear to understand the
groups analysis in which the analog research participants respond-
instructions, whereas the other rater thought these same 2 partic- ing under standard instructions were compared with the differen-
ipants most likely understood the instructions. A review of the
tial prevalence sample, the former group scored significantly
responses to the second topic, however, revealed that all subjects
higher on the Neuroticism scale, t(43) 4.16, p .01, d 1.18,
articulated a strategy reflecting that they most likely understood or
and significantly lower on Extraversion scale, t(43) 2.98, p
definitely understood the instructions. Thus, no analog participant
.01, d 0.91, than the latter group. For the between-groups
was eliminated on the basis of failure to understand instructions.
analysis in which the analog research participants responding
under instructions to fake-good were compared with the differen-
tial prevalence group sample, the former group scored significantly
Mean Differences
higher on the Extraversion scale, t(43) 4.16, p .01, d
1.27, and significantly lower on the Agreeableness scale,
Personality domain scales. Three sets of planned comparisons
t(43) 2.34, p .05, d 0.72, than did the latter group.
for the domain scales were performed with t tests. The first set of
planned comparisons consisted of a within-group (repeated) anal- Validity scales. The means and standard deviations for the
ysis (i.e., analog standard instructions vs. analog fake-good in- research validity scales (the PPM and NPM) are displayed in the
structions). The next two sets of planned comparisons were be- lower portion of Table 1. For the PPM, there was a significant
tween groups (i.e., analog sample/standard instructions vs. difference between the analog research participants responding
differential prevalence group sample; and analog sample/fake- under standard instructions compared with their responding under
good instructions vs. differential prevalence group sample). Be- fake-good instructions, t(22) 2.67, p .01, d 1.14; however,
cause most comparisons were based on a priori assumptions, contrary to the hypothesis, those responding under standard in-
Bonferroni correction was not applied, and the p value was set at structions scored higher than those responding under fake-good
Table 1
Means, Standard Deviations, and Effect Sizes for Domain and Validity Scales of the Revised
NEO Personality Inventory
Analog design
Differential
Standard Fake-good prevalence
instructions instructions group Cohen s d
NEO PI R M SD M SD M SD d1 d2 d3
Domain scales
Neuroticism 58.93a 12.55 46.55b 15.88 46.66b 8.48 1.37 1.18 0.01
Extraversion 58.67a 10.80 77.04b 8.73 66.96c 7.44 2.57 0.91 1.27
Openness 67.06 10.40 67.42 10.18 65.26 11.08 0.06 0.17 0.21
Agreeableness 42.08 13.19 36.18a 18.08 46.68b 11.07 0.48 0.39 0.72
Conscientiousness 44.56 11.17 51.90 15.49 48.32 9.68 1.04 0.37 0.28
Validity scales
PPM 23.34a 2.74 21.87b 2.47 21.36b 2.72 1.14 1.19 0.12
NPM 7.57 3.10 7.39 5.04 7.18 3.02 0.40 0.13 0.21
Note. Values for the personality domains scores are standardized (T) scores (M 50, SD 10); values for the
PPM and NPM are raw scores. Row means with different subscripts are significantly different at minimum
.05. NEO PI R Revised NEO Personality Inventory; d1 analog/standard instructions sample vs. analog/
fake-good instructions sample; d2 analog/standard instructions sample vs. differential prevalence group; d3
analog/fake-good instructions sample vs. differential prevalence group; PPM Positive Presentation Manage-
ment scale; NPM Negative Presentation Management scale.
RESPONSE STYLE AND THE NEO PI R
337
instructions. The PPM score was significantly higher for the ana- analog fake-good condition and the differential prevalence group
log research sample responding under standard instructions than sample.
for the differential prevalence group, t(43) 2.44, p .05, d At the same time, a number of unexpected differences emerged
1.19, a finding also contrary to what was hypothesized. The PPM among the groups. Although both the analog research participants
scale scores did not differ significantly between the differential responding under the instructions to fake-good and the participants
prevalence group sample and analog fake-good group. There were in the differential prevalence group scored significantly higher on
no significant differences between any of the groups for the NPM. the Extraversion scale than did the analog research participants
responding under standard instruction, the Extraversion scores of
the analog fake-good group also exceeded those of the differential
Discussion
prevalence group. The analog fake-good research participants also
The primary goals of this study were (a) to attempt to replicate scored lower on Agreeableness compared with the differential
results from previous (analog) studies that have used the traditional prevalence sample, although there was no significant difference in
repeated measures analog design to investigate the influence of Agreeableness scores between the analog research participants
fake-good responding on the personality domain scores on the responding under standard and fake-good instructions.
NEO PI R and (b) to compare NEO PI R personality scale scores These results might be attributed to the fact that the television
of the research participants in the analog sample completed under show in question was already  on the air at the time the analog
standard and fake-good instructions with those completed by in- group was tested. This temporal difference across the analog and
dividuals in the differential prevalence group sample. Comparing differential prevalence group samples may have allowed the ana-
the NEO PI R domain scale scores between these pairs of groups log respondents to become more familiar with the kind of person-
begins, we believe, to address the issue of whether results from alities most likely to be hired, compared with the bona fide
analog design samples are generalizable to differential prevalence applicants, who had no access to such information because the
group samples. This issue of generalizability is critical because show had not been previously aired. In general, extraversion
most research directed at developing, refining, and validating characterized most of the actors, and antagonism and competitive-
 validity scales relies almost exclusively on analog designs, and ness (i.e., low Agreeableness) among the participants were two of
results from analog designs have significant implications for the the most salient features of the show. Thus, the analog sample
effective and meaningful use of validity scales in applied contexts. under instructions to respond in order to maximize their chances of
Consistent with results from previous analog design studies getting on the show may have perceived excessive extraversion
(Reid-Seiser & Fritzsche, 2001; Ross, Bailley, & Millis, 1997; and low agreeableness to be highly desirable in this context.
Rosse, Stetcher, Miller, & Levin, 1998), and supporting one of our No mean differences were detected with respect to the
hypotheses, the results from the present study demonstrate that Openness-to-Experience dimension across any of the three groups.
research participants decreased their Neuroticism scores and in- Previous studies have demonstrated inconsistent differences across
creased their Extraversion scores when instructed to fake-good standard and analog fake-good instructions with respect to
when completing the NEO PI R compared with when they com- Openness-to-Experience. For example, Ballenger et al. (2001)
pleted this test under standard instructions (i.e., responding hon- found that Openness-to-Experience did not distinguish analog
estly). Also as hypothesized, the differential prevalence group fake-good research participants from honest respondents in a clin-
sample scored significantly lower on Neuroticism and higher on ical sample. Conversely, Caldwell-Andrews et al. (2000) found
Extraversion than did the analog sample responding under stan- that Openness-to-Experience scores did differ significantly across
dard instructions. The analog research participants responding analog fake-good and honest conditions. One way to make mean-
under fake-good instructions also scored higher on Conscientious- ing of these discrepant results, including those of the current
ness than when responding under standard instructions, although investigation, is to consider that the scenarios provided in the
this had not been hypothesized. There was no statistically signif- experimental instructions or the perceived demands of the real-life
icant difference between the analog research participants respond- assessment situations may elicit context-specific desirable traits.
ing under standard instructions and the differential prevalence For example, in one situation, high scores on Openness-to-
group in terms of their scores on the Conscientiousness scale, but Experience might be seen as a particularly important characteristic,
the difference was in the predicted direction. Because studies whereas in another context, low or high scores on Agreeableness
examining fake-good responding that employ within-group de- might be seen as the most desirable. This interpretation echoes
signs typically produce larger effect size differences than between- Rogers s (1997) caution that careful attention be paid to the
groups designs (Baer & Miller, 2002), as was the finding in the specificity of instructions when designing dissimulation studies
current study, it is conceivable that a larger sample would have and generalizing results from them. Perhaps what can be said at
detected a between-groups effect. this point is that in most contexts in which some form of fake-good
Because the analog fake-good instruction group and the differ- responding can be expected, Neuroticism scores are likely to be
ential prevalence group differed similarly from the analog standard decreased; Extraversion and, to a lesser extent, Conscientiousness
instruction group for three of the five personality domains (most scores are likely to be increased. Openness-to-Experience and
with medium-to-large effect size differences), we believe there is Agreeableness domain scores are apt to be more variable and
some evidence to suggest that results from fake-good analog situation specific.
designs can be generalized to differential prevalence group de- In this study the performance of the research validity scales
signs. The pattern of these results points not only to the overall developed by Schinka and colleagues (1997) to detect fake-good
effectiveness of the experimental manipulation typically employed responding was much less than optimal. The failure to replicate the
in analog studies but also to the overall similarity between the findings of prior work regarding the validity of these scales sug-
BAGBY AND MARSHALL
338
gests the possibility that the experimental manipulation employed on Extraversion and Conscientiousness), were too similar. More-
in the analog portion of the current study was unsuccessful. This over, this same pattern of personality scale alteration has been
interpretation, however, seems unlikely, as consistent group dif- observed in previous analog studies.
ferences between the fake-good and standard instructions condi- We also recognize that the sample size was small and that this
tions and between the standard instructions and differential prev- limitation not only compromised the generalizability of the results
alence group samples emerged for the Neuroticism, Extraversion, but also prohibited an analysis of the NEO PI R facet scales.
and Conscientiousness domain scales. In a similarly designed, Because facet scales have proven to be excellent predictors of
repeated measures analog study, Ballenger et al. (2001) also re- behavior and offer a more fine-grained personality profile than do
ported that the PPM was unable to distinguish NEO PI R proto- domains scales (see, e.g., Reynolds & Clark, 2001), future studies
cols completed under fake-good versus standard instructions, al- with large samples examining facet scores would certainly prove
though the balance of the evidence suggests that the PPM can useful. Finally, as no study prior to the current investigation has
make such distinctions (see e.g., Caldwell-Andrews et al., 2000; directly compared differential prevalence group samples to analog
Schinka et al., 1997; Young & Schinka, 2001). samples instructed in a manner to mimic real-world scenarios,
As indicated earlier, one explanation for discrepant results future studies comparing analog samples with other types of as-
across different studies may be located in the instructions provided sessment scenarios are needed. Such efforts will begin to clarify
to participants in fake-good conditions, which may differentially the relation between results from analog samples and research
influence scale elevations on both the domain and validity scales designs and those from differential prevalence group samples, and
of the NEO PI R. There is evidence to suggest, for example, that perhaps even from known-group samples. Accumulation of results
the PPM is more highly correlated with measures of self-deceptive from such studies will, in time, address the important issue of the
enhancement than with impression management (e.g., Reid-Seiser external validity of the widely employed experimental approach to
& Fritzsche, 2001). Instructions (or assessment situations) that the study of test response dissimulation.
elicit overt attempts to engage in positive impression management, Notwithstanding the need for replication and extension, it is
as was the case in this study, may not produce strong effects for a worth emphasizing that the results from the current investigation
scale more sensitive to self-deception. More studies are needed to do suggest that outcomes from analog design studies are likely
explore these issues, and should future evidence emerge support- generalizable to real-world settings. Against this background, the
ing the need for NEO PI R validity scales, we believe that separate accumulated evidence from previous analog studies and from the
scales for these two types of fake-good responding should be current investigation, which indicate that test profiles are altered
developed. We also think that careful consideration should be under instructions to fake-good, has implications for the use of the
given to the development of any validity scale that is composed NEO PI R and other tests like it. In assessment contexts in which
exclusively of items that reside on personality domain scales, positive impression management is likely, such as personnel se-
because items designed to assess personality traits that have lection, the assessor needs to be cognizant that such instruments
proven construct validity are unlikely to provide unequivocal are susceptible to impression management response bias. Of
meaning with respect to response dissimulation, especially with course, this becomes especially problematic if the instruments used
scales composed of relatively few items. do not have scales that accurately assess the presence of response
Several limitations of the current study must be acknowledged. bias. One solution would be to use multiple tests or other evalu-
First, the fact that the analog sample was tested almost 10 months ation strategies in which one of the scales or methods assesses for
later than the differential prevalence group sample may have potential response bias. For example, the NEO PI R and the
contributed to some of the differences across these two groups. MMPI 2, the latter of which has scales that can detect positive
Another limitation is that the differential prevalence group sample impression response style, could be used in combination. The NEO
was only assessed on one occasion. Ideally, it would have been PI R would provide extensive information on a variety of person-
best to assess the individuals in this sample a second time, either ality trait attributes relevant to job performance, whereas the
before or after the selection process. Personality assessment results MMPI 2 would provide information about potential psychopathol-
outside of the context of the selection process for these applicants ogy and possible response bias. We believe that although the
would have permitted a more definitive conclusion regarding the general cost (in time and money) might be perceived as excessive,
generalizability of analog and differential prevalence group de- the potential benefits from such a comprehensive assessment rel-
signs and samples. Every effort was made, however, to match the ative to the costs tip the cost benefit ratio in favor of such
research participants in the analog sample demographically (i.e., extensive testing (see, e.g., Meyer et al., 2001).
age, sex, education) and vocationally (i.e., career choice and as-
pirations) to the individuals who composed the differential prev-
References
alence group sample. This was done with the expectation that
Baer, R. A., & Miller, J. (2002). Underreporting of psychopathology on the
closely matched groups would collectively produce similar per-
MMPI 2: A meta-analytic review. Psychological Assessment, 14, 16
sonality profiles. There is the possibility, nonetheless, that these
26.
two groups did have different  baseline personality profiles. We
Ballenger, J. F., Caldwell-Andrews, A., & Baer, R. A. (2001). Effects of
believe, however, that it is unlikely that this potential difference
positive impression management on the NEO Personality Inventory
could account for the outcomes of the current study, as the specific
Revised in a clinical population. Psychological Assessment, 13, 254
patterns of results obtained for the differential prevalence group
260.
sample and the analog group responding to fake-good instructions,
Bar-On, R. (1997). BarOn Emotional Quotient Inventory: A measure of
relative to the same analog research participants responding to
emotional intelligence. Toronto, Ontario, Canada: Multi-Health
standard (honest) instructions (i.e., lower on Neuroticism, higher Systems.
RESPONSE STYLE AND THE NEO PI R
339
Ben-Porath, Y. S., & Waller, N. G. (1992). Normal personality inventories Meyer, G. J., Finn, S. E., Eyde, L. D., Kay, G. G., Moreland, K. L., Dies,
in clinical assessment: General requirements and the potential for using R. R., et al. (2001). Psychological testing and psychological assessment:
the NEO Personality Inventory. Psychological Assessment, 4, 14  19. A review of evidence and issues. American Psychologist, 56, 128 165.
Berry, D. T. R., Bagby, R. M., Smerz, J., Rinaldo, J. C., Caldwell- Morey, L. C. (1991). Personality Assessment Inventory: Professional man-
Andrews, A., & Baer, R. A. (2001). Effectiveness of NEO PI-R research
ual. Odessa, FL: Psychological Assessment Resources.
validity scales for discriminating analog malingering and genuine psy- Piedmont, R. L., McCrae, R. R., Riemann, R., & Angleitner, A. (2000). On
chopathology. Journal of Personality Assessment, 76, 496 516.
the invalidity of validity scales: Evidence from self-reports and observer
Butcher, J. N., Graham, J. R., Ben-Porath, Y. S., Tellegen, A., Dahlstrom,
ratings in volunteer samples. Journal of Personality and Social Psychol-
W. G., & Kaemmer, B. (2001). Minnesota Multiphasic Personality
ogy, 78, 582 593.
Inventory 2: Manual for the administration, scoring, and interpretation.
Reid-Seiser, H. L., & Fritzsche, B. A. (2001). The usefulness of the NEO
Minneapolis: University of Minnesota Press.
PI-R Positive Presentation Management Scale for detecting response
Butcher, J. N., & Rouse, S. V. (1996). Personality: Individual differences
distortion in employment contexts. Personality and Individual Differ-
and clinical assessment. Annual Review of Psychology, 47, 87 111.
ences, 31, 639 650.
Caldwell-Andrews, A., Baer, R. A., & Berry, D. T. R. (2000). Effects of
Reynolds, S. K., & Clark, L. A. (2001). Predicting dimensions of person-
response sets on NEO PI-R scores and their relations to external criteria.
ality disorder from domains and facets of the five-factor model. Journal
Journal of Personality Assessment, 74, 472 488.
of Personality, 69, 199 222.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences
Rogers, R. (1997). Clinical assessment of malingering and deception (2nd
(2nd ed.). New York: Academic Press.
ed.). New York: Guilford Press.
Costa, P. T., & McCrae, R. R. (1992a). Normal personality assessment in
Ross, S. R., Bailley, S. E., & Millis, S. R. (1997). Positive self-presentation
clinical practice: The NEO Personality Inventory. Psychological Assess-
effects and the detection of defensiveness on the NEO PI-R. Assess-
ment, 4, 5 13.
ment, 4, 395 408.
Costa, P. T., & McCrae, R. R. (1992b).   Normal personality inventories
Rosse, J. G., Stetcher, M. D., Miller, J. L., & Levin, R. A. (1998). The
in clinical assessment: General requirements and the potential for using
impact of response distortion on preemployment personality testing and
the NEO Personality Inventory : Reply. Psychological Assessment, 4,
hiring decisions. Journal of Applied Psychology, 83, 634 644.
20 22.
Schinka, J. A., Kinder, B. N., & Kremer, T. (1997). Research validity
Costa, P. T., & McCrae, R. R. (1992c). Professional manual for the NEO
scales for the NEO PI-R: Development and initial validation. Journal of
Personality Inventory (NEO PI-R) and NEO Five Factor Inventory
Personality Assessment, 68, 127 138.
(NEO-FFI). Odessa, FL: Psychological Assessment Resources.
Schnell, E. R., & Hammer, A. (1993). Introduction to the FIRO B in
Costa, P. T., & McCrae, R. R. (1997). Stability and change in personality
organizations. Palo Alto, CA: Consulting Psychologists Press.
assessment: The NEO Personality Inventory in the year 2000. Journal of
Topping, G. D., & O Gorman, J. G. (1997). Effects of faking set on validity
Personality Assessment, 68, 86 94.
of the NEO FFI. Personality and Individual Differences, 23, 117 124.
Edwards, A. L. (1953). The relationship between the judged desirability of
Young, M. S., & Schinka, J. A. (2001). Research validity scales for the
a trait and the probability that the trait will be endorsed. Journal of
NEO PI R: Additional evidence for reliability and validity. Journal of
Applied Psychology, 37, 90 93.
Personality Assessment, 76, 412 420.
Hogan, R., & Nicholson, R. A. (1988). The meaning of personality test
scores. American Psychologist, 43, 621 626.
McCrae, R. R., Costa, P. T., Dahlstrom, W. G., Barefoot, J. C., Siegler,
Received November 4, 2002
I. C., & Williams, R. B. (1989). A caution on the use of the MMPI
Revision received March 25, 2003
K-correction in research on psychosomatic medicine. Psychosomatic
Medicine, 51, 58 65. Accepted April 2, 2003


Wyszukiwarka

Podobne podstrony:
wybrane metody dostępu do Internetu
Metody doboru regulatora do UAR
wybrane aspekty diagnozy psychologicznej
Laserowe Metody Diagnostyki i Terapii
Metody poz tlenu do oxyfuel
METODY POBIERANIA PRÓBEK DO CELÓW URZĘDOWEJ KONTROLI
Źródła i wybrane metody ograniczania zakłóceń w systemach automatyki z napędami przekształtnikowymi
Metody tworcze Artykul
Zagadnienia do zajec labor z Teorii Maszyn i Mechanizmow st inz s5
(Konspekt do zajęć specjalizacyjnych z zakresu pielęgniarstwa w)
Zagadnienia do zajec labor z Elem Teorii Maszyn i Mechan oraz Drgan AiR wiecz inz s5
Metody pobierania próbek do oznaczania WWA w powietrzu

więcej podobnych podstron