Tải bản đầy đủ (.pdf) (21 trang)

Emergent two dimensional patterns in images rotated in depth

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1.73 MB, 21 trang )

Journal of Experimental Psychology:
Human Perception and Performance
1980, Vol. 6, No. 2, 244-264

Emergent Two-Dimensional Patterns
in Images Rotated in Depth
Steven Pinker

Ronald A. Finke

Harvard University

Massachusetts Institute of Technology

Once a person has observed a three-dimensional scene, how accurately can
he or she then imagine the appearance of that scene from different viewing
angles? In a series of experiments addressed to this question, subjects
formed mental images of a set of objects hanging in a clear cylinder and
mentally rotated their images as they physically rotated the cylinder by
various amounts. They were asked to perform four tasks, each demanding
the ability to "see" the two-dimensional patterns that should emerge in their
images if the images depicted the new perspective view accurately—(a)
Subjects described the two-dimensional geometric shape that the imagined
objects formed in an image rotated 90°; (b) they "scanned" horizontally
from one imagined object to another in a rotated image; (c) they physically
rotated the empty cylinder together with their image until two of the imagined objects were vertically aligned; and (d) they adjusted a marker to
line up with a single object in a rotated image. The experimental results
converged to suggest that subjects' images accurately displayed the twodimensional patterns emerging from a rotation in depth. However, the
amount by which they rotated their image differed systematically from the
amount specified by the experimenter. Results are discussed in the context
of a model of the mental representation of physical space that incorporates


two types of structures, one representing the three-dimensional layout of a
scene, and the other representing the two-dimensional perspective view oi
the scene from a given vantage point.
The mental representation of threedimensional space and its relation to visual
imagery is a relatively unexplored area of

cognitive psychology. There is reason to
believe that information about the threedimensional layout of a scene is preserved
in some way in visual images of the scene.
For example, Shepard and Metzler (1971)
,
j , . U i U
i
i A tshowed that when people are asked to
decide whether or not two line drawings
depict three-dimensional objects having the

„,..
,
, , XT • , r. .
This research was supported by National Science
Foundation Grant BNS-77-21782 awarded to S.
Kosslyn; the first author was supported by a
Natural Science and Engineering Research Council
Canada Postgraduate Scholarship.
We are grateful to Nancy Etcoff, Reid Hastie,
Frank Restle, and an anonymous reviewer for

same shape, their decision time increases
j through which one

° . . , °,
, .

linearl J wkh th

constructive comments on the manuscript; to
Stephen Kosslyn and Mary Potter for advice on the
design and interpretation of the experiments; and to

object has to be rotated m depth to bring it
into correspondence with the other. Pinker
and Kosslyn (1978) and Pinker (in press)

David Birdsong and Fabio Idrobo for assistance

jater

^Re^e'sTs for reprints should be sent to either
Steven Pinker, who is now at the Center for Cognitive Science 20D-105, Massachusetts Institute of
Technology, Cambridge, Massachusetts 02139, or to
Ronald Finke, who is now at the Department of
Psychology, Uris Hall, Cornell University, Ithaca,
New York 14850.

found that the time su bjects

scan

the s a ce


"
"
P , between two imagined objects increases linearly with the distance between the objects in three dimensions. In
addition, Attneave and his colleagues
(A t j . n p
1Q7?- Attnpavp & Farrar 1977(Attneave, 19/2 , Attneave & farrar, 19 / / ,
Attneave & Pierce, 1978) found that sub-

Copyright 1980 by the American Psychological Association, Inc. 0096-1S23/80/0602-0244$00.7S
244

take to


TWO-DIMENSIONAL PATTERNS IN ROTATED IMAGES

245

jects are extremely accurate in mentally
Pinker (in press) has pointed out that
extrapolating a visible straight line segment a "three-dimensional scale model" theory
behind their heads. Such findings have led cannot easily explain the existence of
some theorists to propose that scenes are these perspective properties of images,
represented internally in a three-dimen- since perspective properties arise only
sional "space," in which an object is repre- when a three-dimensional scene is prosented by a "filled-in" region of the space, jected onto a two-dimensional surface.
in the same way that a scale model is iso- If the representational medium underlying
morphic in shape to the object it represents imagery is simply a three-dimensional
(Attneave, 1972, 1974; Metzler & Shepard, coordinate system, there is no obvious
1974; Pinker & Kosslyn, 1978).
reason why only the "visible surfaces" of

Other recent studies, however, suggest imaged objects should be accessible at one
that two-dimensional projective properties, time, nor why the appearance of an imsuch as the alignment or concealment of agined scene should change systematically
objects at different depths, the variations of with the angle and distance of a "vantage
1
apparent size with distance, and the varia- point." Rather, in a three-dimensional
tions of apparent shape with orientation, system, we might expect that the various
may also be represented in images. First, processes that inspect or "read" images
if people must mentally rotate two objects could be applied to any three-dimensional
into correspondence to verify that they region of the coordinate system, accessing
have the same three-dimensional shape, the information from any or all directions at
objects must have been encoded in a form once.
In fact, a three-dimensional representaspecific to their original viewing perspective; otherwise the representations could tion (together with the processes that act
have been matched against one another upon it) would seem to describe the haptic
directly, without the need to perform sense more than the visual sense. When a
mental rotations (as Metzler & Shepard, three-dimensional environment is explored
1974, note). Second, Kosslyn (1978) has by touch, distant objects do not appear
found that the linear relation in vision "smaller," nor are the backs of objects
between an object's distance and projective "occluded," nor do the shapes of objects
size also holds for imaged objects—the seem to change with relative orientation.
larger an object, the farther away it must Likewise, we would not expect to find these
be imagined to subtend a constant "visual perspective properties when an internal
angle." Third, several experiments have three-dimensional "environment" is inshown that people are likely to remember spected ; but nevertheless, they seem to
the details of an imagined scene that were exist.
Alternatively, perhaps people store two
"visible" from their imagined "vantage
point," and to forget those details that different traces of a visual scene: a threewere "concealed" or "out of view" (Abel- dimensional structure representing its
son, 1976; Fiske, Taylor, Etcoff, & Laufer, spatial layout and a two-dimensional
in press; Keenan & Moore, 1979). Finally, structure (perhaps a copy of the retinal
Pinker (in press) found that when sub- image), in which the original perspective
jects are asked to scan from one imagined properties are preserved. However, this

object to another in a three-dimensional model is also inadequate. Pinker (in press)
scene by "sighting" the objects through also had subjects study a scene and imagine
an imaginary rifle sight, the subjects' that the scene was rotated so that they
scan times mirror the apparent two- were looking at it from above or from
dimensional separations of the objects
1
This is barring the unlikely possibility that the
in the frontal plane. This suggests that
representational
system underlying imagery has
images preserve interpoint distances in the
the functional equivalent of reflected light rays,
planar projection specific to a vantage which are projected onto the "retina" of the mind's
eye!
point.


246

STEVEN PINKER AND RONALD A. FINKE

the side. As before, they scanned across left-right distances, to represent the effects
the resulting image by "sighting" ob- of a shift to the side; or front-back distances
jects through an imaginary rifle sight. to top-bottom distances, to represent the
The time required to scan from one effects of a shift to a bird's eye view.
imagined object to another increased lin- However, if people can also accurately
early with increasing distance in two imagine changes that result from interdimensions between the objects as they mediate rotations or perspective shifts, the
would appear in the new perspective view. process of transforming an image would
Since the subjects had never seen the display have to be more complex than the dimenfrom these vantage points, they could not sion-relabeling procedure suggested earlier.
merely have scanned a trace of their retinal

In the present investigation, the visual
image, but must have constructed images "scene" consisted of four small objects
depicting the appearance of the objects suspended in different positions inside a
from the new viewing angles, using their clear upright cylinder, which could be
knowledge about the three-dimensional rotated about its vertical axis. After the
structure of the scene.
subject studied the display, the objects
This result would have pleased Hermann were removed, and the subject was asked
von Helmholtz, who in 1894 conjectured to imagine each object in its former position
that "without it being necessary, or even inside the cylinder. Then the subject was
possible to describe [an object] in asked to rotate his or her image of the
words . . . we can clearly imagine all the suspended objects by various amounts and
perspective images which we may expect to perform several tasks that require
upon viewing from this or that side" "seeing" the two-dimensional spatial
(Warren & Warren, 1968, pp. 252-254). properties and relations that should emerge
In the present investigation, we put this if the image was rotated accurately. These
conjecture to further experimental test. tasks consisted of (a) naming the twoIn particular, we wished to determine how dimensional geometric figure that the
accurately images can represent the two- imagined objects formed when "viewed"
dimensional perspective properties that mentally from a new angle, (b) scanning
emerge when a scene is rotated with respect mentally in a horizontal direction from one
object to another in the image of the
to a vantage point.
Existing evidence on this issue is equiv- rotated scene, (c) rotating their image
ocal. Consider, for example, the "three- until two of the imagined objects were
mountain problem" (Huttenlocher & Pres- vertically aligned, and (d) judging the
son, 1973; Piaget & Inhelder, 1956). In this horizontal displacement of a single object
task, subjects observe a three-dimensional that was imagined to have revolved around2
display consisting of three mountains and the axis of the cylinder by various amounts.
must select from a group of pictures the one
that represents the appearance of the

Experiment 1
display from the side. That adults can
solve this problem, however, merely shows
It is often suggested that a mental image
that they know how to substitute one can serve as a surrogate percept, allowing
linear ordering for another—in this case,
a front-to-back ordering becomes a left2
To control, or in some cases, measure, the angle
to-right ordering. Although Pinker's (in
of
rotation of the image, we had subjects physically
press) scanning experiments seem to inrotate the empty cylinder by the same amount as
dicate that metric, and not just ordinal, they mentally rotated its contents. Thus, the mental
information is present in images result- rotation process we studied is not identical in all
ing from a shift in perspective, they respects to the one Shepard and Metzler studied,
only examined the effects of 90° shifts. since in our experiments the subjects could align
their image with a frame of reference at the same
Hence, subjects in these experiments could time that they shifted portions of their image by
have transformed front-back distances to predetermined amounts.


TWO-DIMENSIONAL PATTERNS IN ROTATED IMAGES
people to detect some pattern or property
in a remembered scene that they did not
encode explicitly when they saw the scene
initially. However, attempts to demonstrate
this ability experimentally have been disappointing (see Reed, 1974; Reed &
Johnsen, 1975). In the first experiment, we
hoped to demonstrate that people can
encode a three-dimensional scene, mentally

rotate it in depth, and "see" in their new
image a two-dimensional shape that they
did not notice in the original display.
Subjects studied a configuration of objects
which, unknown to them, had been arranged so that the configuration defined
a particular two-dimensional shape when
viewed from the front and a different shape
when viewed from the side. After mentally
rotating the configuration, the subjects
were asked to identify the second, emergent
figure. To rule out alternative explanations
for performance in this task, we compared
the shapes subjects named (a) with those
they used to describe the two-dimensional
figure when imagining the configuration
from the front, (b) with those named by a
second group of subjects who actually saw
the figure from the side, and (c) with those
named by a third group of subjects who
never saw the display.
Method
Subjects
Eighteen students and employees of Harvard
University were paid to participate in an "imagery"
condition. Two could not carry out the task instructions and did not complete the experiment. Sixteen
additional members of the Harvard community
participated in a "perception" condition. Another
72 participated in a "control" condition by filling
out a brief questionnaire.


Apparatus
A clear Plexiglas cylinder, 27 cm high X 20 cm
in diameter, was mounted on a turntable on a
wooden platform and could be rotated 360° about
its vertical axis. The front of the cylinder was
marked by a thin black line running vertically down
the length of the cylinder; the rear was marked by
a line running down the top and bottom thirds of its
length. An angular scale was located on the bottom
of the rim of the turntable and was visible to the
experimenter in a small mirror mounted under the

247

Figure 1. The shape formed by the objects when the
display is rotated 90° counterclockwise.
rim. The platform rested on a table 35 cm away
from a chinrest. The chinrest was adjusted so that
when the subject was seated at the table, he or she
would be gazing into the center of the cylinder.
Four small plastic animal toys were suspended
by clear nylon thread from brass rods (30 cm X .4
cm X .4 cm) that lay across the top of the cylinder.
The animals, between 2- and 4-cm long, consisted of
a red bug, a black bear, a yellow fish, and a green
frog. An animal could be positioned anywhere in the
cylinder by sliding its rod to a given location and by
winding or unwinding the thread around the rod
to raise or lower the animal. At the start of the
experiment, the animals were positioned so that

when viewed from the front, the two-dimensional
shape they defined (i.e., the plane figure whose
vertices corresponded to the animals) could be
described either as a triangle or a quadrilateral, as
three of the animals were roughly collinear. However, when seen from the left side, the shape they
defined corresponded to a tilted parallelogram (see
Figure 1). The experimenter positioned the animals
by placing over the cylinder a clear plate that was
marked with the correct positions of the rods in the
horizontal plane and by inserting into the cylinder
a "dipstick" on which the correct heights of the
animals were indicated.

Procedure
Imagery condition. The experimenter, reading
from a script, told the subjects that they were
participating in a study on visual imagery and
visual memory, and asked them to sit at the table
with chin in chinrest and to study the display,
trying to form an accurate mental image of it. After
several minutes the subject was asked to study one
particular object. Then that object was lowered to
the bottom of the cylinder by allowing the thread


STEVEN PINKER AND RONALD A. FINKE

u.

j:

6

1
s

J

~*

0-1

CM

vO O

So

~

Sc00 Ov

•o

1
a

,_Sj

oo


oo

O fo'
*~~*

'

10 o

3

.8
tj
.^

O

a

fti

rt

3s

•^
O 0

O 0


1
1

"So

33

$CJ

r\
q

1
*ã *

"ãKĐ
^

tU

a&

J3

VI

0!

e
fe

o

3

"s
Ê

3

a
a
Ui

q

f;

I
*ôãằ
^

i
5s
^
q

^.

^


oo"
(^5

o ^,
^.

V)

--^ir>

cs

*^ CN

o ^o

o

10 (S

.JO.
CN 1O

2^<.
t^ằ Ov

CS
O vo
'^-^


3"^

1t

-* 00

v_^
ui sằ^


"2
ãM

CN 00

~

X5 00

ãato
3

^-"^^
^ PO

^s
*~~f *ã'

S3


V_^ S_^

fS CO

(N

a

flj
J=
-M

Ê

Ê.

.1
Ê

So

00.0

5SIS

^t<

to


O rh

1

(N

.2
"O

1

Đ

*^>
,,
ãS

ão
.S
c

a

/*t

^Vote. Percentages ô

Đ
g


0 JO

o
10
o^

Imagery"
Before rotatio
After rotation
Perception"
Before rotatio
After rotation
Controlb
First choice
Second choice

(5
1
ã2

*o *^

G-SCN

Condition

to unwind from the rod, and the rod-thread-object
assembly was handed to the subject. The subject
was asked to try to reposition the object in its
exact former location in the cylinder. This procedure

was repeated until he or she could reposition it to
within 3 mm of its correct location in the horizontal
plane and to within 4 mm of its vertical position.
Similarly, the subject attempted to reposition each
of the other three objects. Then the experimenter
randomly repositioned the four objects within the
cylinder, and the subject was required to reposition
all four. As before, the procedure was repeated until
the subject met the accuracy criterion for all four
objects.
The subject was then given a final opportunity
to study the display, and the objects were removed
from the cylinder. The experimenter then asked,
"Did you mentally arrange the objects in any
particular shape or configuration in order to remember their positions?" After giving an answer,
the subject was told to imagine the objects suspended
in their correct locations and to slide in the chair
away from the table, keeping his or her chin at the
same height as it was in the chinrest, until all four
objects were "visible in a single glance." The
experimenter then told the subject, "I would like
you to tell me whether the objects form some
specific shape in two dimensions. That is, if you
were now looking at a photograph of the display,
what shape would you see the objects form in that
photograph?" If the subject claimed to have a
particular shape in mind but to have forgotten
its name, the experimenter recited brief definitions
of the following shapes: triangle, quadrilateral,
trapezoid, parallelogram, rectangle, rhombus, and

square (in that order). If the subject described the
figure instead of naming it (e.g., "a four-sided
figure with two sides parallel and two sides not
parallel"), the response was recorded as the most
appropriate geometric shape of those listed earlier.
The subject was asked once again to rest in the
chinrest and to imagine the objects in their appropriate locations. The experimenter then asked the
subject to rotate the cylinder slowly in a counterclockwise direction and to maintain an image of the
objects rotating exactly with the cylinder. When
the cylinder had been rotated 90°, the subject was
told to stop and was allowed to "consolidate" the
new, transformed image of the objects. As before,
the subject was asked to sit back from the table
until all four objects were "visible at a glance" and
to describe the two-dimensional shape formed by
the imagined objects. The subject was then asked,
"Did you notice the shape that you just described
at any time before you rotated the cylinder?"
After giving an answer, the subject was handed a
piece of paper on which a rectangle had been
printed and was asked to indicate the positions of
the animals in his or her new, rotated image by
drawing small circles inside the rectangle, labeling
each with the appropriate animal name. When
asking each of these questions, the experimenter
looked away from the subject and said nothing
until the subject had given a decisive answer, which

Table I
Frequency of Cho


248


TWO-DIMENSIONAL PATTERNS IN ROTATED IMAGES
was then recorded. No feedback was given as to the
"correctness" of any of the responses.
Perceptual condition. Subjects were told that
they were participating in an experiment on visual
perception and were asked to seat themselves at the
table, placing their chins in the chinrest. Like the
subjects in the imagery condition, these subjects
were asked to slide back from the table until all
four objects were visible at a glance and to describe
the two-dimensional shape formed by the objects,
which remained in the cylinder. Similarly, they were
asked to rotate the cylinder 90° and to describe the
two-dimensional shape formed by the objects in
front of them.
Control condition. Subjects were asked to fill out
a questionnaire that contained the following single
item: "An experimenter suspends four small objects
in a room in such a way that they form a geometric
shape when viewed from a particular angle. Knowing
only this, what shape do you think the experimenter
has chosen?" On half of these questionnaires,
subjects were asked to write the names of the first
and second most likely shapes in two blank spaces;
on the other half, they were asked to write the
numerals 1 and 2 next to 2 of 13 shapes that were

printed on the page in alphabetical order.

Results and Discussion
The frequencies of shapes named in the
different conditions are listed in Table 1.
Among the imagery subjects, the most
frequent shape used to describe the display
from the front was "quadrilateral;" the
second most frequent choice was "triangle."
However, these responses are of little
interest in themselves as they were recorded
primarily to rule out the possibility that
whatever shapes the subjects "see" in
their rotated images may have been
"seen" anyway in the image before it was
rotated. Such was not the case; subjects'
descriptions of the shape formed by the
objects when imagined from the side did
differ from these baseline descriptions. As
predicted, the most frequent shape selected
to describe the pattern was "parallelogram" ; the second most frequent choice
was "trapezoid." A McNemar test (Segal,
1956) can be used to test the hypothesis
that the increase in parallelogram responses
following the mental rotation was not due
to random fluctuations of response frequencies. If the mental rotation had no
systematic effect on what subjects reported
"seeing," one would expect that, among
subjects who changed their minds after


249

rotating the image, as many would change
their responses away from parallelogram
as would change their responses toward
parallelogram. That did not happen. Following rotation, no subject changed his or her
response away from parallelogram (since
no subject made that guess originally), but
seven subjects changed their responses to
parallelogram, a statistically significant
difference, X 2 (l) = 5.14, p < .05.
Before claiming that people can detect
a two-dimensional pattern in a transformed
image, however, we must consider the
following two questions: (a) Are there
other explanations for parallelogram being
the subjects' most frequent choice? and
(b) Why did so many subjects not see a
parallelogram in their image?
As to the first question, there seems to be
no reason to suspect that subjects somehow
knew that the objects would form a
parallelogram from the side before they
imagined the rotation. As mentioned,
subjects never described the shape as seen
from the front as a parallelogram; and when
asked, no subject claimed to have mentally
arranged the objects into any shape even
suggesting a parallelogram. Furthermore,
after identifying the shape in the rotated

image, all subjects denied having noticed
or seen the named shape before the rotation.
Is it possible, then, that subjects claimed
to have "seen" a parallelogram simply
because they deduced that the experimenter
was likely to have chosen it as a "hidden"
figure (perhaps because it was a symmetrical, common, or "good" figure)?
The responses of subjects in the "control"
condition, who did not participate in the
experiment but guessed which shape an
experimenter would have chosen under
such circumstances, indicate that this is
not the case. By far, the most frequent
choice was "square," followed by "diamond." Conceivably, though, the subjects
might have offered their guess as to the
second most likely shape to have been
chosen if they thought that the most
likely shape had already been reserved for
the objects' shape from the front or if they
wanted to avoid the most "obvious" response. Again, the control subjects' choices


250

STEVEN PINKER AND RONALD A. FINKE

Figure 2. The shapes depicted in subjects' drawings of the imagined display rotated 90° counterclockwise. (Numerals next to the vertices indicate the particular object that the subject drew at
that location: 1 = "frog," 2 = "bear," 3 = "bug," 4 = "fish.")

belie this interpretation. The most frequent

second choice was "rectangle," again
followed by diamond.
If two-dimensional patterns really do
emerge when an imaged display is rotated,
why did only 7 of the 16 subjects claim to
have seen a parallelogram? One possibility
is that the other 9 subjects were simply
unable to rotate their image properly and
either placed the objects in their images at
random positions and described the shape

that they formed, or simply guessed a shape
at random. Our results, however, do not
seem to support this interpretation. First,
the most frequent choice of the subjects,
after parallelogram, was trapezoid, which,
of course, differs from a parallelogram in
that one pair of sides is not parallel. This
selection occurred despite the fact that
trapezoid was a relatively infrequent choice
among the control subjects. Second, the
subjects' drawings of the display as


TWO-DIMENSIONAL PATTERNS IN ROTATED IMAGES

imagined from the side suggest that the
particular trapezoids or quadrilaterals that
these subjects claimed to have seen in
their images were in fact approximations

or simple distortions of the parallelogram
that they should have seen, had their images
been perfectly accurate (see Figures 1
and 2). Note that the subjects' verbal
descriptions and drawings complement each
other nicely. The verbal descriptions suggest that the subjects "saw" shapes within
certain categories, but they do not tell us
which shapes within a category (out of the
infinity of possibilities) were seen. The
drawings, on the other hand, might be
contaminated by various aspects of the
subjects' drawing abilities, but they do
suggest that the rough orientation and
base-height ratio of the target figure were
available to the subjects. Taken together,
then, the two sorts of data suggest that
subjects' images depicted figures closely
approximating the target parallelogram.
There are four reasons why some of the
subjects might have seen a shape closely
related to the parallelogram in their
images, rather than the parallelogram
itself. First, subjects' image-encoding processes may have placed objects at slightly
incorrect positions in the initial image.
This possibility is supported by the finding
that subjects in the perception condition
were more likely than subjects in the
imagery condition to call the first shape a
triangle (as opposed to a quadrilateral or
other shapes, X 2 (2) = 10.22, p < .01, suggesting that subjects' mental images failed

to preserve the collinearity of three of the
animals in the initial display. Second, the
image transformation process governing
the rotation may have introduced additional noise into the image, explaining why
subjects who saw the rotated display were
slightly more likely to describe it as a
parallelogram than those who imagined
the display (although the difference in
overall choice frequencies was not significant, x2(3) = 6.00, p > .10. Third, the
images may have been accurate, but
subjects in the imagery and perception
conditions may have differed in their ability
to detect the defining geometric features

251

for a parallelogram, or in their willingness
to call the figure a parallelogram if unsure
whether the defining features were present.
(In fact, there was considerable disagreement over the correct shape label even
among subjects in the perception condition.) Finally, subjects' images may have
been perfectly accurate, but they may
have rotated their images through a
different angle than the one defined by
the experimenter, a possibility we explore
in more depth later in the article. Whichever of these interpretations is correct, all
are consistent with our claim that people
can "see" the two-dimensional shapes (or
approximations to them) that emerge from
rotations of images in depth.

Using a different dependent variable
in this experiment, we have replicated
Tinker's demonstration that people can
imagine a three-dimensional scene as it
would appear from a new viewing angle.
It seems that the image encoding and
reconstruction processes do introduce distortions, but that these distortions are
sufficiently small that subjects see much
the same shapes in their new image as they
would if they were actually seeing the
display from that angle. Furthermore, the
subjects seem to have recognized these
shapes only after having imagined the
display in the rotated position. The findings
of this experiment therefore demonstrate
that people can detect, in a mental image
of a rotated scene, a pattern that was not
encoded explicitly during the original viewing of that scene.
Experiment 2
This experiment is designed to extend
Tinker's finding that people can scan
across a rotated three-dimensional image,
requiring proportionally more time to
scan longer two-dimensional interobject
distances in the transformed image. In the
present experiment, however, subjects did
not rotate the image through 90°, but
through two angles that were chosen
arbitrarily. If people can visualize the
two-dimensional patterns created by arbitrary amounts of rotation, we could rule



252

STEVEN PINKER AND RONALD A. FINKE

2500
2400
2300
2200
~ 2100
8 2000
— 1900

-120'

• Bear-Fish
Fish-Frogt
Bug-Bear*,

I 1800

*c 1700
'ãĐ 1600

Bug-Frog*

8 1500
1400
1300


y124x+1019
r-.94

ã Bear-Frog
"Bug-Fish

1200
1100

1000

3

4

5

6

7

8

10

Distance Between Objects (cm)

Figure 3. Mean response times for scanning between imagined objects separated by different twodimensional horizontal distances, following a 120° clockwise mental rotation of the display.


out the hypothesis that people employ
some special strategy that allows them
only to visualize the results of 90° rotations
(such as recalling the front-to-back component of the distance between two objects
in the original view, and using it to place
the objects the appropriate distance apart
from left-to-right in the rotated view).
Method
Subjects
Twenty members of the Harvard community
were paid to participate. Sixteen of these subjects
also participated in Experiment 1, performing the
present task immediately afterward in the same
experimental session.

Apparatus
The display was identical to that used in Experiment 1. In addition, telegraph keys labeled yes and
no were placed to the sides of the display within
reach of the subject, with the yes key on the side of
the subject's dominant hand.
A series of pairs of animal names was recorded
on tape, with approximately 4 sec separating each
name. These pairs were divided into nine blocks of
nine pairs each. Each block contained one instance
of each of the six possible pairings of animals in the
display (the yes pairs), plus three pairs (the no
pairs), each consisting of an animal in the display

followed by an animal not in the display ("pig,"
"cow," "dog," or "mouse"). Unknown to the

subject, the first block was intended for practice
only, and data from these trials were not analyzed.
The remaining blocks were coupled so that each of
the yes pairs occurred in one order in the first
block and in the other order in the second block.
As well, over the last eight blocks, each of the eight
animals occurred equally often as a member of a
no pair. The pairs were ordered so that no animal
was mentioned in more than two consecutive trials,
and so that there would never be more than three
consecutive yes trials or no trials.
The tape was played back on a two-channel
relay-controlled tape recorder. The first member of
each pair was recorded on one track, which was
channeled through a speaker. The second member
of each pair was recorded on both tracks and
triggered a voice-activated relay as it was reproduced. The relay triggered a millisecond clock and,
after a delay of .5 sec, stopped the tape recorder.
When either of the keys was pressed, the clock
stopped and the tape recorder restarted, ensuring
a constant 4-sec intertrial interval regardless of the
subject's response time.

Procedure
Of the 20 subjects, 16 had already learned the
positions of the objects by participating in Experiment 1; the remaining 4 subjects underwent a study
phase that was identical in all respects to that of
Experiment 1. With the empty cylinder at its
original position, the subjects were asked to imagine
the objects suspended in their exact positions and

to rotate the cylinder slowly in a clockwise direction,


TWO-DIMENSIONAL PATTERNS IN ROTATED IMAGES
maintaining an image of the objects rotating with
the cylinder. The experimenter told the subject to
stop as soon as the cylinder had been rotated a
certain amount—120° for 10 subjects, 33° for the
other 10 subjects. The subject was then asked to
"consolidate" his or her image by imagining each
animal in its correct location as the experimenter
read the animal's name; this procedure was continued until each animal had been named four times.
The subjects were then told to rest their fingers
comfortably on the telegraph keys, to shut their
eyes, and to listen to the tape when it started. When
the first animal of a pair was mentioned, subjects
were to imagine the cylinder and its contents and
to imagine, in addition, a thin vertical line that
was the height of the cylinder, lined up with the
named object and localized along the front edge
of the platform. The subjects were to maintain that
image over the next few seconds, and when the
second animal was named, they were to "scan"
horizontally over to it by mentally sweeping the
line across the front of the cylinder, pressing the
yes key as soon as the line arrived at the second
animal. They were asked to "focus" on the center
of the line while scanning and to scan at the fastest
possible speed that would allow them to see the line
at all times. If the second animal was not in the

cylinder, the subjects were asked to press the no
key as quickly as possible after "searching" the
image and failing to find it.3 Again, speed and
accuracy were stressed.
The experimenter then read a short list of animal
pairs as practice trials. As soon as the subject felt
that he or she could perform the task properly,
the experimenter started the tape. A short break
was provided approximately halfway through the

2500
2400
2300
2200
2100
2000
1900
1800
1700
1600
1500
1400
1300
1200

-33'

experimental trials. When these trials were over,
the subject was asked to complete a questionnaire
containing the following four items: "On what

percentage of the trials did you follow the instructions? If you did not follow the instructions on some
of the trials, what did you do instead? Did you use
any special tricks or strategies? and What do you
think the purpose of the experiment is?

Results and Discussion
Mean response times for correct yes
trials were calculated for each pair of
animals, separately for each of the two
groups of subjects. In Figures 3 and 4, these
means are plotted against the horizontal
distance between the objects. These distances were calculated by measuring the
separations between objects in a photograph of the rotated display taken from the
subject's vantage point and by scaling
these measurements to correspond to the
distances along the front edge of the plat3
This task was included to make response time a
less salient dependent variable to the subjects,
thereby reducing the likelihood that subjects would
deduce the purpose of the experiment. It also
allowed us to compare error rates for the different
animal pairs to rule out possible speed-accuracy
trade-offs.

Bug-Frog*

Fish-Fro

g
Bug-Fis


h
Sear-Fish 'Bear-Frog
y=67x + 1568
r - .80

1100
1000

253

8

9

10

Distance Between Objects (cm)

Figure 4. Mean response times for scanning between imagined objects separated by different twodimensional horizontal distances, following a 33° clockwise mental rotation of the display.


254

STEVEN PINKER AND RONALD A. FINKE

form, where the subjects were to have
imagined the sweeping vertical line. "Wild"
scores, those more than twice the mean of
their cell, were discarded prior to analysis.

Fewer than 1% of the scores were wild,
and the frequencies of occurrence of these
scores did not vary systematically with
horizontal interobject distance.
120° Condition
The response times in this condition
increase linearly with increasing horizontal
separations of the objects as seen from an
angle of 120° (r = .94). An analysis of
variance shows that these means differ
significantly from one another, F(5,45)
= 9.82, p < .001. Further, a trend analysis
shows that the linear relation between
mean response time and distance generalizes over subjects, F(l,45) =43.71, p
< .001; and that the deviation from this
linear trend does not generalize, F(4, 45)
= 1.34, p > .10.
All subjects claimed to have followed
instructions at least 60% of the time. Four
of the 10 subjects guessed that response
times should vary with distance, and 1 of
the 4 also confessed to deliberately manipulating his response times on occasion. When
data from these subjects are discarded,
the correlation between mean response
time and horizontal distance increases to
.97. In addition, both the overall differences
between means and the linear trend remain
significant, ^(5, 25) = 6.35, p < .01, and
F(l, 25) = 29.89, p < .001, respectively,
and the deviation from linearity remains

nonsignificant (F < 1). Thus, it is unlikely
that the present results are caused by
subjects' manipulating their response times
to comply with experimental demand
characteristics.4
Errors occurred in approximately 1% of
trials; although, on the average, they
occurred more often for pairs separated by
shorter distances, both their infrequency
(six errors across all subjects and yes trials)
and their nonsignificant correlation with
distance (r — —.39) make it unlikely that
a speed-accuracy trade-off is responsible for

the linear increase of response time with
distance.
Unfortunately, a serious confounding
prevents us from concluding unequivocally
that two-dimensional interobject separations are accurately represented in rotated
images. From the particular vantage point
used in this condition, the imagined objects
were almost exactly evenly spaced from left
to right, so that objects separated by
greater distances were also separated by
more intervening objects. (The correlation
between the two measures is .99986.) Since
people take more time to scan over more
objects, as well as over greater distances
(Kosslyn, Ball, & Reiser, 1978), the
differences in response times found in the

present condition may reflect only the time
taken to process more intervening objects
between "source" and "destination." Although subjects were asked to "focus"
on the center of the line, thus avoiding
intervening objects, and although distance
has been shown to affect scanning time
when the number of intervening objects is
held constant (Kosslyn et al., 1978; Spoehr
& Williams, Note 1), we cannot claim with
certainty that accurate interobject distances in rotated images are responsible for
the present results. We can only claim at
this point that the objects were placed in
positions appropriate for any rotation between 90° and 135°, the region of rotations
in which the objects have the left-to-right
ordering reflected in the response times.
However, for the angle of rotation used in
the second condition, the objects' horizontal separations correlate only .59 with
the number of intervening objects. If
response times in that condition can be
shown to vary with distance independently
of their variation with the number of
intervening objects, it would argue that
response times here, too, are influenced
by interobject distances in rotated images.
4
See Finke (in press) for arguments that a large
set of data on mental imagery cannot be explained
by subjects' knowledge or expectations, and Kosslyn,
Pinker, Smith, & Shwartz (in press a, b) for arguments that image scanning experiments in particular
are not contaminated by demand characteristics.



TWO-DIMENSIONAL PATTERNS IN ROTATED IMAGES

33° Condition
Figure 4 shows the mean response times
for object pairs separated by different
distances for this condition. As before,
different amounts of time were required
to scan between different pairs of objects,
F(5, 45) = 10.49, p < .001. As well, the
linear trend of response time with distance
(r = .80) generalizes over subjects, F(l, 45)
= 33.56, p < .001. However, here the
deviation from linearity is also significant,
^•
(4 , 45) = 4.73, p < .01, indicating that
interobject distances in an accurately
rotated image are not the only influence on
response times. Again, all subjects claimed
to have followed instructions at least 60%
of the time. When data are discarded from
the single subject who mentioned that
response time should vary with distance,
the results do not change: Mean response
times differ for different object pairs,
^(5, 40) = 9.95, p < .001, increase linearly
with increasing distances (r = .81), F(\, 40)
= 32.87, p < .01, and deviate significantly
from linearity as well, F(4,40) = 4.22,

p < .01. Errors occurred in fewer than 1%
of the trials, too infrequently to compare
across distances.
The significant deviation from linearity,
together with the results from the previous
condition, suggest that the number of
objects intervening between source and
destination might influence response times,
perhaps to the exclusion of effects of
distance itself. A multiple regression analysis, using horizontal distance between
objects and number of intervening objects
as predictors, and response time as the
criterion, refutes this interpretation. Together, these two variables account for
64% of the variance of the mean response
times. However, more than two thirds of
this predicted variance (43.7% of the total
variance) is uniquely accounted for by the
distance variable, an effect that is statistically significant when tested against its
total residual variance, F(l,49) = 15.06,
p < .001. Of the remaining predicted
variance, almost one third (20.3% of the
total variance) is jointly predicted by
distance and number of intervening objects,

255

whereas the percentage of variance uniquely
accounted for by the number of intervening
objects is close to 0. Thus, the effects
obtained in both experiments can be

attributed, at least in part, to the twodimensional distance between imagined
objects. If the number of objects intervening between the source and destination of
scanning has any effect whatsoever on
scan times, this effect is not apparent in
our data.
Why, then, do the mean response times
in this condition deviate significantly from
a straight line, with a correlation coefficient
considerably lower than those usually
obtained in image scanning experiments?
One subject provided a clue to the answer:
Upon completion of the task, he asked to
see the objects replaced in their proper
positions in the rotated cylinder and expressed surprise at the discrepancy between
the objects' positions in his image and in
the rotated display. In his image, at least
one object was displaced farther from its
initial position than it should have been,
given the amount of rotation. To see if
the relatively low correlation might have
been due to a systematic misplacement of
objects in the images, we had four new
subjects learn the display and rotate their
images using the same procedure that was
used by the experimental subjects. Then
they were asked to look away from the
cylinder and to indicate the positions of
the objects in their images by drawing and
labeling small circles inside a rectangle,
as subjects in Experiment 1 had done.

When the interobject distances measured
in the four drawings are averaged, they
correlate .93 with the mean response times
of the experimental subjects (see Figure 5).
In addition, the slope of the best fitting
line (presumably an estimate of the rate
of scanning) is 98 msec/cm, considerably
closer to the slope for the 120° condition
than is the slope calculated for the present
condition using the appropriate physical
distances. Although the mean interobject
distances in the drawings correlate more
highly with the mean number of intervening
objects than they do in the correct configuration, a regression analysis again


256

STEVEN PINKER AND RONALD A. FINKE

2500
2400

-33'

Bug-Frog*

2300
2200
Bug-Bea

ã
r

3 2100

ô 2000
~ 1900
.i

Fish-Frog*

1800

c1700
'ã5 1600

ã Bug-Fish
Bear-Frog:

• Bear-Fish

| 1500

1400

y*98x + 1322
r=.93

1300


1200
1100
1000

10

Distance Between Objects (cm)

Figure 5. Mean response times for scanning between imagined objects separated by different twodimensional horizontal distances, following a 33° clockwise mental rotation of the display, when
distances are calculated from drawings of the rotated display contributed by other subjects.

suggests that it is distance per se that determines response time. Distance uniquely
accounts for 11.6% of the variance, a
marginally significant amount when tested
against the total residual variance, .F(l, 49)
= 3.26, p < .10; the number of intervening
objects accounts for an insignificant .3%
(F < 1); and the shared or confounded
effects of distance and number of objects
accounts for 75.5%
Thus, although it appears likely that
response times are determined by horizontal
interobject distances in subjects' images,
these images were consistently distorted.
A comparison of the subjects' drawings
and the correct display suggests two
explanations for this distortion. First, if
subjects had positioned objects in a
rotated image by using a triangular function (i.e., linear changes in the projected
position of an object with rotation) instead

of a sine function, they would have shifted
some of the objects into the incorrect
positions that we observed in the drawings.
Though there is no reason (other than the
data at hand) to have predicted that a
triangular function would be used, conceivably it could reflect the use of some
simple heuristic, such as shifting an object

by an amount proportional to the amount
of rotation, instead of an accurate rotation
conforming to the laws of projective
geometry. A second possible explanation
for the distortion is that subjects rotated
their images in an accurate manner, but
rotated them too far. The cylinder must
be rotated 50° for the objects to assume
the same approximate positions relative to
each other as they do in the drawings, a
value that is 17° larger than the correct
rotation. Thus the subject might have
rotated the objects farther than the amount
defined by the physical rotation of the
cylinder.
In summary, the results of this experiment are consistent with those of Experiment 1: When people inspect two-dimensional patterns formed by objects in an
image rotated in depth, their responses are
closely related to, but not identical with,
the responses that one would expect if
objects actually were rotated and actually
could be seen. In Experiment 1, we could
only suggest that the rotated images were

distorted in spme way. In the present
experiment, however, we suggest that the
many possible types of nonrandom distortion can be narrowed down to two. In the


TWO-DIMENSIONAL PATTERNS IN ROTATED IMAGES

following experiments, we attempt to
distinguish between these two possibilities
and to replicate and extend the results
obtained so far.
Experiment 3
In the previous experiments, subjects
were required to imagine the display
rotating by a certain amount and then to
perform certain tasks pertaining to the
two-dimensional properties of the display
in the resulting image. In the present experiment, we denned a certain two-dimensional property in advance and instructed
the subjects to rotate their images until
that property was apparent. Specifically,
we asked subjects to rotate the cylinder
until two given objects were aligned
vertically in the image. We then compared
the amount of rotation needed to align
each pair of objects in images with the
amounts needed by subjects who perform
the task while actually observing the same
objects, and with the correct amounts
dictated by the laws of geometry.
There are two advantages to this

technique. First, it is possible that subjects
in the previous experiments did not
transform their images continuously as
they rotated the cylinder, but instead
constructed the rotated image from scratch
when the cylinder had been rotated the
required amount. To rotate the cylinder
by just the amount needed to align two
imagined objects, subjects cannot wait
until the cylinder is rotated the requisite
amount before constructing their image—
they must either shift the positions of the
objects in their images continuously as
they rotate the cylinder or, at the very
least, construct the image repeatedly as
the cylinder is rotated by small increments,
stopping when the last image constructed
satisfies the requirement that the two
objects be aligned. The second advantage
of this procedure is that it permits one to
see if the discrepancy between the "target"
angle and the angle subjects select bears
some relation to the absolute amount of
rotation. If so, one can then decide between
the two remaining hypotheses concerning

257

the type of nonrandom distortion introduced into rotated images.
Method

Subjects
The 16 subjects who served in the imagery
conditions of the two previous experiments performed the present task immediately upon completion of the task in Experiment 2. Ten of the 16
subjects in the perception condition of Experiment
1 were paid to participate in the perception condition
in the present experiment, performing the present
task at the completion of Experiment 1.

Apparatus
The display was identical to those of Experiment
1 and 2. No additional materials were used.

Procedure
The subjects in the imagery condition had all
learned the display during their participation in
Experiment 1, and as in Experiment 2, they were
given no additional exposure to the display. After
the empty cylinder had been restored to its original
orientation, the subject placed his or her chin in the
chinrest, and was asked to listen to the pair of
animals recited by the experimenter and to rotate
the cylinder and the image until he or she could
"see" the two animals vertically aligned in the
image. The experimenter recorded the amount and
direction of rotation, returned the cylinder to its
original position, and read another pair of animals,
until each of the six pairs had been mentioned. A
different order of presentation was used for each
subject.
The subjects in the perception condition performed the same task, but with the objects visible

in the cylinder in their correct positions at all times.

Results and Discussion
Given the positions of the objects in the
cylinder and the distance of the cylinder
from the subject, one can calculate the
amount of rotation of the cylinder that is
necessary for the vertical plane joining any
two objects to intersect the midpoint
between the subject's eyes. This amount
should coincide with the amount of rotation
necessary for the two objects to be aligned
vertically from the subject's vantage point.
The calculation can be done in two different
ways for each pair, corresponding to one
or the other object being closer to the


STEVEN PINKER AND RONALD A. FINKE

258

120

Đ 60

-60
ã Imagery
& Perception
-120


-180

-120

-60

)

60

120

180

Target Angle (deg)

Figure 6. Mean rotations of the cylinder to align
pairs of imagined or perceived objects. (Negative
values represent clockwise rotations; positive values
represent counterclockwise rotations. The solid line
represents perfectly accurate performance; the
dotted line is the best-fitting line to the means from
the imagery condition.)

subject. Since none of the vertical planes
denned by the object pairs in this display
include the cylinder's axis of rotation, the
two possible amounts of rotation that
align the members of each pair do not

simply differ by 180°. When the two
amounts differ substantially for a pair of
objects (i.e., when one object is substantially farther forward than another), one
might expect all subjects to align the
objects in the same way, choosing the
smaller amount of rotation. But when the
two amounts are close (i.e., when the
objects are at similar depths), one might
expect some subjects to align the objects
one way and some to align them the other
way.
This is in fact what happened in both
the imagery and the perception conditions.
For four pairs; of objects, one member was
in the forward half of the cylinder and the
other was in the rear half, and there was
virtual unanimity among subjects in both
conditions as to the way in which the
objects should be aligned, and thus as to
the direction in which the cylinder was
actually rotated. (Only one of the imagery
subjects rotated the cylinder the long way
around, doing so for two of the pairs.) For

the other two pairs, both objects were in
the same half of the cylinder, and subjects
were divided as to the direction of rotation:
9 of the 16 subjects in the imagery condition chose to align each pair in the direction
requiring the smaller amount of rotation,
and 7 (for one pair) and 8 (for the other)

of the 10 subjects in the perception condition did the same. Hence, there are eight
different angles of rotation to examine for
each group, one for each of the four
"unanimous" pairs, and two for each of
the two pairs in which subjects disagreed
as to the direction of rotation (excluding
the instances in which 1 subject in the
imagery condition went against the other
25 subjects in her choice).
In Figure 6, the mean amounts of
rotation for the eight different alignments
are plotted against the correct amounts,
separately for the imagery and perception
conditions. Not surprisingly, subjects in the
perception condition are extremely accurate : The points representing their mean
amounts of rotation fall very close to the
principle diagonal, deviating from the
correct angles by amounts varying from
.8° to 7.0°. The means are highly correlated
with the correct values (r = .9995) and
the line that best fits the points has a slope
close to unity (.99) and an intercept close
to zero (3.7°). The standard errors of the
mean for the different points are also small,
ranging from .6° to 3.3°.
The mean angles through which the
subjects in the imagery condition rotated
the cylinder for the different alignments
also vary linearly with the target angle
(r = .994) and fall along a line with an

intercept close to zero (2.3°), but with a
slope less than unity (.77). The means
deviate from the correct values by amounts
ranging from 8° to 28° ; the standard errors
of the mean are also larger than those in the
perception condition, ranging from 1.3°
to 8.0°.
As before, we find that subjects are
accurate in detecting a two-dimensional
property of a configuration of imagined
objects rotated in depth, when their
performance is considered within the range
of possible responses. However, again we


TWO-DIMENSIONAL PATTERNS IN ROTATED IMAGES

find that this performance differs markedly
from optimal performance, exhibited in
this case by the subjects who actually saw
the objects rotating in the cylinder. Here,
the nature of the discrepancy can be
characterized simply. The subjects who
imagined the objects consistently rotated the
cylinder by a fraction less than the correct
angle. This suggests that their image of
one or more of the objects "moved" a
constant fraction farther than the amount
denned by the physical rotation of the
cylinder.

This account is consistent with the
distortions observed in the second condition of Experiment 2. In that condition,
both the drawings provided by the four
additional subjects and the overall pattern
of response times suggested that subjects
displaced at least one of the objects farther
than they should have. The regression
equation in the present experiment (see
Figure 6) implies that the images of the
subjects in the 33° condition of Experiment
2 actually correspond to a rotation of 46°,
a prediction that is close to the 50° rotation
suggested by the drawings of the four
additional subjects. And in fact, the
response times correlate .85 with the horizontal interobject distances in the display
when rotated 46°, not as high as their
correlation with the mean distances in the
drawings, but higher than their correlation
with the distances in the correct display.
Although these results support the second
of our possible explanations as to why
subjects in the 33° condition in Experiment
2 scanned distorted images, they do not
rule out the first explanation entirely.
Subjects could have overestimated the
amount by which the cylinder had been
rotated, and could have displaced the
imagined objects linearly with the estimated amount of rotation. In the following
experiment, we examine these and other
possibilities. In addition, we wish to

determine whether subjects would give
evidence of moving their images too
quickly in a task involving fixed amounts
of rotation, or when the cylinder is rotated
beyond the largest angle used so far.

259

Experiment 4
In this simple experiment, subjects
imagined a single object rotating with the
cylinder and indicated the frontal projection of the imagined object by aligning a
marker with it, doing so for many angles
of rotation. If subjects can rotate an image
of an object in depth and can see how far
it has traveled in two dimensions, we
would expect the marker's position to vary
sinusoidally with the angle of rotation of
the cylinder, with an amplitude equal to
the radius linking the object to the axis of
rotation. If subjects in fact rotate their
images by too great an angle in comparison
to the cylinder's rotation, we would expect
the marker's position to vary initially
according to a sine function with a shorter
period than the correct one (until the line
on the cylinder tells the subject that he or
she should be approaching 360°, at which
point the subject might slow down his
or her mental rotation to bring it back into

phase with the physical rotation). On the
other hand, if subjects estimate the amount
of rotation accurately, but overestimate
the extent of the lateral displacement of an
imagined object that results from a given
amount of rotation, we would expect to
find a sine function with a greater amplitude
than the correct one. Such a tendency
could reflect, for example, a strategy of
attending to a physical mark on the
cylinder instead of an image of an object
inside the cylinder. Finally, if the subjects
laterally displace an object in an image by a
distance that is proportional to the angle
of rotation (as was suggested earlier), we
would expect to find a triangular function
instead of a sinusoidal one.
Method
Subjects
The 10 subjects who participated in the perception
conditions in Experiments 1 and 3 also served in
the present experiment, immediately upon completion of the task in Experiment 3. Data from two
of these subjects were discarded because of frequent
wild responses, which could not meaningfully be
combined with the other data in calculating means
and standard errors.


STEVEN PINKER AND RONALD A. FINKE


260

Amount of Rotation (deg)

Figure 7. Mean setting of a marker aligned with a single object rotated with the cylinder by various
amounts. (The dashed line represents perfectly accurate performance.)

Apparatus
The platform supporting the cylinder was modified
by the addition of a slide rule mounted horizontally
on two posts attached to the front edge of the
platform. The slide rule stood 13 cm in front of the
cylinder at about half its height. The only markings
on the rule consisted of a linear millimeter scale,
visible only to the experimenter. A thin vertical
wire was attached to the back of the cursor, extending its hairline 1.25 cm beyond its upper edge.

Procedure
The cylinder was oriented so that its two vertical
black lines were at its sides, instead of at the front
and rear as in the previous experiments. A single
animal (the bug) hung at half the height of the
cylinder, 7 cm in front of the axis of rotation. The
subject was required to learn to imagine the object
in its correct position, following the procedure
outlined in Experiment 1 but with two modifications:
The object was kept at the same height at all times,
and the allowable error was reduced from 3 mm to
1 mm.
The experimenter then asked the subjects to

rotate the cylinder until instructed to stop, imagining
the object rotating with it. As soon as the subjects
stopped rotating the cylinder, they were to slide the
cursor until it was aligned directly in front of the
imagined object, shifting their heads sideways from
the chinrest, if desired, to "sight" the object. The
experimenter recorded the cursor setting, restored
the cylinder to its former position, and instructed
the subjects to rotate the cylinder and image again,
stopping them this time after a different amount of

rotation. This was repeated 20 times, using rotations
at 18° increments ranging from 0° to 342°. The
angles were read in a random order that varied from
subject to subject. Four of the eight subjects rotated
the cylinder clockwise, and four rotated it counterclockwise, so that the effects of rotation by a
given amount can be separated from the effects of
rotation through a given region of the interior of
the cylinder.

Results and Discussion
Figure 7 shows the mean displacements
of the marker for different amounts of
rotation, collapsed over the clockwise and
counterclockwise directions.6 The figure
also contains the correct sine function
corresponding to perfectly accurate performance. The subjects' settings display a
pattern close to the sine function (the two
variables correlate .99), but these settings
seem to be shifted or compressed to the

left for the first three quadrants. Ignoring
the last quadrant for the moment, this
pattern of responses is precisely what we
would expect if the subject's images were
rotated faster than the cylinder itself.
6
Standard errors of
of the mean, computed sepaTiiir»e <"\f
QiiniAr 4*e average
a \rtira art* .,69
AO r*m
cm
of subjects,
rately for the two groups
and range from .2 cm to 1.4 cm.
>


TWO-DIMENSIONAL PATTERNS IN ROTATED IMAGES

These data weaken the contention that
subjects in these experiments simply overestimate the extent of the horizontal displacement of the imagined object for a
given angle, (e.g., if they focused on a
marking on the surface of a cylinder instead
of an imagined object inside it), which
would have yielded a sine function with a
greater amplitude but with no leftward or
rightward shift as compared to the correct
function. Similarly, the results show that
when people rotate an image, the movement of an imagined object in two dimensions is not simply a linear function of the

amount of rotation, but is closer to the sine
function dictated by the laws of geometry.6
These results are consistent with those
of Experiments 2 and 3, in which the inaccuracies in the subjects' performance
were attributable to their rotating their
images faster than the cylinder. To assess
whether the overrotation of images in the
present experiment is of the same approximate magnitude as it was in the previous
experiment, we estimated the correct angle
of rotation that would correspond to each
of the mean marker settings. These estimations were made by dividing each mean
by the amount of the largest mean and
calculating the arcsine of that value. To
compare the two experiments directly, we
selected the first eight angles used in the
present experiment (corresponding to the
range of rotations employed in Experiments
1
2 and 3), calculated the arcsines separately
for the clockwise and counterclockwise
directions, and plotted the eight actual
cylinder angles used in this experiment
against the eight arcsines, which estimate
the angles through which the images were
rotated. Thus, cylinder rotations are plotted
against corresponding image rotations, as
they were in Figure 5. As before, the two
sets of angles correlate highly (.99) and
fall along a line with a slope less than unity
(.87; the intercept is —5°). The two

estimates, .77 and .87, seem to be in
reasonable agreement, given the differences
between the two methods used to estimate
them.
Why should the marker settings for the
three largest angles of rotation in the fourth

261

quadrant fall below the correct values when
the trend of the preceding angles would
lead us to expect that they should fall
above them? Perhaps the presence of the
lines on the cylinder, which were at the
sides at the beginning of rotation, cued the
subjects after a certain amount of rotation
that their image of the object was quickly
approaching its original position in the
cylinder. As a result, they may have
deliberately slowed down their rotation as
the object passed into the fourth quadrant,
perhaps to the extent of causing it now to
lag behind the cylinder. Given the presence
of a clear frame of reference on the cylinder,
it is not unreasonable to expect that
subjects might eventually have noticed the
distortions introduced by the rotation
process and taken corrective measures of
some sort.
Overall, the results of the last three

experiments agree in suggesting that at
least for rotations up to 130°, subjects tend
to rotate their images approximately 15%
to 35% farther than they rotate the
cylinder. These results are compatible, in a
weaker sense, with those of Experiment 1
as well, inasmuch as consistent overestimation of the amount of rotation might cause
the imagined configuration to differ from
the target configuration.
General Discussion
The present series of experiments provides convergent evidence that people can
localize mental images of objects in threedimensional visual space, that they can
mentally rotate the configuration of objects
in depth, and that they can detect twodimensional perspective properties that
emerge from that rotation. These properties include the two-dimensional geometric
shape formed by a set of objects, the
horizontal separations between objects, the
vertical alignment of objects, and the
6

Examination of the marker settings of individual
subjects confirms that the sinusoidal shape of the
aggregate function is not an averaging artifact.
Only one of the eight subjects seemed to have
moved the marker linearly with increasing rotation
and did so only in the first two quadrants.


262


STEVEN PINKER AND RONALD A. FINKE

position of an object's projection onto the
frontal plane. The procedures that transform the positions of imagined objects
introduce a degree of error into the resulting
image, primarily by failing to coordinate
a rotation in imagined space with the
corresponding rotation in visuomotor space.
However, this source of error should not
obscure the fact that subjects' images were
indeed accurate when their performance
is considered within the range of possible
responses in each experiment. First, the
two-dimensional shape actually formed by
the rotated display was the shape subjects
most often reported "seeing" in their
images, and the other shapes subjects
named were similar to that shape. Second,
the time subjects required to scan horizontally between objects in a rotated image
reflected the horizontal separations between
the objects as seen from the new angle
(although, in one case, that angle was
overestimated). Third, subjects rotated the
cylinder by an amount that was linearly
related to the degree of rotation that
would align two objects, were they visible
in the display. Fourth, the judged position
of the frontal projection of an imagined
object varied sinusoidally with the angle
through which it had revolved about the

vertical axis.
What do these results say about the
mechanisms underlying the mental representation of visual space? At the very least,
we need a representation in which the twodimensional perspective properties specific
to a vantage point are represented and also
a representation that preserves the threedimensional spatial structure of a scene.
Furthermore, the present results (as well as
those of Shepard & Metzler, 1971, and of
Pinker & Kosslyn, 1978, and Pinker, in
press) suggest that the latter structure
might represent information in a format in
which rotation or translation can be performed smoothly and continuously. Recent
advances in the computational study of
shape recognition (Marr, 1978; Marr &
Nishihara, 1978) suggest what these two
sorts of representations might be like. Marr
and Nishihara propose that the process of
recognizing an object's shape makes use of a

format in which objects are represented by
volumetric "shape primitives" ("generalized cones" of various shapes and sizes)
whose relative positions are defined with
respect to a coordinate system centered on
the object. Something like this objectcentered representation, or 3-D model, is
plausible as a long-term memory representation for objects and their absolute locations in a scene. Marr and Nishihara also
propose a second format, in which the
visible regions of a scene, as well as their
depth and tangent-plane orientations with
respect to the viewer, are represented in a
coordinate system centered on the viewer's

vantage point. Something like this viewercentered representation, or "2f-D sketch,"
is plausible as a short-term memory representation of the appearance of a set of objects, since the two-dimensional properties
and relations visible from a given vantage
point are represented perspicuously.
Marr and Nishihara propose that during
perception, the information in the retinal
images is converted into a 2J-D sketch,
which is then converted into a 3-D sketch,
the level at which shape recognition takes
place. We suggest that during imagination
the inverse of this process might occur. In
this case, information stored in a "3-D"
format, together with a specification of
the position and viewing direction of the
viewer's vantage point relative to the scene,
would be fed into a process that computes
the appropriate "2J-D" or viewer-centered
representation. At this level, which is
probably common to both imagery and
perception, two-dimensional properties like
those defined in the present experiments
can be "read off" the display by ignoring
depth and surface orientation information.
The distinction we have drawn between
these two types of representations receives
further support from Kosslyn and Shwartz's
(1977, 1978) computer simulation of twodimensional visual imagery. For entirely
different reasons, they, too, posited two
levels of representation: At their "deep
level," an object is represented as a list

of two-dimensional polar coordinates, corresponding to the filled-in points in a depiction of the object, with the coordinate


TWO-DIMENSIONAL PATTERNS IN ROTATED IMAGES

axes centered on that object. At their
"surface level," the coordinates of the
various objects to be imagined are mapped
onto a single two-dimensional coordinate
system, and the corresponding points in
that coordinate system are "filled in" to
display the objects. Kosslyn and Shwartz's
"deep" representation resembles a 3-D or
"spatial" representation, with its coordinate systems each centered on the represented object. Their "surface display"
resembles a 2f-D or "perspective" representation, with its single coordinate system
centered on the viewer's "fixation point."
Finally, in both our account and theirs,
there must be a process that maps the
various object-centered representations
onto a single viewer-centered representation, a process that in our account also
computes the two-dimensional properties
of the surface image determined by the
laws of perspective.
If in fact there is a process that transforms information from a three-dimensional
to a perspective-specific format, the results
of the present experiments indicate that
the values it can accept for the new vantage
point are not restricted simply to 90° shifts
relative to the original vantage point. Nor
are they restricted to rotations within some

narrow range of angles. In fact, it seems
possible that the process instantiates a
very general algorithm fordepicting shapes
from any angle, like those found in sophisticated computer graphics programs. In any
case, our results suggest that this process
would introduce noise into images, both
in the form of random perturbations of
remembered objects' positions and in the
form of systematic perturbations that
result when the amount of rotation has been
mistakenly estimated with respect to a
physical rotation in the external world.
A final note: The claim that the human
mind is equipped with a component that
computes the exact perspective appearance
of a scene from any viewing angle might
strike the reader as farfetched. After all,
children, non-Western peoples, and preRenaissance artists are notorious for their
inability to depict three-dimensional scenes
accurately in two-dimensional media such

263

as painting and drawings (Arnheim, 1954).
However, this supports several interpretations. Perhaps these people can visualize
the perspective appearance of the scene,
but are simply unable to coordinate their
motor plans for drawing with the patterns
depicted in their images. A second possibility is that these people are capable of
letting a perspective mental image guide

their sketching, but it does not occur to
them to use this strategy. Instead, they
may rely on their knowledge of objects'
three-dimensional shapes, which is sound
practice when reasoning about objects in
the world, but may lead to systematic error
when the objects must be depicted on a
two-dimensional surface (cf. Phillip, Hobbs,
& Pratt, 1978), On this view, the development of artistic skill may in part be an
example of "metacognitive development"
(cf. Flavell, 1977), whereby people learn
how to exploit the special properties of
their own cognitive structures and processes.7 A third possibility is that certain
people may have difficulty simultaneously
mapping the different underlying components of an object or scene into a unique,
viewer-centered coordinate system. Instead,
they might mentally depict each part as
it would appear from its own optimal
vantage point and combine the separate
views into a single sketch. This would
explain why a common error people make
in depicting cubes is to draw each face as
a square and to display more faces than
can actually be seen in a single glimpse.
Clearly, we have few grounds for distinguishing among these possibilities at
present, but the issue may be a promising
subject for the developmental and crosscultural study of visual cognition.
7
In fact, some artists are trained to represent
perspective by "projecting" a mental image of a

scene onto the canvas and sketching in the contours
that they "see."

Reference Note
1. Spoehr, K. T., & Williams, B. E. Retrieving
distance and location information from mental
maps. Paper presented at the 19th annual meeting


264

STEVEN PINKER AND RONALD A. FINKE

of the Psychonomic Society, San Antonio, Texas,
November 9-11, 1978.

References
Abelson, R. P. Script processing in attitude formation and decision making. In J. S. Carrol & J. W.
Payne (Eds.), Cognition and social behavior.
Hillsdale, N.J.: Erlbaum, 1976.
Arnheim, R. Art and visual perception. Berkeley:
University of California Press, 1954.
Attneave, F. Representation of physical space. In
A. W. Melton & E. J. Martin (Eds.), Coding
processes in human memory. Washington, D.C.:
V. H. Winston, 1972.
Attneave, F. How do you know? American Psychologist, 1974, Z9, 493-499.
Attneave, F., & Farrar, P. The visual world behind
the head. American Journal of Psychology, 1977,
90, S49-S63.

Attneave, F., & Pierce, C. R. Accuracy of extrapolating a pointer into perceived and imagined
space. American Journal of Psychology, 1978, 91,
371-387.
Finke, R. A. Levels of equivalence in imagery and
perception. Psychological Review, in press.
Fiske, S. T., Taylor, S. E., Etcoff, N. L., & Laufer,
J. K. Imaging, empathy, and causal attribution.
Journal of Experimental Social Psychology, in
press.
Flavell, J. H. Cognitive development. Englewood
Cliffs, N.J.: Prentice-Hall, 1977.
Huttenlocher, J., & Presson, C. Mental rotation and
the perspective problem. Cognitive Psychology,
1973, 4, 277-299.
Keenan, J. M., & Moore, R. E. Memory for images
of concealed objects: A re-examination of Neisser
and Kerr. Journal of Experimental Psychology:
Human Learning and Memory, 1979, 5, 374-385.
Kosslyn, S. M. Measuring the visual angle of the
mind's eye. Cognitive Psychology, 1978, 10,
356-389.
Kosslyn, S. M., Ball, T. M., & Reiser, B. J. Visual
images preserve metric spatial information.
Journal of Experimental Psychology: Human
Perception and Performance, 1978, 4, 47-60.
Kosslyn, S. M., Pinker, S., Smith, G., & Shwartz,
S. P. The how, what, and why of mental imagery.
Behavioral and Brain Sciences, in press, (a)

Kosslyn, S. M., Pinker, S., Smith, G., & Shwartz,

S. P. On the demystification of mental imagery.
Behavioral and Brain Sciences, in press, (b)
Kosslyn, S. M., & Shwartz, S. P. A simulation of
visual imagery. Cognitive Science, 1977, /, 265-295.
Kosslyn, S. M., & Shwartz, S. P. Visual images as
spatial representations in active memory. In
E. M. Riseman &A. R. Hanson (Eds.), Computer
vision systems. New York: Academic Press, 1978.
Marr, D. Representing visual information. In E. M.
Riseman & A. R. Hanson (Eds.), Computer vision
systems. New York: Academic Press, 1978.
Marr, D., & Nishihara, H. K. Artificial intelligence
and the sensorium of sight. Technology Review,
1978, 81, 2-23.
Metzler, J., & Shepard, R. N. Transformational
studies of the internal representation of threedimensional space. In R. Solso (Ed.), Theories in
cognitive psychology: The Loyola Symposium.
Potomac, Md.: Erlbaum, 1974.
Phillip, W. A., Hobbs, S. B., & Pratt, F. R. Intellectual realism in children's drawings of cubes.
Cognition, 1978, 6, 15-33.
Piaget, J., & Inhelder, B. The child's conception of
space. London: Routledge & Kegan Paul, 1956.
Pinker, S. Mental imagery and the third dimension.
Journal of Experimental Psychology: General, in
press.
Pinker, S., & Kosslyn, S. M. The representation and
manipulation of three-dimensional space in
mental images. Journal of Mental Imagery, 1978,
2, 69-84.
Reed, S. K. Structural descriptions and the limitations of visual images. Memory & Cognition, 1974,

2, 329-336.
Reed, S. K., & Johnsen, J. A. Detection of parts in
patterns and images. Memory & Cognition, 1975,
3, 569-575.
Segal, S. Nonparametric statistics for the behavioral
sciences. New York: McGraw-Hill, 1956.
Shepard, R. N., & Metzler, J. Mental rotation of
three-dimensional objects. Science, 1971, 171,
701-703.
Warren, R. M., & Warren, R. P. Helmholtz on
perception: Its physiology and development. New
York: Wiley, 1968.
Received April 2, 1979 •



×