Don Herbison-Evans
Faculty of Applied Science
Central Queensland University
Bundaberg
donherbisonevans@yahoo.com
(updated 11 October 2004)
The techniques available for the animation of 3D images by computer are summarised, and exemplified by still and animated images. Monocular techniques discussed are: hidden lines and surfaces, depth cueing, background blurring, perspective, surface shading,and temporal parallax. Binocular techniques covered are: parallel stereo pairs, crossed stereo pairs, Anaglyphs, image switching, polarised images, dual displays (Virtual Reality), Lenticular film, Pulfrich effect, Multiplex holograms, and Autostereograms. Polyocular techniques discussed are: volume scanned displays, vibrating mirrors, intersecting beams, layered displays, and full computed holograms. The last may require a processor capable of around 1019 operations per second. These are not yet readily available.
A number of techniques are at last providing computer professionals and users with 3D spatial images which can be animated. Such animated sequences can be thought of as four dimensional entities. Other methods are available for presenting 4D entities, e.g. using combinations of RGB colour space plus geometric space, but none have the impact of spatial 3D + time, as may be judged by the popularity of Virtual Reality and of Autostereograms with the general public for recreational purposes. The combination of 3D imaging and animation is the best way of presenting complex data [Ware, 1996].
This article is aimed at providing a background to the various techniques which over the years have contributed to our current position [Okoshi, 1976].
The illustrations and movies were all composed using the NUDES animation system (an acronym for: Numerical Utility Displaying Ellipsoid Solids) [Herbison-Evans, 1978], using figures composed by the author and his students [Herbison-Evans, 1987]. For the images which are designed for parallel stereo viewing, the dots at the top/bottom of these images should be 3 to 5 centimetres apart for greatest comfort. The reader may need to scale the images depending on the display being used. The animations are all 31 or 63 frame cyclic loops in the mpeg format, and are best shown in continuous repeating mode. The animations where available are obtained by clicking on the appropriate single images. A short commentary (in aiff format) on each animation example can be obtained by clicking on the loudspeaker icon under each image.
We obtain much information about the 3D nature of the world that we see by simply using the information in an image at one eye:
The opacity of many objects in our world means that a nearer object can
obscure parts of a more distant object. This led to the 'Hidden Line' and
'Hidden Surface' problems in computer graphics
[Herbison-Evans, 1983].
The many algorithms for the
rapid solution of this problem have been a mainstay of computer graphics
academic examinations for two decades [Hearn and Baker,1986].
Simple examples for comparison are shown in Figures 1 and 2, and
Animations 1 and 2:
Figure 1
Figure 2
Most animals have an upper surface that is darker than the underside.
This has been interpreted as an attempt at camouflage, because
monochromatic objects are brighter on the side closest to a source of
illumination. Outdoors, averaging over all normal upright orientations of
an animal, the sun is overhead. Thus a simple shading of a computed image
of a 3D object from lighter on top to darker underneath is a good way of
presenting a solid look to the object. This has been used for many years by
artists, under the term
Chiaroscuro.
Figure 3
This effect has been used for many years in Chemistry and
in Computer animation. In this technique, items that are
close to the observer are made more intense or thicker than items further
away. This has been common practice in diagrams used in
stereochemistry.
It is equivalent putting
an implicit light source at the observer's eye,
rather than say, above the scene.
It does have the advantage of requiring no computation of shadows.
Except for the shadows, the difference between this and
the vertical illumination described above for Chiaroscuro
is computationally trivial: just a question of referring the shading to
the y component rather than the z component
of the surface normal [Herbison-Evans, 1980].
However, visually, the impacts are very different. Vertical light gives a
simple natural feel to the solidity of objects in a scene, even if no
shadows are computed. Light from the observer washes out any shading
due to curvature of the objects in a scene. It also gives the feel
of being down a mine, with a light on the hat illuminating the scene. For
many observers this is associated with a claustrophobic sensation which is
usually not what the programmer had in mind.
Examples may be seen by comparing Figures 3, and 4.
Figure 4
An early observation was that backgrounds appear to be
more blurred
than the objects in front of them which have our visual attention.
This is probably due to both the limited focussing depth of field
of the ocular lens,
and to the reduced resolution available outside of the 1/2 degree field
of the fovea centralis, as well as the effects of mist and dust in the
intervening atmosphere
('
aerial perspective').
Examples are shown in Figures 5, 6, and 7:
The effect has been widely used by painters, photographers,
and cinematographers to give a 3D feel to their images [Rokita, 1996].
Figure 5
Figure 6
Figure 7
Temporal or Motion parallax
can be obtained using only one eye by means of animation: near
objects move faster on the visual field than distant objects. And while
this is true in general, more specifically the brain infers 3D character
most effectively from the comparison of two views of a scene, taken with the
scene rotated a couple of degrees about a vertical axis between the views.
This is the basis of binocular 3D imaging, about which more later. But it
works even with one eye. The brain can apparently use the same
algorithms for depth perception whether the two images arise from different
eyes at the same time or from one eye at different times. This effect has
been used in computer graphics for many years by having the default state
of the display of a 3D entity to be a continuous slow rotation about a
vertical axis.
The reader may care to compare the apparent solidity of
the molecules rotating about the vertical and the horizontal
axes in Animation 8: if the theory is correct, the ones rotating
about the vertical axis should appear more solid than the ones
rotating about the horizontal axis.
Figure 8
Another early observation was that images of nearer objects appear
larger than images of more distant objects. This led to the science of
Perspective, epitomised by the projective transformation learned by all
computer graphics students. This transformation does have two minor
problems however. The first is that the perspective is for viewing from
only one point in space. If the observer's eye is placed anywhere else, the
image is incorrect, and the 3D effect is strained. The second problem is
that the perspective image is projected mathematically onto a flat plane,
whereas the retina of the eye is approximately hemi-spherical.
This again leads to a distortion around the periphery of images
generated by the flat perspective projection.
An example may be seen in the animation at Figure 9:
Figure 9
Another 3D effect is called
'texture gradient', where the pattern of some
texture such as grass varies with distance.
Statistically: the autocorrelation function
of the texture contracts as distance increases.
The illusion of solidity is particularly strong if the left and right eye
images can be separately computed, and each presented to the appropriate
eye. Doing this by computer has problems. The eyes can only appreciate the
solidity of a scene if the ratio of the nearest to furthest distance is
greater than about 2:3, and the left and right scenes differ by a rotation
of about 2 or 3 degrees about a vertical axis (clockwise as viewed from above
if deriving the right eye view from the left). Given a screen resolution of
about 4 pixels/millimetre, for most binocular techniques, this quantises
depths to about 16 possible values, which can give an unfortunate layering
effect. Printed stereo images, having an order of magnitude better
resolution, can give nearly continuous depth perception, as the eye/brain
has itself only a resolution of about 250 values.
>
Also called 'Parallel Viewing', this
simple technique is to place the stereo images side by side, left on the
left, right on the right. The viewer must then keep the axes of the eyes
parallel as if looking at infinity, while focussing on the display.
Initially, this is not easy, and some practice may be necessary.
This display technique is called
Freeview.
A restriction is that the images must be placed no further apart than the
spacing of the eyes, and hence they can each be no wider than this, and each
is limited to using only half the display width. An example is shown at
Figure 10:
Figure 10
Alternatively, the images can be placed side by side, left on the right
and right on the left, commonly called the
'Crossview' or 'Transverse View' technique.
To view such images, the viewer must keep the eyes crossed, as if
looking at something closer than the screen, while focusing on the screen.
This is not initially easy either. The images however are now no longer
restricted in their spacing, and can be of arbitrary width. However, again,
only half the area of the display can be used for each image, effectively
halving the display resolution. An example is shown at Figure 11:
Figure 11
A 3D scene can be displayed in
Anaglyph
form [Morgan & Symmes, 1982].
The image for one eye is displayed in red, and the other in green. The observer
must wear glasses with one red lens and one green. The major disadvantage
of this technique is that only monochrome images can be displayed. Also
the observer needs to use special eyewear, which is troublesome for
people who normally wear spectacles already. An example may be seen at
Figure 12:
Figure 12
The left and right images can be displayed alternately at a rate faster
than the critical flicker frequency of the eye (about 20 per second). The
observer again needs to wear special
Shuttered Glasses, this time having lenses that are switched to
alternately blacken out each eye
in synchronism with the display.
The images with this technique can now use the whole screen
and be in full colour, but the eyewear problem remains.
Two normal displays can be used and their images
Polarised at orthogonally to each other,
and then combined with a half silvered mirror.
The observer must wear glasses with cross polarised lenses. This
technique is the one used most often for '3D Movies'. The complexity of
the display and the glasses are problems.
Two tiny displays, one for each image, can be mounted on a headset,
and mirrors used to present the images to each eye. This is the technique
normally used in
Virtual Reality systems [Pimentel & Teixeira, 1993].
Again the headset is a problem.
A
Lenticular film composed of a set of thin vertical cylindrical lenses
(each typically with a width of 1/4 millimeter)
can be placed over the screen, and the two stereo
images dissected into vertical stripes and interlaced so that the lenses
present each stripe to the appropriate observer's eye. This technique has
been used on picture postcards for many years.
Figure 13
It has the advantage of not needing the observer to train the
eyes or to wear special gear.
Its main disadvantages are the reduced horizontal resolution,
and the difficulty of aligning the images stripes with the lenses.
The
Pulfrich effect can be used to give some 3D illusion. This depends
on the fact that the eye requires longer to process dim images than bright
ones. Thus if animation is created of a scene rotating about a vertical
axis, and the appropriate eye covered with a dark filter, the scene
will appear in 3D [Watkins & Mallette, 1996]. If the left eye is the one
darkened then the scene must be rotating clockwise at about 40 degrees per
second for the effect to occur. An interesting variant is the possibility
that if one eye is dominant, it may process an image faster than the other
eye, in which case no special glasses need be used.
Unfortunately, people having a dominant left eye would require the scene
to be rotating in the opposite direction to those with a dominant right eye.
Examples may be compared in the two molecules in Animation 14:
Figure 14
A Multiplex or
Rainbow
hologram can offer a 3D monochrome image, albeit
multicolored [American Banknote Company 1984]!
It is composed of a series of thin vertical holograms of
flat images of the scene, each generated with the scene rotated a degree
or so from the last one about a vertical axis. Each eye of an observer
sees a different hologram stripe, each reconstructing the scene at the
appropriate angle for that eye as if the scene were solid. The generation
of a Multiplex hologram takes some time, so no interaction with the image
is possible, but limited animation is possible in the precomputation of
the set of flat images. The horizontal resolution is restricted, as well
as the scene having to be monochromatic. However, observers need no special
eyewear, and multiple simultaneous observers are easily accommodated.
A number of commercial firms offer the service of custom making Multiplex
holograms of virtual objects.
The latest technique is to overlap the left and right images, and
indeed to overlap multiple copies of each. The result is called an
Autostereogram [Tyler, 1994].
The left and right images can use random
dots or indeed any arbitrary pattern, for displaying the depth, with no
coloring information (making a Random Dot Stereogram or Single Image
Stereogram), or be in full
shaded colour (resulting in a Wallpaper Stereogram). Possibly a better
technique is to combine the shaded colour at each pixel 50/50 with
a random value, giving a sort of speckled autostereogram. This combines
the proper colour shading of the wallpaper stereogram with the depth
precision of the random dot technique. The Random Dot stereograms pose a
problem for some systems because they are impossible to compress.
When using random numbers, the same set of random numbers can be used for
every frame, which gives peculiar effects for the observer, because some of
the lines change numbers and some do not, so patches of the image appear
not to animate. Alternatively, different random numbers
can be used for each frame, which gives a sparkling effect
to the images, as though the objects were covered in sequins.
An example of a Wallpaper stereogram may be seen at Figure 14:
Figure 15
An example of a Random Dot Stereogram may be seen at Figure 15:
Figure 16
These are very hard to see, so a hybrid one with the same figure
but a plain background is shown at Figure 16:
Figure 17
Another type of hybrid speckled stereogram may be seen at Figure 17:
Figure 18
These create an image of solid appearance
which is viewable by any number of eyes,
and so, for example, are viewable by several people at once.
Various methods have been tried for this.
A display screen can be driven mechanically to scan a volume,
preferably faster than the critical flicker frequency of the eye.
For example, a cathode ray tube can be constructed so that the screen
oscillates or rotates inside the vacuum.
Alternatively, a scannable laser beam can be arranged
to shine on a rotating screen, so as to scan out volume.
A
Helically Shaped Screen has been suggested for this.
Also, a Rotating Array of LED's has been tried.
With any of these techniques,
the points on the image are painted in appropriate syncronism with
movement of the screen so that they appear at the required point in the
3D space scanned by the screen.
Another technique is to view a normal CRT via a
Vibrating Mirror.
This technique can give a large virtual scanned volume by making the mirror
curved, so magnifying the CRT screen image. If the mirror is driven by a
loudspeaker mechanism, its focal length can be oscillated, and so a virtual
image of considerable depth created [Traub,1967].
An optically non-linear material can be used to create
Double Quantum transitions.
Two scannable invisible infra-red beams are shone into the
material at an angle to each other.
Where they intersect, visible light is
emitted, so that the beams can be scanned to draw 3D shapes in the medium.
A
set of transparent flat surface displays can be placed on top of each
other to make a solid volume [Sullivan, 2005].
These layers do not have to fill the volume as the eye interpolates a
pair of dots on adjacent screens as falling in between them.
A resolution of 20 depths per pair of layers can be easily achieved
by using various proportions of brightness devided between two
successive layers.
The ultimate 3D display may be the computed projected full
Hologram
[Firth, 1972]. The computational problems however are immense. For example,
a 10 cm. square hologram would require approximately 10^12 pixels, the
value at each being a Fresnel transform of the light coming from the scene.
For interactive animation, these 10^12 transforms would need to be computed faster
than the eye's critical flicker frequency. Suitable materials exist which
change their optical properties when an appropriate voltage is placed
across them, and this can be done using transparent electrodes. Thus a
hologram could be made from such a material, and the electrodes can
be striped and painted at right angles across opposing faces, so that a
tiny volume of material can be addressed where the stripes cross.
Thus suitable displays seem quite possible. The problem is that the value at each
output pixel is an integral (approximated presumably as a finite sum) over
all the input pixels in a depth map of the scene. So the problem appears to
require a processor capable of approximately 10^25 multiplications and
additions per second. This does seem to pose a difficult problem. One
of the best hopes for reducing this might be to compose the scene from
flat triangles, and to approximate the Fresnel transform of each by its
Fourier transform multiplied by a quadratic phase factor representing
the depth of its centre [Vaughan-Taylor, 1994]. If the Fourier transforms
are precalculated and held to appropriate accuracy in tables, a scene of
say 10,000 triangles
(enough for a human figure given current screen resolutions) would only
need 10^19 operations per second. Further research may lead to improved
ways of cheating.
Thanks are due to many staff and computer graphics students at the
University of Technology, Sydney, and the
Central Queensland University, Rockhampton and Bundaberg,
for their assistance beyond the call of duty,
and especially to Andrew Marriott of Curtin University
for turning the first draft of this seminar into an HMTL
document and generally being supportive as the author ventured gingerly
onto the web.
An animated image of solid appearance viewable by any number of
unemcumbered observers is still a dream popularised by such images as
that of Princess Leia in the movie Starwars, and Selma in the TV series
Timetrax. A number of techniques are closing in on this dream, many
requiring rather specialised types of hardware. With hardware speeds
currently reaching 1 Teraflop (10^12 floating operations per second)
and improving at approximately a factor of 2 every 18 months,
[Bell,1996] it may take only 37 years for this dream to become possible.
Hidden Surfaces
Chiaroscuro
Depth Cueing
Background Blurring
Motion Parallax
Perspective

Texture Gradient
BINOCULAR TECHNIQUES
Freeview

Crossview
Anaglyphs
Shuttered Glasses
Cross Polarised Images
Virtual Reality
Lenticular Film

A pair of stereo images with the left and right images interlaced.
A suitable lenticular film must be placed over the
image to see the 3D effect.
Pulfrich Effect
Multiplex Hologram
Autostereograms
POLYOCULAR TECHNIQUES
Volume Scanning
Vibrating Mirror
Double Beam Display
Layered Displays
Holograms
ACKNOWLEDGEMENTS
CONCLUSIONS
REFERENCES