Course Outline

Intro to Psych home

EnvironmentalET home

Sensation and Perception

Anthony G Benoit

The Chemical Senses
Skin Senses
The Body Senses
Psychophysical Basics



Light is a form of electromagnetic radiation with wavelengths between 400 and 750 nanometers. Most of the radiation transmitted by the sun is visible light. Other forms of EM include infrared (which snakes can see), microwaves, radio waves, ultraviolet, and x-rays.

The wavelength of light determines its color, or more specifically its hue.

The amount of energy in a light ray determines its intensity.

The degree to which a light ray contains only a narrow range of wavelengths determines its saturation.

The Eye (see Figure 12.3 page 179)

Light passes through the transparent cornea and the pupillary opening of the iris. It is focused by the lens then passes through the vitreous humor to strike the retina at the back of the eye. The incoming light forms an inverted image on the rear of the eye.

The retina contains light sensitive receptors (~6.5 M cones and ~100 M rods) and nerve cells.

Light passes through the neurons (ganglion cells and bipolar cells) before reaching the receptors (Fig 12.5, p 181).

Cones are sensitive to color.

Each cone has a connection to one bipolar cell.

Rods are not sensitive to color, but are more sensitive in low light.

Rods contain purple rhodopsin ("visual purple") a light sensitive pigment.

Several rods connect to one bipolar cell.

Rods can interact in one of two ways: convergence (activity in one rod excites its neighbors) or lateral inhibition (activity in one rod inhibits its neighbors via an interneuron-try for a description of this and other visual processing tricks).

Cones are most dense at the fovea, directly behind the lens. The optic nerve exits the eye at a blind spot (no receptors) slightly toward the centerline of body from the fovea (see Figure 12.6 p 181)

The optic nerve is composed of the axons of the retinal ganglion cells. It carries signals from the eyes to the brain. The signals from the right and left visual fields crossover at the optic chiasm.

Visual Processing

Visual processing starts right in the retina as described above, and continues in the thalamus and cortex.

Each neuron in the visual system has a receptive field. This refers to the part of the visual scene that the neuron responds to and to the upstream cells that the neuron responds to.

For example, a ganglion cell changes its rate of firing in response to a certain pattern of light and dark within a specific part of the image because that pattern causes certain rods on the retina to respond.

The Society for Neuroscience web site has a brief article on visual development with a diagram of the visual system by Lydia Kibiuk. See fig 12.7 on page 182 for the visual pathways. (Other drawings at

Hubel & Wiesel won the Nobel Prize for studying the neuronal organization of the visual cortex of the cat. Their work described how neurons are capable of feature detection.

The visual system carries out hierarchical processing. Three types of cells were identified:

Simple cells respond to bars, lines, or edges in certain orientations.

Complex cells respond to stimuli showing bars, lines, or edges undergoing particular movements (left to right, etc)

Hypercomplex cells respond to complex arrangements of length, edge, angle, movement, etc



Thalamus (lateral geniculate nucleus)

Visual Cortex (Cells here are called "feature detectors")

Other Areas (visual association area, inferior temporal cortex)

Cell Type:


Bipolar Cells

Ganglion Cells

Layers of
Simple Cells





Respond to:



Bullseye with center on or off

Similar to ganglion cells
 · form/color
 · movement/distance

Oriented bar of light

Moving oriented bar of light

Moving oriented bar of given length

Depth, movement, recognized objects

The visual cortex has a columnar arrangement (ie, blocks of cells are arranged in columns).

A visual pattern causes a pattern of blocks within the column to respond (this is not unlike the sensory and motor mapping previously discussed)

Dimensions of Vision

Acuity is the ability to recognize familiar shapes (such as letters) at a distance. This is highest near the center of the visual field (ie, near the fovea).

Resolution is the ability to distinguish between two lines that are close together.

Detection is the individual threshold.

Dark adaptation is the increase in sensitivity to stimuli in low light settings. Changes occur both in the cones and rods and in the higher processing centers.

The eyes are constantly moving (since they are sensitive to changes in stimuli). Also

saccadic movements (eg, in reading)

pursuit movements

Color Vision

tristimulus or trichromatic theory (Young-Helmholtz theory)

According to this theory, we have three types of cones, sensitive to yellow-red, green or blue-violet (such categories have been found, see diagram). Differential response from the different cones records color information from the image on the retina. Any color can be formed by mixing three pure colors.

opponent process theory

This theory neatly explains both negative afterimages (the sensation of complementary colors observed after adaptation) (Figure 12.15, page 188) and the perception of gray when complementary colors are mixed.

According to this theory, there are six types of ganglion cells (downstream from cones):

red plus, green minus

red minus, green plus

blue plus, yellow minus

blue minus, yellow plus

white plus (black minus)

white minus (black plus)

An after image occurs because a color causes adaptation in one category of cone. For example, as you stare at a green image, your green plus cones become adapted. When you look at a neutral background, the red plus cones (which have not adapted) respond normally and the green plus less than normal, so you perceive red.

in the visual cortex, color is recorded as activity in "blobs" of cells running up and down the columns


The Ear (Figure 13.3 page 192)

Air vibrating in the auditory canal causes vibrations in the eardrum. These vibrations are transmitted through the middle ear by the hammer, anvil and stirrup (bones) to the oval window of the inner ear. Movements of the oval window vibrate the cochlea, which consists of a basilar membrane covered by hair cells.

The middle ear acts as a mechanical amplifier.

Signals leave the inner ear via the auditory nerve.


Sound is a transverse compression wave transmitted through the air.

Sound waves have a frequency, which we perceive as pitch.

Humans are sensitive from 20 to 20,000 Hertz (cycles per second). A doubling of frequency is heard as an octave of pitch.

The amplitude of the sound wave gives rise to loudness.

The mixture of frequencies in an auditory stimulus gives it its timbre or tone color.

Theories of Hearing

Most humans are very sensitive to pitch. Sensation of pitch is explained by two theories:

Place theory holds that different sound frequencies cause different parts of the basilar membrane to vibrate, causing different hair cells to fire. This has been observed, but no part of the BM is sensitive to low frequencies.

Frequency theory holds that different sonic frequencies cause different rates of neural firing. This works for low frequencies, but needs to be modified for higher frequencies (over 1000 Hz) by assuming that there multiple neurons trigger a multiplication factor (eg, six neurons firing at 1000 times per second means 6000 Hz).


Differences in between the timing and intensity of stimuli arriving at the two ears allows us to localize the source of a sound.


The inner ear is also the sense organ of balance and motion (the vestibular sense).

The semicircular canals (three in each ear) respond to rotation (around three different axes).

The otoliths respond to acceleration or tipping of the head relative to gravity.

This information is vital for allowing you to stand up and for allowing the eyes to track.

The ocular-vestibular reflex moves your eyes automatically when you turn your head. Good hitters are better able to suppress this reflex so as to keep their eyes on the ball longer.

The Chemical Senses

These respond to chemical rather than physical stimuli.


Dissolved chemicals on the tongue stimulate the taste buds (located in clusters on bumps called papillae). Within each taste bud is a number of receptors sensitive to one of four different tastes (sweet, salt, sour, bitter).

humans have ~10,000 taste buds (photo on page 124)

the different tastes are localized on the tongue

the combination of four taste "hues" with other oral sensations leads to the wide variety of food experiences

cross adaptation

Sweet food B tastes less sweet after you have tasted sweet food A.

cross enhancement

Sweet food B tastes more sweet after you have tasted salty food A (hence the appeal of salted nuts on ice cream). Similarly, salt inhibits the perception of bitter tastes.


Airborne chemicals dissolve in the moisture lining the nose. These dissolved chemicals come into contact with the receptor cells of the olfactory epithelium (fig 4.24, p 122).

Humans have about 5 million receptors of about 1000 different types. Dogs have about 100 million receptors and a larger proportion of their brains dedicated to smell.

Odor seems to defy reason.

There are no useful schemes for classifying odors.

We are not good at identifying odors but we are quite good at recognizing them (ie, recalling that we have smelled them before).

Odors seem remarkably able to evoke memories and feelings.

The common chemical sense

The trigeminal nerve innervates a spot in the nose that is sensitive to chemical irritants like ammonia, vinegar, or menthol.

Skin Senses

The skin is the largest sense organ, with a variety of receptor types.

The average adult has about 20 square feet of skin, about 6 pounds.


The skin has pressure sensors called Pacinian corpuscles (Pc). When deformed, the Pc responds. If the pressure is constant, the Pc adapts.

Hands and face have the most pressure receptors and so are most sensitive to touch. Back, arms and legs have the least.


The skin has two types of temperature sensitive areas: warm areas and cold areas.

Cold areas respond to cold (50F) or hot (~125F) stimuli.

Warm areas respond only to warm stimuli (greatest response around 105F)

If only cold areas are stimulated, we perceive cold. If cold and warm are stimulated we perceive hot. The body can be fooled by a surface that has stripes of cold and warm (such a surface will feel very hot).


Pain is elusive and variable, but unmistakable when encountered.

The perception of pain varies widely, even for the same stimulus.

psychological state

Pain alerts us to potential tissue damage by external threats:

corrosive chemicals
electric shock
sharpness (cutting, etc)

Or from internal problems:

organ damage
organ dysfunction

The most widely accepted theory of pain is the gate-control theory proposed by Melzack and Wall (fig 4.26, p 128).

Melzack & Wall, 1965, "Pain Mechanisms: A New Theory," Science: 150, 171-179.

This theory is a response to the single-neuron theories of the 50s and before. It was influenced by the systems thinking prevalent in the 60s (see diagram) and explains many of the properties of pain described above.

Under this theory, pain is controlled by "gates," special nerve centers in the spinal cord. These open and close (transmit or stop signals) in response to stimuli. When the gate is open, the message reaches the brain and pain is perceived.

Touch signals are normally carried by large diameter fibers (neurons). The pain signal starts when free nerve endings (a special type of small-diameter sensory nerves) respond to intense stimulation. Small fiber activity opens a gate or neural switch in the spinal cord allowing pain signals through to the brain. The gate can be closed by fast fiber activity or by signals coming back from the brain.

Note that the large fibers are fast (myelinated) with low thresholds. The small are slower (including unmyelinated C fibers) with higher thresholds.

Free nerve endings are located in skin, muscles and internal organs.

Stimulation of free nerve endings triggers pain and inflammation.

The free nerve endings carry a signal to the spinal cord and trigger the release of a special neurotransmitter called substance P.

Substance P triggers messages that pass through the open gates to the thalamus, frontal lobes, and other parts of the limbic system.

The brain integrates the pain information with other signals.

The brain can actually send back signals which can close the gates and reduce the pain. Alternatively, the signals may open more gates, intensifying the pain.

Anxiety, fear, helplessness can increase perceived pain, as can physical distress, muscle tension or sympathetic arousal.

Positive feelings, laughter, distraction, a sense of control, and vivid imagery can all reduce perceived pain.

In response to stress, the brain releases endorphins. These inhibit the transmission of pain signals in the brain and inhibit the release of substance P in the spinal cord.

The Body Senses

Movement is aided by the kinesthetic sense (a sense of the position of parts of the body).

The kinesthetic sense depends on proprioceptors, sensory neurons located in muscles and joints.

The control of movement requires an integration of kinesthetic, vestibular and visual information.

Psychophysical Basics

psychophysics = the study of sense reception



In the context of sensation, a stimulus is a physical feature of the world (typically some form of energy or force).

The distal stimulus is the thing "out in the world" that gives rise to our perception (eg, that tree over there). The proximal stimulus is the immediate effect of the stimulus on a sense organ (the image of the tree on the retina).

absolute thresholds (see Table 5.1, p 140)

Sense receptors translate the physical energy of stimuli into nerve impulses.

The absolute threshold is the minimum level of physical stimulus that can be detected by sense receptors.

eg, a single candle can be seen thirty miles away on a clear, dark night

These are expressed in terms of basic physical phenomena, since those were what were available to the psychologists who studied thresholds 100 years ago.

This varies from person to person and from time to time.

Psychologists usually define the threshold as the level detected by most people 50% of the time.

Signal detection theory holds that sensation of a faint stimulus is more complex than simple physical translation. Instead the decision mechanisms include a balance of the rewards and costs of noting or not noting the stimulus (see Fig 5.5, p 145).

Was that the front door knob being turned?

difference thresholds

aka "just noticeable differences"

The amount that a stimulus must change for it to be rated as different is called the difference threshold.

governed by Weber's Law: the JND is proportional to the intensity of the signal (typically a few percent--see Table 5.2, p 146 for more precise values)

subliminal stimuli

Stimuli that are do not reach awareness may be perceived.

Priming: Subjects can more quickly guess that a string of letters is a word (psychology) rather than gibberish (gocophylsy) when the string of letters is proceeded by a related word even if they are not aware of hearing the related word.

Blindsight (aka cortical blindness): People with damage to their primary visual cortex may respond to visual stimuli even if they are not aware of seeing it.

Even under anesthesia, people seem to perceive enough of a stimulus to be better able to guess its nature later.

Subliminal advertising doesn't work (Drink Coke). In fact, James Vicary, who claimed in 1957 to have found that subliminal movie images sell coke and popcorn, admitted in 1962 that he made up the study. Research since then has not found it to work. The lack of evidence for the effectiveness hasn't stopped some from trying.


Sensitivity to an unchanging stimulus decreases over time. We appear to be most sensitive to changes in the environment.


Organizing Principles (Gestalt Principles)

The Gestalt psychologists worked in Germany in the early 20th century (led by Max Wertheimer). They believed that perception depended on innate responses to relationships between parts of the visual field (patterns). Gestalt is German for "whole," "form," or "shape."

Figure-Ground (Fig 5.7, p 147).

We divide the perceptual field into figure and ground.

Figure has definite shape and location.

Ground has no shape and no location but seems to exists behind and around the figure.

Gestalt Laws of Grouping (Fig 5.8, p 148--see also Gestalt Principle sketches)

These are not natural laws without exception, but rather good summaries of the way we often perceive things.

law of similarity - things alike are grouped together

law of closure - we fill in gaps

law of proximity - things close are grouped

law of good continuation - we tend to connect things obscured by objects on top

law of simplicity - simpler patterns are preferred


These do not explain perception as much as challenge our explanations.

Size constancy - the perceived size of an object does not change as the object moves.

size-distance invariance (we take distance into account)

relative size

Shape constancy - the perceived shape of an object does not change as it rotates (right angles perceived on a cube seen vertex-on).

Brightness invariance - the perceived brightness or darkness of an object does not change under different lighting conditions.

The visual system is insensitive to absolute brightness (except at the pain threshold). We can estimate brightness by changes in detail and are sensitive to brightness differences within the visual field.


prototype matching

According to this theory, stimuli are compared to abstract prototypes (stored in your memory).

feature analysis (Fig 5.17, p 158)

A stimulus is processed into its component features. The components can then be compared with prototypes. The neural processing of Hubel and Wiesel seems consistent with this (hierarchical organization).

bottom-up vs top-down processing

Feature analysis is a form of bottom-up processing. The whole is recognized as a combination of recognized components. This alone will not do--how do we fill in the blanks (or recognize half a letter or face?) (exc_ama_ion!). Computers are good at bottom-up processing but are very bad at recognizing objects. The constancies argue against pure bottom up processing (the components of a perceptual object are not constant).

Top-down processing allows our expectations to influence our perceptions (12 I3 14 vs A I3 C). We expect to find certain objects in certain situations, and look for just enough cues to confirm our suspicions.

Both appear to be necessary.

Neural network models

Bottom up and top down can be reconciled by using a network model for perception.

These are known as neural network models or parallel distributed processing (PDP) models.

Perceptual elements are connected to one another.

The connections can be inhibitory or excitatory.

Activation of one node in the network spreads through excitatory connections.

The strength of connections increases with repeated use.

Initial activation is bottom up

Based on edges and shapes, some letters are recognized or certain parts (geons) are recognized. Certain words or objects contain these shapes.

The potential words or objects facilitate top down processing.

Seeing some of the letters in a word energizes letters that are likely to be in words that have the recognized letters (fig 5.24 p 162) (and probably activates the nodes that correspond with those words).

The word superiority effect: "G" is more likely to be recognized in CARGO than in ORCGA.

Recognizing shapes activates other shapes that are in possible objects.

The object superiority effect: patterns are more easily recognized within an object.

Computer programs have been written to simulate these networks.

These underly the best machine recognition schemes, such as text recognition, speech recognition, object recognition and even depth perception.

More abstract models are capable of reasoning (problem solving, drawing conclusions, learning) which is consistent with the human metaphor of perception for reasoning: "I see your point." "Show me how you reached that conclusion."

The electronics can be a standard digital computer, but there are software analogs of neurons as basic units that make up the overall program.

The behavior of neurons is clearly consistent with this model.

The semantic network model of memory is very similar to the PDP model of perception.

Illusions, extensively studied by Coren in the 80's

Müller-Lyer illusion (Fig 5.6d, p 146)

Ponzo illusion (lines on a fan) and Poggendorf illusion (Fig 5.6a, p 146)

due to misapplied constancy

the moon illusion

illusions due to eye movement; which set of dots are closer together?

X                     X
   X       X        

Depth Perception

Monocular Cues

  1. size cues - images covering more of the retina are judged to be larger or closer
  2. linear perspective - parallel lines converge with distance
  3. texture gradient - smoother surfaces are judged to be farther away
  4. atmospheric perspective - less distinct edges are judged to be farther away
  5. interposition (overlap) - if one image covers another, it is judged to represent a closer object
  1. aerial perspective (height cues) - above the horizon, higher objects are judged to be closer; below the horizon, lower objects are judged to be closer
  2. shading - objects cast consistent shadows; brighter parts of an object are judged to extend toward the viewer

Binocular Cues

  1. convergence - you need to turn your eyes inward (cross your eyes) to look at closer objects
  2. retinal disparity (binocular parallax or stereo vision) - the image projected by an object is slightly different on each retina
Binocular parallax or stereo vision is an example of top-down and bottom-up interacting:

To pick out an object, I need to locate it in space.

To pick it out in space, I need to be able to perceive differences between its image on my two retinae

To determine the differences on my retinae, I need to recognize the object.

To recognize the object, I need to pick it out in space....

Note: Triangulating on a distant point is simple, but teaching a robot to see in three dimensions is almost impossibly hard.

Return to: Top Course Outline Intro to Psych Environmental ET

Anthony G Benoit
(860) 885-2386