How the Senses Combine in the Brain

Marcel Kinsbourne

The design characteristics of the human and non-human primate brain are congruent with its role in the world. Life is a multimodality enterprise. I shall discuss how our brain reconciles its differentiated organization with the fact that we experience objects in multiple modalities (and often respond with coordinated body parts). Indeed, when single modalities are appreciated in isolation, it is by an afterthought, an act of abstraction from the multisensory whole. Correspondingly, the brain is organized so that the combination of modalities that characterizes a thing is taken into account early in the microgenesis of a percept. A strictly modular organization of specialized processors is not ideal for such a purpose, since it would call for yet additional "association areas" dedicated to bring the unimodal sources of information into congruence. Such an architecture would incur expense in processing capacity and delay in timing of the experience and consequent response. It is not surprising that evidence for "association modules" is hard to come by. The number of meaningful ways in which modalities may combine in animals with sophisticated nervous systems is astronomical, and no merely structural concatenation of converging connections could possibly accommodate them all. Since the functional interactions are legion and changeable, the underlying neural interactions must be dynamic and plastic. Relatively simple nervous systems incorporate strictly modular processors (expert systems) that act automatically and autonomously, and whose logic is not accessible to the rest of the brain. At the other extreme, the human brain is notable for its ability to apply a configuration or rule ("paradigm") derived from one modality, to other modalities, and even to abstract (amodal) reflection. The more the rules are practiced and the earlier they are acquired, the less accessible they are to generalization (c.f. syntactic rules of first languages).

Although things in the world are multimodal, all attributes are not equally relevant to the observer's adaptive needs. Inbuilt perceptual and response hierarchies prioritize how attributes are processed and responses selected. These rigid innate hierarchies are modified, fragmented and supplanted during development and the learning that attends it. To the extent that their attributes are approximately equally salient, objects will appear distinctive. To the extent that relatively few attributes capture attention, the observer will be apt to note equivalencies.

When stimuli are simultaneously delivered in two sensory modalities, the early electrical potentials that arise in each unimodal cortex are amplified by the potentials that arise in the other one. This multisensory interaction can be observed as early as one twentieth of a second after onset of the stimulus. Correspondingly young children find crossmodal matching easy, as long as the mapping is straightforward and not confounded by the need for mental transformation. 

The senses may merge so as to lose their individualities, collapsing into neurons that receive polysensory innervation. This happens in the superior colliculus, and in certain polymodal cortical areas. Or the brain may grant one stimulus priority over the rest (a phenomenon that finds maximal expression in the neuropsychological syndrome of unilateral neglect). Or stimuli may align with each other while maintaining their individuality, by entraining across the network. The rule that I have termed the Functional Cerebral Distance Principle formulates the conditions in dual-task performance under which two inputs (or two outputs) are readily combined, or, conversely, held in mind simultaneously as distinct entities. I shall present experimental demonstrations of this principle. Skills may be most readily acquired when the individual is guided by multiple concurrent inputs, and responds with several coordinated effectors.

Polymodal forebrain areas are evolutionarily ancient, primitively organized and highly interconnected. The more specialized unimodal areas are also not modular. Six-layered neocortex harbours no discontinuities and is uniformly organized throughout. Neurons interconnect profusely, reciprocally or as polysynaptic loops. Most connections are cortico-cortical rather than input- or output-related. Accordingly, the cortex acts as a differentiated whole. The stream of consciousness reflects the location of the momentary maxima of activation in the forebrain neural manifold. When multiple modalities cross-facilitate, the representation of the object being presented is strengthened as a candidate for inclusion in consciousness.

Multisensory percepts may not be cobbled together or bound from their attributes, but instead emerge from global "premodal" rudiments. Cross-modal integration is not accomplished by convergence, but is anchored in the shared topographical location and the timing of its referents. Inbuilt coordinations are adaptive. Thus certain neurons in auditory (temporal) cortex respond to a sound if its source is in peripersonal space, and thus within grasping reach, but not otherwise. Individual "mirror" neurons code both for perceiving another individual in the act of grasping, and for one's own intention to grasp. They illustrate the reciprocal interaction between, and overlap of, circuitry for perception and intention. Thus the cell that reacts to perceived grasp also fires when the organism initiates grasping. It may be part of a circuit that previsions the goal or intended outcome of the act (mirror neurons only fire if something is actually being grasped, and not when an empty grasping movement is on view). Activating one modality primes others and evokes readiness to respond. Thus, to assist communication, an intact sensory dimension can serve as "carrier" for an injured one. Multisensory communication capitalizes on how the brain is organized.