I thought I knew something about movement until I broke my foot.
In March 2012 I sustained an Mt-II stress fracture in my right foot. At the time, I had been practicing yoga daily for about eleven years and taught whenever I got the chance. I ran several times a week in barefoot-style trainers, doing interval training in the summers and sustained hill climbs in the winters. I lifted weights, I meditated for an hour every morning, I invested, I thought, a lot of energy, in the face of an absurdly cerebral professional life, in staying connected to my body. Then one Friday morning at the start of March, when I was preoccupied with a paper I was to present the next week and exhausted from dealing with the unhappy end of a winter fling, I went running and had to stop at 39:12 with an unusual discomfort in my right foot.
My foot was swollen and tender but not exceptionally painful. I assumed it was sprained and taped it, took arnica, gave my paper and went to Paris for a short break, where I practiced twice at a favorite yoga studio and imagined the pain was abating. It was seventeen days before I received a diagnosis. When the orthopedist in Berlin pronounced my foot “kaputt” and sent me, on crutches, across town for an MRI, it felt like a gaping hole was opening up before me. Vigorous movement was my main strategy for keeping sane.
Breaking my foot turned out to be a great gift. The ten weeks I spent on crutches made me exquisitely aware of how central kinesthesis is to how we experience ourselves as subjects, that is, as first-person presences—and as social presences.
Figure 1: Author’s Mt-II marrow reaction, 19 March 2012.
We are creatures of movement. The thereness of the world, including our own bodies, arises from our capacity to anticipate how the flow of information to our senses would change if we altered our own position. This attunement to sensorimotor coupling is the product of our early explorations of the world. Motor activity entails two sets of nervous signals: efferent impulses to the muscles and an efference copy that gets compared to afferent, or sensory, impulses to determine whether a given sensory event reflects our own activity (reafference) or activity in the environment (exafference). In this way movement makes salient the boundary between self and non-self, or, as philosopher Susan Hurley put it, “Our understanding of the world as distinct from, and independent of, the self is most deeply grounded in environmental recalcitrance in the face of our rational efforts at motor control” (Hurley 2005: 143).
Efference copy helps you know that it is you, not just the world, that is moving, but it does something more: it provides a basis for fine-grained Bayesian predictions as to the likely outcomes of possible motor acts via simulated, or offline, motor activity. As we grow into our motor selves, we extend and refine our offline predictive faculty, developing an internal loop linking potential motor acts with sensory consequences. It is our ongoing exploration of the world in cerebro that lends the experience of reality its palpable quality. That chair, that building, that person, are there to the extent you can imagine—automatically, unconsciously—how they would present themselves if you shifted your stance (Noë 2004).
The body itself is a source of reafferent stimuli, as when you stretch and your arm swings into view. This is an example of exteroceptive, or body-external sensation. To this we must add body-internal sensory processes: interoception, or awareness of the physiological status of the body’s tissues, proprioception, and vestibular sensation. Proprioceptive stretch receptors embedded in the muscles, tendons, and ligaments, together with balance and motion receptors in the vestibular apparatus of the inner ear, afford the basis for kinesthesia, the phenomenal experience of movement. Visual, auditory, vestibular, and proprioceptive signals come together in the temporoparietal junction, just past the inner ear on either side of the head. When the four systems agree on the body’s position and orientation, somatic presence is coherent and action and perception originate from a single point in space. This spatial coherence is part of the basis for the reflexive experience of egocentric or first-person perspective. Disturbances in the temporoparietal junction are associated with autoscopic phenomena, including heautoscopy (bilocation, the feeling of being in two places at once) and out-of-body experiences (Blanke and Metzinger 2008; Gapenne 2010).
Living things, in particular sentient beings and above all our own kind, have a special quality of presence, a social presence. This too comes from movement. We are attuned to movement in the environment, we seek it out and imitate it covertly in our own bodies. When we observe patterns of movement that resemble our own, the observed movement triggers activity in the same neural circuits responsible for initiating the corresponding movement in our own bodies (so-called mirror system activity). In this way we simulate observed non-self movement and form predictions as to what the other will do next. As with our predictions of the sensory consequences of possible self-movements, we are continuously comparing predicted to actually observed movement to refine our model of the other’s behavior. We look for patterns of self–other motor contingency: does this (person, creature, thing) respond to my movements with a characteristic sequence of its own? We look for signs of intentionality: is this other’s behavior oriented toward effecting specific environmental outcomes, as mine is? When we observe jerky or unpredictable movement in another, we feel unsettled. We project ourselves, again, automatically and unreflexively, into others’ first-person perspectives and form hypotheses about what they know (e.g., what is visible from where they stand) (Frith and Frith 2010).
Covert imitation represents overt imitation where actual movement has been inhibited by higher-order nervous function. The inhibition is never perfect, and we are prone to traces of overt imitation, giving rise to a pattern of motor resonance between interlocutors. Up to a point, overt imitation is prosocial: when a counterpart imitates you, you become more receptive to that other’s behavior, so long as you are not too aware of the imitation. Even in simple neural network models that omit explicit simulation, motor resonance on its own induces a form of cooperation (Froese and Fuchs 2012).
The human faculty for imitative learning far outstrips that of any other animal. Imitative learning entails three things: intention inference (simulating others’ behavior to form hypotheses as to what they are trying to achieve), motor emulation (reproducing others’ motor behavior simply on the basis of distal—visual, auditory—sensation), and the inference of a causal linkage between behavior and goal. No other animal consistently combines goal inference and motor emulation in this way. Imitation, as much as language, is a characteristically human behavior and part of the basis of our ability to attribute mental states to others (Hurley 2008; Tomasello et al. 2005).
Often we can infer another’s intentions and social stance even in the absence of contextual cues, simply observing her movements. With reaching and grasping behavior, for instance, we make fine-grained discriminations of wrist velocity and grip aperture. We use these discriminations, generally without being able to explain what we’re doing, to make predictions as to whether a particular action forms part of a cooperative act (taking something to give it to someone else), a competitive act (taking something preemptively), or an individual act (taking something to eat it in the absence of competitors) (Becchio et al. 2012).
At the base of social life, then, stands a shared experience of kinesthetic empathy, an ongoing choreography of movement and intention that binds sentient creatures together via palpable traces of co-presence. We sense others moving and respond in characteristic ways, as do they to our movements. Indeed, it is not just that animal nervous systems have evolved to respond to socially significant self–other motor contingency, but that the community is the evolutionary niche to which the animal nervous system is adapted (Di Paolo and De Jaegher 2012; De Bruin et al. 2012).
I’d be lying if I said I induced all this simply from walking on crutches, but that was a big part of it. Suddenly, dimensions of postural control I had been thinking and talking about as a student and teacher of yoga for more than ten years—alignment of the iliac crests, extension of the lumbar spine—were critical whether I’d be able to get from the S-Bahn station to home without having to pause to rest.
Walking with crutches tuned me in to something else too, something that does not get talked about in contemporary phenomenology of movement, even among those who espouse the view that cognition is a fundamentally somatic condition (Thompson 2007): we are creatures of rhythm, and our musicality, our capacity to entrain to a rhythmic pulse and to pick up on the rhythmicity of movement in the environment, is a big part of what allows us to enact shared moods, intentions, and dispositions (Phillips-Silver and Keller 2012).
Figure 2: Ai, a 37-year-old chimpanzee and veteran research participant at Kyoto University, spontaneously entrained to a distractor pulse after having been trained to tap keys on the keyboard as they were illuminated (Hattori et al. 2013; http://langint.pri.kyoto-u.ac.jp/ai/).
At the time I broke I my foot I was recovering from a PhD in linguistic anthropology, and I was writing a book that I thought would be about how register boundaries figure in endangered language documentation in Australia. You, and you, and I belong to a network of constantly evolving institutions defined and differentiated on the basis of shared access to speech registers, linked repertoires of gestural, referential, and syntactic behavior. In enacting these linked repertoires—and this enacting includes passive comprehension as well as active production—we enact our membership in various speech communities. All of us, at every moment of our waking and dreaming lives, are participating in multiple gestural-referential-syntactic communities.
It is by virtue of their ongoing enaction in speech communities that languages assume characteristic topologies or patterns of extension in time and space. When we say language has topological qualities what we’re really saying is that language-enacting communities have characteristic patterns of extension in time and space, since language’s presence in the world, its ongoing manifest availability to sentient awareness, is contingent on the presence of a community of symbol-using beings who enact it, and enact it, and enact it. This is true of any thing whose existence is a product of intentional behavior, but for many people language poses special problems in this regard because its sensible traces are ephemeral. Consider, for example, the comments of Bill Hillier, a theorist of urban form, on the relationship between language and built space. There exists, Hillier writes,
a class of artefacts which are no less dramatic in their impact on human life [than physical artifacts], but which are also puzzling in themselves precisely because they are not objects, but, on the contrary, seem to take a primarily abstract form. Language is the paradigm case. Language seems to exist in an objective sense, since it lies outside individuals and belongs to a community. But we cannot find language in any region of space-time. Language seems real, but it lacks location.
Or rather, Hillier continues, it is not that language and other abstract social artifacts do not manifest in space-time but that “these space-time appearances are not the artefact itself, only its momentary and fragmentary realisations” (Hillier 2007: 65).
The built environment too reflects the ongoing transitory social enactment of artifacts that exist strictly on the basis of convention, that is, artifacts whose presence in the world is given by the systematic articulation of referential (representational, meaning-generating) events to tangible things. Systems of referential convention (speech registers are one kind) afford the principle by which an accumulation of physical traces is configured into a language or a city. Cities, like languages, “are space-time manifestations of configurational ideas which also have an abstract form” (Hillier 2007: 68). The difference is that in the case of cities the tangible precipitate of the artifact-enacting process is a lot more durable, giving us a chance to inspect it at our leisure and form hypotheses about the relationships between the topologies of the social processes that make cities possible and those of the built artifacts that make them sensible.
Can we even imagine something comparable for language? A remote-sensing apparatus that would afford us a synoptic feed of the ensemble of fleeting gestural artifacts and their haptic, sonic, visual, and graphic traces, ranging from touching your elbow to vocalizing to signing to the appearance of glyphs on the screen as I type, that make up the instantaneous activity of a speech community? If we had such an apparatus, how would we represent the data it generated in a way that made patterns manifest, how would we visualize it, how would we map it? Even in writing-saturated societies, the vast majority of language-enacting continues to play out on a temporal modulus dictated by the phenomenal experience of sensible copresence: you and I, walking down the street, talking on the phone, trading text messages. Until recently, there was no way to monitor symbol-using activity on this modulus, let alone map it, for more than a handful of interactions.
Figure 3: New York City in 2011, viewed from the Hedonometer. Each point represents one Twitter status update, colored according to the “happiness” of the words in tweets within a 500m radius (Mitchell et al. 2013).
The fact that today we can imagine such a thing, if not for all linguistic activity then at least for an increasingly significant part of it, reflects a shift in the relationship between language and its material traces that has no precedent. Conversation in media of pervasive transcription is the opposite of writing. What writing does is slow down the process of enacting language to the point where its material traces can be recorded in materials similar to (in some cases the same as) building materials. Writing stretches the horizon of interaction among actors involved in enacting language out to a scale more like that in play in the fashioning of built space. Pervasive transcription does something different. I want to be cautious about saying what, exactly, it does. For a start, maybe it would be better to think of what happens when we communicate in these media not as a form of transcription but of modal transduction. To see what I have in mind, let’s shift focus from the pervasive transcription of social media to the more intimate transcription of typographic design.
Just as the built form of a city embodies the palpable residue of an artifact, urbanicity, that exists by virtue of social enaction, so we could say that typographic design embodies the palpable residue of another kind of social artifact, language, transduced from an acoustic to a glyphic mode of realization. (I’m bracketing signed languages only because typographic design has, historically, ignored them.) The transduction is low-fidelity to say the least, and part of the challenge of typographic design is to subtly reintroduce information—conveyed through intonation, prosodic contour, larygealization, and so on—lost through the shift from acoustic to glyphic form, or, equivalently, to suggest acoustic enactions to readers.
This kind of intermodal iconicity works because most of our sensory experience, in fact, plays out independent of the modality of sensation in which a particular experience is cast. Usually this phenomenon is referred to as amodality, but it’s not that sensory experience is without modal quality. Mark Johnson has proposed the term crossmodal, but this seems to miss the fact that there is some component of sensory experience that is independent of its modal realizations if not separable from them in experience. Let’s call this the metamodal component of sensory experience.
This metamodal component is defined by patterns of recurring variance over time and space that we heuristically associate with image schemata such as slippery—rough. To get a sense of the heuristic value of these image schemata, in particular for describing qualities of movement, consider Shigehisa Kuriyama’s characterization of slippery and rough mo (crudely and imperfectly analogous to pulse) in classical Chinese medicine, from The Expressiveness of the Body and the Divergence of Greek and Chinese Medicine:
The slippery mo “comes and goes in slippery flow, rolling rapidly, continuously forward” (liuli zhanzhuan titiran), says Wang Shuhe. The rough mo is the opposite: it is “thin and slow, its movement is difficult and dispersed, and sometimes it pauses, momentarily, before arriving”; one has the impression of flow made rough by resistance, struggling forward laboriously, instead of in a smooth, easy glide. “Like sawing bamboo,” says the Mojue. [Kuriyama 1999, 48–9]
Transcription, the relentless transformation of our behavior into data, has been getting all the attention, but there’s something else going on in the media that have so quickly come to dominate our communicative lives: movement. Control of movement has becoming the defining problem of typographic design, indeed, of design simpliciter (fig. 4). We are surrounded by movement in a way that was unimaginable—or, not unimaginable but improbable or culturally distant, something I’ll come back to—a generation ago.
Figure 4: Views of the installation Type/Dynamics, created by LUST for the Stedelijk Museum, Amsterdam, November 2013. Text scraped from a variety of news feeds cascades across the wall in imbricated columns. When a visitor approaches, the view opens up to focus on a single feed or item. (Note the Kinect enclosed in a wooden box at the foot of the wall, at 8 o’clock to “test”.) http://lust.nl/#projects-5525
After I broke my foot I began to think of bodily movement as something we could usefully apply register theory to. Language, after all, whether in the haptic, vocal, or glyphic mode, is nothing more than a highly developed and conventionalized form of movement. This is not a novel thought, but it is one we need to take more seriously, by which I mean we need to invest energy both in clarifying what registers are—what distinguishes a register from a style, for instance?—and in developing methods to pick out registers in modes of social presence other than language. What I have in mind is something different from the work conducted under the aegis of gesture studies. I think of one key methodological challenge for an anthropology of movement in terms of the U-Bahn (métro, underground, subway) problem. You don’t have to be an anthropologist to notice immediately that people hold and move their bodies differently on public transport in different places, though the difference is usually clearest in a direct comparison. In Berlin, for instance, U-Bahn passengers are markedly more receptive than in Munich to unbidden encounters. It comes across in how they hold their shoulders and hips, where they focus their gaze, the expressions they wear on their faces. But how would you characterize all this precisely from field data? Techniques for gauging body coordination dynamics remain laboratory techniques (Schmidt and Richardson 2008), though marker-free video motion capture could soon change this.
The problem is similar to that posed by speech registers. You don’t have to be a linguist to pick up on the fact, say, that the English spoken among teenagers in Los Angeles is different from that spoken by BBC presenters, or that when I conclude a lengthy technical argument with a sentence that starts, “All’s I’m sayin is—” I’m invoking African-American English to soften the technical rigor of what came before and reestablish empathy with listeners. We all pick up on these things. But to say exactly what we’re picking up on is more difficult.
What I am proposing is this: because body coordination plays out in a number of linked channels, each of which is characterized, for participants in a given motoric community, by distinctive shared repertoires of gesture and coordination, focusing on isolated coordination phenomena, say, phase-locking of forward–backward hip sway, is analogous to focusing, say, on final rhoticity. Final rhoticity tells you something about the differences between RP English and Angelino English, but hardly everything. Similarly, hip sway (Shockley 2012) would tell you something about the differences in the motor language of five-year-old kids and adults (or perhaps U-Bahn-Mitfahrer/innen in Berlin and Munich), but hardly everything.
There’s a second part to what I’m proposing, and this gets to why I think the project of elaborating a register theory for bodily movement is urgent in a way it did not used to be and why anthropologists should stop waiting for kinesiologists and cognitive scientists of music to perfect instrumentation for recording body coordination dynamics, just as phonologists proceeded by ear, training their own bodies to serve as research instruments long before acoustic spectroscopy was mature. Enregistered repertoires of bodily movement are the key to how we gauge intentionality in others over time. The over time part is critical. This is something that does not come up in the literature on simulation and theory of mind, both that in philosophy and that in cognitive science, because, again, the empirical basis for this literature is largely laboratory-based. Laboratory settings allow us to draw out the role of body coordination in the formation of shared moods and intentions—social entrainment over epochs of a few seconds to a few minutes. But social entrainment also plays out over epochs of days and years. It is implicated in the formation of shared dispositions as well as moods and intentions, above all the disposition to regard similarly moving others as moral conspecifics, possessed of a quality of aliveness and social presence like our own. Body coordination is central to how we classify the world according to a social ontology of animacy—how we determine whether some other presence is, by degrees, alive, intentional, sentient, agentive, or imbued with a moral presence like our own, that is, a person. Registers of movement may even shape the social ontology of animacy itself.
Here, I should say, I am entering the realm of hypothesis. I have been moved to do so by my own experience of movement frustrated and by my sense that we are crossing a threshold in our kinesthetic environment. We inhabit a new plenum, a world of ghostly moving presences. When I say that we are surrounded by movement in a way that was previously unimaginable, I have in mind the proliferating numbers and categories of maybe intentional moving presences, all those things, from screen interfaces to drones to immortalized cell lines to the ever more frequent catastrophic meteorological events, that, by design or effect, give an appearance of goal-directed behavior. Perhaps, as some design theorists have suggested, we should start thinking of built space itself as sentient (Shepard 2008), or, less tendentiously, as possessed of a quality of self-production that is larger than us. Our encounters with this new world of motion draw us, individually and collectively, toward decisions about how much animacy, and, by extension, how much moral presence, to ascribe to these new kinds of moving things.
Linguists, of course, use an animacy hierarchy to talk about the realization in grammar of variation in the social ontology of animacy, with different languages and registers assigning different cut-off points for what kinds of things—weather, living things, animals, human beings, first-person pronouns—can occupy which syntactic positions and take what thematic roles. But animacy is a multidimensional phenomenon, and what we need are animacy spaces analogous to the style spaces proposed by Lev Manovich (2013) for the analysis of large numbers of images.
When I say that this new plenum was not previously unimaginable so much as improbable or culturally distant, what I mean is that cognitive anthropology has long been attentive to the wide variation in the social ontology of animacy and to the fact that historically it has not been so uncommon to imagine the world a plenum, full of ghostly presences the discernment of whose movements demands a different kind of bodily attunement from that demanded by other human beings. But cognitive anthropology has, up till now, declined to engage with the urban world, preferring to focus on foragers and herders, communities whose moral entente with other animate beings is immediately palpable (Ingold 2000).
Animals, specifically other-than-human gregarious large vertebrates, have been, over the history of human behavioral modernity, the screen on which we projected our budding interiority, and they remain a key reference point for moral sense-making (Haraway 2007). But today we share our world with a proliferating array of partly autonomous animate copresences. These new things, and the new registers of movement they introduce into our lives, are not going to disappear when the hype over Big Data dies down. We need a cognitive anthropology that takes them seriously as sources of meaning in our lives.
Figure 5: Detail, Untitled (Hungry Ghosts Scroll), late twelfth century CE. From the second section of the Hungry Ghosts Scroll held in the Kyoto National Museum (via http://en.wikipedia.org/wiki/File:Hungry_Ghosts_Scroll_-_detail.jpg). In many times and places, people have been inclined to view the world as a plenum, populated by a wide range of but faintly discernible animate presences (Picone 1991).