The Inner Observer Trap

Chapter 3

How Self-View Shifts the Brain from "I Am Communicating" to "I Am Observing Myself," Triggering Vicious Cycles

Take the example of a colleague, shared on her social media (her name is changed for privacy). Nelly is a clinical psychologist with fifteen years of experience. In her physical office, she works exactly as she was trained to do: she listens, observes, registers pauses, catches micro-expressions, and notices the subtle tension in a client's shoulders. But when the pandemic forced her practice onto Zoom, Nelly discovered something unexpected. A client would be talking about a painful divorce, their voice trembling—and Nelly would catch herself looking not at the client, but at her own face in the corner of the screen.

"Why did I make that face? Do I look empathetic enough? Let's furrow the brows a bit... or is that too much?" She forces her gaze back to the client, but a minute later, her eyes slide back to her own reflection.

Nelly is an experienced professional who understands exactly how the psyche and attention work. She spent years working in a psychiatric hospital and a crisis center for victims of abuse. If even she cannot stop looking at herself, what is happening to everyone else?

The automatic hijacking of attention—which causes alpha rhythms to spike and stay elevated—is only the tip of the iceberg. The deeper issue is that the self-view fundamentally alters the direction of consciousness, not just consumes attentional resources. A person ceases to be merely the subject of a conversation and simultaneously becomes its object. They are the one communicating—and the one being looked at. And the primary person looking is themselves.

It is precisely this shift—from subject to object—that triggers vicious cycles that become self-sustaining. This is why it is impossible to "just get used to" the self-view: the longer a person remains in the position of their own observer, the deeper they sink into it.

The Theory of Objective Self-Awareness

As discussed in the previous chapter, in 1972—half a century before the first Zoom call—Shelley Duval and Robert Wicklund described a mechanism that operates today with frightening precision. Their Theory of Objective Self-Awareness states that when a person's attention is directed inward (whether by a mirror, a camera, or a voice recording), an automatic process of comparison is triggered. The "actual self" is compared to the "ideal self." If a discrepancy is found—and it almost always is—negative affect arises: discomfort, anxiety, or shame ^[1].

From there, a person has two options. The first is to try and reduce the discrepancy: fix their hair, straighten their posture, adjust their facial expression. The second is to flee the stimulus: turn away from the mirror, leave the room, stop the recording.

In a laboratory, both options are available. In real life, too: you can step away from a mirror in an elevator or look away from a shop window. But on a video call, both exits are effectively blocked. Reducing the discrepancy is impossible: a web camera with a short focal length distorts facial proportions—making the nose appear wider, the face rounder, the shadows harsher—and no amount of fixing your hair will correct this (leaving users to place all their hope on AI enhancement filters, where available). Fleeing is also not an option: you are in a meeting, you are visible, and you cannot leave. The self-view keeps running. The comparison continues. The negative affect accumulates.

Duval and Wicklund could never have imagined that half a century later, hundreds of millions of people would be forced into a state of objective self-awareness for several hours a day, for months and years on end. Their theory proved even more accurate than its authors could have anticipated once the pandemic—and the broader remote-work economy—created the conditions to test it on a global scale.

Through the Eyes of an Outside Observer

In 1995, David Clark and Adrian Wells proposed a cognitive model of social anxiety that explained a long-standing paradox: why doesn't social anxiety fade with repeated exposure to the feared situation? ^[2] A person with a phobia of spiders who encounters a spider and survives gradually stops being afraid. A person with social anxiety who speaks in front of an audience repeatedly without being mocked or judged does not stop being afraid. Clark and Wells demonstrated why.

Their model outlines a closed loop consisting of six stages:

1. A person enters a social situation, which activates core beliefs: "I am boring," "People will notice my anxiety," "I look ridiculous." 2. The situation is perceived as a threat. 3. Attention shifts inward—to one's own thoughts, physical sensations, and appearance. This is known in the literature as self-focused attention (SFA). 4. This is the most critical link for our topic: The person constructs an image of themselves from an external perspective. They don't just feel anxious; they see themselves as anxious from the outside. Not from within, as internal states are usually experienced, but externally, as if through the eyes of the audience. This is the observer perspective. 5. To cope with the threat, the person employs safety behaviors: avoiding eye contact, rehearsing every sentence mentally, gripping their hands under the table, frequently taking sips of water, etc. As a result of these safety behaviors, the person is momentarily "distracted," and anxiety drops briefly. But in the long run, the brain becomes convinced that it "survived" solely because of the safety behaviors ("If I hadn't controlled myself, it would have been unbearable"), which ultimately reinforces the anxiety. 6. After the situation, post-event rumination sets in: "Why did I say that? How did that look? They definitely noticed everything."

For us, the key link is the fourth: the observer perspective. In normal life, this is an internal construct; the person imagines how they look to others. The image is inaccurate and distorted by anxiety, but it is at least imaginary. It can be challenged in therapy. A therapist can show the client a video recording of their speech, and the client will see that they looked much calmer than they thought. In fact, video feedback is a core therapeutic technique in Clark's treatment model used to correct distorted self-imagery ^[3].

The self-view on a video call makes the observer perspective literal, rather than metaphorical. A person no longer needs to imagine how they look—they are already seeing it. In real-time, and continuously. And what they see is typically a version distorted by a webcam lens. Clark's therapeutic technique worked because the video was viewed in a safe context, alongside a professional, with attention directed toward behavior rather than appearance. The self-view lacks this framing. It is presented without context or guidance—which is exactly why it works in reverse. It does not correct the distorted image; it creates or sustains it.

The Clark-Wells model explains why Self-View Fixation does not fade with experience. Every video call activates all six links simultaneously and continuously: Core Beliefs → Threat → Self-Focused Attention → Observer Perspective (literal) → Safety Behaviors (monitoring facial expressions, posture, or under-eye bags is a textbook safety behavior) → Post-Call Rumination. The cycle is sealed. The lived experience never refutes the negative beliefs because the safety behaviors prevent that from happening. The person thinks: "The meeting only went okay because I kept checking my face and making sure I looked engaged." The takeaway is not "my fears were unfounded," but "anxiety is unbearable, I must keep controlling my appearance."

Not Just for the Anxious

It is a mistake to think that this only applies to people with clinical social anxiety.

Self-focused attention (SFA) operates in everyone. It is not a pathology; it is a normal cognitive process. The only difference lies in the intensity of the process and its consequences. In 1975, Allan Fenigstein, Michael Scheier, and Arnold Buss described a stable trait they called public self-consciousness—the tendency to focus on how one appears to others ^[4]. This is also not a diagnosis, but a continuum: everyone falls somewhere on a scale from minimal concern about others' opinions to maximum concern.

Recent studies have shown that reactions to the self-view predictably depend on a person's position on this continuum. For people with high public self-consciousness, the self-view worsens their attitude toward video calls, increases anxiety, and decreases satisfaction. For the rare, lucky few with low public self-consciousness, it can even be helpful as purely technical feedback: making sure the camera is on, the lighting is fine, and there's no clutter in the frame ^[5]. There is no universal recipe. But because the self-view is enabled by default for everyone, it creates an excessive burden for a massive portion of users—the exact portion for whom observing themselves triggers the vicious cycle described above.

The Double Weakening of Empathy

There is another hidden cost to this shift in consciousness that is difficult to notice from the inside, but is felt by everyone in the conversation: the loss of empathy. Video inherently weakens the function of mirror neurons, and the self-view delivers a secondary blow by draining resources away from them.

Mirror neurons—a group of neurons that activate both when you perform an action and when you observe someone else performing that same action—form the neurophysiological basis of our ability to understand others' emotions and intentions. They provide what is sometimes called motor resonance: when you see your conversation partner wince in pain, your brain briefly simulates that pain. This mechanism is essential for the rich social interactions characteristic of primates.

The problem is that the video format naturally weakens this mechanism. Studies on primates have shown that out of 123 mirror neurons in the F5 area, only 43% reacted to a video with the same intensity as they did to a live action ^[6]. Video is, in a sense, an impoverished signal: flat, delayed, and devoid of spatial depth. The mirror neurons still work, but their firing is muted.

The self-view adds a second layer of suppression. Mirror neurons activate when observing others. When attention shifts to one's own image, a different system activates: the self-referential network (the medial prefrontal cortex, posterior cingulate cortex, and insula). These two systems—processing others' actions and processing one's own image—compete for resources ^[7]. The self-view redirects a portion of the incoming signal from the first system to the second.

The result is a double weakening: the video format degrades the signal reaching the mirror neurons, and the self-view distracts the brain from whatever signal remains. This explains a very specific sensation that many people describe: video calls feel emotionally "empty," "fake," or "exhausting"—even though no one can pinpoint exactly why. The content is the same, the people are the same, but something is missing. What’s missing is resonance. The very feeling of contact that makes people want to communicate face-to-face in the first place.

The Camera as an Unreliable Witness

In 1972 (coincidentally, the same year Duval and Wicklund published their theory), Daryl Bem introduced Self-Perception Theory ^[8]. Its premise is simple but counterintuitive: we learn about our own emotions not (only) from the inside out, but by observing our own behavior. If I am smiling, it means I am amused. If my posture is rigid, it means I am anxious. To a certain extent, we are forced to "read" our own states based on external cues, exactly as we do with the people around us. (I suspect the popular psychology advice to smile at yourself in the mirror every morning was born from Bem's concept).

The self-view turns this mechanism into a trap. The camera is an unreliable witness: as we've established, the wide-angle lens distorts proportions, poor lighting adds harsh shadows, and low resolution erases nuance. A person looks at the screen, sees a tired, tense face, and—following Bem's mechanism—concludes: "I am tired and tense." This conclusion, in turn, amplifies the actual physical sensation of fatigue and tension. You notice signs of exhaustion on the screen → you subjectively feel more exhausted → your facial expression becomes genuinely more exhausted → you see this on the screen. It is another closed loop.

The Betrayal of Interoception

Constantly watching yourself strikes another target: interoception—the ability to read the signals of your own body, such as heartbeat, breathing, muscle tension, hunger, and fatigue.

In normal life, as trivial as it sounds, we learn about our physical state primarily from within. We feel tired not because we look in a mirror and see a tired face, but because we register heaviness in our limbs, a slowing of our thoughts, and a desire to close our eyes. Interoception is an internal feedback channel, and it functions as long as our attention is at least partially directed inward. (Psychotherapists frequently work both with clients who cannot interpret their bodily signals and with clients who fixate on them excessively, such as constantly checking their pulse).

The self-view forcefully yanks attention outward—onto the screen, the image, the visual representation of the self. This is an exteroceptive fixation: a person learns about their state from the outside rather than the inside. Hours of fixation on an external image gradually displace interoceptive contact with the body. A person stops noticing that they are sitting in an uncomfortable position, that their shoulders are hiked up to their ears, or that their breathing has become shallow. They see how they look, but they no longer feel how they feel.

Reduced interoceptive awareness is a known neurophysiological predictor of anxiety disorders ^[9]. The self-view, therefore, does not merely provoke anxiety—it actively degrades the very system designed to help regulate it.

A Workday in Front of a Digital Mirror

In the classic mirror experiments described in Chapter 1, the exposure time was long enough to alter behavior, but not long enough to alter the brain. The self-view on a video call is a different story entirely. For many professionals, four, six, or even eight hours of video conferencing a day is the norm, not the exception.

The neuroscience of chronic stress shows that persistent cognitive overload is more than just fatigue. Chronic strain on the prefrontal cortex (the brain region responsible for executive functions, emotional regulation, and voluntary attention) can lead to measurable changes: a reduction in long-term potentiation (LTP), which is foundational for memory and learning, and a simplification of the dendritic architecture of neurons ^[10]. The prefrontal cortex is vulnerable to chronic stress even in adults—and this is exactly the region being taxed every time you suppress the urge to look at your self-view, every time you force your attention back to the speaker, and during every cycle of "evaluate self → suppress evaluation → evaluate self again."

To be clear: there are not yet direct studies definitively linking SVF to long-term neuroplastic changes. However, the pattern of exposure—daily, multi-hour sessions involving chronic activation of the self-referential network and the suppression of automatic reflexes—aligns perfectly with what neuroscience categorizes as an ecological stressor capable of provoking microstructural changes in neural networks. For now, this remains the author's hypothesis.

Three Vicious Cycles

Everything described above can be systematized into three self-sustaining mechanisms—three vicious cycles, each triggered by the self-view and perpetuated by it.

1. The Anxiety Cycle (The Clark-Wells Loop). Adapted for video calls: Self-view activates SFA → the person views themselves from the observer perspective → cognitive distortions activate ("I look ridiculous," "Everyone sees it") → safety behaviors kick in (controlling facial expressions, fixing hair, monitoring the self-view) → the safety behaviors sustain the SFA → the cycle is closed. This is the primary loop for individuals with high public self-consciousness and social anxiety. It does not fade with exposure; it intensifies. 2. The Dysmorphic Cycle. Self-view fixates attention on specific facial features → selective attention creates a magnifying effect ("My nose is huge. How did I never notice this?") → emotional reasoning takes over ("I feel like I look terrible, therefore I really do look terrible") → increased scrutiny → discovery of new "flaws" → the cycle is closed. This parallels clinical descriptions of compulsive mirror-checking in Body Dysmorphic Disorder, but it is being triggered in people who had no issues with their appearance prior to the era of endless video calls. (We will discuss the early 2020s spike in plastic surgeries—likely associated with the self-view—later on). 3. The Neurocognitive Cycle. Self-view automatically hijacks attention (the mechanism from Chapter 2) → resources for processing the conversation partner drop → the person struggles to follow the conversation → a feeling of lost control arises → compensatory return of gaze to the self-view (the only "familiar" and predictable element on the screen) → the hijacking intensifies → the cycle is closed. This cycle operates even in people without anxiety or dysmorphic concerns; it is purely automatic. (As a reminder, we possess no evolutionary immunity to this third communication channel). This cycle explains why even those completely satisfied with their appearance still find their eyes glued to their own window.

These three cycles are not mutually exclusive. A single person might experience two or all three simultaneously. But generally, one cycle takes the lead. Identifying which one is driving your behavior is the goal of Part II of this book, where we will break down the seven motives for fixating on the self-view.

Asymmetry: The Listener and the Speaker

There is one more crucial detail to consider: not all roles on a video call are equal.

When you are presenting or speaking at a meeting, the self-view can function as a genuine feedback tool: you can see your gestures, ensure you haven't drifted out of frame, and calibrate your delivery. This is active monitoring integrated into an action. It is cognitively expensive, but justifiable by rational needs. (Though, as a psychologist and lecturer, I would still highly recommend looking at engaged participants instead—their reactions and support are far more informative and useful than your own reflection).

But when you are listening, the dynamic flips entirely. You are not performing any action that requires visual feedback. It is entirely in your best interest to focus on external reality: the words, intonations, and facial expressions of the speaker and other participants. Yet, the self-view remains active, and you have no functional reason to look at it. The cognitive load it generates during this time is a pure deficit. The latest research confirms this asymmetry: for a listener, the self-view exacts a maximum cognitive toll while providing minimum utility ^[11]. Hide it.

The Additional Burden on Women

In the previous chapter, we mentioned the gender paradox found in the data of Whelan and Xu: EEG scans show no difference between men and women in the neurophysiological load caused by the self-view, yet subjective reports consistently show that women experience more Zoom fatigue, are more frequently dissatisfied with their appearance, and fixate on their image more often.

With an understanding of the vicious cycles, this paradox now has an explanation. The neurophysiological load is identical—but the interpretation is different. Social norms, which place far stricter appearance demands on women, channel the experience of cognitive overload into a specific narrative: "I am exhausted because I look bad." Men experience the exact same overload but describe it differently: "I'm just tired. My head hurts. I probably didn't sleep enough."

The Anxiety Cycle (Clark-Wells) and the Dysmorphic Cycle activate more frequently and intensely in those for whom evaluating their appearance is a habitual way of interpreting any physical discomfort. The Neurocognitive Cycle operates equally in everyone—but it is noticed less often because its symptoms are less specific.

This is not a biological gender issue, but a social one—yet its consequences are highly concrete. Women on video calls are not "weaker" or "more sensitive"; their brains react identically to men's. But the interpretive mechanism instilled by culture translates the exact same neurophysiological load into different subjective experiences—and, consequently, feeds different vicious cycles.

Let us return to Nelly. She is an expert psychotherapist. She knows exactly what self-focused attention and the observer perspective are. She understands the Clark-Wells model and uses it in her practice. And yet, she still catches herself staring at her own face precisely when her client needs her undivided attention. Has she lost her professionalism? No. This is the predictable reaction of a nervous system to a stimulus against which it has no defense.

Therapeutic contact—the psychologist's primary tool—is sabotaged by a rectangle in the corner of the screen. The therapist tries her best, but the self-view activates three vicious cycles simultaneously. The Anxiety Cycle: "Do I look empathetic enough?" The Neurocognitive Cycle: Her gaze automatically darts to herself, missing a client's micro-expression. Bem’s Self-Perception Cycle: She sees her tense face on the screen and feels even more tense.

This concludes Part I. We have journeyed from classic mirror experiments (Chapter 1) through the neurobiology of attention hijacking (Chapter 2) to the mechanism of shifting consciousness and the three vicious cycles (this chapter). The overarching conclusion is this: the self-view does not merely distract us; it recalibrates the very optics through which we perceive ourselves during communication. Mirrors have always done this, but usually in brief intervals. The self-view does it chronically.

Up to this point, we have discussed the mechanisms—how it works. We have not yet asked why a specific person looks at themselves. As it turns out, the motives vary wildly. One person monitors their facial expressions out of a fear of negative judgment, while another hunts for flaws they never noticed in a bathroom mirror. For a third, the familiar little window becomes a refuge from the intimidating gaze of others, and someone else might simply be unable to look away because their own moving face is an insurmountable distractor. The underlying motive determines which vicious cycle is driving the bus—and, consequently, how to stop it.

Part II of this book outlines seven such motives. Before we break them down and help you find your profile, I invite you to conduct a brief self-assessment. This scale will help objectify your experience and show you exactly where you stand.

Interactive self-assessment

The SVF-7 Self-Test

Reflect on your experience with video calls over the past few months. Rate each statement from 1 to 5 — the result is calculated instantly and never leaves your browser.

1 · Never2 · Rarely3 · Sometimes4 · Often5 · Almost always

During video calls, I frequently look at my own image for extended periods.
My gaze slides back to my own face automatically, even when I am trying to watch the speaker.
The presence of my face on the screen makes it difficult for me to fully concentrate on what others are saying.
After long video calls, I feel a specific kind of exhaustion or depletion that I do not experience after in-person meetings.
After a call, I sometimes struggle to remember details of the conversation because a portion of my attention was spent observing myself.
Seeing my own face on the screen regularly causes me background tension, anxiety, or dissatisfaction.
I feel that without the self-view window, things would be easier, but I hesitate (or don't want) to hide it.

Answered 0/7

References

[1] Duval, S., & Wicklund, R. A. (1972). A Theory of Objective Self-Awareness. New York: Academic Press.

[2] Clark, D. M., & Wells, A. (1995). A cognitive model of social phobia. In R. G. Heimberg et al. (Eds.), Social Phobia: Diagnosis, Assessment, and Treatment (pp. 69–93). New York: Guilford Press.

[3] Clark, D. M., & Wells, A. (Ibid.) Video feedback as a therapeutic tool is detailed in Clark's work on social anxiety disorder therapy; George, S., & Stopa, L. (2008) demonstrated that live video reflection increases anxiety and public self-awareness.

[4] Fenigstein, A., Scheier, M. F., & Buss, A. H. (1975). Public and private self-consciousness: Assessment and theory. Journal of Consulting and Clinical Psychology, 43(4), 522–527.

[5] Ratan, R. et al. Studies on the link between public self-consciousness and reaction to self-view (Wayne State University); Kuhn (WSU): "One-size-fits-all does not work."

[6] Data on mirror neuron responses to video stimuli are derived from primate studies; the suppression of the mu rhythm (a marker of mirror neuron system activation) is significantly weaker during video observation than live observation.

[7] The conflict between the mirror neuron system (processing others' actions) and the self-referential network (mPFC, PCC) is described within the neurophysiology of social cognition; see reviews by V. S. Ramachandran, G. Rizzolatti.

[8] Bem, D. J. (1972). Self-perception theory. In L. Berkowitz (Ed.), Advances in Experimental Social Psychology (Vol. 6, pp. 1–62). New York: Academic Press.

[9] For a fundamental review on the link between reduced interoception and clinical conditions (including anxiety and eating disorders), see: Khalsa, S. S., et al. (2018). Interoception and Mental Health: A Roadmap. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging, 3(6), 501–513. Empirical data showing how exteroceptive manipulation (observing one's body via video) is negatively related to interoceptive awareness can be found in: Bekrater-Bodmann, R., Azevedo, R. T., Ainley, V., & Tsakiris, M. (2020). Interoceptive Awareness Is Negatively Related to the Exteroceptive Manipulation of Bodily Self-Location. Frontiers in Psychology, 11, 562016.

[10] The impact of cognitive overload and stress on synaptic plasticity (LTP) and the architecture of the medial prefrontal cortex is detailed in: Fagiani, F., et al. (2022). Long-term memory, synaptic plasticity and dopamine in rodent medial prefrontal cortex: Role in executive functions. Frontiers in Behavioral Neuroscience, 16. On the connection between the loss of dendritic spines, reduced LTP in the prefrontal cortex, and the development of depressive symptoms, see the classic review: Duman, R. S., Aghajanian, G. K., Sanacora, G., & Krystal, J. H. (2016). Synaptic plasticity and depression: new insights from stress and rapid-acting antidepressants. Nature Medicine, 22(3), 238–249.

[11] The integration of the sender-receiver framework with the consumption of self-referential information in virtual meetings is presented in: Abramova, O., Gladkaya, M., & Krasnova, H. (2024). The differential effects of self-view in virtual meetings when speaking vs. listening. European Journal of Information Systems. The authors compellingly demonstrate that the cognitive cost and consequences of SVF differ radically depending on whether the participant is currently speaking or listening.

In the first part, we dismantled the mechanism: what the presence of your own face on a screen does to the brain, how it hijacks attention, depletes the cortex, and shifts consciousness from an "I am communicating" mode to an "I am observing myself" mode. This mechanism is universal—it operates in everyone. But the motives driving people to look at the self-view differ radically.

Imagine five people sitting in a meeting. All five are looking at themselves, but their motives vary completely. One is anxiously checking to make sure she doesn't look terrified. Another is tortured by the thought that the wide-angle lens has enlarged his nose. A third uses his little window as a safe haven from the intense pressure of a dozen other people's gazes. The fourth is currently giving a presentation and evaluating her own persuasiveness, while the fifth physically cannot look away because, due to his attention deficit, a moving image is an insurmountable distractor.

A single behavioral act—looking at the self-view—triggers different vicious cycles for these people and requires entirely different solutions. This is exactly why the universal advice to "just don't look" or "just turn it off" is highly unlikely to work: it completely ignores why the person cannot pull themselves away. At this stage, the person understands neither the root of the problem nor the benefit of giving up their "favorite toy" to solve it.

We have identified seven consistent motives—seven reasons people look at themselves. Each gets its own chapter. The archetypes are arranged on a spectrum from the most anxiety-driven to the most neurocognitive: The Controller, The Hider, The Objectified, The Performer, The Face-Saver, The Fascinated, and The Overwhelmed.

One person may have a primary motive and one or two secondary ones. Identifying yours is the first step toward stopping the self-view from dictating your attention. If you scored over 18 points on the SVF-7 scale, the following chapters will serve as your navigator. We will break down exactly which of the seven mechanisms is devouring your cognitive resources and how to disable this automatic response.