The Protocol

Chapter 12

Concrete Actions Across Three Levels of Responsibility

Moving to practical recommendations requires a clear division of responsibilities. An executive in charge of occupational health should not have to explain the neurobiology of motivation to every employee; they need a functional protocol that accounts for varying levels of employee engagement.

An employee who hides their self-view but works in a "cameras-on-by-default" culture will likely turn it back on within a week. An organization that implements a "cameras optional" policy (a wonderful global trend!) cannot change the interface of a third-party video platform that turns the self-view back on by default with every update. A platform that (miraculously!) changes its default view settings won't help an employee whose manager demands, "Turn on your camera."

A sustainable solution lies at the intersection of three levels: the individual, the organization, and the platform. Each is necessary, and none is sufficient without the others.

In the previous chapter, we approached the boundary: dissociation is the limit where self-view fixation becomes a clinical problem. Everything described in Parts I and II—from attention hijacking to vicious cycles, from anxious control to sensory capture—is reversible. But to reverse it, you have to act.

Each of the seven archetypes described in the second part has its own primary motive, its own vicious cycle, and consequently, its own set of tools. Before diving into specific recommendations, it makes sense to look at the big picture.

The Diagnostic Map

Below is a summary guide. It links each archetype with what can be done right now (a concrete action), a long-term strategy (including at the organizational level), and markers indicating it is time to consult a specialist.

The Controller (evaluation anxiety, fear of a "facial slip").
Right now: Hide the self-view for one meeting and write down what happens.
Long-term: A series of behavioral experiments using Clark's protocol—predict, drop the safety behavior, test the reality ^[1].
See a specialist: If pre-call anxiety interferes with work or provokes avoidance (systematically turning off the camera, skipping meetings).
The Hider (nonverbal overload, exhaustion from other faces).
Right now: Switch to Speaker View.
Long-term: Introduce breaks between calls (at least ten minutes completely off-screen), limit Gallery View.
See a specialist: If video calls are regularly followed by emotional numbness or a need for multi-hour isolation to recover.
The Objectified (appearance fixation, camera dysmorphia).
Right now: Hide the self-view. Remind yourself: a short-focal-length camera physically distorts facial proportions—it widens the nose and rounds the face ^[2]. What you see on the screen is not what people see in real life.
Long-term: Limit "mirror time" (not just on video calls, but in daily life), restore interoceptive contact with the body.
See a specialist: If the "defect" found on camera begins to occupy your thoughts outside of calls, or if you feel the urge to consult a cosmetologist or surgeon about something you only noticed on the screen.
The Performer (impression management, splitting into actor, director, and audience).
Right now: Choose one task—to be or to appear. Try the former in your next meeting.
Long-term: Realize that the audience for your performance is largely absent; your colleagues are busy looking at their own self-views, not yours. Shift focus from impression management to content.
See a specialist: If preparing for video calls takes a disproportionate amount of time and is followed by mounting post-event rumination ("Why did I say that?", "Why doesn't anyone listen to me?", "Why do I always mess up my presentations?").
The Face-Saver (cultural pressure, fear of losing "face" in front of the group).
Right now: If the context allows, hide the self-view and turn off the camera when not speaking. If it doesn't allow it (due to cultural norms or group expectations), shrink your window to the absolute minimum size.
Long-term: Normalize "audio-only" meetings within the team; discuss how different participants may require different camera settings. Advocate for culturally sensitive video call policies.
See a specialist: If the fear of "losing face" on a video call generalizes to other situations or is accompanied by somatic symptoms (headaches, facial muscle tension).
The Fascinated (pleasure from one's own image, dopamine loop).
Right now: Conduct one video call without the self-view and track whether the quality of your contact with the speaker changes.
Long-term: Realize the cost—pleasure drains the exact same cognitive budget as anxiety. Find alternative sources of positive reinforcement not tied to your own image.
See a specialist: If "pleasant" has turned into "cannot go without"—a compulsive pattern you are unable to break on your own.
The Overwhelmed (ADHD, neurodivergence, sensory capture).
Right now: Hide the self-view. For this archetype, this is a strict recommendation, a necessary, and often sufficient step: your own moving face is an insurmountable distractor when executive control is already taxed.
Long-term: Speaker View instead of Gallery View, tactile stimulation (a fidget spinner, a stress ball, etc.), off-camera movement, shortening the duration of video calls.
See a specialist: If the attention issue on video calls is part of a broader picture (difficulties with concentration, organization, completing tasks) that has never been professionally evaluated.

This map is your navigator. Find your primary archetype (or two), pinpoint your starting point, and begin there. The rest of the protocol is detailed below across the three levels of responsibility.

Level One: What Everyone Can Do

Use the SVF-7 scale as a baseline tool: measure your score before implementing the protocol, and again after two weeks of conscious practice. Usually, the subjective feeling of "this got easier" is backed up by a drop in points on items 4 and 5 of the scale (exhaustion and loss of context).

The following recommendations can be implemented at the individual level and require no organizational approval or permission.

First and foremost—hide the self-view. This option exists on all major video conferencing platforms; it takes two or three clicks. In Zoom, for example, right-click your own window and select "Hide Self View." Your image will continue to be broadcast to everyone else—it is only hidden from you. No one but you will know you did this.

Second—as an intermediate alternative, utilize the sender-receiver asymmetry. According to research by Olga Abramova, Margarita Gladkaya, and Hanna Krasnova ^[3], the self-view causes the most damage when you are in the listener role—when you are passively observing yourself without an active task to perform. The authors primarily measured subjective metrics: meeting satisfaction and perceived productivity. They found that for a listener, the self-view significantly worsens these metrics. For a speaker, the negative effect was much less pronounced, and in some cases, the self-view even slightly increased satisfaction. Therefore, the authors conclude that you should turn the self-view off while listening, and turn it on while speaking. Jeremy Bailenson ^[6] previously made a similar recommendation, noting that the self-view can serve as useful feedback for a speaker.

However, an important caveat is needed here. The aforementioned studies evaluated the subjective experience of the meeting (satisfaction and comfort), not objective communicative efficacy. In the context of the latter, when you are presenting on a video call, your main task is still to remain in live contact with your audience and rely on their feedback (not the feedback of your own image): reading micro-expressions, nods, and the listeners' level of engagement. This is exactly what allows you to gauge how well your message is landing and maintain a natural dialogue. If you check your own "director's monitor" too frequently during this time, it is easy to lose genuine contact with the audience and slide into a performance. Therefore, the most sensible approach looks like this: when listening, the self-view should almost always be hidden; when speaking, you can briefly turn it on to calibrate at the beginning or before a key moment, but then hide it again and focus entirely on the participants' reactions.

Third—Speaker View instead of Gallery View. Gallery View places your face in a grid with dozens of others—ideal conditions for upward social comparison and sensory overload. Speaker View shows only the person talking, eliminating visual noise. For the Hider and the Overwhelmed, this is absolutely crucial.

Fourth—Shrink the self-view window or the entire video app. Not all tasks require full-screen mode. A smaller window physically shrinks the self-view to a few square centimeters, reducing the power of the automatic attention hijack.

Fifth—Audio pauses. If the meeting agenda allows, turn off your camera for five to ten minutes and switch to audio-only mode. This advice might seem counterintuitive: doesn't video aid communication? Research from Carnegie Mellon University provides a surprising answer. Groups working in an audio-only format demonstrate higher collective intelligence than groups with their cameras on! ^[4] The mechanism is prosodic synchrony: when participants cannot see each other, they begin to tune in much more precisely to the tone, rhythm, and pauses in the speaker's voice. The visual channel, which seemingly enriches communication, actually suppresses the more nuanced auditory channel—and distracts participants with appearances (their own and others') instead of content. Five minutes of audio in the middle of an hour-long meeting is an entirely justified mode switch that gives the brain a break from processing faces.

Sixth—Movement. Briefly turning off the camera for a physical stretch should become a normalized practice during video meetings. Any movement restores interoceptive contact and reduces the cognitive load of forced immobility ^[6]. During this time, the resources of the cortex are redistributed. Standing up and taking three steps across the room during a speaker transition or a presentation hiccup is an action that would be completely normal in a physical meeting room, yet somehow feels inappropriate on a video call. Allow yourself this "indecency" as often as possible—or at least do it while your camera is off for a minute (who knows—did you do it on purpose, or was there a connection glitch?).

Seventh—The behavioral experiment. We detailed this in the chapter on the Controller, but it applies to any anxiety-driven archetype. The format: formulate a concrete prediction about what will happen without the self-view, run one meeting without it, and check reality against your prediction.

The Objectified's prediction: "Without the self-view, I'll obsess over my nose even more because I won't be able to check it." The Performer's prediction: "Without the self-view, I'll definitely be unexpressive and fail to convey my idea." The Face-Saver's prediction: "Without the self-view, I won't know how my face looks to the group, and it will be unbearable." Reality testing almost always refutes all three. But to find out, you have to run the experiment. The help of a CBT therapist in designing and interpreting the experiment can be highly beneficial. Usually, actually doing the practice* is the biggest bottleneck when attempting self-help instead of formal therapy.

Finally, in many cases, the principle of delayed analysis applies. If you find it difficult to give up the self-view due to concerns about the quality of your self-presentation, replace live monitoring with a post-factum video audit (if recording is permissible). This removes the cognitive load in the moment but leaves you the opportunity to review your mistakes later. As practice shows, after just two or three of these iterations, the urge to watch yourself in real-time begins to drop.

Level Two: What the Organization Can Do

Individual actions are necessary but insufficient. A person who hides their self-view but works in a culture where six to eight hours of video calls a day are considered the norm is only solving a fraction of the problem. Organizational policies create the environment in which individual solutions become sustainable.

First—"Cameras Optional." A "cameras-on-by-default" policy has zero scientific backing. Not a single peer-reviewed study confirms that having cameras on increases meeting productivity or participant engagement. However, data on the cognitive cost of the video format certainly exists, and it is staggering: fifteen minutes to EEG-measurable exhaustion (the Graz experiment); an alpha rhythm that stays continuously elevated for twenty minutes when the self-view is on (Whelan et al., 2024) ^[5]. A camera is a tool, not a sacred duty. Declaring cameras optional does not mean banning them. It means delegating the decision to the level where it is made most intelligently—to the individual employee, who knows best whether they currently need the camera for effective communication.

Second—Audio by default for certain meeting types. Status updates, quick syncs, and operational check-ins do not require video. Transitioning these formats to audio-only is not a loss; it is a cognitive unburdening.

Third—Video time limits. No more than four hours of video conferences a day. This is the threshold beyond which cumulative cognitive load ceases to be offset by the recovery time between calls. The Graz experiment showed that exhaustion begins fifteen minutes into a single video call; Whelan's EEG data showed it does not decline over at least a twenty-minute span. Extrapolating this data to a six-hour workday on camera explains exactly why remote employees feel utterly drained by Friday, and why their productivity plummets. Between video calls, there must be a minimum of ten minutes off-screen. (To reiterate: not ten minutes of checking emails or social media, but ten minutes strictly away from the screen).

Fourth—Psychoeducation for managers. A manager who insists on cameras being on "for engagement" is typically operating on intuition, not data. Their intuition says: "If I see faces, people are working; if I don't, maybe they aren't." The first part of this book can supplement that intuition with hard science. A manager who learns that the self-view depletes the cerebral cortex in fifteen minutes, and that the alpha rhythm never habituates to this stimulus, should probably stop demanding non-stop camera usage during two-hour planning meetings. Especially because they will now understand that what looks like engagement (intense faces staring at the screen) might be its exact opposite (people consumed by monitoring their own faces, completely checked out of the conversation).

Fifth—Cultural sensitivity in international teams. As we demonstrated in the chapter on the Face-Saver, for employees from collectivist cultures, an active camera might signify not "engagement," but the relentless pressure of group observation. A "cameras optional" policy must be genuine—without hidden penalties for keeping the camera off. This is not just a matter of ergonomics; it is a matter of internal cross-cultural corporate policy.

Level Three: What Platforms Can Do

Level One and Two solutions require awareness of the problem and an exertion of willpower—either from the individual or the organization. Level Three solutions should work by default, requiring no awareness or effort from the users. And, realistically speaking, they can only be initiated by the platforms themselves. What could these solutions look like, and in what directions should society and regulators nudge video conferencing providers?

First and foremost—a "Self-View Off by Default" policy. At the time of writing, all major corporate video conferencing platforms—Zoom, Microsoft Teams, Google Meet, Cisco Webex—enable the self-view by default. The user sees their own face from the very first second of the call and must take deliberate action to hide it. Most users don't even know this option exists—and that is not their fault; it is a consequence of interface design. The self-view was originally added as a technical feature: to check the camera, lighting, and framing. It was left on because it was easier—users could see their camera was working and wouldn't bother tech support. The psychological consequences of this decision were never considered. They became obvious later and are systematized in Parts I and II of this book. Inverting the logic—showing the self-view only upon request, or at least explicitly offering the choice to show it or hide it upon joining—does not require massive engineering effort. It is largely a matter of marketing and the public's lack of awareness of the problem.

Second—A Self-View Timer. If a user turns on the self-view, the platform could automatically hide it after thirty to sixty seconds. This is plenty of time for a technical check, but not enough time to establish a vicious cycle. The user could always turn it back on—but every activation would be a conscious decision, rather than a passive drift.

Third—Alternative forms of feedback. The self-view in its current form is a blunt instrument: a full-resolution, real-time mirror image of your face. For the task of "making sure everything is okay," this level of detail is excessive and harmful. A schematic silhouette instead of a mirror reflection would be entirely sufficient. Any format that conveys technical framing information without feeding the brain a highest-priority self-relevant visual stimulus would work. The difference between a self-view as a technical tool and a self-view as a digital mirror is the difference between a functional dashboard gauge and an endless hall of mirrors. The job of a designer who cares about the digital hygiene of their product is to provide the former without creating the latter.

Fourth—Adaptive Gallery View. The current Gallery View throws all faces into a single grid, including your own. An adaptive version could: automatically exclude your face from the grid; shrink it relative to the others; or move it to a de-emphasized position. Any of these solutions reduces the power of the automatic attention hijack described in Chapter 2.

Fifth—Duration notifications. "You have been viewing your self-view for more than 1 minute." A soft, unobtrusive nudge, similar to screen-time notifications on mobile operating systems. The eye-tracking data from Ariss and Fairbairn (2024) proved that the gap between where people think they are looking and where they actually look is robust and reproducible ^[7]. A person who doesn't realize how much time they spend staring at their self-view cannot make an informed decision to hide it. The notification doesn't solve the problem—it makes the problem visible. And a visible problem, as the entire history of Cognitive Behavioral Therapy has taught us, is a problem you can begin to solve.

What Not To Do

Any protocol can be misapplied. Here are a few warnings about approaches that will do more harm than good. If you are responsible for your organization's occupational health policy, try your best to...

Do not ban cameras. A "cameras off for everyone" policy is just as flawed as a "cameras on for everyone" policy. For some participants, the camera is their only way to feel social presence. For others, it is a tool without which they cannot deliver their message (educators, therapists, managers). The goal is to provide a choice, not replace one mandate with another.

Do not stigmatize. Fixation on the self-view is an automatic neurobiological reaction to a high-priority stimulus; it is not a consciously executed "whim." Using shame (e.g., making remarks like, "You're staring at yourself again") only raises anxiety levels and reinforces safety behaviors, sealing the vicious cycle shut completely.

Do not expect instant results. Behavioral experiments work through the accumulation of corrective experience—we detailed this mechanism in the chapter on the Controller. One call without the self-view does not shatter a six-month habit. But it creates the first precedent: the world didn't end, colleagues made no remarks, and the content of the meeting was remembered better. Three calls solidify the precedent. Ten calls establish a new baseline. This is exactly how the correction of anxious beliefs works in CBT: gradually, through the exact experiences that safety behaviors previously blocked. For those expecting instant relief, it is worth remembering: the problem took months or years to form. The solution won't take months, but it won't take minutes either.

Do not force a one-size-fits-all protocol. Remember that different archetypes require different solutions. The Controller needs a behavioral experiment. The Overwhelmed needs radical hiding of the self-view and sensory offloading of the environment. And so on, through chapters 4 to 11. One protocol for all archetypes is impossible—and unnecessary. The diagnostic map at the beginning of the book, and the more detailed one at the start of this chapter, exist so that everyone can find their specific entry point.

Three Levels, One Logic

So, what should a manager do? First, on a personal level: hide your own self-view and observe how the quality of your attention changes during meetings. On a team level: declare cameras optional, introduce a daily cap on video hours, and distribute a memo to employees recommending they hide their self-view, backed by the scientific data. On an infrastructure level: write to the platform's tech support requesting a change in the default settings for your corporate account.

You can begin implementing the protocol from any level, but the individual step remains the most pragmatic—it requires no corporate approvals. Conducting a behavioral experiment (one call with the self-view hidden) allows you to verify the reduction in cognitive load through your own personal experience. Dismantling anxious expectations through practice is the most reliable first step toward reclaiming control over your own attention.

References

[1] Clark, D. M. (2001). A cognitive perspective on social phobia. In W. R. Crozier & L. E. Alden (Eds.), International Handbook of Social Anxiety (pp. 405–430). John Wiley & Sons.

[2] Ward, B., Ward, M., Fried, O., & Paskhover, B. (2018). Nasal distortion in short-distance photographs: The selfie effect. JAMA Facial Plastic Surgery, 20(4), 333–335.

[3] Abramova, O., Gladkaya, M., & Krasnova, H. (2024). The differential effects of self-view in virtual meetings when speaking vs. listening. European Journal of Information Systems. (Note: Updated to 2024 per earlier chapter referencing).

[4] Woolley, A. W., Chabris, C. F., Pentland, A., Hashmi, N., & Malone, T. W. (2010). Evidence for a collective intelligence factor in the performance of human groups. Science, 330(6004), 686–688. Later Carnegie Mellon studies confirmed the benefits of the audio format for group work, linking it to the restoration of prosodic synchrony.

[5] Whelan, E., et al. (2024). Self-view in video-conferencing and its role in Zoom fatigue: An EEG study. The alpha rhythm remained stably elevated for 20 minutes when the self-view was enabled, showing no signs of habituation (PubMed: 38574294). Graz University of Technology: markers of cognitive fatigue were recorded after 15 minutes of a video call.

[6] Bailenson, J. N. (2021). Nonverbal overload: A theoretical argument for the causes of Zoom fatigue. Technology, Mind, and Behavior, 2(1).

[7] Ariss, S. & Fairbairn, C. (2024). Eye-tracking during videoconference interactions: Self-view fixation and gaze patterns. University of Illinois. Participants systematically returned to the self-view window significantly more often than they reported in subjective accounts.

[8] Rzesnitzek, L. (2013). "Early Psychosis" as a mirror of biological controversies in post-war German, Anglo-Saxon, and Soviet Psychiatry. Frontiers in Psychology, 4, 481.