← Blog

What Happens Inside the Brain When Someone Watches Your Ad

We built a tool that predicts real fMRI brain activation in response to video content, using Meta's TRIBE v2 neural encoding model. Here is what the neuroscience means for how you make creative decisions - and why measuring attention is no longer enough.

The problem with attention metrics

For years, video advertising has been measured by a single proxy: did someone watch it? View-through rate, average watch time, completion rate. These metrics tell you that eyes were on the screen. They do not tell you what the brain did with what it saw.

A viewer can watch 30 seconds of your video and retain nothing emotionally meaningful. Or they can glance at four seconds of a scene and have a strong associative memory encoded. The difference is not visible in the analytics dashboard. It is happening inside the brain - in the visual cortex, in the auditory processing regions, in the memory and emotion centers that determine whether your message actually lands.

This gap between viewership and cognitive impact is the fundamental unresolved problem in video creative. And it is why most A/B testing in video advertising tells you very little about why one creative outperforms another.

700+
human fMRI subjects trained the TRIBE v2 model
Meta AI Research, Algonauts 2025
20k
cortical vertices measured per frame of video
TRIBE v2 neural encoding model
6
brain regions tracked in real time: visual, auditory, language, emotion, memory, prefrontal
app.publicimpact.ai

What TRIBE v2 actually does

TRIBE v2 is a neural encoding model developed at Meta AI Research. It was trained on fMRI data from over 700 human subjects watching video content. The model learned, at a very fine resolution, how different visual and auditory signals map to brain activation patterns across the cortex.

When you upload a video to app.publicimpact.ai, the model processes every frame, extracts audio and visual features, runs them through the same encoding pathways learned from real brain data, and outputs predicted BOLD (Blood-Oxygen-Level-Dependent) signals across 20,000 cortical vertices. These are the same signals a neuroscientist would measure in an fMRI scanner - predicted, in seconds, from your video file.

Algonauts 2025 winner: TRIBE v2 achieved the highest predictive accuracy of any model in the Algonauts Challenge 2025 - the leading international benchmark for brain-computer alignment research. This is not a marketing model trained on click data. It is neuroscience research running inside a browser.
Source: Meta AI Research / Algonauts Challenge 2025

The six brain regions that matter for marketing

The tool aggregates the cortical predictions into six Regions of Interest (ROIs) that are directly interpretable for creative decisions:

ROI 1
Visual Cortex
Responds to motion, contrast, color, and compositional complexity. Peaks at cuts, movement, and high-contrast visual moments.
ROI 2
Auditory Cortex
Tracks tonal variation, speech rhythm, music, and background sound. Reveals whether your audio is working with or against your visual.
ROI 3
Language Areas
Processes spoken words, captions, and written text. Shows which moments of dialogue or narration actually get processed semantically.
ROI 4
Emotion (Amygdala)
The most commercially important signal. Emotional activation predicts memorability, social sharing, and purchase intent more reliably than any click metric.
ROI 5
Memory (Hippocampus)
Encodes episodic context. High memory activation at the moment your brand or product appears is the neurological correlate of brand recall.
ROI 6
Prefrontal Cortex
Associated with decision-making, evaluation, and intent. Elevated activity here during your CTA is a direct signal of conscious consideration.

How the analysis works end-to-end

Upload a video to app.publicimpact.ai and the pipeline runs automatically on an A100 GPU:

1
TranscriptionWhisperX extracts the spoken word with timestamp-level precision. Language is detected automatically - German, English, or any other language works without configuration.
2
Feature extractionTRIBE v2 processes visual frames and audio signals in parallel, extracting the multimodal features the neural encoder was trained on.
3
Neural encodingThe model predicts BOLD activation across 20,000 cortical vertices, frame by frame. This is the same process as simulating an fMRI scan of a viewer watching your video.
4
ROI time seriesPredictions are averaged over the six brain regions and plotted as time series. You can see exactly when Visual, Emotion, and Memory activation peaks and drops across the video timeline.
5
Brain heatmapsNilearn renders glass brain images at the moments of highest and lowest total activation - giving you a visual representation of which cortical areas lit up and when.
6
GPT-4o recommendationsBased on the ROI scores, transcript, and heatmaps, GPT-4o generates a qualitative analysis with specific, actionable recommendations for improving the creative.

What this means for how you make creative decisions

The practical implications are significant. Consider a 60-second brand film. Traditional analysis tells you average watch time and where drop-off occurs. Neural analysis tells you something much more useful: at second 38, when your presenter says the brand name for the first time, does the memory region spike? At second 52, when you show the product, does the emotional region activate? At your call-to-action, does prefrontal engagement rise - or has the video already exhausted cognitive load and produced a flat response?

These are the questions that determine whether a video converts. They cannot be answered by view counts. They can only be answered by understanding what is happening inside the brain of the viewer.

What the heatmaps reveal: In tests with existing ad creatives, the glass brain heatmaps consistently show two patterns. Videos with high emotional and memory activation at brand moments perform significantly better in recall studies. Videos with high visual activation but flat emotional response produce attention without purchase intent - the "I've seen this ad but can't remember what it was for" phenomenon.
Internal analysis, app.publicimpact.ai

Why this is possible now

Two things changed in the last 18 months. First, Meta's TRIBE v2 research matured to the point where cross-subject neural encoding predictions are accurate enough to be useful outside the lab. The Algonauts 2025 benchmark results confirmed that TRIBE v2's predictions correlate strongly with actual measured brain activity - across subjects who were not in the training data.

Second, the cost of GPU inference dropped to a point where running this kind of analysis per-video is commercially viable. The entire pipeline runs on a single A100 GPU in three to five minutes. A year ago, equivalent compute would have required a reserved cluster and a six-figure budget.

The result is a capability that previously existed only inside neuroscience research labs - now accessible to anyone with a video file and a browser.

How to use it in practice

app.publicimpact.ai is live. Upload any video file. The first analysis takes three minutes for the GPU to start; subsequent uploads within the same session are faster. The output includes the full ROI time series, brain heatmaps, transcript alignment, and the GPT-4o creative analysis.

Try the neural video analysis tool.

Upload any video file and get a full brain activation report in under five minutes. Built with Meta's TRIBE v2 model on an A100 GPU.