Skip to content

Methods & Citations

How the GRW Engine measures audience experience.

Per-primitive math, inputs, outputs, and published references for the Proof of Impact Report. The GRW Engine is built so that every line item on the report traces back to a function in the codebase and a citation on this page.

If you are evaluating us for an enterprise procurement, this is the page your security and methodology reviewers should read first. Question we have not answered? Email hello@grwproject.app.

Behavioural capture stack

What is actually read from the video?

Inputs
Any modern video: venue AV, GoPro, iPhone, webcam. No wearables. No audience opt-in beyond the venue's standard consent.
Computation
468 facial landmarks per detected face per frame via MediaPipe FaceMesh, body keypoints via MediaPipe Pose, voice prosody via Whisper-derived features. Face embeddings are never persisted; only the derived per-frame scoring vector survives the analysis pass.
Output
Per-frame analysis records: face slot, smile/surprise/neutral probabilities, attending boolean (head pitch ±25°, yaw ±30° of stage), optional bounding box.

Synchrony Score

Did the audience react together or one face at a time?

Inputs
Per-frame face arrays across the session. Faces are matched by stable slot index within frame; identity is not tracked across the room.
Computation
For each rolling window (default 30 seconds): compute the time-series of smile probability for every attending face that contributed at least 3 frames; compute Pearson r for every pair; report mean pairwise r per window and overall.
Output
Per-window r in [-1, 1] plus a normalized 0..1 score (r + 1) / 2 for UI surfacing.
Caveats
Requires at least two attending faces per window. Windows that do not meet the bar are reported as zero pairs, not imputed.

Held-Attention Index

What share of seconds did the room actually face the stage?

Inputs
Per-frame face arrays. The "attending" boolean is computed in the capture stack from head pitch and yaw relative to the camera (a stage-fixed proxy for "facing the speaker").
Computation
Aggregate faces by second-of-session. For each second compute attending / total. Mark the second as "held" when the fraction meets the threshold (default 0.6). Report the held rate, the mean attention fraction, and the longest consecutive sustained run.
Output
Per-second attentionFraction time series, headline heldRate, longestSustainedSec.
References
  • Standard head-pose attention proxy (per-frame pitch/yaw thresholding).

Applause Spectrogram

Where did the audible reactions land and how long did they hold?

Inputs
The AudioPeak stream produced by the capture stack: timestamp, RMS amplitude, duration above threshold, heuristic type (laugh / applause / ovation / reaction).
Computation
Count peaks by type. Sort by duration desc and keep the top N. Bin peaks into 5-second buckets, tracking weighted duration and peak amplitude per bin. Report the 95th-percentile amplitude across the session.
Output
countsByType, sustainedRanking, amplitudeP95, and a per-bin spectrogram payload for the heat-bar UI.
References
  • Affectiva audience response methodology (acoustic envelope segmentation).

Talk-over-Talk Delta

Compared to the prior session, what actually moved?

Inputs
Two complete sessions for the same speaker: SessionSummary plus the underlying engagement buckets.
Computation
Compute deltas on overallScore, meanAttention, and reactionTotals (per type). Compute Cohen's d using pooled standard deviation across bucket engagementScores. Classify: |d| ≥ 0.5 = meaningful, ≥ 0.2 = trend, otherwise noise.
Output
Per-metric deltas, Cohen's d, and a significance band the manager can act on.
Caveats
Requires at least two buckets per session. Speakers with one short session cannot anchor a baseline; the panel renders an explicit "no prior" state in that case.
References
  • Cohen's d effect size convention (Cohen, 1988).

Disclaimers

  • Proof of Impact is an audience experience intelligence product driven by the GRW Engine, not a clinical or diagnostic instrument. No primitive on this page is a medical, psychiatric, or hiring decision tool.
  • Faces in derived reports are not retained or rendered identifiable; only the aggregate behavioural signals survive past the analysis pass.
  • Cohort comparisons are reported only when the cohort cell has at least 30 entries; smaller cells are reported as below-threshold rather than imputed.

← Back to Proof of Impact