Enterprise HR Tech

AI-Powered Interview Orchestration Platform

A multi-agent AI system that conducts structured interviews autonomously — real-time video, adaptive questioning, and objective behavioral assessment at any scale.

Platform

Web · Real-time Video · AI Agents

Duration

6 months

0

Interviewer hours per screening

100%

Structured script adherence

5

Behavioral dimensions per candidate

<3 min

Full personality report generation

Project overview

We designed and built a production-grade platform that conducts structured interviews through an AI avatar in a live video session — handling questioning, follow-up probing, real-time transcript capture, and post-interview behavioral scoring entirely without human involvement in the loop.

Platform

Web · Real-time Video · AI Agents

Duration

6 months

Type

Enterprise HR Tech

Stack

12 technologies

The challenge

High-volume hiring creates a structural tension: rigorous structured interviews require experienced interviewers, but scaling interviewer time to match hiring volume is expensive and inconsistent. Existing video interview products only record — they don't evaluate. The gap between volume and quality had to be closed without sacrificing either.

Structured interviews demanded significant senior interviewer time per candidate

Inconsistent questioning across interviewers produced unreliable comparisons

No system could evaluate behavioral traits objectively at high volume

Manual scheduling and interview coordination created operational drag

Evaluation quality varied with interviewer mood, fatigue, and bias

Time-to-feedback loops stretched to days, slowing offer decisions

What we set out to do

  • 01

    Build an AI interviewer that conducts role-specific structured interviews autonomously via live video

  • 02

    Ensure every candidate receives identical structured questioning regardless of volume or timing

  • 03

    Generate structured behavioral assessments automatically from interview transcripts

  • 04

    Integrate real-time video infrastructure with multi-agent AI orchestration in a single session

  • 05

    Design the system to be configurable per role — different frameworks, question banks, and evaluation rubrics

  • 06

    Produce an auditable, evidence-anchored report that hiring managers can act on immediately

How we solved it

01

Multi-agent orchestration with LangGraph

We modelled the interview as a stateful graph using LangGraph. Each phase — opening, competency questioning, follow-up probing, closing — is a separate agent node with explicit transition conditions. The graph enforces structure: agents cannot skip phases, must satisfy depth criteria before advancing, and hand off cleanly via typed state objects. This gave us the determinism a hiring process requires while preserving LLM flexibility for natural language generation.

Key decision

LangGraph over a single-prompt loop — state machine semantics enforce phase discipline that a flat conversation chain cannot.

Result

Interview flow is reproducible, auditable, and configurable per role without touching application code.

02

Real-time video infrastructure with LiveKit

Candidate sessions run over WebRTC via LiveKit — an open-source real-time media server. The AI agent participates as a LiveKit participant, publishing synthesised audio and consuming the candidate's audio stream for real-time transcription. Video is recorded server-side for compliance. Sub-150ms audio latency was essential for a natural conversation feel; LiveKit's SFU architecture delivered this at scale without requiring per-session infrastructure setup.

Key decision

Server-side transcription over browser-side — more reliable under varied network conditions and eliminates browser compatibility risk.

Result

Audio/video latency under 150ms. Full session recordings stored automatically in Azure Blob.

03

AI avatar interface

Rather than a plain voice interface, candidates interact with an animated AI avatar that maintains eye contact and expressive gestures synchronised to speech. This was critical for candidate experience — early testing showed candidate response quality correlated with perceived interviewer presence. The avatar layer sits between the LangGraph orchestrator and LiveKit, consuming text output from the agent and publishing lip-synced video back into the session in real time.

Key decision

Avatar animation driven by agent text output, not pre-recorded clips — fully dynamic, no scripted responses.

Result

Candidate comfort scores comparable to human interview panels in internal testing.

04

Evidence-anchored behavioral scoring

Post-interview, a dedicated evaluation pipeline processes the full transcript through GPT-4o with a structured rubric. Rather than producing bare scores, each trait assessment is anchored to specific transcript segments — the system cites the exact candidate statement that supports a rating. A confidence check node flags borderline scores for human review rather than silently producing a number that looks authoritative but isn't. This makes every assessment defensible.

Key decision

Evidence citation made mandatory — a score with no cited transcript segment is rejected by the pipeline.

Result

Every evaluation output traceable to verbatim candidate statements.

05

Automated personality report generation

A five-node LangGraph pipeline runs after scoring: trait analysis → job fit scoring → detailed behavioural summary → report compilation → QA check. The pipeline consumes from a Redis queue (decoupled from the interview session) and produces a structured report with Big Five trait scores, job fit percentage, key strengths, and development flags. Reports are generated in under three minutes and immediately available to the hiring team.

Key decision

Redis queue between interview completion and report generation — decouples session reliability from report generation latency.

Result

Full personality report available to hiring managers within 3 minutes of session end.

Measurable impact

0 hrs

Interviewer time per screening round

100%

Question consistency across all candidates

<3 min

Time to full behavioral report

5 traits

Scored per interview with evidence citations

~150ms

Audio/video round-trip latency

6 months

Concept to production deployment

Tech stack

PPythonLLangGraphLLiveKitWWebRTCNNestJSTTypeScriptPPostgreSQLRRedisRRabbitMQRReactOOpenAI GPT-4oAAzure Blob

What we learned

The platform proved that structured interviewing — a process historically bottlenecked by human bandwidth — can be automated without sacrificing quality. The key was treating the interview as a state machine, not a conversation, and enforcing evidence requirements on every evaluation output. Hiring teams get consistent, auditable data. Candidates get a structured, professional experience. The bottleneck disappears.

  • 01

    LangGraph's state machine model maps precisely to structured interview phases — phase discipline that a flat LLM loop cannot enforce

  • 02

    Avatar-driven interviews require real-time speech-to-animation pipelines; candidate comfort directly impacts response quality and signal reliability

  • 03

    Evaluation frameworks need mandatory evidence anchoring — a score with no transcript citation is an opinion, not an assessment

  • 04

    Decoupling report generation from session handling via Redis queue is essential; both need independent reliability guarantees

  • 05

    Configurable frameworks matter more than clever LLM prompts — the system's value comes from structure, not from the model

Ready to build something that matters?

We solve problems that don't have Stack Overflow answers. Let's talk.

Book a Discovery Call