AI-Powered Interview Orchestration Platform
A multi-agent AI system that conducts structured interviews autonomously — real-time video, adaptive questioning, and objective behavioral assessment at any scale.
Platform
Web · Real-time Video · AI Agents
Duration
6 months
0
Interviewer hours per screening
100%
Structured script adherence
5
Behavioral dimensions per candidate
<3 min
Full personality report generation
Project overview
We designed and built a production-grade platform that conducts structured interviews through an AI avatar in a live video session — handling questioning, follow-up probing, real-time transcript capture, and post-interview behavioral scoring entirely without human involvement in the loop.
Platform
Web · Real-time Video · AI Agents
Duration
6 months
Type
Enterprise HR Tech
Stack
12 technologies
The challenge
High-volume hiring creates a structural tension: rigorous structured interviews require experienced interviewers, but scaling interviewer time to match hiring volume is expensive and inconsistent. Existing video interview products only record — they don't evaluate. The gap between volume and quality had to be closed without sacrificing either.
Structured interviews demanded significant senior interviewer time per candidate
Inconsistent questioning across interviewers produced unreliable comparisons
No system could evaluate behavioral traits objectively at high volume
Manual scheduling and interview coordination created operational drag
Evaluation quality varied with interviewer mood, fatigue, and bias
Time-to-feedback loops stretched to days, slowing offer decisions
What we set out to do
- 01
Build an AI interviewer that conducts role-specific structured interviews autonomously via live video
- 02
Ensure every candidate receives identical structured questioning regardless of volume or timing
- 03
Generate structured behavioral assessments automatically from interview transcripts
- 04
Integrate real-time video infrastructure with multi-agent AI orchestration in a single session
- 05
Design the system to be configurable per role — different frameworks, question banks, and evaluation rubrics
- 06
Produce an auditable, evidence-anchored report that hiring managers can act on immediately
How we solved it
Multi-agent orchestration with LangGraph
We modelled the interview as a stateful graph using LangGraph. Each phase — opening, competency questioning, follow-up probing, closing — is a separate agent node with explicit transition conditions. The graph enforces structure: agents cannot skip phases, must satisfy depth criteria before advancing, and hand off cleanly via typed state objects. This gave us the determinism a hiring process requires while preserving LLM flexibility for natural language generation.
Key decision
LangGraph over a single-prompt loop — state machine semantics enforce phase discipline that a flat conversation chain cannot.
Result
Interview flow is reproducible, auditable, and configurable per role without touching application code.
Real-time video infrastructure with LiveKit
Candidate sessions run over WebRTC via LiveKit — an open-source real-time media server. The AI agent participates as a LiveKit participant, publishing synthesised audio and consuming the candidate's audio stream for real-time transcription. Video is recorded server-side for compliance. Sub-150ms audio latency was essential for a natural conversation feel; LiveKit's SFU architecture delivered this at scale without requiring per-session infrastructure setup.
Key decision
Server-side transcription over browser-side — more reliable under varied network conditions and eliminates browser compatibility risk.
Result
Audio/video latency under 150ms. Full session recordings stored automatically in Azure Blob.
AI avatar interface
Rather than a plain voice interface, candidates interact with an animated AI avatar that maintains eye contact and expressive gestures synchronised to speech. This was critical for candidate experience — early testing showed candidate response quality correlated with perceived interviewer presence. The avatar layer sits between the LangGraph orchestrator and LiveKit, consuming text output from the agent and publishing lip-synced video back into the session in real time.
Key decision
Avatar animation driven by agent text output, not pre-recorded clips — fully dynamic, no scripted responses.
Result
Candidate comfort scores comparable to human interview panels in internal testing.
Evidence-anchored behavioral scoring
Post-interview, a dedicated evaluation pipeline processes the full transcript through GPT-4o with a structured rubric. Rather than producing bare scores, each trait assessment is anchored to specific transcript segments — the system cites the exact candidate statement that supports a rating. A confidence check node flags borderline scores for human review rather than silently producing a number that looks authoritative but isn't. This makes every assessment defensible.
Key decision
Evidence citation made mandatory — a score with no cited transcript segment is rejected by the pipeline.
Result
Every evaluation output traceable to verbatim candidate statements.
Automated personality report generation
A five-node LangGraph pipeline runs after scoring: trait analysis → job fit scoring → detailed behavioural summary → report compilation → QA check. The pipeline consumes from a Redis queue (decoupled from the interview session) and produces a structured report with Big Five trait scores, job fit percentage, key strengths, and development flags. Reports are generated in under three minutes and immediately available to the hiring team.
Key decision
Redis queue between interview completion and report generation — decouples session reliability from report generation latency.
Result
Full personality report available to hiring managers within 3 minutes of session end.
Measurable impact
0 hrs
Interviewer time per screening round
100%
Question consistency across all candidates
<3 min
Time to full behavioral report
5 traits
Scored per interview with evidence citations
~150ms
Audio/video round-trip latency
6 months
Concept to production deployment
Tech stack
What we learned
The platform proved that structured interviewing — a process historically bottlenecked by human bandwidth — can be automated without sacrificing quality. The key was treating the interview as a state machine, not a conversation, and enforcing evidence requirements on every evaluation output. Hiring teams get consistent, auditable data. Candidates get a structured, professional experience. The bottleneck disappears.
- 01
LangGraph's state machine model maps precisely to structured interview phases — phase discipline that a flat LLM loop cannot enforce
- 02
Avatar-driven interviews require real-time speech-to-animation pipelines; candidate comfort directly impacts response quality and signal reliability
- 03
Evaluation frameworks need mandatory evidence anchoring — a score with no transcript citation is an opinion, not an assessment
- 04
Decoupling report generation from session handling via Redis queue is essential; both need independent reliability guarantees
- 05
Configurable frameworks matter more than clever LLM prompts — the system's value comes from structure, not from the model
Ready to build something that matters?
We solve problems that don't have Stack Overflow answers. Let's talk.
Book a Discovery Call