AISA

For the technically curious. (You know who you are.)

Architecture

How It Works

One conversation. Two AI tracks. Zero black box.

Candidates talk to one AI. A separate AI evaluates every response in real time. They never mix. That’s how we keep the dialogue natural and the scoring rigorous.

Anti-gaming

We detect when responses don’t come from the candidate. No proctoring—just instrumentation.

Typing speed

Human typing is ~4–10 chars/sec. Text appearing faster (12+ chars/sec) or in large chunks = pasted.

Paste events

We track paste events directly. No guesswork—if they pasted, we know.

Style shifts

Sudden change in vocabulary or formality (e.g. ChatGPT-ish phrasing) vs their baseline.

Response timing

Long pauses before answering or impossibly fast replies. Flagged alongside the report.

Typing metrics weigh 70%, other signals 30%. Flags appear in the report with an appeal option—we’re confident but never punitive.

End-to-end flow

1

Invite

You send a link. They get a professional invite and open the assessment.

2

Conversation

20–40 min chat. Workflows, tools, verification—feels like a colleague, not an exam.

3

Evaluation

Every message is scored by Track B before Track A replies. Evidence and steering, invisible to the candidate.

4

Report

Scores across 5 dimensions, each tied to quotes. Plus follow-up questions where evidence was thin.

The two tracks

Two distinct AIs so conversation quality and evaluation rigor don’t compromise each other.

Track A

Conversationalist

The only AI the candidate sees. Warmth, natural flow, adaptive depth. Gets steering notes from Track B but never sees scores. Prioritizes a natural dialogue over checklist coverage.

Track B

Evaluator

Runs silently. For every message: evidence items, scores, steering notes—structured data only, no candidate-facing text. Behavioral anchors (1-10 scale) keep scores consistent and explainable.

Track B runs before Track A each turn → the next reply can already reflect the latest steering.

Orchestrator

Stateless pipeline per message. Instrumentation is explicit—we measure it, we don’t ask the model to guess.

Message
Flags
Track B
Track A
Persist
Latency Style shift

Behavioral rubric

5 skill dimensions, 11 underlying criteria. Job-relevant, learnable AI skills (not personality). Selected via multi-frame analysis; we assess skills, not communication style.

Score scale (1–10)

1NoviceUnaware this is a skill; no intentional practice.
3DevelopingAware but inconsistent; reactive, not deliberate.
5CompetentFunctional approach with repeatable techniques; not yet internalized.
7ProficientConsistent and intentional; understands why, not just how.
10ExpertPrinciple-level mastery; pushes the craft forward, influences others.

5 dimensions · 11 criteria

Prompting & Comms
  • P1Prompt Design
  • P2Iterative Dialogue
  • P3Context & Memory Management
Critical Thinking
  • T1Output Evaluation
  • T2Limitation Awareness
Technical Understanding
  • U1AI Fundamentals
  • U2Tool Landscape
Workflow & Application
  • W1Workflow Integration
  • W2Task Decomposition
  • W3Domain Application
Safety & Responsibility
  • S1AI Safety & Responsibility

Want the full breakdown? Read our deep dive into all 5 dimensions or learn how conversational evidence prevents cheating.

Evidence & scoring

11
Criteria
Quote
per score
Scaffold
detection

Every score is tied to specific quotes. We flag when the candidate agreed with something we explained (score cap). Chaptering keeps context manageable over long sessions.

What you get

  • Overall score & recommendation
  • Per-criterion scores with confidence
  • Exact quotes justifying each score
  • Strengths & gaps
  • Follow-up interview questions
  • Instrumentation notes when relevant

The 10 AI Personas

Beyond the score, every candidate gets an AI persona — a profile of how they interact with AI, not just how well they understand it. Two people can score identically and receive different personas.

The OracleUnderstands AI at its core — not just how to use it.

Deep technical mastery of AI itself. Understands or builds AI models, works with ML and LLMs at a technical level. Elite critical analysis comes from understanding the technology at its foundation, not just from using it.

The ArchitectBuilds highly complex integrated systems using AI.

Designs and builds sophisticated multi-system AI integrations at scale. Goes beyond creating individual tools to engineering production-grade architectures where AI components interact with each other and non-AI systems.

The BuilderHas actually built something with AI.

Personally created complex, useful tools, workflows, or products using AI — whether for their own use, their company, or commercially. Developed deep practical understanding through hands-on building that goes beyond secondhand knowledge.

The ConductorOrchestrates AI across the workflow, not just within it.

Uses AI heavily across complex workflows, automations, and multi-tool pipelines. Understands AI limitations well and knows which tool integrates with which. Orchestrates and configures sophisticated setups, but typically works with what's available rather than building novel tools from scratch.

The TacticianGets things done with AI — fast and reliably.

Productive with mainstream AI tools and uses them well within established workflows. Communicates clearly with AI and consistently gets quality output, but typically hasn't pushed into the cutting edge of AI tooling or complex integrations.

The EnthusiastCurious, capable, and picking up speed.

Actively building AI skill across multiple dimensions. Tries new tools, refines prompts, and is beginning to develop repeatable patterns — the trajectory is strong.

The ScepticQuestions everything — the output, the tool, the hype.

Approaches AI with critical caution. May under-use AI in practice, but the verification instinct and risk awareness form a strong foundation that many frequent users lack.

The Copy-PasterUses AI regularly — takes the output at face value.

Relies on AI for day-to-day output but with limited iteration or verification. Gets value, but leaves quality and safety gains on the table by accepting first-pass results.

The DabblerTries things out — hasn't locked in a rhythm yet.

Experiments with AI intermittently: a prompt here, a quick question there. Nothing sustained, but a willingness to explore that many skip entirely.

The BystanderAI is on the radar, but not in the routine.

Has heard of AI tools but hasn't meaningfully engaged — the assessment itself may be the most direct interaction to date. Awareness exists; habit does not.

Personas reflect AI interaction style — usage patterns, habits, and mindset. They correlate with the score but don't directly map to it.

Powered byClaude(Anthropic)