Technical Understanding: The Dimension That Separates AI Operators from AI Thinkers

A deep dive into AISA's Technical Understanding dimension: what each score band looks like, with concrete examples from Novice to Expert.

By AISA Team··8 min read
scoringmethodologydimensionstechnical-understandingai-skillsassessment-dimensionsscore-bandsai-literacy

Most people can prompt an AI model. Far fewer can explain why a prompt works, when a model is likely to fail, or what's actually happening when they hit "send."

That gap — between operating AI and understanding AI — is exactly what the Technical Understanding dimension measures. Across 755 completed AISA assessments, this dimension consistently surfaces the sharpest divides between score bands. Someone scoring a 4 and someone scoring a 7 might produce similar-looking outputs on a good day. But when the model hallucinates, when the task demands a specific architecture choice, or when the stakes require knowing what a model can't do, the gap becomes obvious.

Technical Understanding accounts for 20% of the overall AISA score. It's the third-heaviest dimension behind Workflow & Application (25%) and Prompting & Communication (23%), but it punches above its weight in terms of predictive value. People who score well here tend to score well everywhere else, because understanding how these systems work informs every other skill.

What Technical Understanding Actually Measures

This dimension evaluates two core criteria:

  1. Model Literacy — Can the person articulate how large language models work at a conceptual level? Do they understand tokens, context windows, temperature, training data cutoffs, and the difference between retrieval and generation?

  2. Tool & Ecosystem Awareness — Do they know what tools exist, when to use which ones, and how different models compare for different tasks? Can they reason about tool selection rather than defaulting to whatever's in front of them?

Notice what's not here: we're not testing whether someone can write PyTorch code or explain backpropagation. This isn't a machine learning engineering exam. It's a practical literacy test — do you understand enough about how these systems work to use them well and fail gracefully?

Score Band Breakdown: What Each Level Looks Like

Novice (1-2): "It's like a smart Google"

At this level, the mental model is essentially a search engine or a magic box. When asked how an LLM generates a response, a Novice might say something like "it searches the internet for the answer" or "it looks up information in its database."

Concrete signals we observe:

  • No awareness that models have training data cutoffs
  • Treats all AI tools as interchangeable ("I just use ChatGPT")
  • Cannot explain why a model might produce a confidently wrong answer
  • Surprised when a model doesn't "know" something recent

This maps closely to the Bystander and early Dabbler personas. The person may use AI daily but has built no mental model of what's happening underneath.

Developing (3-4): "It predicts the next word, right?"

Developing scorers have picked up fragments of how LLMs work — often from blog posts, podcasts, or casual conversation. They can parrot that models are "predicting the next token" but struggle to connect that mechanism to practical implications.

Concrete signals:

  • Knows the phrase "context window" but can't explain why it matters for their workflow
  • Aware that different models exist (GPT-5.5, Claude Opus 4.8, Gemini 3.5 Flash) but can't articulate meaningful differences beyond brand preference
  • Understands hallucination as a concept but attributes it to the model being "wrong" rather than understanding the generative mechanism that produces it
  • May know about temperature but not how adjusting it changes output characteristics

This is the most common band we observe. People here have the vocabulary but not the working understanding. They know that things are true about AI without knowing why they matter.

Competent (5-6): "I pick the model based on the task"

This is where Technical Understanding starts translating into measurably better AI use. A Competent scorer has an accurate-enough mental model to make informed decisions.

Concrete signals:

  • Can explain why a 1M context window matters differently for summarization versus code generation
  • Chooses between models with reasoning: "I'd use Flash for this because it's a high-volume, lower-complexity task where latency matters more than depth"
  • Understands that training data has a cutoff and knows to verify time-sensitive claims
  • Can explain hallucination mechanistically: the model generates plausible-sounding text because it's optimizing for coherence, not truth
  • Aware of the difference between a base model, a fine-tuned model, and a model accessed through an agentic tool like Claude Code or Cursor

This is the inflection point. People at this level start making tool choices that a Tactician or Conductor would recognize. They're not just using AI — they're reasoning about which AI, how, and why.

Proficient (7-8): "I know where the failure modes are"

Proficient scorers don't just understand how models work — they understand how models break. This is the difference between knowing a car has an engine and knowing that the engine overheats under specific conditions.

Concrete signals:

  • Can articulate why certain prompt structures produce better results (e.g., why structured output schemas reduce hallucination, why few-shot examples anchor model behavior)
  • Understands token economics and can reason about cost-performance tradeoffs — for instance, why GPT-5.5 Pro at $30/$180 per MTok might be worth it for some tasks but wildly inefficient for others
  • Knows the practical differences between retrieval-augmented generation and pure generation, and when each is appropriate
  • Can explain why model performance on benchmarks (SWE-Bench Pro, GPQA) may not predict performance on their specific task
  • Understands multi-model workflows: using one model for planning, another for implementation, another for review — and can explain why that decomposition helps

The current coding tools convergence is a good test case. Teams using Claude Code for architecture, Codex for implementation, and Antigravity for browser testing are making Proficient-level technical decisions. They're not just following a tutorial — they're reasoning about which model's strengths match which sub-task.

Expert (9-10): "I can predict model behavior before I prompt"

Expert-level Technical Understanding is rare. These are people who have internalized enough about model architectures, training processes, and inference mechanics that they can predict failure modes before encountering them.

Concrete signals:

  • Can reason about why a model might perform differently on a task framed as classification versus generation, even when the underlying question is the same
  • Understands attention mechanisms well enough to predict when long-context performance will degrade and structures prompts accordingly
  • Can evaluate new models and tools critically — not just "is it better?" but "better at what, measured how, and what are the tradeoffs?"
  • When a model like GLM-5.2 drops with 744B MoE parameters (40B active), an Expert can immediately reason about what that architecture implies for latency, cost, and task suitability — without waiting for someone else's benchmark review
  • Understands the implications of regulatory developments (Colorado AI Act enforcement, EU AI Act) on what models can be deployed where and how

We observe very few people at this level. It requires both breadth across the ecosystem and depth in understanding the underlying mechanics.

Why This Dimension Matters More Than People Think

There's a tempting argument that Technical Understanding is academic — that what matters is whether someone can get good outputs, not whether they can explain why the outputs are good. We see the data push back on that assumption.

People with low Technical Understanding scores and high Prompting scores tend to be brittle. They've memorized patterns that work today. But when models change, when a new tool ships, or when their usual approach produces garbage on an unfamiliar task, they don't have the mental model to adapt. They're optimized for one environment.

People with strong Technical Understanding adapt faster. They can look at a new model release, reason about its likely strengths and weaknesses, and adjust their approach without waiting for a tutorial. That's the difference between someone who follows a recipe and someone who understands cooking.

What to Do With This

If you're evaluating AI capability on your team — whether for hiring, upskilling, or role design — Technical Understanding is the dimension most likely to be invisible in day-to-day work but critical when things go wrong or when the tooling shifts.

A few concrete steps:

  1. Baseline your team. Run a free AI skills assessment to see where people actually land on Technical Understanding versus where you'd assume they are. The gap is usually larger than expected.
  2. Look at the rubric. The full AISA rubric breaks down exactly what each criterion measures, so you can design targeted learning paths rather than generic "AI training."
  3. Use score bands for role expectations. A product manager probably needs Competent-level (5-6) Technical Understanding to make good tool and vendor decisions. A developer building AI-integrated systems likely needs Proficient (7-8). Set explicit expectations rather than hoping people figure it out.

Technical Understanding isn't about turning everyone into an ML engineer. It's about ensuring the people using AI every day have an accurate enough mental model to know when to trust it, when to verify it, and when to use something else entirely. That's not academic — that's operational.

Ozan Dagdeviren

Ozan Dagdeviren

Founder of AISA — the AI skills assessment platform used by professionals worldwide to measure, certify, and develop their AI fluency. More about AISA

The Science Behind AISA

Metropolitan PoliceHarvard UniversityCrowdboticsE.S.E.

In 2026, Anthropic published the AI Fluency Index — the largest empirical study of AI fluency to date, analysing nearly 10,000 conversations. AISA covers 93% of the behaviours Anthropic identified as markers of AI fluency and goes even deeper with 4 additional dimensions. The U.S. Department of Labor's AI Literacy Framework (TEN 07-25) defines what every worker needs to know about AI — AISA covers 100% of its 25 sub-competencies.Read our analysis: Anthropic's AI Fluency Study & AISA · DOL AI Literacy Framework & AISA

AISA's framework is developed by a team with deep roots in tech, behavioural science, and AI product leadership — the rubric is informed by backgrounds spanning the Metropolitan Police, Harvard, Crowdbotics (Silicon Valley), and the European School of Economics.