Prompting & Communication: What Each Score Band Actually Looks Like in Practice

A detailed breakdown of the Prompting & Communication dimension, with concrete examples of what separates Novice from Expert across all score bands.

By AISA Team··8 min read
scoringmethodologydimensionspromptingai-assessmenthiringscoring-rubricai-skills

GitHub Copilot just moved to token-based billing. Anthropic disclosed that 80%+ of its production code is now written by Claude. ChatGPT reportedly passed a billion monthly active users. The throughput of human-to-AI communication has never been higher — and the cost of doing it poorly has never been more visible.

Prompting & Communication carries 23% of the AISA rubric weight. It's the dimension most people assume they're good at. Across 454 completed assessments, it's also the dimension where the gap between self-perception and demonstrated skill is widest. People who use AI daily often score lower here than they expect, because fluency with a chat interface is not the same thing as effective communication with a model.

Let's break down what each score band actually looks like.

What This Dimension Measures

Prompting & Communication evaluates two core criteria: prompt construction quality and iterative refinement ability. The first is about how well you structure an initial request — specificity, constraint-setting, format guidance, context provision. The second is about what you do when the output isn't right. Do you rephrase vaguely? Do you diagnose what went wrong? Do you adjust your approach based on the model's behavior?

These two criteria interact. A strong initial prompt reduces the need for iteration, but the ability to iterate well is what separates someone who gets stuck from someone who converges on a useful result.

Score Bands: What Separates Each Level

Novice (1-2): Treating the Model Like a Search Engine

At this level, prompts are typically one-line queries with no structure. Think: "Write me a marketing email" or "How do I fix this bug?" There's no context about audience, tone, constraints, or desired format. When the output misses the mark, the response is usually to either accept it as-is or rephrase the same question with slightly different words.

The pattern we observe here is delegation without specification. The person knows AI can do something but hasn't internalized that the quality of the output is a function of the quality of the input. When asked during assessment to refine a prompt that produced poor results, Novice-level candidates often can't articulate why the output was poor — they just know it "wasn't what I wanted."

Developing (3-4): Adding Context, But Inconsistently

Developing-level candidates show awareness that prompts need structure. They'll include some context — maybe a role ("You are a senior developer") or a format request ("Give me a bullet list"). But the application is inconsistent. They might nail the structure on a straightforward task and then fall back to vague prompting when the task gets complex.

Iteration at this level is reactive rather than diagnostic. If the output is too long, they'll say "Make it shorter." If it's off-topic, they'll repeat the original request with emphasis. What's missing is the ability to identify which part of the prompt led to the unwanted behavior and adjust that specific element.

This is the most common band we see for people who describe themselves as "regular AI users." Daily usage doesn't automatically build prompting skill — it often just reinforces habits.

Competent (5-6): Structured Prompts, Purposeful Iteration

This is where a qualitative shift happens. Competent candidates construct prompts with multiple explicit constraints: audience, format, length, tone, exclusions, examples. They understand that a prompt is a specification, not a wish.

More importantly, their iteration is purposeful. When output quality drops, they can diagnose the likely cause. "The model is being too generic because I didn't provide enough domain context" or "It's hallucinating details because I asked for specifics I didn't supply." They adjust one variable at a time and observe the effect.

A concrete example from assessment conversations: when asked to prompt a model for a technical document, Competent candidates will specify the target reader's expertise level, define what sections to include, and provide examples of the desired style. When the output needs work, they'll say something like, "The introduction is too high-level for this audience — I need to add a constraint about assuming the reader already understands X."

Proficient (7-8): Prompt Architecture and Model-Aware Communication

Proficient candidates don't just write good prompts — they design prompt strategies. They think about multi-turn conversation structure before they start. They anticipate where the model is likely to struggle and preemptively address those failure modes in their prompt design.

At this level, candidates demonstrate awareness of how different models handle different types of instructions. With reasoning-capable models like Claude Opus 4.8 or GPT-5.5, they understand that effort levels and thinking modes change how the model processes their request. They know when to use system-level instructions versus in-prompt constraints. They understand that context window management matters — that a 1M-token context window doesn't mean you should dump everything in without structure.

Iteration at this level looks like systematic experimentation. A Proficient candidate might say: "I'm going to try decomposing this into three sequential prompts instead of one compound prompt, because the model is losing coherence when I combine the analytical and creative tasks." They treat prompting as an engineering problem with testable hypotheses.

This is also where we see candidates naturally reference techniques like few-shot examples, chain-of-thought elicitation, and output format enforcement — not as buzzwords they've memorized, but as tools they reach for when the situation calls for them.

Expert (9-10): Communication as System Design

Expert-level prompting is rare. These candidates treat the entire human-AI communication flow as a system to be designed, not a conversation to be had.

They think about prompt reusability and composability. They design prompts that can be parameterized and reused across contexts. They understand the trade-offs between specificity and flexibility — when a highly constrained prompt produces better results versus when leaving room for the model to explore yields more useful output.

Expert candidates demonstrate something we call failure mode fluency. They can predict, before sending a prompt, the likely ways it will fail. They build guardrails into their prompt structure proactively. When discussing their approach during assessment, they'll reference specific model behaviors: "Models tend to over-index on the last instruction in a long prompt, so I front-load my most important constraints."

Perhaps most distinctively, Expert candidates think about prompting in the context of broader workflows and teams. They consider how their prompts would need to change if handed to a colleague. They think about documentation and reproducibility. In an era where agent orchestration tools like Claude Code Workflows and Vercel AI SDK 6 are making multi-agent patterns accessible, Expert-level communicators are the ones who can design the instructions that make those systems actually work.

The Patterns That Matter

Three observations from reviewing assessment data across this dimension:

Iteration skill is more predictive than initial prompt quality. Someone who writes a mediocre first prompt but iterates precisely will consistently outperform someone who writes a strong first prompt but can't adapt when it doesn't work. This matters for hiring — you want people who can converge on good results, not people who occasionally get lucky.

Technical roles don't automatically score higher. Developers often score well on prompt structure but surprisingly poorly on iterative refinement. They're accustomed to deterministic systems where the same input always produces the same output. The probabilistic nature of LLMs requires a different mental model for debugging, and many technical candidates haven't made that shift.

Copy-paste behavior is a strong negative signal. Our anti-gaming detection flags style shifts and suspicious response patterns. But beyond gaming, copy-paste behavior during assessment reveals something substantive: the candidate can't construct prompts in real-time. They're dependent on templates they've collected. That's a ceiling on their ability to handle novel tasks.

What This Means for Hiring

If you're evaluating candidates for roles where AI interaction is part of the job — which, at this point, is most roles — Prompting & Communication is the dimension that tells you whether someone can actually use the tools or just access them.

A Competent (5-6) score is a reasonable baseline for most knowledge work roles. Below that, you're looking at someone who will need structured training before they can reliably get value from AI tools. Above that, you're looking at someone who can design AI-augmented workflows for themselves and their team.

Take the free AI skills assessment to see where you land. Or read the full rubric breakdown to understand how this dimension interacts with the other four — particularly Workflow & Application, where strong prompting compounds into dramatically different productivity outcomes.

Ozan Dagdeviren

Ozan Dagdeviren

Founder of AISA — the AI skills assessment platform used by professionals worldwide to measure, certify, and develop their AI fluency. More about AISA

The Science Behind AISA

Metropolitan PoliceHarvard UniversityCrowdboticsEuropean School of Economics

In 2026, Anthropic published the AI Fluency Index — the largest empirical study of AI fluency to date, analysing 9,830 conversations. AISA covers 93% of the behaviours Anthropic identified as markers of AI fluency and goes even deeper with 4 additional dimensions.Read our white paper: Anthropic's AI Fluency Study & AISA

AISA's framework is developed by a team with deep roots in tech, behavioural science, and AI product leadership — the rubric is informed by backgrounds spanning the Metropolitan Police, Harvard, Crowdbotics (Silicon Valley), and the European School of Economics.