Why PMs Miss AI Safety Guardrails in First Prompts — And What It Reveals

AISA assessment conversations reveal a consistent blind spot: product managers rarely consider safety guardrails until prompted. We break down the pattern.

By AISA Team··5 min read
safetyproduct-managerdataassessment

The First Response Tells Us Everything

When AISA asks product managers to describe how they would scope an AI-powered feature, a significant portion make no mention of safety, bias, or guardrails in their initial response. Not a single reference. They go straight to user value, feature scope, and delivery timeline.

This isn't a guess — it is a consistent pattern across over 300 product manager assessments. And it maps directly to S1 (AI Safety), one of the six dimensions in the AISA scoring framework.

The number is not surprising when you consider how PMs are trained to think. But it is a problem when the feature they are scoping involves an AI system that interacts with real users.

How the Pattern Shows Up in Assessments

During an AISA assessment, Track A presents the PM with a scenario. A typical one: "Your company wants to add an AI-powered recommendation engine to its consumer product. Walk me through how you'd scope this feature."

Here is how responses typically cluster:

The 60% who mention safety unprompted usually do so in one of two ways:

  • Early integrators (scoring 7+ on S1) weave safety considerations into their scoping process from the start. They mention things like content filtering, bias testing, user opt-out mechanisms, and failure modes as part of the initial feature definition.
  • Late integrators (scoring 5-6 on S1) mention safety, but only as a final checklist item after covering user stories, success metrics, and engineering requirements.

The 40% who don't mention safety at all require Track A to specifically probe: "What could go wrong with this feature? What safeguards would you put in place?" Only then do safety considerations emerge.

This probe-dependent response is the clearest signal of an S1 score below 5. It indicates that safety thinking is available when triggered but is not part of the candidate's default product thinking.

Why PMs Default to Value-First Thinking

The 40% gap is not a knowledge gap. When probed, almost all of these candidates can articulate safety concerns. They know about bias in recommendation systems. They understand the risks of AI-generated content. The issue is that this knowledge sits in a separate mental category from "product scoping."

Three factors drive this:

1. PM Training Emphasizes User Value Above All

Product management frameworks — jobs-to-be-done, opportunity scoring, impact mapping — are overwhelmingly oriented toward identifying and delivering user value. Safety and risk assessment are typically treated as separate disciplines handled by security or legal teams. PMs learn to think "what should this feature do?" not "what should this feature never do?"

2. Safety Feels Like a Constraint, Not a Feature

In most PM mental models, safety guardrails are something you add after defining the core experience. They are constraints on the product, not part of the product. This framing is dangerous with AI features because the guardrails often are the product — a recommendation engine without content safety is not a feature, it is a liability.

3. AI Safety Is Newer Than AI Capability

PMs who have been working with AI features for a few years developed their instincts during a period when model capabilities were the primary bottleneck. Safety concerns were less visible because the models were less capable. As model capability has accelerated, the safety surface area has expanded faster than most PMs' mental models have updated.

What a Strong Safety-Aware Response Looks Like

Here is a composite of responses from candidates who scored 8+ on S1 (AI Safety):

"Before defining the recommendation logic, I'd want to identify the harm surface. What categories of content exist in our catalog? Are any of them age-sensitive, politically sensitive, or potentially harmful? I'd define a content safety taxonomy and build filtering rules before the recommendation algorithm, not after. The recommendation engine needs guardrails as a first-class architectural component — not a post-launch patch."

Notice three things about this response:

  1. Safety is framed as prerequisite, not afterthought. The candidate defines the harm surface before the feature logic.
  2. Specificity over generality. Instead of saying "we'd add safety measures," they name specific categories of risk and specific mitigation approaches.
  3. Architectural thinking. Guardrails are described as part of the system design, not as a layer applied on top.

This is exactly what S1 measures at the high end: the ability to integrate safety thinking into product decisions rather than treating it as a separate compliance exercise.

The Correlation With Other Dimensions

Our data shows that S1 scores correlate with two other dimensions more strongly than any other pairing in the PM assessment:

S1 and T2 (Limitation Awareness): PMs who score high on safety awareness almost always score high on limitation awareness. This makes intuitive sense — understanding what an AI system can get wrong is the foundation for knowing what guardrails it needs. These two dimensions are closely linked in our scoring framework.

S1 and P2 (Output Evaluation): PMs who think about safety also tend to be better at evaluating AI outputs critically. They ask "is this output safe to show a user?" alongside "is this output accurate?" — which is a more complete evaluation framework than accuracy alone.

Interestingly, S1 does not correlate strongly with P1 (Prompt Design) in the PM population. A PM can be excellent at crafting prompts and still have a blind spot around safety. Prompt skill and safety awareness appear to develop independently.

What This Means for Hiring

If you are hiring a PM to lead an AI-powered product, the AISA S1 score is arguably the most important single dimension to look at. Here is why:

Prompt design can be taught in a workshop. Model knowledge updates naturally over time. But safety thinking is a mindset — it requires a PM to instinctively consider failure modes and harms as part of their core product process. PMs who don't think this way will consistently de-prioritize safety work in favor of feature delivery, and you won't notice until something goes wrong in production.

This pattern should concern any organization building AI products. It suggests that a significant share of experienced PMs, when given full freedom to scope an AI feature, will not consider safety until someone else raises it.

For PMs Preparing for AISA

If you are a product manager preparing for an AISA assessment, here is a practical exercise: take any AI feature you have worked on and write a one-page scoping document. Then review it and count how many sentences address what the feature should not do, versus what it should do.

If the ratio is less than 1:3, your safety instinct needs strengthening. Practice rewriting feature specs with explicit harm surfaces, failure modes, and guardrail requirements as first-class sections — not appendices.

The AISA rubric is transparent about what S1 measures. A score of 8+ requires demonstrating that safety is integrated into your product thinking, not bolted on when someone asks about it. The PMs who miss it on the first pass are the ones who haven't made that shift yet.

The good news: awareness of the gap is the first step to closing it. The PMs who score highest on S1 are not the ones with the most AI safety knowledge — they are the ones who have trained themselves to ask "what could go wrong?" before "what could go right?"

Learn more about how AISA assesses product managers.

Ready to try the AI skills assessment yourself?