Apple's Siri AI Lets Users Pick Their Own Model — Your Team's AI Skills Just Got Harder to Assess
iOS 27 lets users choose their default AI model. This fragments the tools your team uses daily and makes Technical Understanding the most critical hiring dimension.
At WWDC last week, Apple announced something that most coverage buried under the Gemini-powered Siri AI headline: iOS 27 Extensions will let users select any third-party AI model as their default assistant.
This isn't a design decision. It's a fragmentation event. And it has direct consequences for how you assess AI skills on your team.
One OS, Many Models — the End of a Shared Baseline
Until now, AI tool selection was mostly an organizational decision. Your company standardized on Copilot, or Claude, or ChatGPT. Developers worked within a shared context. Product managers referenced the same model's behavior when discussing capabilities. When someone said "the AI can't do X," everyone knew which AI they meant.
That's over. With iOS 27, the most widely used computing platform in the enterprise will let individual users swap their underlying AI model the way they swap keyboards. One designer on your team will be running Gemini 3.5 Flash. Another will default to Claude Fable 5. A third might stick with whatever OpenAI offers through the extension API.
The practical result: your team members will develop AI intuitions calibrated to different models, with different capability boundaries, different failure modes, and different safety behaviors — and they may not even realize it.
Why This Hits Technical Understanding Hardest
AISA's assessment rubric measures five dimensions. The one most directly affected by per-user model selection is Technical Understanding, which accounts for 20% of the overall score. This dimension evaluates whether someone understands how AI models actually work — their architectures, limitations, and behavioral differences.
Here's the problem: when everyone uses the same model, you can get away with shallow technical understanding. You learn the quirks of one system. You memorize its failure patterns. You build habits around its specific context window and output style. That looks like competence, but it's actually pattern-matching on a single tool.
In a multi-model environment, that breaks down fast. Consider the concrete differences between the models Apple users will be choosing from:
- Claude Fable 5 routes certain queries (cybersecurity, biology, chemistry) through safety classifiers to a different model entirely. A user who doesn't understand this will be confused when their assistant's behavior suddenly shifts mid-conversation.
- GPT-5.5 supports five levels of reasoning effort (none through xhigh), with pricing that doubles beyond 272K tokens. Someone who doesn't understand reasoning effort will either overpay or get worse outputs.
- Gemini 3.5 Flash optimizes for speed (284 tok/s) at the cost of depth. A user who treats it like a frontier reasoning model will get fast, confidently wrong answers.
The person who understands why these models behave differently — not just that they behave differently — is the one who'll remain effective regardless of which model their phone, their laptop, or their company's API happens to be running.
What This Means for Hiring Managers
Three specific things to change:
1. Stop Assessing AI Skills Against a Single Tool
If your interview process involves watching a candidate use ChatGPT (or Claude, or Copilot) to complete a task, you're testing tool familiarity, not AI competence. That distinction always mattered. Now it matters urgently, because you can no longer assume your new hire will use the same model in their daily work that they demonstrated in the interview.
A conversational AI assessment that evaluates underlying reasoning — can this person decompose a problem for an AI? Can they identify when output quality is a model limitation vs. a prompting failure? — gives you signal that transfers across models.
2. Probe for Model-Switching Judgment
The highest-scoring candidates in our assessments — those landing in the Architect and Oracle personas — demonstrate something specific: they can articulate when and why to choose a different model for a different task. They don't just have a favorite tool. They have a framework for matching model capabilities to task requirements.
In our early data across 671 completed assessments, we observe that most candidates struggle to explain the practical differences between models beyond surface-level brand preferences. This is a gap that's about to become very visible in day-to-day work.
3. Weight Technical Understanding More Heavily in Your Evaluation
AISA weights Technical Understanding at 20% — third among the five dimensions. For roles where team members will be choosing and switching between models regularly (which, post-iOS 27, is most knowledge work roles), you should be paying outsized attention to this dimension when interpreting assessment results.
A candidate who scores 8 on Prompting & Communication but 4 on Technical Understanding has memorized effective patterns for one model. That's the Enthusiast persona — high energy, real output, but brittle when the environment changes. A candidate who scores 6 across both dimensions will adapt faster when their daily AI tool changes underneath them.
The Deeper Pattern: AI Skill Is Becoming Model-Agnostic Skill
Apple's decision to make AI models swappable at the OS level is the clearest signal yet that AI proficiency can't be defined by proficiency with a specific AI product. This is the same trajectory we saw with programming languages — hiring managers stopped asking "do you know Java?" and started asking "can you reason about systems?" The transition took a decade. With AI models, it's happening in months.
The AI-native hiring guide we've put together addresses this directly: how to evaluate AI skills that are durable across the current model generation and the next one.
What To Do This Week
If you manage a team of developers, product managers, or designers, run a simple exercise: ask each person which AI model they primarily use and why. Then ask them to name one task where a different model would be a better choice, and explain the technical reason.
The answers will tell you more about your team's AI readiness than any certification. And if most of your team can't answer the second question, you've found the exact gap that iOS 27's model-selection feature is about to expose.
You can get a structured baseline with a free AI skills assessment — 489 people have taken one in the last 30 days alone, and the Technical Understanding dimension is consistently where the sharpest differentiation shows up between people who use AI and people who understand it.
Learn more about how AISA assesses developers.

Ozan Dagdeviren
Founder of AISA — the AI skills assessment platform used by professionals worldwide to measure, certify, and develop their AI fluency. More about AISA
Ready to try the free AI skills assessment yourself?