The AI Hub is live! This is your new home for AI documentation, articles, ideas, and Aria — your AI assistant. Explore the docs, read the latest articles, or just ask Aria anything.
A
Waystone AI Hub
Back to articles

Not All Models Are Made Equal

Neil Edwards5 December 2025
llmmodelsopenaianthropicai-strategyeducation

Not All Models Are Made Equal

Introduction

When people talk about "AI" in the workplace, they often treat it as a single, monolithic thing. But the reality is far more nuanced. The large language models (LLMs) that power tools like M365 Copilot, GitHub Copilot, and ChatGPT are not all the same. They differ in capability, speed, cost, and suitability for different tasks.

Understanding these differences isn't just academic — it directly affects how well AI works for you, how much it costs the organisation, and which tool is the right choice for a given task.

What is a Model?

At its core, a large language model is a mathematical system trained on vast amounts of text data. It learns patterns in language — grammar, facts, reasoning patterns, coding conventions — and uses those patterns to generate responses to your prompts.

Different models are trained differently, on different data, with different architectures and objectives. The result is a landscape of models with different strengths and weaknesses.

The Model Landscape

OpenAI GPT Family

OpenAI's GPT (Generative Pre-trained Transformer) models are the most widely known:

  • GPT-4o — The flagship model powering M365 Copilot. Balances capability with speed. Excellent for general business tasks, writing, analysis, and conversation.
  • GPT-4 — The previous generation. Still highly capable but slower and more expensive than GPT-4o.
  • GPT-3.5 — Faster and cheaper, but noticeably less capable for complex reasoning tasks. Still useful for simple, high-volume tasks.

At Waystone, GPT-4o is the primary model behind M365 Copilot, delivered through the Azure OpenAI Service — Microsoft's enterprise-grade, secure deployment of OpenAI models within Azure data centres.

Anthropic Claude

Anthropic's Claude models represent a different approach to AI development, with a strong focus on safety and helpfulness:

  • Claude Sonnet — A balanced model that offers strong reasoning and coding capabilities at moderate cost. Good for most business tasks.
  • Claude Opus — The most capable Claude model. Excels at complex reasoning, nuanced analysis, and tasks requiring deep understanding. Higher cost, but delivers premium quality.
  • Claude Haiku — Fast and cost-effective. Ideal for high-volume, simpler tasks where speed matters more than depth.

At Waystone, Claude models are used for advanced reasoning, coding assistance, and agentic workflows through approved enterprise channels.

Why Does It Matter?

Cost

Models vary dramatically in cost. Running every query through the most powerful model available would be like using a Formula 1 car for the school run — technically capable, but wildly impractical.

Smart organisations match the model to the task:

  • Simple email drafting? A faster, cheaper model works perfectly
  • Complex regulatory analysis? The premium model delivers better results and is worth the cost
  • High-volume data processing? Cost-effective models keep expenses manageable

Quality

More capable models generally produce:

  • Better reasoning and analysis
  • More nuanced and accurate responses
  • Fewer errors and hallucinations
  • Better handling of complex, multi-step tasks

But "better" isn't always necessary. For straightforward tasks, simpler models are perfectly adequate.

Speed

There's typically a trade-off between capability and speed:

  • Smaller, faster models respond almost instantly
  • Larger, more capable models take longer to generate responses
  • For real-time interactions (chat, quick lookups), speed may matter more than maximum quality

Security and Compliance

At Waystone, we only use models deployed in enterprise-grade environments:

  • Azure OpenAI Service — Models run within Azure with enterprise data protection, no data used for training, and full compliance with our security requirements
  • Enterprise Anthropic — Claude models accessed through approved, secure channels
  • No public APIs — We never use public consumer APIs for company data

What This Means for You

When Using M365 Copilot

M365 Copilot handles model selection for you. Microsoft optimises which model version handles your request based on the task type. You don't need to worry about model selection — just focus on writing good prompts.

When Using GitHub Copilot

GitHub Copilot uses specialised coding models optimised for software development. These models understand code structure, programming languages, and development patterns in ways that general-purpose models don't.

When Evaluating New AI Tools

If you're considering a new AI-powered tool or service for Waystone:

  1. Ask which model it uses — This tells you about its capabilities and limitations
  2. Check the deployment model — Is it enterprise-grade? Where is the data processed?
  3. Understand the cost structure — Per-user, per-query, or per-token pricing all have different implications
  4. Assess the fit — Is the model appropriate for the task, or is it overkill (or underkill)?

The Future

The model landscape is evolving rapidly:

  • Models are getting cheaper — What was premium-tier capability a year ago is now available at a fraction of the cost
  • Specialisation is increasing — Models designed for specific tasks (coding, analysis, creative work) are becoming more common
  • Multimodal capabilities — Models that understand images, audio, and video alongside text
  • Smaller, efficient models — Running capable AI on smaller infrastructure, enabling new deployment options

Waystone's AI Working Group continuously evaluates new models and capabilities to ensure we're using the best tools available while maintaining our security and compliance standards.

Key Takeaways

  1. Not all AI is the same — Different models have different strengths, speeds, and costs
  2. Match the tool to the task — The most powerful model isn't always the right choice
  3. Enterprise deployment matters — How a model is hosted is as important as which model it is
  4. The landscape changes fast — What's cutting-edge today will be standard tomorrow
  5. You don't need to be an expert — But understanding the basics helps you make better use of AI tools and ask better questions when evaluating new ones

Want to Learn More?

  • Explore the Prompting Guide to make the most of whichever model you're using
  • Browse the FAQ for answers to common questions about AI at Waystone
  • Join the discussion in the AI Hub Team on Microsoft Teams