Microsoft Copilot: Why Stale Answers Aren’t Always Hallucinations

Most Copilot readiness conversations focus on two things: licensing and permissions. Make sure users have the right licence. Make sure SharePoint isn’t overshared. That covers the baseline. What it doesn’t cover is something I started noticing during a Copilot deployment: users in the same tenant, with the same permissions, getting noticeably inconsistent Copilot responses. The difference wasn’t what they could access. It was how their OneDrive was configured.

The concern itself is not unusual. In readiness assessments I have run recently, organisations frequently raise stale content as a deployment risk before any consultant flags it. People who know their own document estate understand that years of accumulated material, much of it superseded, is sitting in places Copilot will reach. What they don’t always see is the mechanism by which Copilot ends up preferring that content.

What Microsoft Documents

Microsoft’s Copilot grounding model is reasonably well documented at a high level. Copilot uses the Microsoft Search index and, more specifically, the Semantic Index for Microsoft 365 to retrieve organizationally relevant content before generating a response.

The Semantic Index operates at two layers. The tenant-level index covers content across the organization. The user-level index is scoped to the individual, and it incorporates Microsoft Graph activity signals: files recently opened, edited, or shared, and content the Graph considers trending around that user.

This customization is by design. The system is trying to surface what is most relevant to the individual, not just what exists in the tenant. The consequence is that two users with identical permissions can receive meaningfully different Copilot responses, depending on their interaction history.

Microsoft’s official readiness guidance focuses almost entirely on the permissions layer. Tighten access controls. Run access reviews. Clean up legacy sites. That framing isn’t wrong. It’s just incomplete, because it addresses what Copilot can surface, not what Copilot is more likely to surface for a specific user.

Two Search Systems That Don’t Talk to Each Other

Before explaining what I observed, it is worth establishing one technical distinction that isn’t always clear in the documentation.

Windows Search and Microsoft Search are separate systems. When a user syncs a SharePoint library to their device via the OneDrive sync client, the synced content becomes part of the local Windows Search index. Files become discover able through File Explorer and the Windows Start menu search bar.

Microsoft Search, which is the cloud-side search engine that feeds Copilot grounding, indexes SharePoint and OneDrive content at the cloud level. That indexing happens regardless of whether content has been synced locally. Sync does not expand what Microsoft Search can find. That content was already there.

These two systems do not share an index. Local Windows Search results do not feed directly into Copilot grounding.

What Actually Happens

Microsoft has documented Graph activity-based relevance signals, personalized Semantic Index behavior, and user-context weighting as components of Copilot grounding. What Microsoft has not publicly detailed is the precise mechanics of how OneDrive sync activity interacts with those signals. The behavior described here appears to emerge from those documented systems operating together.

The impact of sync on Copilot is indirect, but it is real.

Syncing a SharePoint library doesn’t change what Copilot can access. It changes what the user is likely to interact with. Synced content appears in File Explorer. It is discover able through Windows Search. It becomes part of the user’s normal daily navigation in a way that content sitting in a SharePoint site, accessible only through a browser, does not.

When a user finds a file through a local search and opens it in Word or Excel, that interaction generates a Graph activity signal cloud-side. The signal isn’t created by the Windows Search operation itself. It is created by the file open that follows. The local search surface acts as a discovery amplifier: it increases the realistic probability that a user will interact with content they might otherwise never have navigated to deliberately.

During one tenant assessment, the OneDrive sync surface exposed several million files to potential Copilot grounding, the vast majority with no recent interaction whatsoever. The pattern was consistent across the user base: large archive libraries had been synced years earlier so that users could search them quickly through File Explorer. Those libraries had not received new documents in over eighteen months. Users were still browsing them regularly, opening files to check details, and navigating the folder structures locally. Each of those interactions was generating Graph activity signals for archive content, gradually weighting it upward in their personalized grounding layer.

From the Semantic Index perspective, that content looked highly active for those users. The mechanism wasn’t sync. It was the interaction behavior that sync had made habitual and convenient.

There Are Important Details to Note

The distinction between permissions and relevance weighting matters practically. A well-governed SharePoint environment with correct access controls can still produce degraded Copilot grounding if interaction patterns are skewed by sync behaviour.
Stale content in SharePoint is already a Copilot risk without sync. If old content exists in a site the user has access to, Copilot can surface it. Sync doesn’t create that risk. What sync does is increase the realistic probability that the user will interact with that content, which elevates its weight in the personalized layer specifically for that individual.
The issue is not the presence of synced files. A synced archive that is never browsed and never opened generates no additional Graph signals. The risk comes from the interaction pattern that local discover ability encourages over time.
Volume matters because it expands the discovery surface. A synced folder with 200 files has limited impact. A synced archive with tens of thousands of files, navigated regularly through File Explorer, creates a much wider surface for incidental interaction with historical content. In environments where library sync has been used liberally for years, the cumulative synced surface can run into the millions of files.
This is not a hallucination problem. Copilot is surfacing real content that the user genuinely interacted with. It is surfacing the wrong real content. That distinction matters when diagnosing the issue, because the remediation path is different from a permissions or index coverage problem.
The Semantic Index is per-user. Two users in the same team, with the same SharePoint access, can receive meaningfully different Copilot responses based solely on differences in their sync configuration and resulting interaction patterns.

What to Do About It

The remediation follows directly from understanding the mechanism.

Start by auditing sync configurations for users reporting inconsistent or historically-skewed Copilot responses. In OneDrive admin settings, you can restrict sync to domain-joined devices and limit which site collections users are permitted to sync. Restricting sync of archive or legacy libraries removes the local discovery surface that drives incidental interaction with stale content. This is a recommendation I now include as standard in Copilot deployment designs, alongside the grounding scope review and label strategy.

For users who need access to historical libraries, the answer is not to remove access. It is to change how that access is used. SharePoint in the browser, accessed deliberately through search or navigation, generates interaction signals only when the user has a specific reason to open something. It does not create the passive, habitual browsing pattern that File Explorer sync encourages. Bookmark the library. Search it in context when needed. Don’t sync it.

From an information architecture standpoint, archive libraries should be separated from active project libraries at the site collection level. This makes it straightforward to apply targeted sync restrictions to historical content without affecting current working libraries.

There is also a short conversation to have with users. Syncing a folder to your device makes that folder part of your daily navigation. Over time, content you browse regularly, even without reading carefully, accumulates relevance signals that influence what Copilot believes matters to you. Old project archives synced for search convenience gradually become part of your Copilot’s working context.

Conclusion

The permissions layer of Copilot readiness is necessary but not sufficient. Getting permissions right prevents Copilot from surfacing content users shouldn’t see. Managing interaction patterns determines whether Copilot surfaces content users actually need.

Sync has always been a performance and convenience trade-off. In a Copilot-enabled tenant, it is also a relevance trade-off, not because sync changes what Copilot can access, but because it changes what users are likely to interact with. That distinction isn’t in the official readiness guidance yet. It probably should be.

I hope this helps frame what to look for during your own Copilot readiness reviews.

Microsoft Copilot: Why Stale Answers Aren’t Always Hallucinations

Leave a Reply Cancel reply

Related Posts

Microsoft Teams – More Free Tools (from Microsoft, MVPs and Community)

Preparing a CCX 400 Phones with Firmware Version 1.0.0 or 1.0.1 for Teams Device Management (manually, using a USB Memory Stick)

Teams (Hybrid Deployment with SfB) – Set Up Hybrid – Office 365 Admin Role Requirements