The rise of AI has created a tension for organisations: they want the productivity gains of AI, but they can't risk exposing sensitive documents to systems they don't trust. It's a valid concern — and it's the reason we built DocInsightHub AI with security as a foundation, not an afterthought.

This article explains how we keep your documents private, your data isolated, and your AI responses trustworthy.

The Core Problem with Public AI Tools

When employees paste company documents into public AI tools like ChatGPT, they're sending that data to external servers with no guarantee about how it will be stored, used, or retained. For organisations handling sensitive policies, contracts, or compliance documents, this creates unacceptable risk.

Common concerns include:

Data being used to train public models
No control over who processes the data
No audit trail of what was shared
No role-based access — anyone with the tool can query anything
No way to delete data once submitted

DocInsightHub AI was designed specifically to address every one of these concerns.

How We Protect Your Data

1. Tenant Isolation

Every organisation on DocInsightHub AI operates as a completely separate tenant. This means:

Your documents are never visible to other organisations
Your search results only include your own documents
Your AI agents are trained exclusively on your data
Cross-tenant access is technically and logically prevented at the database and application level

This isn't just a permissions layer — it's a fundamental architectural decision enforced at every level of the system.

2. Role-Based Access Control (RBAC)

Not everyone in your organisation should have access to every document. DocInsightHub AI enforces role-based access:

Organisation Admins manage users, documents, and agents
Organisation Users can query documents they're authorised to access
Platform Admins manage the infrastructure but don't access tenant data

Access is enforced at the backend — it can't be bypassed by manipulating the frontend or API calls.

3. No Data Used for AI Training

We use Azure OpenAI for embedding generation and answer generation. Microsoft's enterprise AI services come with clear commitments:

Your data is not used to train, retrain, or improve any Microsoft or OpenAI models
Processing happens within Azure's controlled environment
Requests are scoped per tenant and per session

This is fundamentally different from using consumer AI tools, where data handling is opaque.

4. Full Audit Trail

Every significant action in DocInsightHub AI is logged:

Document uploads, updates, and deletions
User queries and AI responses
Role changes and membership updates
Agent configuration changes

This audit trail is essential for compliance, investigations, and demonstrating governance. You can see exactly who asked what, when, and what answer they received.

5. Secure Document Access

Documents stored in DocInsightHub AI are never publicly accessible. Access is controlled through:

Short-lived, signed URLs — documents are accessed via time-limited secure links, not permanent public URLs
Backend authorisation — every request is verified against the user's role and organisation
No direct blob storage exposure — internal storage paths are never visible to users

6. Enterprise Authentication

Authentication is handled through Microsoft Entra ID (formerly Azure Active Directory). This means:

Users sign in with their existing organisational credentials
Multi-factor authentication (MFA) is supported
No separate passwords to manage
IT teams retain full control over user provisioning and de-provisioning

Infrastructure Security

DocInsightHub AI runs entirely on Microsoft Azure, leveraging:

Encryption at rest — all data is encrypted in storage
Encryption in transit — all communications use TLS
Azure Key Vault — secrets and credentials are managed securely
Managed identities — services authenticate to each other without stored credentials
Application Insights — monitoring and alerting for operational security

Grounded AI — Not Hallucinated AI

One of the biggest risks with AI is hallucination — generating plausible-sounding answers that aren't based on real data. DocInsightHub AI addresses this with grounded responses:

Every answer is generated from content retrieved from your uploaded documents
If the system can't find relevant content, it says so — rather than guessing
Citations are included with every response so users can verify the source
Confidence scoring helps identify when answers may need human review

This approach prioritises accuracy and trust over the appearance of being "smart".

Our Security Principles

Everything we build follows these principles:

Your data belongs to you — we don't claim ownership, share it, or use it for training
Isolation by default — every tenant is separate, every access is controlled
Transparency through audit — every action is logged and traceable
Least privilege — users and systems only have the access they need
No shortcuts — security controls are enforced at the backend, not just the UI

Enterprise AI should make your organisation smarter without making it less secure. That's the standard we hold ourselves to.

Secure AI: How We Keep Your Documents Private