Harvey AI, CoCounsel, and Clio Duo: Where Does Your Data Actually Go?

2025-12-20

Harvey AI, CoCounsel, and Clio Duo: Where Does Your Data Actually Go?

Harvey AI, CoCounsel, and Clio Duo: Where Does Your Data Actually Go?

Quick Answer: Harvey AI, CoCounsel, and Clio Duo all transmit client data to third-party AI infrastructure—Harvey uses Azure OpenAI, CoCounsel uses OpenAI's GPT-4, and Clio processes data "outside your home jurisdiction." None offer local/on-premise deployment options, creating privilege waiver risk under third-party disclosure doctrine.

Introduction

Legal AI tools promise efficiency. But when you upload a client document or connect your email, where does that data actually go?

We reviewed publicly available documentation for three major legal AI platforms—Harvey AI, CoCounsel (Casetext), and Clio Duo—to trace the data flows that matter for privilege protection.

Disclaimer: This analysis is based on publicly available documentation as of December 2024. Vendor practices may change. We encourage readers to verify current policies directly.

Harvey AI

What It Is

Harvey AI is a generative AI platform for legal professionals, offering document analysis, legal research, and drafting assistance. It's positioned as enterprise-grade legal AI.

Infrastructure

From Harvey's public statements and documentation:

Harvey is built on Azure OpenAI infrastructure.

This means client data processed by Harvey flows through Microsoft's Azure cloud and OpenAI's API infrastructure.

Data Flow Analysis

When you upload a document to Harvey:

  1. Document leaves your device
  2. Transmitted to Harvey's servers
  3. Sent to Azure OpenAI for AI processing
  4. Response returns through same chain
  5. Data subject to Microsoft and OpenAI policies

Third parties with potential data access:
- Harvey (the company)
- Microsoft Azure
- OpenAI (via Azure OpenAI Service)

Key Policy Points

Harvey has stated they don't use client data for model training and have negotiated enterprise terms with Microsoft. However:

  • Data still transits third-party infrastructure
  • Subject to Azure's geographic processing policies
  • Potentially accessible via subpoena to any entity in the chain

Privilege Implications

Every entity in the data chain represents a potential third-party disclosure. The "enterprise terms" may limit commercial use of your data, but they don't eliminate the fact that privileged communications are being transmitted to and processed by external parties.

CoCounsel (Casetext)

What It Is

CoCounsel, developed by Casetext (now owned by Thomson Reuters), offers AI-powered legal research, document review, and drafting assistance.

Infrastructure

CoCounsel uses a combination of:
- GPT-4 (OpenAI) for general language capabilities
- Proprietary legal models
- Thomson Reuters infrastructure (post-acquisition)

Data Flow Analysis

When you run a CoCounsel query:

  1. Query leaves your device
  2. Transmitted to Casetext/Thomson Reuters servers
  3. Processed through OpenAI's GPT-4 API
  4. Legal-specific processing on Casetext infrastructure
  5. Results returned

Third parties with potential data access:
- Casetext/Thomson Reuters
- OpenAI
- Cloud infrastructure providers (AWS, GCP, or Azure—not publicly specified)

Key Policy Points

Casetext has stated:
- They have enterprise agreements with OpenAI
- Client data is not used for model training
- Data is encrypted in transit and at rest

These are security measures. They don't address the privilege question of whether transmission itself constitutes disclosure.

Privilege Implications

CoCounsel's use of OpenAI means privileged communications cross multiple organizational boundaries. Thomson Reuters' acquisition may add additional data handling considerations depending on integration.

Clio Duo

What It Is

Clio Duo is Clio's AI assistant for legal practice management, offering drafting, summarization, and workflow automation within the Clio ecosystem.

Infrastructure

From Clio's documentation:

"Clio may process queries on servers outside your home jurisdiction."

This statement appears in Clio's AI-related disclosures and indicates geographic distribution of processing.

Data Flow Analysis

When you use Clio Duo:

  1. Data from your Clio account accessed
  2. Transmitted to Clio's AI processing infrastructure
  3. Likely processed through third-party AI providers (Azure, OpenAI—specific providers not fully disclosed)
  4. Results integrated into Clio interface

Third parties with potential data access:
- Clio
- AI infrastructure providers (specifics unclear)
- Cloud hosting providers

Key Policy Points

Clio emphasizes:
- SOC 2 certification
- Encryption
- Compliance with legal industry standards

The "outside your home jurisdiction" disclosure is notable. If you're a Chicago attorney, your client's communications might be processed in Canada (Clio's headquarters), the US, or elsewhere.

Privilege Implications

The geographic uncertainty adds complexity. Different jurisdictions have different privacy laws, subpoena procedures, and legal frameworks around compelled disclosure.

Billables.ai

What They Say

From Billables.ai's website:

"We don't store privileged data and we isolate and anonymize customer data."

What This Means (and Doesn't Mean)

"We don't store privileged data" addresses retention—how long data is kept after processing.

It doesn't address:
- Whether data is transmitted externally for processing
- What third-party AI providers are used
- Where processing actually occurs

"We isolate and anonymize" describes security measures against unauthorized access.

It doesn't address whether the transmission itself creates privilege concerns.

What's Not Disclosed

Based on publicly available documentation, Billables.ai doesn't clearly disclose:
- Which AI providers power their processing
- Server locations
- Subpoena response procedures
- Whether client emails pass through OpenAI or similar APIs

Privilege Implications

Without clear documentation of the AI architecture, it's difficult to assess privilege exposure. The marketing language focuses on security rather than addressing the third-party disclosure question directly.

The Pattern

Across all major legal AI tools, we see a consistent pattern:

What Vendors Emphasize

  • Encryption
  • SOC 2 certification
  • "No training on your data"
  • Privacy policies

What Vendors De-Emphasize

  • Third-party AI infrastructure
  • Geographic processing locations
  • Subpoena exposure
  • The distinction between security and privilege

Comparison Table

Feature Harvey AI CoCounsel Clio Duo Billables.ai IntelliBill
Third-party AI? Yes (Azure OpenAI) Yes (OpenAI) Likely Unclear No
Data leaves your infra? Yes Yes Yes Yes No (local)
Processing jurisdiction Azure global Not specified "Outside home jurisdiction" Not specified Your location
Subpoena exposure Multiple parties Multiple parties Multiple parties Vendor + unknown None (local)
Local/on-prem option No No No No Yes

Questions to Ask Any Vendor

Before adopting any legal AI tool, get clear answers to:

1. What AI providers power your processing?
"We use AI" isn't an answer. Do they use OpenAI? Azure? Anthropic? Google?

2. Where are servers located?
Which countries? Can you require US-only processing?

3. What's your subpoena response procedure?
Will they notify you? Fight it? Simply comply?

4. Is there a local/on-premise option?
Can data stay on your infrastructure entirely?

5. Can I see your data processing agreement?
Not the marketing page—the actual legal terms.

The IntelliBill Difference

We built IntelliBill specifically to eliminate these concerns:

No third-party AI providers. We use Ollama running Llama models locally. No OpenAI. No Azure. No API calls to external AI.

Local processing options. Run AI on your laptop (Local mode) or your firm's server (On-Premise mode).

Clear answers to hard questions:
- AI provider: Ollama/Llama (runs locally)
- Server location: Your device/server
- Subpoena exposure: None for local deployments
- On-premise option: Yes

When your billing software is subpoenaed in a contentious divorce, we can truthfully say: "We don't have that data. It never left the client's infrastructure."

Conclusion

The major legal AI platforms have optimized for capability, not privacy architecture. They've built on third-party AI infrastructure because it's faster to market and cheaper to operate.

That's a legitimate business decision. But it means attorneys using these tools need to understand the tradeoff they're making—and whether it's appropriate for their clients.

For a comprehensive analysis of privilege risk across AI billing tools, including detailed vendor comparisons and compliance checklists:

[Download: The Hidden Privilege Risk in AI Billing Software →]

Vendor information based on publicly available documentation as of December 2024. Policies change; verify current terms directly with vendors.

This article is for informational purposes only and does not constitute legal advice.

ATTORNEY ADVERTISING

Comments

No comments yet. Be the first to comment!