Harvey AI, CoCounsel, and Clio Duo: Where Does Your Data Actually Go?
2025-12-20
Harvey AI, CoCounsel, and Clio Duo: Where Does Your Data Actually Go?
Quick Answer: Harvey AI, CoCounsel, and Clio Duo all transmit client data to third-party AI infrastructure—Harvey uses Azure OpenAI, CoCounsel uses OpenAI's GPT-4, and Clio processes data "outside your home jurisdiction." None offer local/on-premise deployment options, creating privilege waiver risk under third-party disclosure doctrine.
Introduction
Legal AI tools promise efficiency. But when you upload a client document or connect your email, where does that data actually go?
We reviewed publicly available documentation for three major legal AI platforms—Harvey AI, CoCounsel (Casetext), and Clio Duo—to trace the data flows that matter for privilege protection.
Disclaimer: This analysis is based on publicly available documentation as of December 2024. Vendor practices may change. We encourage readers to verify current policies directly.
Harvey AI
What It Is
Harvey AI is a generative AI platform for legal professionals, offering document analysis, legal research, and drafting assistance. It's positioned as enterprise-grade legal AI.
Infrastructure
From Harvey's public statements and documentation:
Harvey is built on Azure OpenAI infrastructure.
This means client data processed by Harvey flows through Microsoft's Azure cloud and OpenAI's API infrastructure.
Data Flow Analysis
When you upload a document to Harvey:
- Document leaves your device
- Transmitted to Harvey's servers
- Sent to Azure OpenAI for AI processing
- Response returns through same chain
- Data subject to Microsoft and OpenAI policies
Third parties with potential data access:
- Harvey (the company)
- Microsoft Azure
- OpenAI (via Azure OpenAI Service)
Key Policy Points
Harvey has stated they don't use client data for model training and have negotiated enterprise terms with Microsoft. However:
- Data still transits third-party infrastructure
- Subject to Azure's geographic processing policies
- Potentially accessible via subpoena to any entity in the chain
Privilege Implications
Every entity in the data chain represents a potential third-party disclosure. The "enterprise terms" may limit commercial use of your data, but they don't eliminate the fact that privileged communications are being transmitted to and processed by external parties.
CoCounsel (Casetext)
What It Is
CoCounsel, developed by Casetext (now owned by Thomson Reuters), offers AI-powered legal research, document review, and drafting assistance.
Infrastructure
CoCounsel uses a combination of:
- GPT-4 (OpenAI) for general language capabilities
- Proprietary legal models
- Thomson Reuters infrastructure (post-acquisition)
Data Flow Analysis
When you run a CoCounsel query:
- Query leaves your device
- Transmitted to Casetext/Thomson Reuters servers
- Processed through OpenAI's GPT-4 API
- Legal-specific processing on Casetext infrastructure
- Results returned
Third parties with potential data access:
- Casetext/Thomson Reuters
- OpenAI
- Cloud infrastructure providers (AWS, GCP, or Azure—not publicly specified)
Key Policy Points
Casetext has stated:
- They have enterprise agreements with OpenAI
- Client data is not used for model training
- Data is encrypted in transit and at rest
These are security measures. They don't address the privilege question of whether transmission itself constitutes disclosure.
Privilege Implications
CoCounsel's use of OpenAI means privileged communications cross multiple organizational boundaries. Thomson Reuters' acquisition may add additional data handling considerations depending on integration.
Clio Duo
What It Is
Clio Duo is Clio's AI assistant for legal practice management, offering drafting, summarization, and workflow automation within the Clio ecosystem.
Infrastructure
From Clio's documentation:
"Clio may process queries on servers outside your home jurisdiction."
This statement appears in Clio's AI-related disclosures and indicates geographic distribution of processing.
Data Flow Analysis
When you use Clio Duo:
- Data from your Clio account accessed
- Transmitted to Clio's AI processing infrastructure
- Likely processed through third-party AI providers (Azure, OpenAI—specific providers not fully disclosed)
- Results integrated into Clio interface
Third parties with potential data access:
- Clio
- AI infrastructure providers (specifics unclear)
- Cloud hosting providers
Key Policy Points
Clio emphasizes:
- SOC 2 certification
- Encryption
- Compliance with legal industry standards
The "outside your home jurisdiction" disclosure is notable. If you're a Chicago attorney, your client's communications might be processed in Canada (Clio's headquarters), the US, or elsewhere.
Privilege Implications
The geographic uncertainty adds complexity. Different jurisdictions have different privacy laws, subpoena procedures, and legal frameworks around compelled disclosure.
Billables.ai
What They Say
From Billables.ai's website:
"We don't store privileged data and we isolate and anonymize customer data."
What This Means (and Doesn't Mean)
"We don't store privileged data" addresses retention—how long data is kept after processing.
It doesn't address:
- Whether data is transmitted externally for processing
- What third-party AI providers are used
- Where processing actually occurs
"We isolate and anonymize" describes security measures against unauthorized access.
It doesn't address whether the transmission itself creates privilege concerns.
What's Not Disclosed
Based on publicly available documentation, Billables.ai doesn't clearly disclose:
- Which AI providers power their processing
- Server locations
- Subpoena response procedures
- Whether client emails pass through OpenAI or similar APIs
Privilege Implications
Without clear documentation of the AI architecture, it's difficult to assess privilege exposure. The marketing language focuses on security rather than addressing the third-party disclosure question directly.
The Pattern
Across all major legal AI tools, we see a consistent pattern:
What Vendors Emphasize
- Encryption
- SOC 2 certification
- "No training on your data"
- Privacy policies
What Vendors De-Emphasize
- Third-party AI infrastructure
- Geographic processing locations
- Subpoena exposure
- The distinction between security and privilege
Comparison Table
| Feature | Harvey AI | CoCounsel | Clio Duo | Billables.ai | IntelliBill |
|---|---|---|---|---|---|
| Third-party AI? | Yes (Azure OpenAI) | Yes (OpenAI) | Likely | Unclear | No |
| Data leaves your infra? | Yes | Yes | Yes | Yes | No (local) |
| Processing jurisdiction | Azure global | Not specified | "Outside home jurisdiction" | Not specified | Your location |
| Subpoena exposure | Multiple parties | Multiple parties | Multiple parties | Vendor + unknown | None (local) |
| Local/on-prem option | No | No | No | No | Yes |
Questions to Ask Any Vendor
Before adopting any legal AI tool, get clear answers to:
1. What AI providers power your processing?
"We use AI" isn't an answer. Do they use OpenAI? Azure? Anthropic? Google?
2. Where are servers located?
Which countries? Can you require US-only processing?
3. What's your subpoena response procedure?
Will they notify you? Fight it? Simply comply?
4. Is there a local/on-premise option?
Can data stay on your infrastructure entirely?
5. Can I see your data processing agreement?
Not the marketing page—the actual legal terms.
The IntelliBill Difference
We built IntelliBill specifically to eliminate these concerns:
No third-party AI providers. We use Ollama running Llama models locally. No OpenAI. No Azure. No API calls to external AI.
Local processing options. Run AI on your laptop (Local mode) or your firm's server (On-Premise mode).
Clear answers to hard questions:
- AI provider: Ollama/Llama (runs locally)
- Server location: Your device/server
- Subpoena exposure: None for local deployments
- On-premise option: Yes
When your billing software is subpoenaed in a contentious divorce, we can truthfully say: "We don't have that data. It never left the client's infrastructure."
Conclusion
The major legal AI platforms have optimized for capability, not privacy architecture. They've built on third-party AI infrastructure because it's faster to market and cheaper to operate.
That's a legitimate business decision. But it means attorneys using these tools need to understand the tradeoff they're making—and whether it's appropriate for their clients.
For a comprehensive analysis of privilege risk across AI billing tools, including detailed vendor comparisons and compliance checklists:
[Download: The Hidden Privilege Risk in AI Billing Software →]
Vendor information based on publicly available documentation as of December 2024. Policies change; verify current terms directly with vendors.
This article is for informational purposes only and does not constitute legal advice.
ATTORNEY ADVERTISING
Comments
No comments yet. Be the first to comment!