AI Governance & Safety April 4, 2026 · 23 min read

Privacy in Practice: Diagnosing the Gaps and Building the Foundation

A fictitious B2B SaaS company receives a DPIA request it cannot answer. This walkthrough applies the privacy framework from Part 3 to build Data Classification, retention schedules, consent architecture, and sub-processor transparency from scratch.

By Vikas Pratap Singh

#data-privacy #data-governance #privacy-engineering #ai-governance #implementation

Executive Briefing

The scenario: Meridian Analytics, a B2B SaaS company with an AI copilot feature, receives a DPIA request from a major EU client. Their privacy team cannot answer basic questions about where data goes, who processes it, or how long it is retained.
The diagnosis: Mapping Meridian's current state against the 8-component privacy framework reveals gaps in every layer. No AI-specific Data Classification, blanket consent, vague retention, undocumented sub-processors.
What they build: This article walks through implementing the Foundation Layer (Data Classification with AI categories, ML-aware retention schedules) and Control Layer (three-tier consent architecture, sub-processor registry). Every artifact is shown populated, not as an empty template.
For practitioners: If your organization has shipped AI features without updating your privacy infrastructure, Meridian's gaps are likely your gaps. Start with the diagnostic checklist at the end.

Data Privacy Guide: Overview | Part 1 | Part 2 | Part 3 | Part 4 | Part 5 | Part 6 | Part 7 | Part 8 | Part 9 | Part 10

The DPIA Request That Started Everything

On a Tuesday in January, Meridian Analytics’ Head of Privacy, Sarah Chen, received a 47-question Data Protection Impact Assessment questionnaire from Allianz’s procurement team. Allianz was one of Meridian’s largest EU clients, representing roughly 8% of annual recurring revenue. They were also one of the first clients to start using Meridian Copilot, the AI assistant Meridian had shipped four months earlier.

Meridian is a B2B SaaS analytics platform. About 500 employees, Series C funded, 2,000 enterprise clients across the US and EU. Forty percent of its revenue comes from EU customers. The core product is business intelligence dashboards for enterprise clients. Meridian Copilot lets users type natural language questions (“Show me Q4 revenue by region for the insurance vertical”) and receive AI-generated answers drawn from their own data.

The Allianz DPIA was not unusual. Under GDPR Article 35, controllers must assess risks when processing is “likely to result in a high risk to the rights and freedoms of natural persons.” Large enterprises routinely require their vendors to complete DPIA questionnaires as part of procurement and renewal. Meridian had answered them before, for the core dashboard product. But this questionnaire was different. It was scoped specifically to Meridian Copilot.

Sarah started filling in answers. By question twelve, she stopped.

Here are five of the questions she could not answer with specifics:

Question 9: “Describe the data flows for the Meridian Copilot feature. Where does user query data travel from the point of input to the generation of a response? List all systems, services, and third parties involved.”

Sarah knew the query went to an LLM provider. She knew there was a vector database involved. She did not know whether Allianz’s query data was logged by the LLM provider, whether it was stored separately from other clients’ data, or whether the vector embeddings were retained after the response was generated. Engineering had built Copilot in a sprint. The architecture documentation covered functionality, not data flows.

Question 14: “What is the legal basis for processing personal data through the AI assistant feature? If legitimate interest, provide the balancing test documentation.”

Meridian’s Terms of Service included a clause granting Meridian the right to “use customer data to provide and improve the Service.” Sarah knew this was too vague to serve as a legal basis under GDPR. There was no separate consent mechanism for Copilot. There was no documented balancing test. The legal team had not updated the ToS when Copilot launched.

Question 23: “List all sub-processors that receive or process personal data in connection with the AI assistant. For each, state the data received, processing purpose, location of processing, and applicable transfer mechanism.”

Meridian’s privacy policy mentioned “third-party service providers” generically. Sarah could name AWS as the infrastructure provider. Beyond that, she needed to call engineering to find out which LLM provider they used, whether the vector database was self-hosted or a managed service, and whether any data flowed to the monitoring tools used for Copilot’s performance dashboards.

Question 31: “What is the retention period for user queries submitted to the AI assistant? For model training data, if applicable? For inference logs? Provide the specific policy and rationale for each.”

Meridian’s retention policy said: “We retain your data as long as necessary to fulfill the purposes for which it was collected.” This is the kind of vague retention language the Dutch DPA penalized Netflix EUR 4.75 million for, as part of broader transparency failures. Sarah could not provide a specific retention period for any AI-related data category because none existed.

Question 38: “Has a Data Protection Impact Assessment been completed for the AI assistant feature? If so, provide the assessment. If not, explain why one was not deemed necessary.”

No DPIA had been completed. The product team had not flagged Copilot for privacy review before launch. The feature used customer data in a fundamentally new way, passing it through third-party AI infrastructure, but the existing product launch checklist did not include a privacy gate for AI features.

Sarah forwarded the questionnaire to Meridian’s CTO with a two-line message: “We cannot answer this DPIA. If Allianz escalates, we could lose the account, and every other EU client with a similar requirement will ask the same questions.”

The CTO’s response was immediate: “What do we need to fix this?”

The honest answer was: almost everything. Meridian had shipped an AI feature on top of a privacy infrastructure designed for a pre-AI product. The infrastructure had not caught up. Allianz’s DPIA exposed that gap in 47 specific, answerable-or-not questions.

Meridian is fictitious. The scenario is not. If your organization has shipped AI features without updating Data Classification, retention policies, consent architecture, and sub-processor documentation, Meridian’s situation is likely yours.

The 8-component privacy framework from Part 3 provides the blueprint for what to build. This article walks through how to build the first four components, using Meridian as the worked example.

The Diagnostic: Mapping Meridian Against the Framework

Before building anything, Sarah’s team needed to know exactly where they stood. They spent a week mapping Meridian’s current state against each of the eight framework components from Part 3. The diagnostic was uncomfortable.

Component	Framework Requirement	Meridian’s Current State	Gap Severity
Data Classification	AI-specific categories: training data, inference data, model artifacts, synthetic data	Standard 4-tier classification only (Public, Internal, Confidential, Restricted). Copilot training data classified as “Internal.” No distinction between a customer database and an ML training set.	Critical
Retention	ML-specific schedules per data category, with specific periods and rationale	”Retained as long as necessary for business purposes.” No AI-specific retention periods. No model version lifecycle policy.	Critical
Consent	Three-tier layered architecture separating service, improvement, and AI training consent	Single Terms of Service checkbox at signup. No separate consent for Copilot data processing. No opt-out mechanism for AI features.	Critical
Sub-processor Transparency	Named registry with purpose, data received, legal basis, and transfer mechanism per sub-processor	Legal team mentions “third-party service providers” generically. No public registry. DataPulse (acquired startup) vendor list undocumented.	Critical
Cross-border Transfers	Per-system transfer mapping with specific legal mechanisms	Privacy policy states “Data may be transferred internationally.” No mapping of which data goes where, under which mechanism.	High
AI Regulatory Compliance	EU AI Act risk classification per AI system	Not started. No one at Meridian has assessed whether Copilot falls under limited or high-risk classification.	High
PETs	Privacy-enhancing technology assessment per AI use case	No assessment conducted. No evaluation of whether queries could be processed with differential privacy or on-premise inference.	Medium
Governance	Hub-and-spoke model with embedded privacy champions in product and engineering	Central legal team handles all privacy questions. No privacy champion in the Copilot engineering team. No AI governance function.	High

Four critical gaps. Three high. One medium. Every critical gap mapped directly to a question Sarah could not answer in the Allianz DPIA.

Sarah presented the diagnostic to Meridian’s executive team with a simple framing: “We have eight components to build. We cannot do all eight simultaneously. The Allianz DPIA, and every enterprise DPIA that follows, requires four of them as a minimum: Data Classification, retention, consent, and sub-processor transparency. These are the Foundation and Control layers from the framework. If we build these four first, we can answer the DPIA. The remaining four, cross-border transfers, AI Act compliance, PETs, and governance, are the Operations and Compliance layers. They come next.”

The executive team approved a 90-day program to build the Foundation and Control layers. This article covers what they built.

Building the Foundation Layer

Component 1: Data Classification

Meridian’s existing Data Classification was a standard four-tier model: Public, Internal, Confidential, Restricted. It was adequate for the dashboard product. Customer financial data was Confidential. Aggregated analytics were Internal. Nothing controversial.

The problem was that Copilot introduced data types that did not fit any of these tiers. When a customer types “Show me Q4 revenue by region” into Copilot, the query itself is Confidential (it contains business context about the customer’s operations). But the query also becomes something else: an input to an ML inference pipeline. If Meridian logs that query-response pair for quality monitoring, it becomes operational data. If Meridian later uses aggregated query patterns to fine-tune the model, it becomes training data. The same piece of data changes classification as it moves through the pipeline.

The framework from Part 3 defines four AI-specific classification categories: Training Data, Model Artifacts, Inference Data, and Synthetic Data. Sarah’s team extended Meridian’s taxonomy to include all four.

Here is how Meridian’s Copilot data mapped to the extended taxonomy:

Data Element	Traditional Classification	AI Classification	Examples at Meridian
Customer dashboard queries	Confidential	Inference Data	”Show me Q4 revenue by region” prompts sent to Copilot
Copilot model weights	Internal	Model Artifacts	Fine-tuned LLM weights trained on aggregated query patterns
Query-response pairs used for fine-tuning	Confidential	Training Data	Anonymized customer queries used to improve Copilot accuracy
Copilot-generated summaries	Internal	Inference Data	AI-generated text responses to customer queries
DataPulse customer interaction logs	Unknown (undocumented)	Training Data	Legacy interaction data from acquired company, never classified
Query vector embeddings	Internal	Inference Data	Numerical representations of customer queries stored in Pinecone
Synthetic test queries	Internal	Synthetic Data	Generated queries used for Copilot regression testing

The DataPulse row was the hardest to resolve. Meridian had acquired DataPulse six months earlier. DataPulse had its own customer base, its own data collection practices, and its own (minimal) privacy documentation. Some DataPulse interaction logs had been fed into Copilot’s training pipeline before anyone at Meridian reviewed the original consent basis. Sarah flagged this as the single highest-risk item in the entire diagnostic: data of unknown provenance, with an undocumented legal basis, already embedded in model training.

The classification exercise took Sarah’s team three weeks. The output was not just a taxonomy document. It was a Data Classification label on every dataset, table, and pipeline in Meridian’s infrastructure. The engineering team integrated classification labels into their metadata catalog, so every dataset in the Copilot pipeline carried its AI classification alongside its traditional tier.

This is what California’s AB 2013 requires. Effective January 1, 2026, developers of generative AI systems must disclose whether training datasets include personal information, copyrighted material, or synthetic data. If your classification framework does not distinguish these categories, you cannot produce the required disclosure.

Meridian Data Flow Map

The diagram above shows how data flows through Meridian’s Copilot pipeline. Classification labels attach at each stage: when the customer submits a query (Inference Data, Confidential), when the query is embedded and stored (Inference Data, Internal), when the LLM generates a response (Inference Data, Confidential), and when anonymized query patterns are batched for model fine-tuning (Training Data, Confidential). The key insight: a single customer query carries different classifications at different points in the pipeline, and each classification triggers different handling requirements.

What this answered in the DPIA. After the classification exercise, Sarah could respond to Question 9 (“Describe the data flows for the Meridian Copilot feature”) with a specific, auditable answer. She could map every data element to a classification category, identify which categories contained personal data, and trace the flow from customer query to model response to (optional) training pipeline. The answer was five pages long. It was also accurate.

Component 2: Retention Schedules

Meridian’s retention policy was a single sentence: “We retain your data as long as necessary to fulfill the purposes for which it was collected.”

This language appears in more privacy policies than it should. The Dutch DPA cited Netflix’s use of similar phrasing as a violation of GDPR’s transparency requirements. The problem is not that the phrase is wrong. The problem is that it communicates nothing. “As long as necessary” means whatever the company decides it means, which means users and regulators cannot hold the company to a specific standard.

Meridian needed to replace that single sentence with specific periods, tied to specific data categories, with documented rationale for each.

The framework from Part 3 provides the structure: retention schedules must cover raw training data, aggregated training data, model weights (current and previous versions), inference logs, and synthetic data. Sarah’s team populated the schedule with Meridian-specific periods:

Data Category	Current Retention	New Retention	Rationale
Customer dashboard data	”As long as account is active”	Active account + 12 months post-termination	Contractual obligation under the MSA, plus a reasonable wind-down period for data export. Aligns with GDPR Article 5(1)(e) storage limitation principle.
Copilot inference logs (financial services clients)	No policy	90 days rolling	Financial services clients using Copilot for data analysis may trigger GDPR Article 22 explainability obligations. 90 days provides a window to respond to inquiries about AI-assisted decisions.
Copilot inference logs (all other clients)	No policy	30 days rolling	Sufficient for quality monitoring and debugging. No regulatory obligation to retain longer. Shorter retention reduces the privacy surface area.
Copilot training data (containing personal elements)	No policy	Delete after model training + 30-day validation window	Minimize exposure per EDPB guidance. The 30-day window allows validation that the trained model meets quality thresholds before source data is purged.
Copilot training data (aggregated, anonymized)	No policy	24 months	Lower risk. Anonymization validated per EDPB case-by-case standard. Retained for model reproducibility and audit purposes.
Previous model versions	No policy	90 days post-replacement	Provides a rollback window if the new model version underperforms. After 90 days, delete to reduce the surface area for erasure obligations.
Current model weights	No policy	Retain while model is in production	Operational necessity. Document which training datasets contributed to the model for erasure traceability.
Query vector embeddings	No policy	Same as inference logs (30 or 90 days by client tier)	Embeddings are derived from customer queries and may be reversible to personal data. Treat with the same retention as the source query.
DataPulse legacy data	Unknown	Audit within 60 days, apply new schedule or delete	Cannot retain data with unknown classification indefinitely. The 60-day window allows the team to assess provenance and consent basis. Data that cannot be documented must be deleted.

The DataPulse legacy data row required a hard decision. Sarah’s team discovered that approximately 340,000 interaction records from DataPulse’s original customer base had been ingested into Copilot’s training pipeline. The original consent basis was DataPulse’s Terms of Service, which granted broad data usage rights but did not mention AI training, model improvement, or transfer to an acquiring company. Under GDPR Article 6, the legal basis for processing must be established before processing begins. Using DataPulse data for Copilot training without re-establishing consent or conducting a legitimate interest assessment was a compliance risk.

Meridian’s privacy team made two decisions. First, they quarantined the DataPulse data, removing it from the active training pipeline. Second, they documented which model versions had been trained with DataPulse data and queued a machine unlearning assessment: a technical evaluation of whether those model versions could be retrained without the DataPulse data, or whether the influence of that data on the model weights was negligible. The EDPB’s December 2024 opinion is clear that AI models trained on personal data cannot automatically be considered anonymous. If the DataPulse data contained personal information, the model weights may carry that personal data forward even after the source data is deleted.

This was not a comfortable finding. It meant Meridian had to assess not just what data it held, but what data its models had already absorbed.

What this answered in the DPIA. Question 31 (“What is the retention period for user queries submitted to the AI assistant?”) now had a specific answer: 90 days for financial services clients, 30 days for others, with documented rationale for each period. The answer also included the model version lifecycle policy and the DataPulse remediation plan. It was no longer “as long as necessary.” It was auditable.

Building the Control Layer

Meridian’s consent model was a single checkbox at signup: “I agree to Meridian’s Terms of Service and Privacy Policy.” One click covered everything, from rendering dashboards to training AI models to sharing data with third-party infrastructure providers. This is the blanket consent model that the Italian Garante found insufficient when fining OpenAI EUR 15 million in December 2024. The Garante found that OpenAI had no adequate legal basis for using personal data to train ChatGPT, reinforcing the principle that AI model training requires its own documented legal justification, separate from the core service.

Sarah’s team designed a three-tier consent architecture following the model from the Part 3 framework. The key design principle: withdrawing consent from a higher tier never breaks a lower tier.

Tier 1: Service Processing. This covers all data processing necessary to deliver Meridian’s core product: rendering dashboards, running queries, storing customer data, generating Copilot responses. The legal basis is contractual necessity under GDPR Article 6(1)(b). No additional consent is needed. If you are a Meridian customer, processing your data to show you your dashboards is why the contract exists. Copilot inference, taking a customer’s query, generating a response, and returning it, falls here. The data is processed to deliver the service the customer purchased.

Tier 2: Product Improvement. This covers using aggregated, anonymized usage patterns to improve Meridian’s products. Examples: analyzing which dashboard visualizations customers interact with most, identifying common Copilot query patterns to improve the query parser, A/B testing UI layouts. The legal basis is legitimate interest under GDPR Article 6(1)(f), with a documented balancing test showing that the processing serves both Meridian’s business interest and the customer’s interest in a better product. Customers can opt out in their account settings. Opting out of Tier 2 does not affect Tier 1: the customer still gets the full product.

Sarah’s team documented the legitimate interest balancing test for Tier 2. The test weighed Meridian’s interest (improving product quality based on usage patterns) against the data subject’s rights (potential concern about behavioral analysis). The mitigating factors: data is aggregated before analysis, individual-level patterns are not extracted, and a genuine opt-out is available. The CNIL’s guidance on AI system development was the reference standard: legitimate interest requires a documented balancing test, not just a stated interest.

Tier 3: AI Model Training. This covers using customer query-response pairs to fine-tune Copilot’s underlying model. This is where data collected for one purpose (answering a customer’s question) gets repurposed for a different purpose (training a model that answers other customers’ questions). The legal basis is explicit consent under GDPR Article 6(1)(a). It must be freely given, specific, informed, and unambiguous. It cannot be pre-checked. It must be separate from Tier 1 and Tier 2 consent. A customer who declines Tier 3 consent still gets the full Copilot service; the model just will not learn from their interactions.

The consent settings page Sarah’s team designed had three clear sections, each with a toggle switch and a plain-language explanation:

Your Data, Your Choice

Dashboard & Copilot Service (always on) We process your data to run your dashboards and answer your Copilot questions. This is part of the service you purchased.

Product Improvement (on by default, opt-out available) We use aggregated, anonymized usage patterns to make Meridian’s products better for everyone. No individual data is extracted. You can opt out at any time. [Learn more]

AI Model Training (off by default, opt-in required) With your permission, we use anonymized versions of your Copilot interactions to improve Copilot’s accuracy for all customers. Your data is anonymized before training. You can withdraw permission at any time, and we will exclude your data from the next training cycle. [Learn more]

The default states mattered. Tier 2 defaulted to on because Meridian assessed, through the documented balancing test, that aggregated product improvement serves both parties and does not require explicit consent. Tier 3 defaulted to off because using query data for model training is a materially different purpose from delivering the service. The CNIL’s recommendations and the Italian Garante’s OpenAI decision both support this distinction.

What happens when a customer opts out of Tier 3? Sarah’s team defined a four-step process:

The customer’s data is flagged for exclusion from the next model training run. In-flight training cycles complete with existing data; the exclusion takes effect on the next scheduled run.
Existing inference logs for that customer begin the standard retention countdown (30 or 90 days, depending on client tier). No new logs are created for training purposes.
Model versions that were trained with the customer’s data are documented. Meridian records which training datasets contributed to which model versions, so the lineage is auditable.
A machine unlearning assessment is queued. If the customer’s data contribution was significant (determined by volume thresholds defined by the ML team), Meridian evaluates whether retraining without that data is feasible and warranted.

Tier 1 and Tier 2 services continue completely unaffected. The customer sees no change in their dashboard experience or Copilot response quality. The model may still improve from other customers’ contributions; it just will not learn from this customer’s interactions.

What this answered in the DPIA. Question 14 (“What is the legal basis for processing personal data through the AI assistant?”) now had a layered answer: contractual necessity for inference (Tier 1), legitimate interest with documented balancing test for product improvement (Tier 2), and explicit opt-in consent for model training (Tier 3). Each tier had a specific GDPR article reference and supporting documentation. Allianz’s data was processed under Tier 1 by default, with Tier 2 and Tier 3 subject to Allianz’s own consent preferences.

Component 4: Sub-Processor Registry

Meridian’s privacy policy said: “We may share your information with third-party service providers who perform services on our behalf.” This is the exact genre of language the Dutch DPA found insufficient in the Netflix case. Under GDPR Article 28, controllers must identify their processors and sub-processors with enough specificity that data subjects can understand who handles their data and for what purpose.

Sarah’s team conducted a sub-processor audit over two weeks. The process was more revealing than anyone anticipated.

Step 1: Audit every vendor contract. Sarah’s team pulled every active vendor contract from Meridian’s procurement system. For the core dashboard product, this was straightforward: AWS for infrastructure, Snowflake for data warehousing, a handful of monitoring and observability tools.

Step 2: Audit the Copilot architecture. This is where the gaps appeared. The engineering team had built Copilot using several third-party services that had never been reviewed by the privacy team. An LLM provider for natural language processing. A vector database service for semantic search over customer data. A logging platform for monitoring Copilot response quality. Each of these services received some form of customer data, and none had been added to any privacy documentation.

Step 3: Audit DataPulse’s vendor list. DataPulse, the acquired startup, had its own set of vendors. Its customer data sat on MongoDB Atlas. It used a separate analytics platform. Its vendor contracts were stored in a shared Google Drive folder that Sarah’s team had to request access to. Two of DataPulse’s vendors had been decommissioned since the acquisition, but the data had not been migrated or deleted.

The completed registry:

Sub-Processor	Data Received	Purpose	Legal Basis	Transfer Mechanism	Location
AWS (us-east-1, eu-west-1)	All customer data (dashboard + Copilot)	Cloud infrastructure, compute, storage	Contractual necessity	EU data: processed in eu-west-1. US data: processed in us-east-1. EU-US transfers via DPF certification.	US, Ireland
Anthropic	Customer queries (anonymized at point of transmission)	LLM inference for Copilot responses	Legitimate interest (Tier 1 service delivery)	DPF certification	US
Pinecone	Query vector embeddings	Semantic search for Copilot context retrieval	Legitimate interest (Tier 1 service delivery)	DPF certification	US
Snowflake	Aggregated customer analytics, usage metrics	Data warehousing and business intelligence	Contractual necessity	SCCs + supplementary technical measures (encryption at rest and in transit)	US, Netherlands
Datadog	System telemetry, anonymized performance metrics	Application monitoring and alerting	Legitimate interest	DPF certification	US
MongoDB Atlas (DataPulse legacy)	Historical customer interaction logs from DataPulse	Pending migration to Meridian infrastructure	Under review (original legal basis was DataPulse ToS)	SCCs (pending DPF assessment)	US (Virginia)

The MongoDB Atlas row was the one that kept Sarah awake. DataPulse’s customer data was sitting on infrastructure that Meridian had inherited but not yet governed. The original Terms of Service under which DataPulse collected the data did not contemplate transfer to an acquiring company for AI training purposes. Under GDPR Article 6, Meridian could not simply assume the legal basis transferred with the acquisition. The data needed to be either re-consented, justified under a new legal basis with a documented legitimate interest assessment, or deleted.

Sarah’s team set a 60-day deadline for resolving the DataPulse data. The options were:

Re-consent. Contact DataPulse’s original customers, explain the acquisition and the intended data use, and obtain fresh consent under Tier 3. Feasible for active customers. Not feasible for churned customers whose contact information may be stale.
Legitimate interest assessment. Document a legitimate interest basis for retaining the data for product improvement (Tier 2), with a genuine opt-out. Requires a balancing test and notification to data subjects.
Delete. Purge the data entirely. Simplest from a compliance perspective. Loses potential training value, but eliminates the risk.

For churned DataPulse customers (roughly 40% of the dataset), Sarah’s team recommended deletion. For active customers who had been migrated to Meridian accounts, they recommended notification plus a Tier 2 legitimate interest assessment, with Tier 3 consent collected separately. The engineering team was given 60 days to execute the migration and deletion.

Making the registry public. Under GDPR Article 28, processors must make sub-processor information available to controllers. Best practice goes further: publish the registry so any customer can see it without requesting it. Meridian created a dedicated page at meridiananalytics.com/sub-processors (a common pattern; companies like Notion, Slack, and Snowflake maintain similar pages) that listed every sub-processor with the data they receive and their purpose. The page included a change log with dates, so customers could see when sub-processors were added or removed.

The registry also included an email notification mechanism: enterprise customers could subscribe to receive advance notice when Meridian intended to add a new sub-processor. This gave customers like Allianz a 30-day window to review the new sub-processor before data processing began, a contractual right increasingly common in enterprise SaaS agreements.

What this answered in the DPIA. Question 23 (“List all sub-processors that receive or process personal data in connection with the AI assistant”) now had a complete, auditable answer. Every sub-processor was named. Every data flow was documented. Every legal basis was stated. Every transfer mechanism was specified. Sarah could point Allianz to a public URL that would stay current as sub-processors changed.

What Meridian Could Now Answer

Six weeks after the Allianz DPIA arrived, Sarah submitted the completed questionnaire. Here is what had changed:

DPIA Question	Before	After
Data flows for the AI assistant	”We could not provide specifics”	5-page data flow map with classification labels at every stage
Legal basis for AI data processing	”Covered by Terms of Service”	Three-tier consent architecture with specific GDPR article references for each tier
Sub-processor list	”Third-party service providers”	Named registry with 6 sub-processors, each documented with data received, purpose, legal basis, and transfer mechanism
Retention periods	”As long as necessary”	Specific periods per data category: 30-90 day inference logs, training data retention tied to model lifecycle, 90-day model version rollback window
DPIA for the AI feature	”Not completed”	In progress, expected completion within 60 days (requires cross-border transfer mapping from the Operations layer)

Sarah could not yet answer every question. The cross-border transfer documentation (Component 5 from the framework) was still being built. The EU AI Act risk classification for Copilot (Component 6) had not been completed. The privacy-enhancing technology assessment (Component 7) and governance operating model (Component 8) were planned for the next phase.

But the four components Meridian built, Data Classification, retention schedules, consent architecture, and sub-processor registry, answered the questions that had been unanswerable six weeks earlier. That was enough to satisfy Allianz’s procurement team and keep the renewal on track.

Do Next

Priority	Action	Why It Matters
This week	Run the 8-component diagnostic against your own organization. Score each component using the table in Section 2.	You cannot fix what you have not measured. Meridian’s “we are mostly fine” assumption collapsed under one DPIA request from a single client.
This week	Search your privacy policy for “as long as necessary” and “third-party service providers.” Count the instances.	These are the exact phrases that cost Netflix EUR 4.75 million. If they appear in your policy, your retention and transparency controls are likely underspecified.
This month	Extend your Data Classification taxonomy with the four AI categories from Part 3: Training Data, Model Artifacts, Inference Data, Synthetic Data. Classify every dataset in your ML pipeline.	If your classification does not distinguish between a customer database and an ML training dataset, you cannot comply with EU AI Act Article 10 or California AB 2013.
This month	Build a sub-processor registry. Start with AI pipeline vendors: model providers, vector databases, annotation services, monitoring tools.	Generic “third-party service providers” language is an enforcement target. Name your sub-processors, document the data they receive, and publish the registry.
This quarter	Design and deploy a three-tier consent architecture. Separate AI training consent from service consent. Default AI training consent to off.	The Italian Garante’s OpenAI fine established that blanket ToS consent is insufficient for model training. Layered consent with explicit opt-in for AI training is becoming the enforcement standard.
This quarter	Audit any data inherited through acquisitions. Assess the original consent basis and determine whether it covers your current use.	Meridian’s DataPulse data was the highest-risk finding in the entire diagnostic: data of unknown provenance already embedded in model training. If you have acquired companies with their own data practices, this risk is likely sitting in your pipeline today.

What Comes Next

Meridian now has the Foundation and Control layers in place: classified data, specific retention schedules, layered consent, and a public sub-processor registry. But the Allianz DPIA is not fully answered. Five questions about cross-border transfers remain open. The EU AI Act risk classification for Copilot has not been completed. No privacy-enhancing technology assessment has been conducted. And the governance model, the structure that ensures these artifacts stay current as the product evolves, does not exist yet.

Part 6 walks through building the remaining four framework components: cross-border transfer documentation for AI workloads, the EU AI Act risk classification for Copilot, a PET assessment, and the hub-and-spoke governance model that ties everything together. Meridian’s privacy infrastructure is half built. The second half is what turns it into an operational program.