Design the Target Level | Part 3/4

Unstructured data is growing rapidly, and without a clear governance strategy, organisations risk being overwhelmed by outdated, unmanaged, and potentially sensitive information. Designing a target level for data governance is essential—not just for compliance, but for operational efficiency, cost control, and AI readiness.

Start with a Plan

The lifecycle of unstructured data should follow a structured model:
Plan → Do → Check → Act

  • Plan: Define what data is collected, from whom, and for what purpose. Consider mergers, reorganisations, and legacy systems that may introduce risks.
  • Do: Implement storage practices in appropriate systems—HR platforms, cloud libraries, or network drives.
  • Check: Regularly audit data age, access rights, and compliance with retention policies.
  • Act: Remove or manage data that has reached the end of its lifecycle.

Define Ownership, Retention, and Labelling

Every data type should have a designated owner, retention period, and storage location. For example:

  • Job applications → Owner: HR → Retention: 6 months → Location: Sympa HR system
  • Customer complaints → Owner: Legal → Retention: 2 years → Location: a secure document repository

In addition to ownership and retention, high-level labelling (e.g. “HR”, “Internal”, “Confidential”) helps contextualise data for access control and policy enforcement. These labels can be applied manually or automatically based on metadata, folder structure, or file content.

Content Classification Is Essential

Governance begins with knowing what you’re managing. Content classification tools scan files to detect sensitive information—such as personal identifiers, financial records, or regulated terms. These labels (e.g. “Contains SSN”, “PII”, “Contractual”) enable automation, retention enforcement, and risk mitigation. Classification is also a prerequisite for safe and effective AI use.

AI Without Data Awareness Is a Risk

AI tools like Microsoft Copilot or ChatGPT offer powerful capabilities, but using them without a clear understanding of your data landscape is risky. If your unstructured data contains outdated, sensitive, or misclassified content, AI may surface or process it inappropriately. Before deploying AI, ensure your data is classified, governed, and access-controlled.

Cloud Migration Doesn’t Equal Modernisation

Migrating legacy data to the cloud “as-is” is common—but risky. Moving unmanaged files to Microsoft 365 or Google Workspace does not reduce risk or modernize content. Old formats (e.g. Lotus 1-2-3, Word 2) remain unsupported, and sensitive data remains exposed unless actively governed. Migration must be paired with classification, clean-up, and policy enforcement.

Environment Limitations

Governance capabilities vary across platforms. Microsoft 365 and Google Workspace offer different features depending on licensing level. Legacy file servers, by default, lack classification, access auditing, and retention enforcement. These can be added using third-party tools, but require investment and planning.

Cost Implications

Storage isn’t free. Industry estimates put storage costs at around €3,500 per terabyte per year. A 41.5 TB environment costs €145,000 annually—and with 15% annual growth, costs double in five years. Versioning in cloud platforms can further inflate these numbers. A single document with 26 versions may consume 165 MB instead of 6 MB.

Conclusion

Designing a target level for data governance isn’t just about rules—it’s about enabling sustainable, secure, and cost-effective data management. Start with clear ownership, defined retention, and consistent labelling. Avoid pseudo-archiving, classify your content, and remember: migrating data to the cloud without governance doesn’t solve the problem—it simply relocates it. AI can be transformative, but only when built on a foundation of clean, classified, and controlled data.

Want to explore this topic further? Read our article “Unstructured Data Threats and How Top Experts Say You Can Handle It”. Click here to read more!

This is Part 3 of our 4-part series on unstructured data. In the last article, we’ll move from planning to execution—exploring practical methods for implementing and maintaining effective data governance.

What's new?

In the blog you will find current information, interesting articles and a lot of detailed information related to data protection.

Read these also

Share on social media

Request a quote for services