Data is the lifeblood of today’s digitally transformed business environment and is growing rapidly as it is estimated that 90 percent of the world’s existing data was created in the last two years alone. With such rapid growth, simply understanding the context of what data is important to keep, classifying that data and organizing it into a useful form cannot happen without the support of technology. Organizations are realizing that traditional manual data governance approaches around classifications, definitions and quality are insufficient to scale to meet current and future demands. Opportunities for new innovations and data uses (e.g., generative AI) come with ever expanding data, but risks are also increased as global cybersecurity and data privacy mandates are constantly evolving. These trends are driving our clients to focus on innovative uses of emerging technologies, including Microsoft Purview, to help support sustainable ways to approach data governance.
Trends, such as migrating data from on-premises environments into the cloud, are a result of the rapid data growth already highlighted, but bring in additional compliance and other risks that require an added layer of data governance that historically has not existed. For success in cloud migration, we must understand the types of data assets (e.g., sources, classifications and uses) that are moving into the cloud so organizations can ensure adequate protections for compliance, security and privacy.
In response, businesses are taking a closer look at how to implement a comprehensive data governance strategy combining elements of data management, data quality, privacy and data security. The overall goal is to ensure that there is enough metadata (data about data) knowledge associated with what data is in use, where data is stored, how it is used and who should have access to it. In 2021, Microsoft released Purview, a solution that is designed to answer inventory, metadata and other questions for an organization’s structured and unstructured data.
Enabled data governance using automated tools
The first challenge in any type of governance program is getting a handle as to what is being governed, and data is no different. To accomplish this, we need to know what data exists and prioritize data that should be further controlled. Applying data discovery technology is a foundational step for building out a data catalog as it helps to find these assets in an automated fashion. For structured data sources, data discovery tools can read the schemas (i.e., data definition language, or DDL) to capture the technical components of what types of data are stored within a given repository. This information gives us an overall understanding of how much data may be out there, but then needs to be further enriched to get an understanding of the data type.
Data catalogs are used to store the results of data discovery exercises and support the additional enrichment of knowledge about the data asset to help us govern it. For example, through the data discovery process, we can apply pattern analysis (through Machine Learning and other techniques) to help us begin to better understand the type of data we are looking at and classifying it. This type of analysis has been done for years to identify critical data such as Social Security or credit card numbers to mark those as sensitive. In addition to classifications, business definitions, ownership, source systems and other critical attributes are also stored in the data catalog to help users – both technical and functional – better understand the data and to support proper use. Using a data catalog discovery tool can:
- Assist in making better business decisions about data: Understand what data is available, where it came from, how it is currently used within the business and who else is using it can be critical when choosing from the myriad data sources available.
- Help improve data insight: Provides an increased view into organizations’ data through dashboards and metrics supporting downstream/upstream impact analysis regarding how and where data asset fields and files are presented for particular use cases.
- Assist in compliance with data privacy regulations: Allows organizations to discover data assets that are subject to data privacy regulations through sensitivity and classification rules and allows organizations to respond to data subject requests under those regulations.
- Build a list of business terms applicable to the company: Aids companies in coupling “business laymen’s terms” to captured data assets by providing business definitions to the developed technical asset nomenclature.
The panacea in data governance is being able to populate a data catalog through technology, with limited manual intervention. Technologies such as Purview are rapidly filling the gaps toward reaching this goal through automation, but still require manual inputs and reviews for assurance.
Governance for the data-enabled world using Microsoft Purview
Microsoft Purview provides organizations the ability to discover, classify and assist in the protection of their data assets, both structured and unstructured, along with other functions. Purview can be used to:
- Perform data discovery and lineage identification: Purview can scan entire data landscapes ranging from Azure Synapse Analytics, Microsoft Fabric, SQL Server, Microsoft Power BI, Amazon S3, Google Big Query and more (including on-premises and cloud-based data sources), to identify data assets, both structured and unstructured, and their relationships (including data lineage) into one single location for viewing.
- Classify data based on its sensitivity: Purview uses comprehensive algorithmic methods to scan and classify data by leveraging either predefined or custom coded classifiers to identify the level of sensitivity. As such, a vantage point into sensitive data is created with labeled information adhering to established data protection policies.
- Administer visibility and management of data among the organization: A key feature of Microsoft Purview is its ability to allow for assignment of assets to responsible data managing stewards and experts. This is a value-driven functionality that allows organizations to build trust in their data environment and in their data intelligence community through ownership and knowledge of data.
- Enable a trusted data catalog through change control: Microsoft Purview contains the ability to provide change control on established data catalog configurations on an acceptable level of audit controls considered as “workflow.” Purview can monitor the number of activities performed such as data scans, scanning rules, classification rules, business definitions and policy adherence — a result that allows for captured metadata to be consistently trusted on ownership certification frequency in an ever-changing data environment.
Microsoft Purview today and in the future
Microsoft Purview is rapidly gaining adoption in the data governance market. It is the first data governance solution to achieve the native cloud data management capabilities certification from the EDM Council in March 2023 and provides a comprehensive end-to-end integration capability within the Microsoft portfolio of data and security products. Additionally, Microsoft is rapidly adding new features and capabilities to heavily compete in the data governance space including pivoting Purview to coexist with Microsoft’s leading edge offerings including Microsoft Fabric and Microsoft Priva.
Backed by Microsoft’s boundless scaling of resources and strategic investments, we expect to see Purview as a leading governance solution for years to come, keeping in lock step with ongoing data growth, and providing organizations with a sustainable ability to discover, classify and assist in data ecosystem protection.
Where Protiviti can help
Enabled by Microsoft Purview, Protiviti can help build sustainable data governance capabilities that allow organizations to achieve data asset alignment needed to compete in the digital era. Whether modernizing a data governance strategy, enriching data protocols and processes, transforming data management capabilities or implementing periodic data scans to refresh technical metadata, Protiviti’s subject matter experts in data governance provide the leading-edge skillsets needed to optimize data investments.
Read the results of our Global Technology Executive Survey: The Innovation vs. Technical Debt Tug-of-War.