Skip to main content

Data Terminology - Part III

· 4 min read

Welcome to my third of 3 posts around data terminology. In the first post, I focused on terminology that happens before analysis even starts. In the second post, I went over some data analysis and communication approaches. In this post I will discuss some of the terminology around data management.

Data Management

What is data management? While definitions tend to be fairly similar, Tableau defines data management as “the practice of collecting, organizing, protecting, and storing an organization’s data so it can be analyzed for business decisions.” Databricks defines data management as “the day-to-day, technical and operational processes that ensure data is properly collected, stored, maintained and made accessible to users”.

As the volume of data available for analysis increases and people get more concerned about how their personal information is used, not only are concepts in data management being discussed more but it is becoming more important for organizations to have robust data management processes. Which leads us to more terminology that needs to be reviewed, specifically related to data management.

Data Governance vs Data Management

Data governance and data management are two sides of the same coin and organizations should ensure they have a robust system for both. Data governance sets the standards, strategy and processes for data use while data management executes and operationalizes those processes.

A good analogy found on the BMC website compares the two to building a house: data governance represents the architectural blueprint while data management represents construction of the house. Although it’s possible to build a house without a blueprint or develop a blueprint without building a house, it’s best when the two are used together, just like data governance and data management.

Data Democratization

Data democratization is about data accessibility; to allow everyone within an organization, regardless of how technical they are or what their job title is, to be able to access and use data. The goal is to empower more people to make data informed decisions by removing bottlenecks – such as reliance on data engineers or data analysts – and providing tools and training that allow users to work with data directly.

Data Mesh vs Data Fabric

Data mesh and data fabric are two newer concepts in data management and they represent modern approaches to handling large-scale data across organizations.

  • Data mesh: A decentralized approach to managing data. Instead of centralizing all data into one repository, data mesh advocates for a distributed architecture where data is owned by individual teams (or domains) within an organization. Each team is responsible for treating its data as a product, ensuring quality and accessibility.

  • Data fabric: Refers to an architectural approach that provides a unified layer of data management and governance across various sources. It focuses on creating a seamless integration of data from multiple systems, ensuring that it can be accessed and used consistently across an organization.

While data mesh focuses on decentralization and ownership, data fabric emphasizes a unified, interconnected data environment that facilitates integration and governance.

Conclusion

Understanding the nuances between these common data terms is critical for navigating today’s data informed world. Whether you’re working with structured or unstructured data, combining datasets, or transforming them for analysis, each of these terms describes a specific part of the data lifecycle. With a solid grasp of this terminology, you can better communicate and collaborate in the data space—whether you’re determining your data governance strategy, building a data lake, wrangling data for analysis, or empowering your team through data democratization.

I hope these posts have helped your understanding regarding the nuances in data terminology. As the industry grows, so will the terms in use so good luck, and keep learning!