Reading duration:4 min
2023-01-25
The data mesh approach allows teams to own and manage their own data, while still providing the benefits of a federated governance model. It is based on 4 key principles:
The switch from centralized to decentralized data in organizations refer to the move from traditional data management approaches, such as data lakes and data warehouses, to more modern, decentralized approaches, such as data mesh. This is a shift that organizations are experiencing since a few years now. In traditional data management, data is typically stored in a central repository creating a monolith, such as a data lake or data warehouse, and accessed by various teams and systems. This can be useful for providing a single source of truth and enabling organizations to gain insights from their data. However, it can also be inflexible and difficult to scale, as teams may need to go through complex data integration processes to access the data they need. Business entities have evolved and are indeed calling for a new model as they now have data scientists or data experts and they want a faster time-to-market for their analytics and insights requirements.
Different typology of organizations to leverage data
Data mesh is based on the idea of decentralized, self-serve data access, with data treated as a first-class citizen and fully distributed across the organization. This approach allows teams to own and manage their own data, while still providing the benefits of a federated governance model.
One of the key principles of data mesh is the concept of "domains", which are self-contained units of data related that are owned and managed by a specific team or business unit. These domains are connected through a shared data model, allowing teams to access and use data from other domains while still maintaining control over their own data. This approach allows for greater flexibility and agility, as teams can quickly access and use the data they need without needing to go through complex data integration processes.
It is based on four core principles:
In addition, data mesh encourages the use of modern data technologies, such as cloud-native architecture, event-driven data pipelines, and AI-powered data analytics. Indeed for a while now, the term “Modern Data Stack” has been everywhere. To put it simply, it is a set of tools hosted in the cloud that allows an organization to integrate data in a very efficient way. It is also the foundation of DataOps and MLOps. The Modern Data Stack enables the creation of clean, reliable and always available data that allows users to self-serve, thus fostering a truly data-driven culture. It is typically composed of several layers stacked on top of each other (like a cake) and each layer has its own function (ingestion, storage, transformation, analytics and governance). The Modern Data Stack configuration is modular and designed to be compatible with other components and tools (plug-and-play). This concept is compatible with the Data mesh approach, as each Modern Data Stack build by each domain will generate and expose data as a product, which can then be used by any other domain or can be pulled into the enterprise data warehouse.
Overall, data mesh represents a new way of thinking about data management that can help organizations to overcome the challenges of traditional approaches and to unlock the full potential of their data.
But you may face some obstacles, most of them are going to be on organizational level, like where to start? What are the different domains in my organization? Is there a data mesh blueprint somewhere? How to design a data mesh architecture including the microservices and APIs needed to access the data as well as the data governance and security strategy? Which data product to deploy first?
Stay tuned, as we prepare more articles on how to start a transition !