Data use and generation today are both awesome and daunting to manage. What is the best way to manage this mountain of dispersed and disparate data? A possible answer lies in the concept of ‘data fabric’ as a means to unify data. This is an integrated layer of data and connecting processes that “utilizes continuous analytics over existing, discoverable and inferenced metadata assets to support the design, deployment and utilization of integrated and reusable data across all environments.”

So what does that mean in everyday speech?

Think of the data fabric as a bedsheet spanning across data sources. Every source is connected, or woven, into the bedsheet through some form (metadata, API services, security information, etc.). Therefore, they can be linked to other storage and computing functions. Data silos come down by feeding data into one large connector.

Seems straightforward. In a hybrid environment, the concept sounds appealing, too. The details, of course, are where it gets tricky to implement.

How Does It Work?

To begin, we need to understand that data fabric architectures do not happen overnight. Putting one in place is a journey, requiring knowing your existing data (or ‘data mess’). Next, you need a plan to automate, harmonize and manage all sources with common connectors and components, removing the need for custom coding.

Remember when mobile phones had proprietary connectors? Nowadays, almost all of them use USB-C. Loosely, removing custom coding is a similar concept. Proprietary connectors, components and code have their use, but if the purpose is to connect your data sources, the commonality is your friend. You need a data framework that enables a single and consistent data management, allowing seamless access and processing. The framework is built on the following principles:

  • Knowledge, insights and semantics. Knowing what data is out there, with high visibility, and how to access it.
  • Unified governance and compliance. A common playbook and set of rules so all users can play on the same field.
  • Intelligent integration. Defined and managed on-ramps, lanes and off-ramps. Could you safely drive on the interstate if jumping on and off was not managed? Data fabric can lead to improved workload management.
  • Orchestration and life cycle. Take advantage of new tools, such as machine learning, to limit the number of accidents on that interstate. The unified data source view means the system can limit pile-ups.

Data Fabric as a Means of Protection

If a skeptical CISO is reading and wondering “won’t this just expose my data from a single source?” they would be correct. Supply chain attacks have shown the problems single sources can cause. Would the same concern not apply here?

Quite possibly, but the answer lies in build, configuration and maintenance. Correctly used, data fabric can make your business more efficient and add data protection. The key is ensuring the right defensive and privacy guardrails are built-in, including but not limited to data masking and encryption.

But like all centralized systems, there are some drawbacks.

Learn about IBM Security Guardium Insights

Where Data Fabric Can Backfire

Centralization will always come with its own problems. If you mismanage data fabric architecture, you could face cascading failure. While they may not be efficient, architecture and security measures through obfuscation, lack of coordination and disparity (intentional or not) offer a level of resilience. Think of it as a type of unintended segmentation and backup measure.

For example, data fabric could limit, or remove, historical records of data transactions. Depending on the business type, using data fabric architecture could be a very risky decision. If your business relies on processing transactions, not having historical record backups could put you in a bad position if destructive malware or ransomware hits, severely limiting your disaster recovery.

Is Data Mesh Right for You?

As mentioned before, there is a huge benefit in having common connectors. However, these come with a price. Building and managing complex data pipelines that permit common connections make the system more complex. With that comes fragility. It also increases the likelihood of latency.

To contrast, let us look at a related, but different concept: data mesh. Whereas the data fabric relies heavily on artificial intelligence and automation, driven by rich metadata, data mesh relies more on the structure and culture of the organization to bring together data product uses.

Let’s say you’re a CISO or a CIO, or perhaps even a risk or technology officer, who wants to implement data mesh. You would push for a change program that defines data needs upfront, where your data product owners shift to align data with those needs. Data fabric is centralized and requires control to operate, whereas data mesh is federated and requires alignment to operate.

Building Data Fabric Into Your Environment

So, what do you do once you’ve chosen the data fabric approach? Begin with small steps, starting with your DevOps team. Rolling out data fabric requires a good deal of planning, meaning software and IT teams working together is crucial. It is also smart to include your security and business teams. Keep in mind, if the entire enterprise will rely on this ‘bedsheet’ to connect their data, you’ll need input from all of the stakeholders.

Also, migrating to a data fabric implementation is a great time to adopt some security by design thinking. This can do wonders for your business and technical resilience, and think longer-term about data destruction. Cataloging and tagging your data is a key performance marker of how successful your project will be, so do not shy away from investing serious effort into your metadata requirement. In the end, your AI/ML work will be relying on it.

Gartner calls out data fabric and data mesh as strategic tech trends to keep an eye out for in 2022. Before you decide which may be right for you and which can improve your defensive posture, remember that your risk tolerance and business operation needs will drive which architecture solution is best for you.

More from Data Protection

Defense in depth: Layering your security coverage

2 min read - The more valuable a possession, the more steps you take to protect it. A home, for example, is protected by the lock systems on doors and windows, but the valuable or sensitive items that a criminal might steal are stored with even more security — in a locked filing cabinet or a safe. This provides layers of protection for the things you really don’t want a thief to get their hands on. You tailor each item’s protection accordingly, depending on…

What is data security posture management?

3 min read - Do you know where all your organization’s data resides across your hybrid cloud environment? Is it appropriately protected? How sure are you? 30%? 50%? It may not be enough. The Cost of a Data Breach Report 2023 revealed that 82% of breaches involved data in the cloud, and 39% of breached data was stored across multiple types of environments. If you have any doubt, your enterprise should consider acquiring a data security posture management (DSPM) solution. With the global average…

Cost of a data breach: The evolving role of law enforcement

4 min read - If someone broke into your company’s office to steal your valuable assets, your first step would be to contact law enforcement. But would your reaction be the same if someone broke into your company’s network and accessed your most valuable assets through a data breach? A decade ago, when smartphones were still relatively new and most people were still coming to understand the value of data both corporate-wide and personally, there was little incentive to report cyber crime. It was…

Topic updates

Get email updates and stay ahead of the latest threats to the security landscape, thought leadership and research.
Subscribe today