TempBackground.png

What is

Data Collaboration?

Data Collaboration is an approach to digital innovation where stakeholders (customers, partners, internal teams) participate in the creation of new applications, systems, analytics, algorithms, and automations while retaining meaningful control of their information.

This stands in stark contrast to traditional approaches to data management and application design where control and collaboration are made impossible by the creation of unrestricted copies between fragmented data silos.

Collaboration everywhere, but not on data?

Over the past 10 years or so, the rise of business collaboration tools has been unstoppable, and for good reason – we all remember when getting multiple colleagues to work on a document required it to be emailed back and forth among contributors. The versions very quickly got out of sync, and it was basically a nightmare for everyone involved.

Then, a few years ago came Google Docs and we all quickly learned how powerful it is to collaborate in real-time with people and systems (e.g. spell checkers) in order to improve the quality and efficiency with which we can deliver outcomes. The trend quickly spread to include collaboration on other business-critical areas, including project management (Asana), development (e.g. Github), and storage (e.g. Dropbox).

The funny thing is that collaboration now seems to be happening everywhere except where it would probably have the greatest impact - on the operational data that is used to power the apps, reporting, systems, and automations that power organizations and even entire supply chains. This includes the data maintained by operational and customer-facing teams, as well as the IT groups who use data to manage and build the systems that help everyone else do their jobs.
 
As modern organizations, we need to enable all stakeholders, including business teams, partners, customers, and suppliers to successfully collaborate on operational data in order to accelerate organizational problem-solving without compromising control or protection.

 

Unfortunately, current data management technologies offer few answers, as they’re either un-secure and not fit for operational use (e.g. spreadsheets), lack a collaborative user experience (e.g. databases), or fail to address the root causes of the data fragmentation that makes collaboration on data impossible (e.g. data integration hubs, data lakes, data virtualization, and data warehouses).

 

But what are these "root causes", and how can they be addressed?
 

Data is fragmented by silos

You've probably heard the expression "There's an app for everything" - at the Data Collaboration Alliance, we have taken that sentiment and extended it as follows:

"There's an app for everything, and a database for every app"

 

Even small organizations now maintain 100s of apps and systems and in enterprise organizations such as banks, healthcare groups, and manufacturers they number in the thousands.

 

The challenge with this approach, which has been standard practice since 1979 (thanks, Oracle!) is that each app maintains a separate database (also known as a data silo) and so when anyone wants to use data to build new apps, 360 views, AI/ML algorithms, or digital twins they need to make copies.

Lots of copies.

Control is eroded by copies

The exchange of copies of data between apps and systems is known by IT teams as "data integration" and it has become a routine task carried out (or indirectly supported via SaaS apps) by virtually every organization in the World, including those that collect sensitive healthcare, location, and financial information.

 

Most technology leaders now consider data integration a necessary evil - it's an "innovation tax" that adds no value to employees or customers and only gets more complex with every new app that is introduced to their technology ecosystem.

For the citizens, customers, clients, internal teams, and partners who contribute data to IT ecosystems, the real problem is that copy-based integration erodes the ability for anyone to meaningfully control, audit, or delete their data. 

 

This means that data is routinely exposed to people, companies, and systems that were never intended to have access. This includes "3rd parties" who operate SaaS-based apps in jurisdictions beyond the reach of compliance officers.

 

This poses a huge challenge to the data governance policies and data protection regulations meant to prevent this from happening.​​

So how can people collaborate on data that is trapped in application silos and routinely copied at scale?

Data Collaboration is a Zero-Copy approach

Today's application designs and data integration technologies are basically large-scale photocopiers of sensitive information

Towards Collaborative Intelligence

It might sound crazy, but one way to make the control of data possible in order to support true collaboration between stakeholders is to stop making copies.

 

When you think about it, this approach is already used by most societies in order to protect the value of their currency, intellectual property, and identities - and it works for data, too.

In fact, there's already a new generation of innovators who have found inspiration from nature (where else!) to solve this seemingly impossible challenge.

For example, the design of the brain provides us with an architectural blueprint for how data can be used to power unlimited solutions while preserving ownership and control. This design, perfected over millions of years of evolution, not only enables each of us to manage more data than the largest organization on Earth, but to do so without making physical copies of information.

 

This miracle is made possible by organizing information as an interconnected network. In the digital World, innovators are now using hyperlink and other technologies to mimic how the brain uses one set of physical data to power unlimited applications.

 

In other words, we now know how to eliminate data silos and data copies from the design of new apps, systems, and bots. Thanks, brain! 🧠

But whereas the human brain represents a singular design that is well-suited to serve individuals, there's no such "silver bullet" that will transform the global 
IT ecosystem.

Just like the internet leverages many technologies and protocols in order to function as a singular network of connected computers, a global network of datasets (and data-centric solutions) will require m
ultiple data management architectures, technologies, standards, and protocols to support humanity's transition from a data landscape defined by copies and chaos to one defined by control, ownership, and collaboration:

  • Blockchain / Hashgraph

  • Dataware

  • Data Wallets / Data Unions

  • Data Encryption

  • Data Mesh / Data Products

  • Linked Data

  • Lo Code / No Code

  • Self-Sovereign Identity

  • Zero-Copy Integration

  • Zero-Knowledge Proof

 

Combining these threads into a coherent fabric is the motivation behind our Collaborative Intelligence Network (CIN) blueprint initiative.

Embracing Data-Centricity

Another key development in making Data Collaboration the new normal lies in finding a common language that can be shared by business, data, IT teams in order to benefits from a diversity of ideas and perspectives when it comes to digital problem-solving.  

This is where data-centricity offers a unique advantage.  

 

"Data centricity refers to the idea that data is a primary and permanent asset, whereas applications come and go. In the data-centric architecture, the data model precedes the implementation of any given application and will be around (and valid) long after the application has gone."

 

When you think about it, data is as close to a universal business language as we’re likely to find, and learning it is a highly-achievable goal for almost everyone. 

For example, at the Data Collaboration Alliance, we’re introducing free courses to teach business users who are familiar with spreadsheets to become savvy in data-centric skills like data modeling, active metadata, and data ownership.

 

To us, this makes a lot more sense than teaching them Angular, Java, and Python (or, conversely, sending scarce IT resources to more business meetings).

All that’s been missing for this to become a reality are digital environments where business people and IT teams can log in and collaborate on data - this is exactly the sort of technology and outcome we advocate at the Data Collaboration Alliance (and support via our free Node Zero community).
 

Data Ownership is a work in progress

While the potential for Data Collaboration and data-centricity to advance control, efficiency, and collaboration-based innovation is incredibly exciting, it would be a mistake to assume that the shift from data silos to controlled environments will happen overnight. 

 

Similarly, it would be naive to assume that the citizens, nonprofits, and businesses who contribute data to digital ecosystems have the time or inclination to manage access requests from all those who want to collaborate on their data (hey there, Privacy Paradox).

 

Imagine if every app required end users to maintain a unique set of access controls - it wouldn't be long before we'd all be required to set hundreds or even thousands of such controls, or to be more accurate, give up and not do any of this. 🤣

 

Maybe the answer to this challenge will be found in the emergence of new professions (e.g. Data Access Consultants) or perhaps digital agents, powered by machine-intelligence, will adopt the role of "robotic custodians". But whatever the future holds, a lot of work remains to figure out exactly how data ownership will work.  

 

As the futurist William Gibson once observed, "The future is already here, it's just not very evenly distributed."  At the Data Collaboration Alliance, we're up for the challenge of making data ownership and collaboration-based innovation the new normal. 

 

Are you ready to join us?