The Golden Age of Data Owners

}
8 August 2024

The Promise of AI

Throughout the past months, generative AI has taken by storm the minds of decision makers and practitioners alike. It is finding its way into corporate strategy documents. The AI cornucopia is churning out hundreds upon hundreds of start-ups that offer solutions promising to revolutionise businesses with AI technology. This evolution creates a unique business opportunity to monetise proprietary data, promising a golden age of data owners.

There is no shortage of impressive yet carefully curated demoes of solutions powered by AI. However, examples of successful corporate-wide deployments of AI initiatives are few and far between. Few doubt that AI brings the promise of radical improvements in productivity. Even fewer realise that materialising this promise requires ever large volumes of high quality data to train AI models. Such data is in short supply and businesses have a unique opportunity to monetise proprietary data.

So far, organisations developing foundational models made extensive use of data publicly available on the Internet. Beyond that, they used social media, Internet forums and curated content to train models. However, over time the marginal cost of acquiring additional proprietary data sets becomes higher.  To mitigate this, researchers theorised about using synthetic data to train and improve models. Indeed, generative AI models creates ever more content published on the Internet today and AI start-ups use it to train new model versions. However, recent work published in Nature indicates that this approach leads to a dead end.

AI Model Collapse

In their paper AI models collapse when trained on recursively generated data the authors find that indiscriminate use of model-generated content in training causes irreversible defects in the resulting models. They call this effect AI model collapse. One of the core conclusions that authors highlight is that “the value of data collected about genuine human interactions with systems will be increasingly valuable in the presence of LLM-generated content in data crawled from the Internet.”

Golden Age of Data Owners

This brings several important implications for organisations with large datasets recording human interactions, or any other type of original data. First – don’t give data away for free in exchange for hazy promises of leadership in the AI race. Chances are, access to your data benefits the AI service providers (much) more than the value you extract. Second, make sure to require data protection throughout its lifecycle: at rest, in transit and most importantly in use. CanaryBit’s Confidential Cloud ensures data protection throughout its lifecycle. Likewise, Apple recently announced a similar approach marketed as private cloud compute. Finally make sure to technologically restrict the scope of data processing to guarantee you maintain control over the data at all times. Confidential Computing is a foundation technology in CanaryBit’s Confidential Cloud that allows fine-grained control of data processing.

Conclusion

In the race towards and AI-powered society, data owners should be mindful of the value  in the data they hold. Research shows that the value of data collected about genuine human interactions with systems will be increasingly valuable in the presence of LLM-generated content in data crawled from the Internet. Businesses and public administration should avoid giving away their data in exchange for the dubious benefit of early access to subpar AI services. Instead, they must develop a strong data monetisation strategy and protect their data throughout its lifecycle with solutions such as CanaryBit’s Confidential Cloud.

Get Started!

Explore how Confidential Cloud helps to secure your cloud infrastructure, protect your data from any AI workload and in turn, enable new business.

 

YOU MAY ALSO LIKE …

CanaryBit joins NVIDIA Inception Program for secure AI

CanaryBit joins NVIDIA Inception Program for secure AI

The solutions developed by CanaryBit have since its inception aimed to provide end-to-end data protection and governance solutions for confidential and secure AI processing, enabling B2B collaborations and compliance with data protection regulations. Today, we are...

CanaryBit unlocked Confidential AI with its first pilot customers

CanaryBit unlocked Confidential AI with its first pilot customers

During the last year, the CanaryBit team worked hard on five projects together with four of its pilot customers. The team used the services of its Confidential Cloud solution (Studio, Tower and Inspector) to run Confidential AI workloads and secure the customer's...