December 1, 2024

Deep Dive: Everything You Need to Know About Data DAOs on Vana

Data DAOs are central to the Vana protocol’s vision of user-owned AI. These entities put groups of users in direct control of their data, aggregating data around specific use cases that can then be monetized to organizations to train AI models. 

This solves an increasingly pressing problem: we’re running out of fresh data to train AI. This “data wall” is already slowing progress when it comes to building better AI. And, as models become more sophisticated, they require exponentially larger datasets for training. 

So instead of siphoning user data from the open web, which is incredibly clunky, represents less than .1% of overall data on the internet, and often infringes on copyrights, models can be trained on user-owned data. Users contribute their data voluntarily – and are compensated for it. 

Enter Vana. We believe that users should determine how and where their data is used – and by whom and for how much. By empowering users in this way, the financial benefits of AI are more evenly distributed to the people who provided the raw materials – the data.

How Vana works

Vana is a network designed to transform personal data into a valuable, user-controlled asset class. 

The protocol supports coordination across an incredibly complex web of connections, ensuring that data remains encrypted in transit to protect privacy while simultaneously enabling transparent usage tracking for precise monetization. 

There are three interconnected layers:

  1. the Data Liquidity Layer for pooling and validating data in Data Liquidity Pools, which are the repository for specific datasets contributed by users
  2. the Data Portability Layer for building applications and monetizing data
  3. the Vana chain, an EVM-compatible ledger that manages transactions and cross-DAO functionality. 

Through this structure, users can monetize and use their data in multiple ways, such as earning revenue from AI models trained on their data or by injecting their data in applications for personalization, all while maintaining granular control over how their data is used. 

Data contributors receive governance rights in their respective Data DAOs (more on those next!), allowing them to participate in decisions about data usage and value distribution.

In short: we’re the trustless layer between data owners, data communities and data users. We make it possible for individual users to pool their data collectively in a private-yet-traceable way so that data users can monetize that data for commercial tasks, such as training AI models. 

What is a DataDAO?


A Data DAO is a decentralized organization that pools user-contributed data on blockchain, allowing individuals to collectively own, manage, and monetize their data for AI and other applications.

Data DAOs govern their respective Data Liquidity Pools (DLPs), which are smart contracts registered on Vana’s mainnet that allocate rewards to those upholding high standards of security and privacy while creating liquidity for diverse data assets. These pools use proof-of-contribution to ensure data quality, with the top 16 DLPs receiving additional rewards through token holder staking. 

Each Data DAO can set its own parameters for data validation and reward distribution, while maintaining user privacy and control through features embedded in the Vana protocol, such as encryption and secure key management.

Data DAOs have a few core benefits:
  • Control. Data DAOs empower individuals by returning control over their data and creating avenues for monetization. It’s just not practical for individual users to offer their data for commercial opportunities; the DAO handles those aspects to facilitate monetization for contributors. 
  • Governance. Data DAOs govern the DLPs, which is where participants contribute their data. Participants earn rewards based on their data’s value and can influence the direction of data usage. This approach promotes efficient data use and fosters a more user-owned data economy, breaking the monopoly of large corporations over valuable user-generated information.
  • Community. The DAOs also function as beacons for like-minded people. By attracting users with shared interests, they propagate themselves and ensure that new data enters the system regularly. 

The Vana DataDAO ecosystem

We have a whole cohort of 16 Data DAOs operating within our ecosystem – and that’s before mainnet has even launched! To get an idea for the breadth and scope of our data collectives, here’s a rundown across several sectors. There are many, many ways that you can get involved by sharing your data and eventually earning from your contributions. 

Platforms

We all spend a lot of our personal capital building out value on Other People’s Platforms – the saying goes: “If the product is free to use, then you’re the product.”

Reddit. Our largest is r/datadao, where 140k Reddit users have committed their data to train the first user-owned AI model. Alongside ORA, the DAO plans to launch the world’s first Initial Model Offering, a pioneering paradigm shift for how users capture some of the value being generated in the AI goldrush. We wrote about IMOs here; they’re really exciting emerging tool in the AI training space. And this isn’t just talk: the DAO has already closed a deal to sell its data to an AI company. 

X. Built by early Bittensor miners, the Volara Data DAO puts that control back in the hands of users, who can upload their X data for potential monetization. This is even more relevant, as X recently announced that third-parties will be able to train AI with user data.

Linkedin. DLP Labs enables professionals to leverage their LinkedIn data into off-platform networks, which can create new innovation opportunities around personal networking. 

Amazon. PrimeInsights DAO lets you earn from your Amazon data by sharing it in a secure, member-controlled network. Instead of Amazon profiting alone, you gain rewards and can vote on data usage for market research or AI training. 

Data + Devices


IoT.
With a focus on IoT data sharing, the IoT Data DAO is a sensor data collective that enables devices to autonomously contribute their data. Powered by OZ Protocol, a decentralized network that facilitates peer-to-peer data sharing, the collective makes it possible for developers to tap into the vast trove of data generated by millions of internet-enabled devices. 

Synthetic Data Generation. SixGPT is a decentralized platform that enables users to run local GPT models, allowing full ownership of their data without the need for payments or API keys. By contributing conversations to a decentralized data liquidity pool, users can earn rewards based on the quality and quantity of their contributions.

Web Analytics. YKYR empowers users to monetize their data by simply browsing and sharing information through a simple extension. Users control what they share, and their data drives advancements in AI, rewarding them for their contributions.

Browsing History. Kleo Network is a Data DAO enabling users to own and earn from their consumption data. Using Kleo cards, it securely builds a comprehensive dataset of browsing history stored on decentralized storage. With encrypted data and smart contracts, users retain full control while their data generates value for them.

Human Insight


Social truth.
dFusion empowers communities to contribute and validate unbiased knowledge, addressing AI bias and misinformation. By securing high-quality data through economic incentives, it builds a trusted knowledge pool for a better-informed future.

AI Training. VanaTensor is an advanced data liquidity pool built on the Vana blockchain, specializing in Reinforcement Learning from Human Feedback (RLHF) for Bittensor validators. By integrating human feedback with AI precision, VanaTensor refines large language models, resulting in more accurate and reliable AI outputs.

Financial  


Trading patterns.
DataPig is a Data DAO that aggregates trading data from DeFi platforms and leverages AI to analyze it for patterns and personalized insights. By collecting user trading preferences and real-time data through questionnaires and API integration, DataPig employs machine learning and natural language processing to perform deep analysis and pattern recognition on raw data.

Financial predictions. Finquarium is a decentralized marketplace for financial forecasts, enabling analysts to share encrypted predictions on various assets. Utilizing blockchain technology, Finquarium ensures transparency and accountability by tracking analyst performance and rewarding accurate forecasts. Users can access a diverse pool of financial insights, fostering informed decision-making in the financial sector.

Health


Genetic data
. Just as 23andMe struggles with layoffs and potential insolvency, some users are looking for ways to continue contributing to the initial mission for longevity and health research. DNADataDAO (powered by NakaMoto) gives 23andMe users the chance to aggregate their data for potential research and other commercial opportunities. There’s also DNA Helix DAO, which is focused on privacy-preserving health data sharing starting with 23andMe data. 

Female health. Women have a unique set of health needs that aren’t always addressed by the medical establishment. AsteriskDAO is dedicated to advancing research and funding for non-reproductive women’s health issues through data sharing. The goal is to empower women to influence healthcare decisions and support the development of female-specific platforms to enhance understanding of conditions that predominantly affect women.

Mental health. Web3 ain’t easy – always on, never sleeps. MindDAO tracks how Web3 affects mental health by collecting mood data via weekly reports. Participants earn $MIND tokens, and the data reveals trends linking wallet activity and trading behavior to emotional well-being – and helps us all navigate the space more healthfully. 

How to get involved with a Data DAO

The Vana ecosystem gets better with every dataset contributed by users.  These embedded network effects make for a powerful flywheel that will continuously grow the value for all participants. 

To get involved, dive into any of the DataDAOs here.