January 8, 2025

Breaking Down Data Walls: How Vana is Building the Future of User-Owned AI

In a recent conversation with The Defiant's Cami Russo, Vana creator Anna Kazlauskas shared her vision for democratizing AI through user-owned data. Her journey from mining Ethereum in an MIT dorm room to building the foundation for decentralized AI offers insights into one of the most pressing challenges facing artificial intelligence: the data wall.


The Growing Data Crisis for AI Models

"I got interested in crypto through mining," Kazlauskas recalls. "I had been really interested in central banks. I had worked at the Fed, I had a picture of Janet Yellen in my high school bedroom because I saw currency as central to everything." This early fascination with decentralized systems, combined with AI research at MIT's Computer Science and AI Lab, led her to a crucial insight: "Really the only thing that matters for better AI models is better data."

"From an AI research perspective, one of the number one problems we face today is something that AI researchers call the data wall," explains Kazlauskas. "We've actually run out of data to train AI models on."

Today's leading AI models are trained on roughly 15 trillion words - approximately the same amount as all publicly available data on the internet. This scarcity has led to a rush by tech companies to acquire private user data, with platforms like Reddit earning $200 million annually from selling user-generated content.

But there's a crucial detail many overlook: "Users have full legal ownership over their data," Kazlauskas points out. "In the same way that when you park your car in a parking lot, the parking lot doesn't own your car... when you put your data on a platform, you retain full legal ownership."

This reality creates an opportunity to reimagine how data flows through the AI economy. Rather than allowing platforms to exclusively monetize user data, Vana enables individuals to participate directly in the value their data creates.

Enter DataDAOs: A New Model for Data Ownership

Vana's solution comes in the form of DataDAOs - decentralized organizations that allow users to pool and govern their data while maintaining control. "It's a bit like a labor union for data," notes Kazlauskas.

These DAOs are already showing promising results. The Reddit DataDAO has attracted 140,000 users who've contributed their data and collectively trained their first user-owned AI model. Other innovative projects include a genetic DataDAO aiming to transform healthcare data ownership and potentially "buy 23andMe outright."

"When you've contributed to a DataDAO, you get back a data set specific token," Kazlauskas explains. "This kind of data token represents your ownership in that data set overall, and gives you the ability to vote on what you want to happen to your data."

What makes DataDAOs particularly powerful is their ability to aggregate data across platforms - something currently impossible in today's siloed ecosystem. "Say you wanted access to somebody's Facebook messages, iMessage and everything they've written," Kazlauskas notes, "you would need Facebook, Apple and Google Docs to all collaborate. But they're not going to do that... because they have no incentive to combine that user's data and they actually likely cannot even do that for regulatory reasons."

Beyond Big Tech: Creating New Possibilities

When asked about competing with tech giants like OpenAI and Google, Kazlauskas draws an interesting parallel to early crypto: "In the same way that one could have asked that question about Ethereum or Bitcoin very early on... like is Ethereum a competitor to a big bank? Like sort of. But actually now if you look at Ethereum today, the sorts of applications that are getting built were just not really possible within centralized finance."

The goal isn't just to compete with existing players but to enable entirely new possibilities. "What's interesting about Vana is it really is a whole new design space for a builder," says Kazlauskas. "Right now you're not just building financial products, there are also these data transactions that can live in the block space."

This new design space is already attracting attention from major platforms. According to Kazlauskas, several DataDAOs have received notices from big tech companies - a sign that the model is having real impact. "The reason why it's a really good sign is it means that it impacts their economics," she explains. "They're thinking, 'oh, someone else can sell this data set now.'"

Looking Ahead: A User-Owned AI Future

With mainnet launch approaching and 12 DataDAOs preparing to go live, Kazlauskas envisions rapid growth: "Even in a year we'll have, I'd say like on the scale of tens of millions of people who've contributed their data and really like some of the best data sets in the world."

One of the most exciting developments on the horizon is the training of AI models across multiple DataDAOs. "In Q1, I'm really looking forward to actually training an AI model across a bunch of different DataDAOs," shares Kazlauskas. "There's been this big question in the AI industry of can you train an AI model outside of a big tech company that is still very competitive?"

The implications go beyond just data ownership. "If users were the ones who owned the AI that was trained on their data," Kazlauskas explains, "then I think it's a lot more comfortable for a user and we can have all powerful, almost like super intelligence, but ensure that it really benefits like a wide group of people."

Drawing a historical parallel, Kazlauskas notes how property rights transformed agriculture: "Once we had really good property rights, then you can have like an economy flourish... In the same way, I think that having those really strong programmable data ownership rights will just allow for a lot more economic kind of creation using that data."

A New Economic Model for AI

The vision extends beyond just technical innovation. Vana is pioneering a new economic model where users don't need capital to participate - they already have the most valuable resource: their data. "You don't need money to come in," Kazlauskas points out, "you just need your data. Like that is the capital that you're coming in with."

This model could fundamentally reshape who benefits from AI advancement. Instead of value accruing solely to large tech companies, Vana enables broad participation in the AI economy through data ownership and governance. Early signs suggest this approach is resonating - beyond the Reddit DataDAO's 140,000 users, there are now over 300 DataDAOs in development on testnet.

Join the Revolution

Vana represents more than just a new platform - it's a fundamental shift in how we think about data ownership and AI development. By enabling users to maintain sovereignty over their data while participating in collective value creation, Vana is building the foundation for a more equitable and innovative AI future.

The launch of mainnet marks a crucial milestone in this journey, but it's just the beginning. As Kazlauskas envisions it, we're moving toward a world where users not only own their data but have a real stake in the AI systems that data helps create. With DataDAOs leading the way, the future of AI looks increasingly decentralized, collaborative, and user-owned.

To get started building on Vana, visit our docs: http://docs.vana.org.