AI + Blockchain: New Use Cases and the $300 Billion Data Goldmine


X Space #3 — AI + Blockchain: New Use Cases and the $300 Billion Data Goldmine. Watch the full recording on YouTube ↗ · Listen on X ↗

X Space #3 focuses on a question that most blockchain AI discussions avoid entirely: what does real AI look like in blockchain, what are its six concrete use cases, and what is the actual monetary value of the data that sits freely available in every public blockchain? Co-founders Martin and Tarmo open by drawing a sharp distinction between using existing AI models and creating proprietary ones — a distinction that eliminates approximately 80-90% of the projects listed in CoinGecko’s AI category from any serious consideration. They then introduce an argument that reframes the entire blockchain AI opportunity: blockchain’s 500 million users have generated financial transaction data worth approximately $300 billion by conservative calculation — all of it free and public, none of it requiring the billions that Facebook paid for inferior behavioral data. Tarmo draws on his experience building 12-year predictive models from bank transaction data at Finnova to demonstrate why this calculation is not theoretical. The session then maps six specific AI use cases that exploit this data advantage, explains why generative AI has no coherent role in blockchain, and makes the case that Web3’s future growth depends on adaptive applications built on this free goldmine.

In This Article

  1. What Real AI Actually Means: Creating vs Using Models
  2. The CoinGecko Reality: 10-20% of AI Projects Are Real
  3. Three Years from Idea to Production: Why Most Projects Never Get There
  4. The Proof-of-Work Data Hierarchy: Social Media, Phone Calls, Financial Transactions
  5. The $300 Billion Goldmine: Calculating the True Value of Blockchain Data
  6. Tarmo’s 12-Year Prediction: What Financial Data Can Reveal
  7. Six Concrete AI Use Cases for Blockchain
  8. Use Case 1 — Fraud Detection: 98% Real-Time Accuracy
  9. Use Case 2 — Rug Pull Detection: 90%+ of PancakeSwap Pools
  10. Use Case 3 — AdTech and Intention Calculation
  11. Adaptive Applications: Why Static Web3 UIs Are Killing Conversion
  12. Use Cases 4-6: Trading Signals, Credit Scoring, Smart Contract Review
  13. The One Question to Ask Any Blockchain AI Project
  14. Mining the Goldmine: Why ChainAware Lets Others Do Generative AI
  15. Comparison Tables
  16. FAQ

What Real AI Actually Means: Creating vs Using Models

Tarmo opens X Space #3 with a definition that immediately distinguishes the session from most blockchain AI discussions. Real AI, he argues, is not about using existing AI models — it is about creating them. The difference is not semantic; it is the difference between having a competitive advantage and having none at all.

Using an existing AI model means calling the OpenAI API to generate an NFT image, using a diffusion model to produce a promotional video, or wrapping a public LLM in a chatbot interface. These are legitimate technical activities, but they confer no proprietary advantage because anyone can access the same underlying model and reproduce the same outputs. Creating an AI model means assembling domain-specific training data, selecting appropriate algorithms, training iteratively, validating against held-out data, backtesting against ground truth outcomes, and deploying to production with performance guarantees. As Tarmo states: “Real AI is when you have trained a model, when you create a model, when you first get your data, you train a model with this data, you validate the model, and create something that is unique. What we see in CoinGecko AI categories are companies who manage to generate maybe a picture with AI or a video clip with AI. It is AI, it is just usage of AI algorithms, but they don’t have a model, they don’t have training data, and it means they have no competitive advantage.”

Why the Distinction Determines Competitive Advantage

The competitive advantage distinction matters because it determines whether a project can build a durable business. A company that generates NFT images using OpenAI’s image generation API has no moat — any competitor can do exactly the same with the same API call. A company that builds a proprietary fraud detection model trained on blockchain behavioral data over two years of iterative development has a moat that competitors cannot replicate quickly or cheaply. Martin draws the direct connection to DeFi’s broader problem: “Companies who just use AI, they are come and go. It’s like we know in DeFi, it’s just scam schemes where we put hype and just leave.” Real AI, by this definition, is infrastructure — it requires the same sustained investment that creates durable products in any other industry. For more on what real AI looks like in blockchain, see our generative vs predictive AI analysis.

The CoinGecko Reality: 10-20% of AI Projects Are Real

Martin applies the real-AI definition to the CoinGecko AI category and arrives at a striking assessment of the current landscape. His estimate — optimistic at 20%, realistic closer to 10% — is that a small minority of listed AI projects are building their own models. The remainder fall into two categories: projects using existing AI models from OpenAI and other providers, and projects creating “AI marketplaces” that position themselves as intermediaries between AI providers and users without building any AI capability themselves.

Even within the subset of projects genuinely training their own models, a further filter applies: how many of those models are actually running in production rather than existing as research notebooks? Martin’s estimate is stark: “If you look at the coingecko list on the AI projects, maybe 20% are working on their own models. The rest are using this generative AI. And the other thing is how many models are live in production — it’s maybe overall two, three percent.” Deploying a research model to production requires entirely different engineering work — real-time performance optimisation, horizontal scaling, reliability engineering, and continuous monitoring. Most projects that have built models have not completed this production deployment journey. For more on what production AI deployment requires, see our AI blockchain winning use cases guide.

One of the 2-3%: Deployed, Verified, Real-Time

ChainAware Fraud Detector — Production AI, Not a Research Model

Proprietary model trained on blockchain behavioral data. Not an OpenAI wrapper. Not a CoinGecko narrative project. 98% accuracy. Sub-1-second response. Scaled to handle parallel queries across ETH, BNB, POLYGON, TON, BASE. The difference between using AI and building AI — in production since 2023.

Three Years from Idea to Production: Why Most Projects Never Get There

Tarmo quantifies the time investment required to build a real AI model and get it to production — a figure that immediately explains why most blockchain AI projects take the shortcut of using existing models rather than building their own.

From initial model concept to production deployment, a high-quality AI model in a domain like fraud detection or rug pull prediction requires approximately three years for experienced data scientists. The journey involves multiple distinct stages: problem definition and data assembly, initial model selection and training, iterative refinement through backtesting, accuracy optimisation, production engineering (converting the research model to a deployable system), performance optimisation (achieving acceptable response times at scale), reliability engineering (ensuring the model does not crash under load), and ongoing monitoring and retraining. As Tarmo states: “If you have a very high quality data scientist, from idea till your model runs in production, you need almost three years. And what it means for DeFi is it is too long. Three years for DeFi is too long. And nobody in DeFi who just does boom and past cycles will do this work.”

The 99% to 98% Decision: Speed vs Accuracy

Martin illustrates the production engineering tradeoffs with ChainAware’s own experience. The initial fraud detection model achieved 99% accuracy — a higher figure than the current 98%. However, the 99% model required 23-24 seconds to process large addresses like Vitalik Buterin’s. Real-time fraud detection is only useful if the prediction arrives before the user executes a transaction — making a 24-second response time practically useless in a trading or DeFi context. ChainAware made a deliberate decision to downscale to 98% accuracy in exchange for sub-1-second response times. As Martin explains: “98% and real time are much more important parameters than 99% and near real time. We decided from 99% we have to downscale to 98%, but we have to be real time.” This kind of production optimisation decision is invisible to anyone evaluating the project from the outside — it reflects years of engineering experience rather than any publicly observable metric. For how this applies across ChainAware’s products, see our predictive AI guide.

The Proof-of-Work Data Hierarchy: Social Media, Phone Calls, Financial Transactions

Before explaining why blockchain data is so extraordinarily valuable, Martin and Tarmo establish a framework for understanding data quality in terms of predictive power. The framework uses the concept of proof-of-work — borrowed from blockchain’s consensus mechanism — to explain why different types of behavioral data produce different prediction quality.

Social media data sits at the bottom of the hierarchy. Creating a Facebook post, liking a tweet, or following an account costs nothing — there is zero proof of work involved. The resulting data is easily faked, reflecting how people want to be perceived rather than their genuine behavioral patterns. Tarmo makes this vivid: “On social media there is no proof of work. You can create a new Facebook site and call yourself a king of England. There’s no cost of doing this, no proof of work, transaction cost they are missing.” The predictive power from this data is correspondingly low.

Phone Calls, Financial Transactions, and Why Zuckerberg Paid $40 Per User

Phone calls occupy the middle tier. Making a call requires knowing the other person, having a relationship that justifies the interaction, and investing real time. This social investment creates a meaningful proof-of-work signal about genuine behavioral connections. Facebook’s $19 billion acquisition of WhatsApp — paying approximately $40 per user — was precisely this insight: phone call data is qualitatively superior to social media data for behavioral prediction. Financial transactions sit at the top of the hierarchy. Every Ethereum transaction requires paying gas fees — real money committed to a deliberate action. This financial proof of work filters out all casual, accidental, and performative behavior, leaving only genuine committed decisions. As Tarmo explains: “Financial data has very high predictive power. The higher the cost of proof of work or the higher proof of work on this transaction category, the higher is the predictive power.” For how this data quality advantage translates to prediction accuracy, see our behavioral analytics guide.

The $300 Billion Goldmine: Calculating the True Value of Blockchain Data

Martin and Tarmo introduce a specific calculation that reframes the entire blockchain AI opportunity. The calculation combines the data quality argument with a market-validated data pricing figure to produce a dollar value for the freely available, publicly accessible behavioral data sitting in every major blockchain.

The starting point is the market price for financial transaction data. If a company wanted to purchase the transaction history of a single user from a retail bank — capturing the behavioral patterns that enable 12-year behavioral predictions — the cost would be approximately $600 per user. This figure reflects the genuine market value of financial behavioral data as determined by data licensing transactions in the financial industry. Applying this $600 figure to blockchain’s approximately 500 million active users: 500 million × $600 = $300 billion. This $300 billion in data value is entirely free. No licensing agreement is required. No privacy regulations restrict access. No platform negotiation determines price. Anyone with a blockchain reader and the technical capability to process the data can access it immediately at zero cost.

Comparing to Facebook’s $19 Billion WhatsApp Acquisition

The comparison to WhatsApp makes the $300 billion figure concrete. Zuckerberg paid $19 billion for 475 million WhatsApp users — approximately $40 per user — to access phone call behavioral data that is qualitatively inferior to financial transaction data. The blockchain data available freely is worth roughly 15 times more per user and requires zero acquisition cost. As Tarmo states: “Zuckerberg paid $19 billion for WhatsApp — $40 per user — to get their data. And the blockchain data, how much does it cost? Zero. It is free. And it is $300 billion of value. If somebody would want to buy transactions of 500 million users from Citibank, for example, he would have to pay $300 billion. In blockchain, it is free. We just take this data, we build an AI algorithm and we monetize it.” For the broader implications of this data advantage, see our Web3 targeting guide.

Tarmo’s 12-Year Prediction: What Financial Data Can Reveal

Tarmo provides the most compelling evidence for the $300 billion valuation argument from his direct experience building predictive models at Finnova — the Swiss banking platform running more than 251 banks. The experience is not theoretical; it reflects production models that were deployed and used in real banking operations.

From just 10-15 transactions in a customer’s financial history, Tarmo’s models at Finnova could predict the customer’s behavior 12 years into the future with high accuracy. The predictions were specific: when the customer would purchase a car (and what kind of car), when they would apply for a mortgage (and on what type of property), which investment products they would buy (and at what price points), and when they would take out a personal loan. These predictions were not generalised probabilistic estimates — they had sufficient precision to be actionable for personalised product targeting by bank relationship managers. As Tarmo explains: “When I was chief architect of the large Swiss banking platform, what I learned was the financial data. I just can predict the intentions. I can say exactly when it is going to take place. When is the customer going to buy a Lamborghini? When is the customer going to finance a yacht? When is the customer buying financial products? We had predictive power twelve years into the future. And you didn’t need many transactions — you needed 10 to 15 transactions and you could say everything.”

Blockchain Has 100x More Data Than Finnova Did

Critically, the Finnova models operated on a few hundred million transactions across hundreds of banks — a large but bounded dataset. Blockchain’s major chains together hold over 10 billion transactions, and this number grows every day. More data produces more accurate predictions across more behavioral dimensions. Everything that was achievable with Finnova’s banking data is achievable with blockchain data — and more, because the scale, diversity, and cross-protocol visibility of on-chain behavior exceeds what any single bank or banking platform can observe. For how ChainAware applies this insight to DeFi marketing, see our AI marketing guide.

Six Concrete AI Use Cases for Blockchain

Having established the data quality and value foundation, Martin and Tarmo enumerate the specific use cases that this data enables. They identify six distinct applications, each of which requires a proprietary trained model — none of which can be accomplished with generative AI or borrowed models.

The six use cases are: (1) fraud detection and trust prediction — identifying fraudulent addresses in real time before transactions occur; (2) rug pull detection — predicting which pools will execute a rug pull before investors enter; (3) AdTech and intention calculation — calculating each wallet’s behavioral intentions to enable one-to-one targeted marketing; (4) trading signals — generating AI-powered buy and sell signals for both short-term and portfolio management strategies; (5) credit scoring — predicting the creditworthiness of blockchain addresses for under-collateralised lending; (6) smart contract review — automating security audit processes currently performed manually at enormous cost. ChainAware has deployed models in four of these six categories, with the remaining two in progress. As Martin frames it: “We have use case fraud detection, rug pull detection, credit scoring, and advertisement technology and intention calculation. This is if you look in the past — but what is ChainAware? ChainAware takes this data and applies AI on the blockchain data. We take the data, we create the models, we monetize it.” For the complete use case framework, see our AI blockchain use cases guide.

Use Case 1 — Fraud Detection: 98% Real-Time Accuracy

Fraud detection is ChainAware’s most mature deployed use case and the one that most directly demonstrates the difference between real AI and narrative AI in blockchain. The use case addresses a specific, measurable problem: how to identify fraudulent blockchain addresses before a user transacts with them, rather than after the irreversible transaction has already occurred.

The technical approach uses machine learning models trained on the transaction interaction patterns of known fraudulent and legitimate addresses. Fraud is treated as a specific type of behavioral pattern — one that manifests in distinctive ways in the sequence, counterparty relationships, and transaction structures of an address’s history. The model achieves 98% accuracy in identifying addresses that will commit fraud, operating in real time with sub-1-second response for typical addresses. Two distinct compliance mechanisms exist in traditional finance for fraud detection: AML scoring (the wine-and-water dilution algorithm that tracks contaminated funds) and AI transaction monitoring. ChainAware’s approach corresponds to the second, more powerful mechanism. As Tarmo explains: “All incoming and outgoing transactions are monitored. We can monitor the addresses with whom you are interacting. Do you want to interact with another address or do you want to say, no, I better don’t interact with this address? This looks too high a probability. I don’t want to be interconnected with this address.” For the complete fraud detection methodology, see our fraud detection guide.

Use Case 2 — Rug Pull Detection: 90%+ of PancakeSwap Pools

Rug pull detection addresses a problem specific to the new-token and new-pool ecosystem — particularly acute on PancakeSwap, where ChainAware’s analysis revealed that approximately 90% or more of new pools follow rug pull patterns. The scale of the problem is reflected in the numbers: PancakeSwap creates 15-20 new pools every 30 minutes, and most of them are designed to fail.

A rug pull follows a specific operational pattern: a pool is created, professional shillers are deployed to promote the token across Telegram groups and Twitter, FOMO drives new investors into the pool, and then the liquidity is withdrawn (or the token supply is inflated to zero value). The entire cycle typically lasts 1-2 hours. Participants who enter during the promotion phase lose not 20% or 50% but 100% of their investment. As Martin describes the shiller ecosystem: “These professional shillers, they’re shilling one day one project, one day ten projects, next day the other ten projects, and so on. That’s how they make their living — they just shill carelessly and they know the project behind are just rug pulling. And they do it again, again and again.”

Predicting Rug Pulls Before They Happen

ChainAware’s rug pull detector analyses the creator addresses and liquidity provider addresses associated with a new pool, identifying behavioral patterns that precede rug pulls. The model identifies whether the pool creator has previously been involved in rug pulls, whether the liquidity provider addresses show characteristic pre-rug patterns, and whether the overall pool structure matches known rug pull signatures. Crucially, the prediction happens before investors enter — not as a forensic determination after the loss. As Martin explains: “We have an AI technology which says which pools are the rug pools. In which pools will there be a rug pull. So we predict: this pool will be a rug pull, this pool here will be a rug pull — or this pool is not a rug pull. So you can make an informed decision.” For the full rug pull detection methodology, see our rug pull detection guide.

90%+ of PancakeSwap Pools Are Rug Pulls

ChainAware Rug Pull Detector — Predict Before You Invest

15-20 new pools every 30 minutes on PancakeSwap. 90%+ end in rug pulls — tens of thousands per year. ChainAware analyses pool creator addresses and liquidity patterns using proprietary AI trained on blockchain behavioral data. Predicts rug pulls before you enter. Not after you lose everything. ETH, BNB, BASE. Free for individual pool checks.

Use Case 3 — AdTech and Intention Calculation

The AdTech use case applies the same blockchain behavioral data to a growth problem rather than a security problem: how to identify which users have which intentions and serve them matched messages rather than generic broadcasts. Martin introduces a nuance that distinguishes ChainAware’s approach from simpler “behavioral analysis” products: the goal is not to analyse what users have done in the past but to predict what they intend to do next.

Intention calculation serves two functions. First, it enables targeted audience building — Web3 platforms can identify wallets whose transaction histories indicate they are likely borrowers, traders, NFT buyers, or gamers, and deliver matched acquisition messages rather than broadcasting to the entire crypto population. Second, it enables on-site personalisation: once a user connects their wallet, the platform immediately knows their behavioral profile and can serve them content and interface elements matched to their specific intentions. As Martin explains: “We calculate the intentions of the addresses — meaning what the user will do as next. Not only is fraud behavior, we consider fraud as one of the user behaviors. And we have other intentions. Not only is fraud, we calculate all of the intentions of the user.” The audience building capability has a dimension that the session makes particularly vivid: ChainAware can identify not only wallets that have already been rug pulled (a targetable audience for protection product advertising) but also wallets that will be rug pulled in the future — before the rug pull occurs. For how this translates to marketing results, see our intention-based marketing guide.

Adaptive Applications: Why Static Web3 UIs Are Killing Conversion

The AdTech use case connects to the broader adaptive applications argument that runs throughout X Space #3. Adaptive applications are interfaces that dynamically adjust their content, layout, and messaging to match the specific behavioral intentions of each individual user. Gartner projects that 70% of Web2 applications will implement adaptive interfaces by end of 2025. Web3 is at approximately 0%.

The conversion rate consequence of this gap is precisely quantifiable. Static Web3 applications — which show the same interface to every visitor regardless of their profile — convert below 1% of visitors to transacting users. Adaptive applications that serve each visitor content matched to their intentions achieve 20-30% conversion rates. At the $5 cost-per-click rates typical of Web3 advertising, the difference between sub-1% and 20-30% conversion rates determines whether a Web3 project can achieve cash-flow-positive economics or whether it will permanently burn acquisition budget. As Tarmo notes: “To get one transacting user with a static application, you need almost a thousand people who have a look on you — and you can calculate what it costs. He will never generate so much revenue that you can pay for the advertising cost. If you go over to adaptive applications and one-to-one marketing, your conversion ratio is 20-30%. You become cash flow positive.” For the full business model analysis, see our Web3 growth restoration guide.

Use Cases 4-6: Trading Signals, Credit Scoring, Smart Contract Review

Beyond the three core use cases (fraud detection, rug pull prediction, AdTech), Martin and Tarmo identify three additional blockchain AI applications that round out the full landscape of where proprietary predictive models create value.

Trading signal generation uses AI models trained on price history, on-chain flow data, and market microstructure patterns to generate actionable buy and sell signals. Both short-term (high-frequency) and medium-term (portfolio rebalancing) strategies benefit from AI signal generation. Martin observes that some projects in the CoinGecko AI list appear to be genuinely building their own models for this use case — one of the few categories where real AI is relatively well-represented. Credit scoring applies predictive models to the creditworthiness assessment problem in DeFi lending. Currently, DeFi lending is overwhelmingly over-collateralised — borrowers post 150% or more of their loan value as collateral. Under-collateralised lending, which would dramatically expand DeFi’s addressable market, requires accurate creditworthiness prediction. ChainAware has deployed a credit scoring model for Ethereum, though Martin notes that the current over-collateralisation norm reduces its immediate demand.

Smart Contract Review: Automating a $300M Industry

Smart contract security auditing represents the most commercially provocative use case. Manual smart contract audits — performed by specialist security firms — command enormous fees from protocols that need to demonstrate security before deployment. One unnamed company in this space has raised over $300 million in VC investment on the strength of its manual audit business model. Martin and Tarmo argue that training a language model specifically on Solidity and smart contract security patterns could automate a substantial portion of this review process — rendering the $300 million valuation of manual audit firms questionable. As Martin states: “When we take technologies like Llama and we train it for smart contract languages, we can automate smart contract review with just an AI model. And all these VC funds who have invested into smart contract review businesses — it’s very funny. They can just write down their investments.” For more on AI security applications in DeFi, see our Web3 security guide.

The One Question to Ask Any Blockchain AI Project

Martin distils the entire generative vs predictive AI analysis into a single diagnostic question that any investor, founder, or user should apply to any blockchain project claiming to use AI: is it predictive AI or generative AI?

The question cuts through narrative immediately. Generative AI — the technology underlying ChatGPT, image generators, and video generators — is trained on linguistic and visual data from the public internet. It predicts plausible next tokens in text sequences or pixel values in images. It has no inherent relationship to blockchain data. If a blockchain project is using generative AI, the immediate follow-up question is: why? What blockchain-specific data input does the generative AI require, and what blockchain-specific prediction does it produce? If the answer is “we generate NFT images” or “we power a chatbot for our community,” the blockchain connection is cosmetic — the same output could be produced without any blockchain involvement. Predictive AI, by contrast, is specifically designed to analyse pattern-rich data and predict specific outcomes. Blockchain transaction data is exactly the kind of pattern-rich data that enables high-accuracy prediction. As Martin argues: “If someone is telling you they want to do generative AI on the blockchain, my first question is why. Why do you need a blockchain to do generative AI? What do you want to generate out of blockchain? You probably want to do predictive AI.” For the complete framework, see our generative vs predictive AI guide.

Mining the Goldmine: Why ChainAware Lets Others Do Generative AI

Martin and Tarmo close X Space #3 with a strategic posture statement that explains ChainAware’s deliberate choice to ignore the generative AI narrative entirely and focus exclusively on mining the $300 billion data goldmine through predictive AI.

The competitive logic is straightforward. Generative AI on blockchain is a crowded, undifferentiated space where most participants create no proprietary advantage and generate cash flow only for the LLM providers they depend on. Predictive AI on blockchain data is a largely empty space — approximately 2-3% of projects listed as AI on CoinGecko are actually in production — where the data asset is free, the predictive power is extraordinary, and the competitive moat from a well-built model is substantial. As Martin frames it: “It’s good for us that nobody has interest for this data. All these trillion transactions are for me just to mine and to monetize. I’m very happy with it. We let everyone else do the generative AI and we are concentrating on the monetization of real data. Let them all generate JPEGs or PNGs or videos. We mine the data and we monetize the mined data.” For the complete ChainAware product suite built on this strategy, see our behavioral analytics guide.

Comparison Tables

Six AI Use Cases in Blockchain: Status and Approach

Use Case Data Input AI Type Required ChainAware Status Competitive Moat
Fraud DetectionWallet transaction interaction patternsPredictive — behavioral classification✅ Live — 98% accuracy, real-timeHigh — years of iterative training
Rug Pull DetectionPool creator + LP address patternsPredictive — rug pull pattern classification✅ Live — ETH, BNB, BASEHigh — specific behavioral fingerprints
AdTech / IntentionsFull wallet behavioral historyPredictive — intention classification✅ Live — banner solution availableHigh — 12-year prediction horizon equivalent
Credit ScoringTransaction history, repayment patternsPredictive — creditworthiness scoring✅ Live — Ethereum (demand limited by over-collateralisation)Medium — growing with under-collateralised lending
Trading SignalsPrice history, on-chain flowsPredictive — price movement forecasting⏳ In progressHigh — proprietary signal models
Smart Contract ReviewSolidity source code patternsSpecialised LLM fine-tuned on smart contracts⏳ In progress — could replace $300M audit firmsMedium — requires specialised training

Behavioral Data Quality Hierarchy: Proof-of-Work and Predictive Power

Data Type Proof-of-Work Cost Fakeable? Predictive Power Market Price / User Blockchain Equivalent
Social media postsZero — create account instantlyYes — anyone can pretend to be anyoneVery lowNear zeroNo blockchain equivalent
Phone call data (WhatsApp)Low — requires real relationshipPartially — genuine connections neededMedium~$40/user (Zuckerberg paid $19B)No blockchain equivalent
Financial transactions (bank)High — real money committedNo — committed financial decisionsVery high — 12-year horizon (Finnova)~$600/user (market estimate)Blockchain transactions with gas fees
Blockchain transactions (ETH)High — gas fees per transactionNo — gas cost filters casual behaviorVery high — equivalent to bank data~$600/user equivalent (free to access)✅ Free and public — the $300B goldmine

Frequently Asked Questions

How is the $300 billion blockchain data value calculated?

The calculation uses the market price for financial transaction data per user as the basis. In the financial industry, if a company wanted to purchase the transaction history of a single retail banking customer — the behavioral data that enables accurate long-term intention prediction — the cost is approximately $600 per user. This reflects genuine market transactions for financial data licensing. Blockchain has approximately 500 million active users who have generated financial transaction data of equivalent or superior quality (due to the gas fee proof-of-work signal). 500 million × $600 = $300 billion. Crucially, this data is publicly available on-chain at zero access cost — compared to Zuckerberg paying $19 billion for the inferior phone call data of 475 million WhatsApp users ($40 per user).

Why can only 10-15 transactions predict behavior 12 years ahead?

Financial transaction patterns are highly informative because each transaction represents a deliberate, committed financial decision filtered by the cost of execution (gas fees on blockchain, or simply the effort and cost of a real financial commitment in banking). From just a small number of these high-quality behavioral signals, predictive models can identify the user’s risk tolerance, spending priorities, investment preferences, and life-stage financial needs with sufficient accuracy to predict their next major financial decision years ahead. Tarmo observed this directly while building models at Finnova: 10-15 transactions were sufficient to predict mortgage timing, car purchases, and investment product choices up to 12 years in the future with high accuracy. The same principle applies to blockchain data — the proof-of-work filtering from gas fees makes each transaction a more reliable behavioral signal than any amount of social media or browsing data.

What is the difference between using AI and creating AI?

Using AI means calling an existing model’s API — OpenAI for text generation, Stable Diffusion for images, or any other publicly available model — to produce outputs. The model was built and trained by someone else; you are simply consuming it. Creating AI means assembling domain-specific training data, selecting appropriate model architectures, training and validating iteratively over months or years, backtesting against real-world outcomes, and deploying to production with performance guarantees. Creating AI produces a proprietary asset — the model — that competitors cannot replicate quickly or cheaply. Using AI produces outputs that any competitor can reproduce with the same API call. Only projects that create their own AI models have competitive advantages. Projects that only use AI have none.

Why does generative AI not work in blockchain?

Generative AI (LLMs and image generators) is trained on linguistic and visual data from the public internet. It predicts statistically probable sequences of text tokens or pixel values. Blockchain data is numerical, structural, and behavioral — it consists of addresses, transaction amounts, function calls, timestamps, and contract interactions. Generative AI has no inherent mechanism for processing this data type in ways that produce blockchain-specific predictions. Projects that claim to combine blockchain with generative AI are either (a) using generative AI for something entirely unrelated to blockchain (NFT image generation, community chatbots) and simply adding a token, or (b) using the term as narrative without any technical substance. The diagnostic question is: what blockchain data input does the generative AI use, and what blockchain-specific prediction does it produce?

How do adaptive applications differ from basic website personalisation?

Basic personalisation uses simple rules — showing a returning user their previously viewed products, or greeting a logged-in user by name. Adaptive applications use AI-computed behavioral intention profiles to dynamically adjust the entire interface and content strategy based on what the specific user is most likely to want to do next. In blockchain, a genuine adaptive application would show a high-frequency trader trading-focused content and interface elements, show a conservative yield farmer stable yield options, show a DeFi newcomer beginner-friendly educational content, and show a borrower borrowing-specific product information — all automatically, based on the wallet’s behavioral history, the moment the wallet connects. ChainAware’s intention calculation and banner solution enables this adaptive approach for any Web3 platform. Gartner projects 70% of Web2 applications will be adaptive by 2025; Web3 is at approximately 0%.

Mining the $300B Goldmine

ChainAware Prediction MCP — All Six Use Cases. One API.

Fraud detection + rug pull + intention calculation + credit scoring + adaptive messaging — all from the $300B blockchain data goldmine that is free, public, and largely unmined. Proprietary models. Real-time. Deployed in production. 14M+ wallets. 8 blockchains. 31 MIT-licensed agents. The 2-3% that is real AI — not narrative.

This article is based on X Space #3 hosted by ChainAware.ai co-founders Martin and Tarmo. Watch the full recording on YouTube ↗ · Listen on X ↗. For questions or integration support, visit chainaware.ai.