On-chain Cryptoasset Data: A Closer Look
Understanding on-chain cryptoasset data and why it is integral to successful institutional adoption of cryptoassets
To get a comprehensive understanding of the cryptoasset market, financial institutions must get their hands on a new type of data not typically seen in traditional financial markets. That data is blockchain or ‘on-chain’ data and arguably, has more significant meaning than mainstream market data.
As highlighted in our 15 May, 2024 post, on-chain data refers to transactions that occur exclusively within a core blockchain network, and are permanently recorded on the public ledger in a decentralised manner by network operators called nodes.
On-chain data can provide valuable information about a particular cryptoasset network, giving insight into who is holding or trading that particular asset, how sophisticated actors are positioning their portfolios, network fees, token supply, miner activity (if applicable), hash rates, mempool data, address and counterparty monitoring, and how token holders are reacting to market events. On-chain data also provides an FI with visibility into the decentralised finance space, which is becoming an increasingly important liquidity source for FIs given the fragmented nature of cryptoasset liquidity.
The need for this new type of data to help make informed cryptoasset trading decisions can bring new infrastructural requirements and challenges for FIs, while also requiring fresh expertise in terms of interpreting and managing this new kind of data.
In order to access on-chain data in-house, an FI must build and integrate highly specialised nodes for each blockchain that it is seeking to retrieve data from. Each blockchain has its own specific requirements, including the technology used, procedures, protocols and data storage requirements, with each blockchain carrying its own security risks.
Raw blockchain data is typically not indexed, searchable, or normalised to a particular time series. A block shows the current progress of the network ledger at a brief moment in time, with blocks being added and completed at random intervals. Each block can have between a few hundred to a million transactions, although there is no standardised context or visibility as to what happened in the previous block. A simple task of assessing a Bitcoin wallet balance over the past year would require access to a significant amount of unstandardised data in order to gauge an accurate depiction of the wallet balance, incurring time and monetary costs.
As a result, before on-chain data can be used by FIs, it needs to be cleansed, processed and optimised to create datasets in a format that an FI’s different business departments are familiar with. This would typically require specialist expertise. For example, front-office traders and investors will need the data to conduct investment analysis, whereas the back office will need data in a format that is optimised for transaction recording and reporting. Once obtained, the data needs to be stored securely and efficiently, with inconsistencies in data storage methods across different blockchains currently prevalent. Given the round-the-clock nature of cryptoassets, data curation and management would have to be carried out constantly.
Ultimately, building the institutional-grade infrastructure needed to acquire and process all of the blockchain data needed for making cryptoasset investment decisions will require a considerable and time-consuming investment in talent and technology. Understandably, FIs may not want to do this unless they absolutely have to, and as such, may seek expertise and out of the box solutions from vendors that provide services and APIs similar to those offered by companies like Bloomberg, Factset, and Refinitiv in traditional asset classes.
As Jack Chong highlighted in his Institutional Onboarding Guide to Digital Assets report, obtaining on-chain data can be broken into three stages:
Data Sourcing - The value chain of on-chain data starts with the data itself, such as a blockchain node from a specific network.
Data Collection, Refinement and Delivery - Data aggregators, a type of third party service provider in the cryptoasset space collect raw data from the different blockchain nodes. After the refinement, cleaning and standardisation of raw data, this data is ‘pushed’ out by the provider to its clients via their APIs.
Data Consumption - The data is consumed in different formats by different business departments, as referenced above.
The importance of accessing reliable and standardised on-chain data in the tumultuous and mysterious cryptoasset world will be vital for FIs seeking to increase exposure to cryptoassets.
Stay tuned for more Substack posts regarding the cryptoasset data world over the coming days.