The Graph Now Supports Solana with Substreams

StreamingFast
7 min readNov 3, 2022

--

The Graph Foundation is excited to announce support for Solana with substreams. The Solana developer community can now begin using The Graph to build lightning-fast dapps. By using the new substreams technology, developers can efficiently extract and interpret on-chain data from Solana’s mainnet-beta to feed their applications. Providing support with substreams is the first step in bringing subgraphs to Solana.

Substreams, which are fully open-source, empower Solana developers to build with on-chain data in brand new ways, thanks to their speed and efficiency. Developers can use substreams modules, coded in Rust, to build protocol-specific data streams or market-wide analytical datasets. They can also be used to power real-time notifications, and display long, time-series information. Breaking out of walled gardens, substreams devs can leverage streams created by others to save time, and can empower the whole web3 ecosystem by making their work openly available and composable. As a result, substreams give rise to new and innovative use cases throughout the Solana developer community.

Developed by StreamingFast, a core developer in The Graph ecosystem, substreams allow for extremely fast historical processing (in batch and in streaming). Substreams open the door to many benefits, including: feeding any data systems through technology-specific sinks, reusing your Solana program’s Rust code to read on-chain data, a laser-focused debugging experience, communal and composable refinement of data streams, and reliable reorg-aware streams.

A true industry-shifting technology, substreams are poised to unlock subgraph performance with parallel data processing to greatly increase syncing speeds. Through a horizontally scalable parallel engine, substreams are capable of multiplying historical indexing performance by more than 100x.

Developers can utilize substreams to generate new and exciting use cases, such as cross-chain bridges, large-scale analytics, refined intelligence for block explorers, trading engines, and any application in need of a rich, consistent data stream.

A free-to-use hosted service for this technology will be available until it is deployed on The Graph Network.

“Solana support on The Graph has been long-awaited and we’re thrilled to be providing an efficient way to get historical Solana data using substreams, a new cutting-edge architecture for data streaming.”

- Eva Beylin, Director of The Graph Foundation

How Do Substreams Work?

Substreams are new data sources on The Graph that more efficiently extract enriched data through modules built in Rust. While substreams can be used independently, there are ongoing efforts to integrate substreams to power subgraphs and be supported on The Graph Network.

The culmination of years of research and development, substreams were created by StreamingFast, a core dev that has worked across many chains to learn the needs for data-indexing architecture. Following the launch of Firehose, which revealed the potential efficiencies of extracting data to optimize indexing, substreams were created to unlock greater opportunities for dapp developers.

In an ETL (extract, transform, load) analogy, substreams are the transformation layer, whereas Firehose is the extraction layer. By contrast, subgraphs provide the full ETLQ experience, including the load and query layers.

RPC-based indexing technologies usually poll API from the native chain clients. Firehose technology replaces those polling API calls with a stream of data utilizing a push model and sending data to the indexing node faster. This increases the speed of syncing and indexing, and does away with most needs for archive nodes.

Substreams, which are blockchain agnostic, take things even further by enabling massively parallelized streaming data. Substreams can be combined in powerful new ways to feed data into subgraphs or end-user applications in a fraction of the time. Early testing on some subgraphs saw sync speed increases of over 100x with substreams parallelization.

Because substreams support stateful modules, analytic use cases can aggregate computations across the history of the chain, even in parallel, enabling new powerful ad-hoc analysis to be performed.

Substreams can feed into many sinks, with Postgres and MongoDB already available, and graph-node integration on its way. Substreams can also easily be consumed by simple programs written in any language that supports gRPC (Python, Go, Rust, C/#/++, Java/Kotlin and more), feeding into any system you may already have.

With any Solana programs being written in Rust, instructions can be decoded in substreams using the same code you use to validate transactions on-chain, targeting WebAssembly instead of BPF.

Being fully deterministic, substreams have excellent caching capabilities. They allow you to leverage the cached state of previously executed modules to jump in the middle of history to zero in on a bug without starting over from the beginning. Once dependencies of your module have been processed once, anyone can start building off of it, at any point in time in on-chain history. This massively impacts agility and speed of iteration.

Substreams also create new forms of in-flight composition. This means that modules taken from different authors can be combined together at the time of transformation, not at a later query time.

As an example, substreams make it possible to use a Serum price module developed by team A, combine it with a Metaplex sales module developed by team B, and then create a third, enriched and refined USD volume of trades, developed by yourself. Each stream would stay independently composable. So, if you need to access data on prices, you could just hook into the prices module; or if you need sales volumes, you could just hook into the sales volume module.

Lastly, reliability is baked into the substream technology in the form of a cursor, accompanying every streamed payload. This cursor can be sent back in the next request in case of disconnection — just like a web cookie — and guarantees that you will never miss any re-org signal, even if the event happened while you were disconnected.

A full-fledged integration of substreams across chains as well as a subgraph-substreams integration to bring performance improvements to subgraphs is coming soon! When combining the speed and data composability of subgraphs and substreams with unpacked blockchain data from the Firehose, The Graph is unarguably the fastest and most efficient way to get data from blockchains.

How to Get Started Indexing Solana Data with The Graph

You can access Solana today from the hosted service, while we work to bring substreams as a native product of The Graph’s decentralized network economics. Here are some resources to help you get started:

— — —

About The Graph

The Graph is the indexing and query layer of web3. Developers build and publish open APIs, called subgraphs, that applications can query using GraphQL. The Graph currently supports indexing data from over 39 different networks including Ethereum, NEAR, Arbitrum, Optimism, Polygon, Avalanche, Celo, Fantom, Moonbeam, IPFS, Cosmos Hub and PoA with more networks coming soon. To date, 63,000+ subgraphs have been deployed on the hosted service. Tens of thousands of developers use The Graph for applications such as Uniswap, Synthetix, KnownOrigin, Art Blocks, Gnosis, Balancer, Livepeer, DAOstack, Audius, Decentraland, and many others.

The Graph Network’s self service experience for developers launched in July 2021; since then over 500 subgraphs have migrated to the Network, with 180+ Indexers serving subgraph queries, 9,300+ Delegators, and 2,400+ Curators to date. More than 4 million GRT has been signaled to date with an average of 15K GRT per subgraph.

If you are a developer building an application or web3 application, you can use subgraphs for indexing and querying data from blockchains. The Graph allows applications to efficiently and performantly present data in a UI and allows other developers to use your subgraph too! You can deploy a subgraph to the network using the newly launched Subgraph Studio or query existing subgraphs that are in the Graph Explorer. The Graph would love to welcome you to be Indexers, Curators and/or Delegators on The Graph’s mainnet. Join The Graph community by introducing yourself in The Graph Discord for technical discussions, join The Graph’s Telegram chat, and follow The Graph on Twitter, LinkedIn, Instagram, Facebook, Reddit, and Medium! The Graph’s developers and members of the community are always eager to chat with you, and The Graph ecosystem has a growing community of developers who support each other.

The Graph Foundation oversees The Graph Network. The Graph Foundation is overseen by the Technical Council. Edge & Node, StreamingFast, Figment, Semiotic, The Guild, Messari and GraphOps are seven of the many organizations within The Graph ecosystem.

About StreamingFast

StreamingFast is a web3 builder and investor. As a core developer on The Graph, it excels at building massively scalable open-source software for processing and indexing blockchain data. Founded by a team of serial tech entrepreneurs, the company has deep expertise in large-scale data science. Its core innovations, the Firehose and Substreams, is a files-based and streaming-first approach that enables high-performance indexing on high throughput chains.

You can follow StreamingFast on Twitter and on Discord.

Originally published at https://thegraph.com on November 3, 2022.

--

--

StreamingFast

StreamingFast is a protocol infrastructure company that provides a massively scalable architecture for streaming blockchain data.