New Substreams Sink: Google Cloud PubSub

StreamingFast
3 min readMar 5, 2024

Substreams sinks allow you to consume blockchain data in a variety of ways. Developed within The Graph, they are StreamingFast’s continuous attempt to make blockchain data more accessible. We are today announcing a new sink: Google Cloud PubSub! Adding to the numerous options to easily create blockchain-based applications!

About Google Cloud PubSub
The Google Cloud PubSub service is an asynchronous messaging system that decouples publishers and consumers of data streams. Publishers send their data to the PubSub platform and consumers can reliably pull from it, optionally filtering on the data they receive.

In PubSub, a stream of data is called a topic. For example, consider that you are building a music application. You could create a topic for every music genre (pop, country, rock…). Publishers send every song to the corresponding topic (depending on the song’s genre), and consumers subscribe only to those topics they’re interested in.

The StreamingFast Google Cloud PubSub Sink
This new software release from StreamingFast offers you the possibility to consume Substreams data through PubSub topics, such as smart contract events, balance updates or general metadata about blocks and transactions.

For example, you can create two topics on Google PubSub: “NFT events” and “Balance Changes events”. Then, you must code your Substreams to extract the data you need (or use one of the many ready-to-use Substreams found in the registry).

The Substreams sink acts as a publisher, emitting data to the topics. The consumers subscribed to the topics start listening to the NFT and balance changes data. There are many types of consumers, with the most common ones being the Go, Java or Node client libraries developed by Google.

Getting Started with the Sink
The Google Cloud PubSub sink is available on GitHub, and contains a CLI interface that allows you to easily integrate Substreams with the PubSub service. The steps that you would take:

  1. Create a new Substreams that extracts the blockchain data you need (or choose one of the ready-to-use Substreams found in the registry).
  2. Install the Google Cloud PubSub sink CLI by building the PubSub sink GitHub project.
  3. Run the Google Cloud PubSub sink CLI by providing information such as the Substreams packages, the Google Cloud project ID or the Google Cloud topic ID:

substreams-sink-pubsub sink -e <endpoint> — project <project_id> <substreams_manifest> <substreams_module_name> <topic_name>

Get started by running one of the simple examples we’ve provided on GitHub.

How the Sink Works
The PubSub Sink consumes messages produced by a Substreams module abiding by a special data model: the sf.substreams.sink.pubsub.v1.Publish protobuf message type, which contains an array of messages, and every message contains two fields: data and attributes (the same fields contained in a Google PubSub message object).

Just like in a normal Google PubSub message, you use the data field to send the actual blockchain data and the attributes field to perform filtering on the messages.

package sf.substreams.sink.pubsub.v1;

// The output format of the Substreams module
message Publish {
repeated Message messages = 1;
}

message Message {
bytes data = 1;
repeated Attribute attributes = 2;
}

message Attribute {
string key = 1;
string value = 2;
}

Learn more about the PubSub sink and the Substreams technology in the Substreams documentation.

--

--

StreamingFast

StreamingFast is a protocol infrastructure company that provides a massively scalable architecture for streaming blockchain data.