Skip to main content

GA4 | tips & tricks

How to sync your marketing data with BigQuery using Fivetran's free plan

A data warehouse like BigQuery provides a centralised place for all your marketing data. By automating the entire data pipeline process with a service like Fivetran, manual effort is eliminated and the risk of errors is reduced.

As a marketer or analyst, you understand the importance of having a deep understanding of your target audience and how your marketing strategies are performing. But, in a world where data is constantly being generated from multiple sources, it can be overwhelming to make sense of it all.

Marketing data warehouse

This is where a data warehouse like BigQuery comes in, providing a centralised place for all your marketing data, enabling you to get a comprehensive view of your marketing efforts.

By collecting data from various sources, such as campaign tools (e.g. Google Ads, Microsoft Ads, Google Campaign Manager, Display & Video), social media platforms (e.g. Facebook, Instagram, Twitter, LinkedIn, Reddit), website analytics (e.g. GA4), ecommerce data (e.g. Shopify), your CRM data (e.g. Salesforce) or customer feedback, you can gain insights into customer behaviour, preferences, and interactions with your brand.

These insights can help inform your marketing strategies, improve your customer experience, and ultimately drive more conversions and revenue.

Technical know-how

While the benefits of using a data warehouse like BigQuery are clear, the way to get there can be a challenging and time-consuming. You need to have the technical know-how to extract data from various systems, format and clean the data, and load it into the warehouse. Additionally, you need to continuously monitor and maintain the data pipeline to ensure that it's up-to-date and accurate.

💡
This is not a sponsored article and it doesn't contain any affiliate links. I just think free plans are interesting for my audience, as they allow you to play around with your data without any obligations (if you stay within the limits described below), and Fivetran is (to my knowledge) the first to offer a user-friendly service like this.

Free plan - monthly limit

This is where a service like Fivetran can help. Fivetran automates the entire data pipeline process, eliminating manual effort and reducing the risk of errors. Now that Fivetran introduced a free plan, starting February 1st, 2023, everyone can play around with database connectors, without any costs or obligations (if you stay within the monthly active rows limit stated below):

  • Unlimited users
  • Usage of up to 500,000 monthly active rows (MAR) at no cost
  • No credit card required
  • Access to 300+ fully-managed application and database connectors
  • 15-minute syncs
  • One-month grace period to upgrade to a paid plan after exceeding 500,000 MAR limit.

To sign up for Fivetran's Free Plan, first sign up for a free trial. After 7 days, you'll get a usage estimate and if it's below 500,000 MAR, you can select the Free Plan without entering a credit card. After exceeding the 500,000 MAR limit, you can upgrade to a paid plan.

Now let's add some data to our existing BigQuery project (assuming you already set up a BigQuery project for your GA4 export data).

Add data destination

First we will configure BigQuery as our data destination. Choose Destinations in the side bar, select BigQuery and proceed.

Follow all the necessary steps (skip the optional step) in the setup guide on the right. Make sure you select the same data (processing) locations as the (multi)region where your GA4 export data is stored.

After completing the steps, Fivetran will test the connection and permissions. If all tests passed, we can proceed.

Add data source

We are ready to add our first data source. As you can see Fivetran provides a lot of connectors, and they are clustered per domain.

As I don't run any ads at the moment and my biggest acquisition channel is Organic Search, I'm very interested in Google Search Console data. Select the desired data source and proceed.

💡
When choosing the historical sync timeframe you have to consider if you need all historical data right away and whether want to stay within the monthly active rows limit of the free plan.

As you read below, our first attempt using the default configuration for Google Search Console resulted in exceeding the monthly limit of the free plan.

Make sure you have a good understanding of the impact of the configuration you are selecting. The default options in Fivetran are not necessarily configured in your best interest.

Think a minute about the destination schema name, because your can't change it later. Then click Authorize.

It will trigger a permission screen. Choose the account you use to access Google Search Console and allow Fivetran to access the data.

If the authentication is successful, you can proceed and select to sync either all sites or specific sites connected to your account.

Again, some tests will run to check if all requirements are in place.

The next step is actually a nice feature: if your data contains columns that need to be blocked or hashed, this is the time to configure it.

For our use case (Google Search Console) this is not necessary. Proceed and select Start initial sync on the next screen. Now we sit back and wait.

After 53 minutes (we're talking about a few years of historical data here), the sync is completed.

By default the next sync will run in 6 hours. You can change the frequency (5 minutes to 24 hours) in the tab Setup.

When you click on the tab Schema you can see the reports that are synced with your data warehouse, and all the schemas. Here you can adjust the reports and columns that you want to include or exclude from loading and configure if the values should be hashed.

There is also a schema setting menu where you can configure how to deal with new schemas and columns. By default every new schema, table or column will be allowed and added to your data destination.

Check results in BigQuery

Navigate back to BigQuery and reload the page if don't see your new data set in the side bar.

Looking at the fivetran_audit table, we exceeded the 500k rows for multiple reports, so you might be careful with syncing all available reports for a data source by default.

Adjust sync configuration

In our case we only want to keep syncing the keyword_page_report, as it contains all fields I'm interested in on a highly granular level:

  • country
  • date
  • device
  • page
  • query
  • search_type
  • clicks
  • ctr
  • impressions
  • position

If you don't need all reports, you can disable the sync for some reports (in the Schema tab for that data source in the Fivetran user interface). From that moment on it will not update that table (and thus not impact your monthly active rows).

Optionally you can also safely delete any table in BigQuery, if you don't need it any longer and want to save on storage costs.

The actual costs

We didn't provide a credit card, so we didn't lose any money (yet). Also the initial sync appears to be free from charge, according to the Usage tab. Once I have clarity about the actual costs of this connection I will update accordingly.

Update: after one week I received a usage estimation. Turns out we are still eligible for the free plan and don't have to provide a credit card to keep the pipeline running.

Now it's your turn!

I hope you've enjoyed this tutorial and feel a bit more confident to utilise your own marketing data in BigQuery. Drop a line in the comments if you have any questions, feedback or suggestions related to this article.