#22 - Data Transfer updates, new AI functions & BigQuery community news in April
Hi there,
Warm welcome to all the new newsletter subscribers since the last one. It's June now, but since there were so many updates in April, let's cover those first — I hope you'll still find a good few relevant pieces of news in here.
New on GA4BigQuery
The Facebook Ads Data Transfer got a major facelift in BigQuery. The free article explaining how to set it up has been updated accordingly:
Google wasn't too forthcoming with the news: I have to admit, I only became aware of this update because a subscriber commented on the original article saying that there are more objects available now. Many thanks to those who add comments to ask questions or provide updates / explanations: I'd like to encourage everyone to use this functionality as it might help others, even if I don't update the article in question with all the latest news.
New AI functions
AI.AGG: an aggregate function that takes data from multiple rows for the column in question and returns a single "summary" string based on the instruction given to Gemini. The GROUP BY clause works as usual like with any other aggregate column, so you can create summaries from chunks of data broken down by your preferred dimensions. See more detail in the documentation and in this LinkedIn post.
AI.KEY_DRIVERS: identifies segments of data that cause statistically significant changes to a summable metric. In other words, this is a "shortcut" to perform contribution analysis, which allows you to analyze metrics of interest from your dataset across defined test and control subsets.
You can enable "Optimized Mode" for certain managed AI functions such as AI.IF and AI.CLASSIFY to reduce query latency and token consumption. See newsletter #17 for a bit more detail.
Infrastructural improvements
Incremental updates in Data Transfers for on-premise databases: perhaps not one we frequently use in digital analytics, but useful for data warehousing folks migrating or replicating data from the following databases:
Microsoft SQL Server
MySQL
Oracle
PostgreSQL
ServiceNow
The above is an enabler in situations where the original database contains large tables with many millions of rows — incremental is the only feasible loading mechanism in these scenarios.
Changes to Google Ads and Facebook data transfers:
Multi-factor authentication is now required for individual user accounts if they are used in the Google Ads data transfer (or if they use the API in any way).
You can now use service accounts in the Google Ads transfer — the option had been there for ages, but for the longest time, it didn't use to work properly. Now it does: more context and detail here.
Some data types have changed in the output of the Facebook Data Transfer: namely, the ActionValue field in AdInsightActions has been changed from INT to FLOAT to better accommodate values (e.g. purchase revenue).
This is a very recent update, not from April, but now you can only backfill Google Ads data for the last 37 months, instead of "forever" (which is still pretty decent). Note: some tables don't get backfilled and only capture a "point-in-time" snapshot (e.g. campaign name, configuration details), so it's better to start the transfer asap.
Continuous queries now support "stateful operations" i.e. you can use joins and window functions. This is extremely useful for streaming applications e.g. enriching data in an event (e.g. click event) before loading it in the downstream application (e.g. for real-time messaging & personalization use cases).
SPONSORED
measure-apps.com: a new training course suite focusing on mobile app tracking and measurement challenges. The first training shows how to measure Webviews and maintain the integrity of the data across the entire visitor journey across native mobile app and webview screens. Launch promo with a remarkable 60% off of the original €150 price tag! (I've already signed up.)
BigQuery Graph capabilities in Preview: you can now use Graph Query Language (GQL) on tables right inside BigQuery without having to copy them over into a graph database instance. You can use a visual graph modeler to define nodes and edges in your BigQuery tables. And not only can you process the data, but you can also visualize results within BigQuery Studio itself, and Conversational Analytics can handle them too. This is very useful in fraud analytics, supply chain and "viral marketing" analysis, to name a few. Screenshot borrowed from this LinkedIn post.
Note: the community already started building accelerators to help people adapt this feature: namely, a GQL to SQL converter (with a blog post to explain).
Strict "act-as mode" in Dataform: starting from end of April, all Dataform repositories have to use a custom service account or an individual user account; they cannot use the default "dataform" service account anymore.
Community posts and updates around BigQuery
An easy-to-install method that sets up all the foundations to trigger a Dataform pipeline when the latest GA4 raw data arrives in BigQuery
Change in Google Signals: it stopped being a "fallback safety measure" (if you kept that switched off) — now sending data is solely governed by the associated consent signals.
ConsentOS: a completely free, open-source CMP (cookie management platform) — some detail here
a blog & repo that reverse engineers client- and server-side GTM scripts
Enjoy the summer and happy querying — stay tuned for the May updates and some technical articles in the works. If you know of any other updates from early spring, please leave a comment below!
Balazs
P.S.: I probably saw this before April, but it made me laugh so sharing it here (source: LinkedIn, credit unknown, apologies):
#22 - Data Transfer updates, new AI functions & BigQuery community news in April
— Balazs Vajna
#22 - Data Transfer updates, new AI functions & BigQuery community news in April
Hi there,
Warm welcome to all the new newsletter subscribers since the last one. It's June now, but since there were so many updates in April, let's cover those first — I hope you'll still find a good few relevant pieces of news in here.
New on GA4BigQuery
The Facebook Ads Data Transfer got a major facelift in BigQuery. The free article explaining how to set it up has been updated accordingly:
New AI functions
AI.AGG: an aggregate function that takes data from multiple rows for the column in question and returns a single "summary" string based on the instruction given to Gemini. TheGROUP BYclause works as usual like with any other aggregate column, so you can create summaries from chunks of data broken down by your preferred dimensions. See more detail in the documentation and in this LinkedIn post.AI.KEY_DRIVERS: identifies segments of data that cause statistically significant changes to a summable metric. In other words, this is a "shortcut" to perform contribution analysis, which allows you to analyze metrics of interest from your dataset across defined test and control subsets.AI.IFandAI.CLASSIFYto reduce query latency and token consumption. See newsletter #17 for a bit more detail.Infrastructural improvements
The above is an enabler in situations where the original database contains large tables with many millions of rows — incremental is the only feasible loading mechanism in these scenarios.
ActionValuefield inAdInsightActionshas been changed fromINTtoFLOATto better accommodate values (e.g. purchase revenue).measure-apps.com: a new training course suite focusing on mobile app tracking and measurement challenges. The first training shows how to measure Webviews and maintain the integrity of the data across the entire visitor journey across native mobile app and webview screens. Launch promo with a remarkable 60% off of the original €150 price tag! (I've already signed up.)
Note: the community already started building accelerators to help people adapt this feature: namely, a GQL to SQL converter (with a blog post to explain).
Community posts and updates around BigQuery
AI.DETECT_ANOMALIESexample use case (code here)INFORMATION_SCHEMA.Other relevant news from the community
Enjoy the summer and happy querying — stay tuned for the May updates and some technical articles in the works. If you know of any other updates from early spring, please leave a comment below!
Balazs
P.S.: I probably saw this before April, but it made me laugh so sharing it here (source: LinkedIn, credit unknown, apologies):