How to replicate the 'Life cycle | Acquisition | Traffic acquisition' report (GA4)
2 min read

How to replicate the 'Life cycle | Acquisition | Traffic acquisition' report (GA4)

A query to generate the Google Analytics 4 traffic acquisition report in BigQuery. In this report you'll find data about acquisition at session level: users, (engaged) sessions, engagement time and rate, segmented by default channel grouping, medium, source and campaign.
How to replicate the 'Life cycle | Acquisition | Traffic acquisition' report (GA4)

As a Google Analytics user you are probably quite attached to the default reports in the user interface of Google Analytics 4. It can be hard to make sense of the data in the BigQuery export tables.

Let me enable you to replicate the report you're familiar with. I'll try to keep it as basic as possible here.

Traffic acquisition report

In the Life cycle | Acquisition | Traffic acquisition report you'll find a table with data about acquisition at session level (source, medium, campaign, default channel grouping) and some engagement metrics. We will try to replicate this table with the query below.

Good to know

The main query contains all primary dimensions that are available in the BigQuery export. By default only the first one is selected, the other dimensions are 'hidden' by a single line comment. Remove the -- part to select these dimensions. Don't forget to adjust the group by clause by removing the corresponding -- as well.

It is expected that the results will not match (exactly) with your reports in the Google Analytics 4 user interface. Don't worry about it, it can occur for a variety of reasons.

Let's query!

-- subquery to prepare the data
with prep_traffic as (
select
    user_pseudo_id,
    (select value.int_value from unnest(event_params) where key = 'ga_session_id') as session_id,
    max((select value.string_value from unnest(event_params) where key = 'medium')) as medium,
    max((select value.string_value from unnest(event_params) where key = 'source')) as source,
    max((select value.string_value from unnest(event_params) where key = 'campaign')) as campaign,
    max((select value.string_value from unnest(event_params) where key = 'session_engaged')) as session_engaged,
    max((select value.int_value from unnest(event_params) where key = 'engagement_time_msec')) as engagement_time_msec,
    -- change event_name to the event(s) you want to count
    countif(event_name = 'click') as event_count,
    -- change event_name to the conversion event(s) you want to count
    countif(event_name = 'purchase') as conversions,
    sum(ecommerce.purchase_revenue) as total_revenue   
from
    -- change this to your google analytics 4 bigquery export location
    `ga4bigquery.analytics_250794857.events_*`
where
    -- change the date range by using static and/or dynamic dates
    _table_suffix between '20210101' and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))
group by 
    user_pseudo_id,
    session_id)

-- main query
select
    concat(ifnull(source,'(direct)'),' / ',ifnull(medium,'(none)')) as session_source_medium,
    -- ifnull(medium,'(none)') as session_medium,
    -- ifnull(source,'(direct)') as session_source,
    -- ifnull(campaign,'(direct)') as session_campaign,
    /* -- definitions of the channel grouping based on the source / medium of every session
    case
        when source is null and (medium = '(not set)' or medium is null) then 'Direct'
        when medium = 'organic' then 'Organic Search'
        when regexp_contains(medium, r'^(social|social-network|social-media|sm|social network|social media)$') then 'Social'
        when medium = 'email' then 'Email'
        when medium = 'affiliate' then 'Affiliates'
        when medium = 'referral' then 'Referral'
        when regexp_contains(medium, r'^(cpc|ppc|paidsearch)$') then 'Paid Search'
        when regexp_contains(medium, r' ^(cpv|cpa|cpp|content-text)$') then 'Other Advertising'
        when regexp_contains(medium, r'^(display|cpm|banner)$') then 'Display'
        else '(Other)' end as session_default_channel_grouping,
    */
    count(distinct user_pseudo_id) as users,
    count(distinct concat(user_pseudo_id,session_id)) as sessions,
    count(distinct case when session_engaged = '1' then concat(user_pseudo_id,session_id) end) as engaged_sessions,
    safe_divide(sum(engagement_time_msec/1000),count(distinct case when session_engaged = '1' then concat(user_pseudo_id,session_id) end)) as average_engagement_time_per_session_seconds,
    safe_divide(count(distinct case when session_engaged = '1' then concat(user_pseudo_id,session_id) end),count(distinct user_pseudo_id)) as engaged_sessions_per_user,
    safe_divide(sum(event_count),count(distinct concat(user_pseudo_id,session_id))) as events_per_session,
    safe_divide(count(distinct case when session_engaged = '1' then concat(user_pseudo_id,session_id) end),count(distinct concat(user_pseudo_id,session_id))) as engagement_rate,
    sum(event_count) as event_count,
    sum(conversions) as conversions,
    ifnull(sum(total_revenue),0) as total_revenue
from
    prep_traffic
group by
    session_source_medium
    -- ,session_medium
    -- ,session_source
    -- ,session_campaign
    -- ,session_default_channel_grouping
order by
    users desc

Enjoying this post? Subscribe to the GA4BigQuery newsletter!

You will receive regular updates (max 1x/week) straight in your inbox. You can unsubscribe anytime.