A lot of Google Analytics 4 dimensions and metrics can be derived straight from the tables without performing calculations on the data. With the help of unnest and parse for dates or timestamps we can do the trick.

Other dimensions and metrics that you might need are a bit harder to access. You will have to (re)calculate them, which can require some serious SQL-skills.

While designing the course Query GA4 Data In Google BigQuery I've learned a lot about best practices to calculate dimensions and metrics. To share this knowledge with you I will provide a combined query for default dimensions and metrics, and a single example query for every non-default dimension or metric.

As always, please let me know if you have any feedback or suggestions to improve the quality of this content.

💡
User acquisition dimensions can be found here.

Default user dimensions

  • user_id
  • user_pseudo_id
  • user_first_touch_timestamp
  • user_properties.key
  • user_properties.value.string_value
  • user_properties.value.set_timestamp_micros
  • user_ltv.currency

Default user metrics

  • user_properties.value.int_value
  • user_properties.value.float_value
  • user_properties.value.double_value
  • user_ltv.revenue
💡
If you only need one default dimension or metric, look at the -- comments in the example query below for names and definitions and copy the part you need from the select clause. Make sure that you also add any additional conditions (i.e. with, from, where, group by, having and order by) that are necessary to calculate the results correctly. E.g, the query below is ungrouped, so every row corresponds to an event and may contain duplicate rows.
select
    -- user_id (dimension | the user id set via the setUserId api)
    user_id,
    -- user_pseudo_id (dimension | the pseudonymous id (e.g., app instance id) for the user)
    user_pseudo_id,
    -- user_first_touch_timestamp (dimension | the time (in microseconds) at which the user first opened the app/website)
    timestamp_micros(user_first_touch_timestamp) as user_first_touch_timestamp,
    -- user_properties.key (dimension | the name of the user property | replace <insert key> with a parameter key or delete where clause to select all)
    (select key from unnest(user_properties) where key = '<insert key>') as user_properties_key,
    -- user_properties.value.string_value (dimension | the string value of the user property | replace <insert key> with a parameter key or delete where clause to select all)
    (select value.string_value from unnest(user_properties) where key = '<insert key>') as user_string_value,
    -- user_properties.value.int_value (metric | the integer value of the user property | replace <insert key> with a parameter key or delete where clause to select all)
    (select value.int_value from unnest(user_properties) where key = '<insert key>') as user_int_value,
    -- user_properties.value.float_value (metric | the float value of the user property | replace <insert key> with a parameter key or delete where clause to select all)
    (select value.float_value from unnest(user_properties) where key = '<insert key>') as user_float_value,
    -- user_properties.value.double_value (metric | the double value of the user property | replace <insert key> with a parameter key or delete where clause to select all)
    (select value.double_value from unnest(user_properties) where key = '<insert key>') as user_double_value,
    -- user_properties.value.set_timestamp_micros (dimension | the time (in microseconds) at which the user property was last set | replace <insert key> with a parameter key or delete where clause to select all)
    timestamp_micros((select value.set_timestamp_micros from unnest(user_properties) where key = '<insert key>')) as user_set_timestamp_micros,
    -- user_ltv.revenue (metric | the lifetime value (revenue) of the user)
    user_ltv.revenue as user_ltv_revenue,
    -- user_ltv.currency (dimension | the lifetime value (currency) of the user)
    user_ltv.currency as user_ltv_currency
from
    -- change this to your google analytics 4 export location in bigquery
    `ga4bigquery.analytics_250794857.events_*`
where
    -- define static and/or dynamic start and end date
    _table_suffix between '20201101'
    and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))

Calculated user dimensions

  • user type
select
	-- user type (dimension | either new visitor or returning visitor, indicating if the users are new or returning)
    case
        when (select value.int_value from unnest(event_params) where event_name = 'session_start' and key = 'ga_session_number') = 1 then 'new visitor'
        when (select value.int_value from unnest(event_params) where event_name = 'session_start' and key = 'ga_session_number') > 1 then 'returning visitor'
        else null end as user_type,
    count(distinct user_pseudo_id) as users
from
    -- change this to your google analytics 4 export location in bigquery
    `ga4bigquery.analytics_250794857.events_*`
where
    -- define static and/or dynamic start and end date
    _table_suffix between '20220901'
    and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))
group by
    user_type
having
    user_type is not null
  • count of sessions
with prep as (
select
    user_pseudo_id,
    max((select value.int_value from unnest(event_params) where key = 'ga_session_number')) over (partition by user_pseudo_id) as max_session_number
from
    -- change this to your google analytics 4 export location in bigquery
    `ga4bigquery.analytics_250794857.events_*`
where
    -- define static and/or dynamic start and end date
    _table_suffix between '20220901'
    and format_date('%Y%m%d',date_sub(current_date(), interval 1 day)))

select
	-- count of sessions (dimension | the session index for a user, each session from a unique user will get its own incremental index starting from 1 for the first session)
    max_session_number as count_of_sessions,
    count(distinct user_pseudo_id) as users
from
    prep
group by
    max_session_number
order by
    max_session_number

Calculated user metrics

💡
Bear in mind that GA4 uses multiple concepts of users:
- total users, based on user_pseudo_id
- active users, based on user_pseudo_id and an engagement threshold
- users based on user_id

For every query that involves users you have to ask yourself which user concept you want to apply. To not overcomplicate things I have a slight preference for total users, unless insight in engagement is key.
  • total users
select
    -- users (metric | the total number of users)
    count(distinct user_pseudo_id) as users
from
    -- change this to your google analytics 4 export location in bigquery
    `ga4bigquery.analytics_250794857.events_*`
where
    -- define static and/or dynamic start and end date
    _table_suffix between '20220901'
    and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))
  • active users
select
    count(distinct case when (select value.int_value from unnest(event_params) where key = 'engagement_time_msec') > 0  or (select value.string_value from unnest(event_params) where key = 'session_engaged') = '1' then user_pseudo_id else null end) as active_users
from
    -- change this to your google analytics 4 export location in bigquery
    `ga4bigquery.analytics_250794857.events_*`
where
    -- define static and/or dynamic start and end date
    _table_suffix between '20220901'
    and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))
  • new users (based on total users)
select
    -- new users (metric | the number of users who interacted with your site or launched your app for the first time)
    count(distinct case when (select value.int_value from unnest(event_params) where key = 'ga_session_number') = 1 then user_pseudo_id else null end) as new_users
from
    -- change this to your google analytics 4 export location in bigquery
    `ga4bigquery.analytics_250794857.events_*`
where
    -- define static and/or dynamic start and end date
    _table_suffix between '20220901'
    and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))
  • % new users  (based on total users)
select
    count(distinct case when (select value.int_value from unnest(event_params) where key = 'ga_session_number') = 1 then user_pseudo_id else null end) / count(distinct user_pseudo_id) as percentage_new_users
from
    -- change this to your google analytics 4 export location in bigquery
    `ga4bigquery.analytics_250794857.events_*`
where
    -- define static and/or dynamic start and end date
    _table_suffix between '20220901'
    and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))
  • new sessions
select
    count(distinct case when (select value.int_value from unnest(event_params) where key = 'ga_session_number') = 1 then concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')) else null end) as new_sessions
from
    -- change this to your google analytics 4 export location in bigquery
    `ga4bigquery.analytics_250794857.events_*`
where
    -- define static and/or dynamic start and end date
    _table_suffix between '20220901'
    and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))
  • % new sessions
select
    -- % new sessions (metric | the percentage of sessions by users who had never visited before)
    count(distinct case when (select value.int_value from unnest(event_params) where key = 'ga_session_number') = 1 then concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')) else null end) / count(distinct concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id'))) as percentage_new_sessions
from
    -- change this to your google analytics 4 export location in bigquery
    `ga4bigquery.analytics_250794857.events_*`
where
    -- define static and/or dynamic start and end date
    _table_suffix between '20220901'
    and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))
  • number of sessions per user (based on total users)
select
    -- number of sessions per user (metric | the total number of sessions divided by the total number of users)
    count(distinct concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id'))) / count(distinct user_pseudo_id) as number_of_sessions_per_user
from
    -- change this to your google analytics 4 export location in bigquery
    `ga4bigquery.analytics_250794857.events_*`
where
    -- define static and/or dynamic start and end date
    _table_suffix between '20220901'
    and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))
  • event count per user (based on total users)
select
    -- event count per user (metric | the number of times an individual event (change event_name 'page_view' to event that needs to be counted) was triggered divided by amount of users)
    countif(event_name = 'page_view') / count(distinct user_pseudo_id) as event_count_per_user
from
    -- change this to your google analytics 4 export location in bigquery
    `ga4bigquery.analytics_250794857.events_*`
where
    -- define static and/or dynamic start and end date
    _table_suffix between '20220901'
    and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))
  • engaged sessions per user (based on total users)
select
    count(distinct case when (select value.string_value from unnest(event_params) where key = 'session_engaged') = '1' then concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')) end) / count(distinct user_pseudo_id) as engaged_sessions_per_user
from
    -- change this to your google analytics 4 export location in bigquery
    `ga4bigquery.analytics_250794857.events_*`
where
    -- define static and/or dynamic start and end date
    _table_suffix between '20220901'
    and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))