End-user privacy is at the core of everything that Ads Data Hub does; it's the foundation that our platform is built upon. In order to help maintain that privacy and help our customers with regulatory compliance, we impose certain checks and restrictions, designed to help prevent the transmission of data about individual users1 in the data that you get out of the platform.
Here is an overview of Ads Data Hub's privacy features, with more detail in the sections that follow:
- Static checks examine the statements in your queries to look for obvious and immediate privacy concerns.
- Data access budgets limit the total number of times that you can access a given piece of data.
- Aggregation checks ensure that every row contains a large enough number of users to protect end-user privacy.
- Difference checks (or "diff checks") compare result sets to help prevent you from gathering information about individual users by comparing data from multiple sets of users.
- Noise injection is an alternative to difference checks.
Adding random noise to an aggregating
SELECT
clause of a query protects user privacy while providing reasonably accurate results, eliminating the need for difference checks, and reducing the required aggregation threshold for output.
When a result doesn't pass privacy checks, Ads Data Hub will display or return a privacy message informing you that a row was filtered. This can be anything from a single row to an entire result set. To ensure that your reporting totals remain accurate, use a filtered row summary to count data from dropped rows2.
Static checks
Static checks examine the statements in your queries to look for obvious and immediate privacy concerns, such as exporting user identifiers, any function of user identifiers, or using disallowed functions over fields that contain user-level data. To avoid query errors from static checks, review the best practices and understand which functions are allowed.
Data access budget
Your data access budget limits the total number of times that you can access a
given piece of data. Users approaching the end of their budget will be notified
with a privacy message with
type DATA_ACCESS_BUDGET_IS_NEARLY_EXHAUSTED
. You may monitor the budget using
the data access budget entry point
or by observing budget notifications in the UI.
Aggregation requirements
At the core of Ads Data Hub's privacy checks is the user aggregation threshold. For most queries, you can only receive reporting data on 50 or more users. However, queries that only access clicks and conversions can be used to report on 10 or more users.
- Events with zeroed user IDs count as a single user toward the aggregation threshold, no matter how many actual users created the events.
- Users with null IDs don't count towards the aggregation threshold.
- Learn how noise mode impacts aggregation requirements.
Best practice: Configure a filtered row summary to report on data that was omitted. This helps to maintain a consistent baseline in your reports.
In the following example, the row containing campaign 125 would be filtered from the final results, because it aggregates results from 48 users, which is below the 50-user minimum.
Campaign ID | Users | Impressions |
---|---|---|
123 | 314 | 928 |
124 | 2718 | 5772 |
125 | 48 | 353 |
Privacy modes
Ads Data Hub offers two privacy modes—difference checks and noise injection. The following sections describe and compare these modes.
Use difference checks
Difference checks help ensure that users can't be identified through the comparison of multiple sufficiently aggregated results in the following ways:
- They compare results from the job that you're running to your previous results.
- They compare rows within the same result set.
Difference check violations can be triggered by changes to your underlying data between two jobs. When comparing a job's results to previous results, Ads Data Hub looks for vulnerabilities on the level of individual users. Because of this, even results from different campaigns, or results that report the same number of users, can be filtered if they have a large number of overlapping users.
On the other hand, two aggregated result sets may have the same number of users—appearing identical—but not share individual users, and therefore be privacy-safe, in which case they wouldn't be filtered.
Ads Data Hub uses data from your historical results when considering the vulnerability of a new result. This means that running the same query over and over again creates more data for difference checks to use when considering a new result's vulnerability. Additionally, the underlying data can change, leading to privacy check violations on queries thought to be stable.
When your job-level results differ adequately, but an individual row is similar to a row in a previous job, Ads Data Hub will filter the similar row. In this example, the row containing campaign 123 in the second job's results will be filtered, because it differs from the previous result by a single user.
|
|
If the sum of the users in all rows in a result set is similar to that from a previous job, Ads Data Hub will filter the entire result set. In this example, all results from the second job will be filtered.
|
|
Use noise injection
Noise injection is a technique used to protect user privacy when querying a
database. It works by adding random noise to an aggregating SELECT
clause of a
query. This noise protects user privacy while providing reasonably accurate
results, eliminating the need for difference checks and reducing the required
aggregation threshold for output. Most existing queries can be executed in noise
mode, with some limitations. To learn more about noise mode and how noise
injection impacts privacy requirements, see Noise injection.
Compare difference checks to noise injection
|
|
||||||||||||||||||||||||||||||||
|
|
Filtered row summary
Filtered row summaries tally data that was filtered due to privacy checks. Data from filtered rows is summed and added to a catch-all row. While the filtered data can't be further analyzed, it provides a summary of how much data was filtered from the results.
Explicit privacy filtering
In cases where you need to break your query up but want to combine the aggregated results, you can explicitly apply privacy checks to several smaller queries and then aggregate those results together in a privacy-safe way.
Example use cases:
- You are an advertiser looking for all conversions by attribution event type in your linked Google Ads account, which includes EEA data.
- You are a measurement partner looking for all conversions by attribution event type in your linked Google Ads account.
To get the sum of conversions for your Google Ads account, you can rewrite the
query using an OPTIONS(privacy_checked_export=TRUE)
clause to apply privacy
checks to each Google service individually.
The example rewrite in this section does the following:
- It queries each Google service individually, explicitly applying privacy checks to each intermediate results set.
- It creates a separate temp table for the privacy-checked results of each Google service: YouTube, Gmail, and Network.
- It aggregates and sums the privacy-checked conversion counts from the temp tables.
CREATE TEMP TABLE youtube_agg OPTIONS(privacy_checked_export=TRUE) AS
SELECT
impression_data.campaign_id,
attribution_event_type,
COUNT(1) AS num_convs
FROM adh.google_ads_conversions_policy_isolated_youtube
WHERE impression_data.campaign_id IN UNNEST(@campaign_ids)
AND conversion_type IN UNNEST(@conversion_type_list)
GROUP BY campaign_id, attribution_event_type;
CREATE TEMP TABLE network_agg OPTIONS(privacy_checked_export=TRUE) AS
SELECT
impression_data.campaign_id,
attribution_event_type,
COUNT(1) AS num_convs
FROM adh.google_ads_conversions_policy_isolated_network
WHERE impression_data.campaign_id IN UNNEST(@campaign_ids)
AND conversion_type IN UNNEST(@conversion_type_list)
GROUP BY campaign_id, attribution_event_type;
CREATE TEMP TABLE gmail_agg OPTIONS(privacy_checked_export=TRUE) AS
SELECT
impression_data.campaign_id,
attribution_event_type,
COUNT(1) AS num_convs
FROM adh.google_ads_conversions_policy_isolated_gmail
WHERE impression_data.campaign_id IN UNNEST(@campaign_ids)
AND conversion_type IN UNNEST(@conversion_type_list)
GROUP BY campaign_id, attribution_event_type;
SELECT
campaign_id,
attribution_event_type,
SUM(num_convs) AS num_convs
FROM (
SELECT * FROM youtube_agg
UNION ALL
SELECT * FROM network_agg
UNION ALL
SELECT * FROM gmail_agg
)
GROUP BY campaign_id, attribution_event_type
Note that this query does not use a JOIN
to directly combine data between the
tables, but instead performs the query for each table first, applies privacy
checks to each intermediate table, then uses a UNION
to sum the
privacy-checked values.
Query advisor
If your SQL is valid but might trigger excessive filtering, the query advisor surfaces actionable advice during the query development process, to help you avoid undesirable results.
Triggers include the following patterns:
- Joining aggregated subqueries
- Joining unaggregated data with potentially different users
- Recursively defined temp tables
To use the query advisor:
- UI. Recommendations will be surfaced in the query editor, above the query text.
- API. Use the
customers.analysisQueries.validate
method.
-
Other than data they have consented to share, such as in the case of panelists. ↩
-
Unless prevented by privacy restrictions, such as when the users in a filtered row summary don't meet aggregation requirements. ↩