Receive and store aggregatable reports

This guide describes how encrypted measurement reports are delivered to ad tech providers. Chrome browsers and clients send these reports to designated reporting endpoints, where the ad tech platform receives and stores aggregatable reports. These endpoints, located at .well-known URLs within the provider's reporting origin, are hosted by the platform, enabling ad tech providers using the Attribution Reporting API or the Private Aggregation API to access them.

The process of receiving and storing aggregatable reports within the Privacy Sandbox Aggregation Service. — **Figure 1.**Aggregation Service: Aggregatable Report Processing.

The following steps detail the Aggregation Service's process for receiving and storing aggregatable reports:

When triggered, the browser sends aggregatable reports containing details about both cross-site and conversion data.
The browser delivers the encrypted reports to a .well-known URL within the ad tech reporting domain.
The system forwards the batches of reports to the Aggregation Service for processing.
The Aggregation Service statistically summarizes the reports.
The Aggregation Service adds noise to the summarized data to enhance user privacy.
The system makes the reports available to the ad tech company for analysis and measurement purposes.

The following table describes the debug and live endpoints for the Private Aggregation API and the Attribution Reporting API:

API	Endpoint	Description
Private Aggregation API	Debug endpoint: `[reporting-origin]/.well-known/private-aggregation/debug/report-shared-storage` Live endpoints: `[reporting-origin]/.well-known/private-aggregation/report-shared-storage` `[reporting-origin]/.well-known/private-aggregation/report-protected-audience`	Debug endpoint: For Private Aggregation API development and testing. Live endpoints: Receives and processes measurement reports in live environments
Attribution Reporting API	Debug endpoint: `[reporting-origin]/.well-known/attribution-reporting/debug/report-aggregate-attribution` Live endpoint: `[reporting-origin]/.well-known/attribution-reporting/report-aggregate-attribution`	Debug endpoint: For Attribution Reporting API development and testing. Live endpoints: Production endpoint for aggregate attribution reports. Receives and processes aggregate attribution reports in live production environments for measurement.

Reporting origins receive JSON reports through POST calls. The system then transforms these reports into Avro format and puts them in cloud storage. After batch processing, the system sends the Avro reports to the Aggregation Service for summarization.

Ad tech platforms trigger an aggregation job request to the Aggregation Service when a batch of Avro reports is deemed ready for processing. This service, hosted within the ad tech platform's cloud environment, retrieves the required Avro reports from the same storage location. For security purposes, the Aggregation Service must be configured to use an approved container image. Refer to the privacy sandbox/aggregation-service GitHub repository for available container images.

The following show representative examples of the reports returned by each API:

Private Aggregation API report example:

  {
    "aggregation_coordinator_origin": "https://publickeyservice.msmt.aws.privacysandboxservices.com",
    "aggregation_service_payloads": [ {
        "key_id": "1a2baa3f-5d48-46cf-91f0-772633c12640",
        "payload": "8Cjr1s3FVkCYkjzBvyzJn14yardVjd5N4vLCA69LQAPbIkJ0B58hAqUGBCNXpvTjW9ZpIoZbCSiUOsUDuoA/S+tqVolLMkame6sWC07cfUmZcVsbU+La3pzTMtCgdtNc8MIWgD3C63CMw7rWroRlechewVUajvAYVK/0HJq0YyGrTiFZZm36zi0jjyHLAXKV8p1Lvy1d0o/wnBxC5oVo5BV6LPkxqQEcoYS2GyixUuht6wD0RzuH+BxxuH6vY/ynp2xDrnwftjvqwDUAxUWLFTunthM6BXZVxlrvOBim1h2dvPqWSyKZ5gafo+MgW9EM4SraavNM3XzZSCjdtAfSMJMrynSu2j0opyAq+9e1jq1xeYN00yZrJ0Y/GTI45IGjgCnVmvmuoI9ucW2SnXP31CQBwHqk4gtUgMsYGFSUYfhtnAQ/8TSbaXyS2LX+cQW87LqkvIraWw6o37O24VFBreFoFFXpu3IUeCZfji+Sr4/ykfZuHeMzQbBavyNnHKzPZlbLSXMiucx4/vWzYyOzHeIlbtupXVvbi40V2PieDShaSbjI266kGgFkeCk6z51AaAGebDPtRT1lhBpcoQ6JdF0Yp5VWSnyFARKFtCZ1aEBrlUlrEHLUQY/pFtmDxJQiicRz1YPjR8jRr3C7hlRhWwov0dMocqnMz5209hHGVZWSsaGc9kWjtxREW2ULXfoIwOGbX+WZsyFW2RhXksQPJ5fhyNc4ROkAzUthLb68gC5e0yZHvmLIAU4hcWe0UanJv+jRljn8PAPaJHKFUxQNJyBA7mTbn5mkpycxGrX6T3ZYdPHqvckqt9llJZWjr8NneizzZFRuJk423BDs38fXkvcTAsAckd2Zu0u2KC45WR93sN2/CWrqB7/QU9BsgNdonl/ehAWhU1LbcRRvBTcR9+0wL7vRL7cv5LG3+gRYRKsWI6U2nDSWp0cNpo9+HU0JNiifa5X0cguihqU2bSk6ABozgRtCZ7m+7eqWXMLSzBdmc1CPUoQppo6Wmf6ujdNqI6v2S6pDH781lph8Z2v7ZpxGdhVVPEL51cVn"
    } ],
    "debug_key": "1234",
    "shared_info": "{\"api\":\"shared-storage\",\"report_id\":\"05e3b948-cb8d-4404-be29-bfeac7ad9710\",\"reporting_origin\":\"https://privacy-sandbox-demos-dsp.dev\",\"scheduled_report_time\":\"1707784729\",\"version\":\"0.1\"}"
  }

Attribution Reporting API report example

  {
    "aggregation_coordinator_origin": "https://publickeyservice.msmt.aws.privacysandboxservices.com",
    "aggregation_service_payloads": [ {
        "key_id": "2dee0f3f-2aee-4a4a-8238-9154ed3d6f72",
        "payload": "pHvTHhcxvNKaCmnLpvYQsXlJpiNRuFO5Zj1QqUlqgWPOfuoHLfiXiFjmpvY8a53/OYnS4bKwHwJReFcofldsu8E9BzTTJ3CEk+B7vbEjnDPaljhpIBMTuQXy3QHGK4slWR/yNZVm2uXRWR/DVVzXziBoTDjN7qaPstRoLKUUMdfY2u8oq4tnLY00Y+NDZttZ4wJvC7hPmvY3lqHjdl14JPD2ytZZ4NViYzno3WKdH/oZc0jhGK4zI38lAM0qpahF/B9yb4zOu7IRIjQpNx73P8naDyddxLldoVlW/qHpO04FguWymscvI/8i6NwUR6Kj8seRlWS0iIUhETt/ai3lilKUHUb+uz0YG2kxjoXq7Ldk+MP56nNl67ZRNi2YZ7bOGI/okYWoT/wt2uWPe/5xAEMmadxl0hQQrG7YXHRSD8rDnaVPXo+AKIxdg727yJeB1ZENZvovl/kIevdRAmdBe2h1U3J6Uz6psly/46fvjgkj5QD+kO2uaYirzvmwS19luJsN/Qvh/R3ZO4qlJIQI0nDJPWwUJ4ODpyVmj4a0xQp3t2ESEnf4EmY7+khn3xpF5+MwEWKES2ZeDf7SHalR99pvZA8G3Fr8M0PWFmT00cmKCBwpQgZyd3Eay70UlqdkbFEedxiCVWKNNOUz41m5KG/7K3aR+dYx57l57Wct4gOFQg3jiUEBJWrFIVCXf12BT5iz5rBQh1N1CUt2oCOhYL/sPuBl6OV5GWHSIj8FUdpoDolqKXWINXfE88MUijE2ghNRpJN25BXIErUQtO9wFQv7zotC6d2BIaF0x8AkKg/7yzBQRySX/FZP3H3lMkpOz9rQMV8DjZ2lz7nV4k6CFo8qhT6cpYJD7GpYl81xJbglNqcJt5Pe5YUHrdBMyAFsTh3yoJvYnhQib/0xVN/a93lbYccxsd0yi375n4Xz0i1HUoe2ps+WlU8XysAUA1agG936eshaY1anTtbJbrcoaH+BNSacKiq4saprgUGl4eDjaR/uBhvUnO52WkmAGon8De3EFMZ/kwpPBNSXi7/MIAMjotsSKBc19bfg"
    } ],
    "shared_info": "{\"api\":\"attribution-reporting\",\"attribution_destination\":\"https://privacy-sandbox-demos-shop.dev\",\"report_id\":\"5b052748-f5fb-4f14-b291-de03484ed59e\",\"reporting_origin\":\"https://privacy-sandbox-demos-dsp.dev\",\"scheduled_report_time\":\"1707786751\",\"source_registration_time\":\"0\",\"version\":\"0.1\"}",
    "source_debug_key": "123456789",
    "trigger_debug_key": "123456789"
  }

Convert JSON to Avro reports

Aggregatable reports must be in the Apache Avro data serialization format for batching purposes. In order to create an Avro report you must use an AVSC schema. The AVSC schema file defines the Avro record structure and data type. For an AVSC schema example, see the example.avsc file in this avrodoc/schemata GitHub repository.

You can find example JavaScript code in the Collect, transform and batch reports section of the Collecting and Batching Aggregatable Reports page located in the privacysandbox/aggregation-service GitHub repository.

You have the flexibility to store all reports in a single AVRO file or distribute them across multiple files. While AVRO files have no size limit, optimal performance is typically achieved when the number of files ranges from the number of CPUs in your cloud instance up to 1000.

The following code example shows an Avro schema for aggregatable reports. The report fields include payload, key_id, and shared_info.

  {
    "type": "record",
    "name": "AggregatableReport",
    "fields": [
      {
        "name": "payload",
        "type": "bytes"
      },
      {
        "name": "key_id",
        "type": "string"
      },
      {
        "name": "shared_info",
        "type": "string"
      }
    ]
  }

Parameter	Type	Description
`payload`	Bytes	The `payload` must be base64 decoded and converted to a byte array for either live or production reports.
`debug_cleartext_payload`	Bytes	The payload must be base64 decoded and converted to a byte array from the `debug_cleartext_payload` for debug reports.
`key_id`	String	This is the `key_id` string found in the report. The `key_id` is a 128-bit universal unique identifier.
`shared_info`	String	This is the unaltered, non-tampered string found in the report `shared_info` field.

The following is an example JSON report:

{
   "aggregation_coordinator_identifier": "aws-cloud",
   "aggregation_service_payloads": [{
      "debug_cleartext_payload": "omRkYXhgaJldmFsdWVEAAAAgGZidWNrZXRQAAAAAAAAAAAAAAAAAAAFWW1vcGVyYX",
      "key_id": "3c6e2850-edf6-4886-eb70-eb3f2a7a7596",
      "payload": "oapYz92Mb1yam9YQ2AnK8dduTt2RwFUSApGcKqXnG1q+aGXfJ5DGpSxMj0NxdZgp7Cq"
   }],
   "debug_key": "1234",
   "shared_info":
"{\"api\":\"shared-storage\",\"debug_mode\":\"enabled\",\"report_id\":\"b029b922-93e9-4d66-a8c6-8cdeec762aed\",\"reporting_origin\":\"https://privacy-sandbox-demos-dsp.dev\",\"scheduled_report_time\":\"1719251997\",\"version\":\"0.1\"}"
}

Domain file specification

Generating summary reports with the Aggregation Service requires your aggregatable reports (JSON reports converted to Avro) and the associated domain file. The system extracts pre-declared keys from your aggregatable reports and includes them in the summary reports within output domains. You'll find details about these crucial aggregation keys in Understanding aggregation keys for Attribution Reporting and the Aggregation key section of Private Aggregation API fundamentals. The output domain also includes the bucket field which represents your bucket key value.

The domain file must be in Avro format using the following schema:

  {
    "type": "record",
    "name": "AggregationBucket",
    "fields": [
      {
        "name": "bucket",
        "type": "bytes",
        "doc": "A single bucket that appears in the aggregation service output. It is an 128-bit integer value encoded as a 16-byte big-endian bytestring."
      }
    ]
  }

Bucket key

The bucket key within the output domain must be represented as a hex byte string.

For example:

If the bucket key is the decimal value 1369:

Convert 1369 to its hexadecimal equivalent: 559
Convert the hexadecimal string "559" into a byte string.

This byte string representation of the bucket key should then be included in the output domain Avro schema.

Important considerations:

Data type: The bucket key within the Avro schema should be defined as a byte type to accommodate the hex byte string representation.
Conversion: The conversion from decimal to hex and then to a byte string can be implemented using Python or Java.

This approach ensures that the bucket key is properly formatted and compatible with the expected data type within the Avro schema for the output domain.

The bucket key should be a hex bytestring. For example, consider a byte string with a decimal value of 1369. When converted to Hex format, it is 559 for addition into the Avro output domain. — **Figure 2.**The diagram illustrates the transformation of a bucket key into a hexadecimal and then byte string representation, ultimately used to populate an output domain AVRO schema.

Batch reports

For details on privacy budgets and batching strategies, see the batching strategies documentation. Note that aggregatable reports have a MAX_REPORT_AGE limit (currently 90 days) between their scheduled_report_time and the batch run date.

Summary reports

After batching, the Aggregation Service creates the summary report in Avro format using the results.avsc schema.

Upon job completion, the summary report is stored in the output_data_blob_prefix within the output_data_bucket_name bucket as stated in the createJob request.

For Aggregation Service batches where debug_run is enabled, it creates two reports, the summary report and the debug summary report. The debug summary report is located in the output_data_blob_prefix/debug folder. The debug summary report uses the debug_results.avsc schema.

Both summary and debug report are named as [output_data_blob_prefix]-1-of-1.avro. If your output_data_blob_prefix is summary/summary.avro, the report is in the summary folder with the name summary-1-of-1.avro.

`results.avsc` example

The following is a example Avro schema for results.avsc:

{
  "type": "record",
  "name": "AggregatedFact",
  "fields": [
    {
      "name": "bucket",
      "type": "bytes",
      "doc": "Histogram bucket used in aggregation. It is an 128-bit integer value encoded as a 16-byte big-endian bytestring. Leading 0-bits are left out."
    },
    {
      "name": "metric",
      "type": "long",
      "doc": "The metric associated with the bucket"
    }
  ]
}

The example Avro schema defines a record named AggregatedFact.

`debug_results.avsc` example

The following is an example Avro schema for debug_results.avsc:

  {
  "type": "record",
  "name": "DebugAggregatedFact", Output domains include summary reports that contain pre-declared keys extracted from your aggregatable reports.
  "fields": [
      {
        "name": "bucket",
        "type": "bytes",
        "doc": "This represents the histogram bucket used in aggregation. It's a 128-bit integer, encoded as a 16-byte big-endian bytestring, with leading zero bytes omitted.."
      },
      {
        "name": "unnoised_metric",
        "type": "long",
        "doc": "The raw metric for the bucket."
      },
      {
        "name": "noise",
        "type": "long",
        "doc": "The noise applied to the metric in the regular result."
      }
      {
        "name":"annotations",
        "type": {
          "type": "array",
          "items": {
            "type":"enum",
            "name":"bucket_tags",
            "symbols":["in_domain","in_reports"]
          }
       }
    ]
  }

Once converted, your summary report resembles the results.json example. When debug_run is enabled, the debug summary report return is similar to the debug_results.jsonexample.

Avro reports format

Avro reports received from the Aggregation Service typically adhere to a consistent format. The Avro report format includes the following fields:

bucket: A unique identifier for the data aggregation (For example, "\u0005Y").
metric: The aggregated value for the corresponding bucket. This value often includes added noise to enhance privacy.

For example:

  {
    "bucket": "\u0005Y",
    "metric": 26308
  }

`debug_results.json` example

Debug Avro reports from the Aggregation Service will resemble the following debug_results.json example. These reports include bucket keys, the unnoised_metric (the bucket keys' summary before noise application), and the noise added to that metric.

  {
    "bucket": "\u0005Y",
    "unnoised_metric": 128,
    "noise": -17948,
    "annotations": [
      "in_reports",
      "in_domain"
    ]
  }

The annotations also contain the following values:

in_reports: the bucket key available inside the aggregatable reports
in_domain: the bucket key available inside the output_domain Avro file