The ingestion process for DTv2 is similar to DTv1 but there are some key things that you will need
to change during your migration.
Processing Files
DTv2 processes hourly files (impression, click, and rich media) and daily files (activity and match tables).
Hourly files (file name contains YYYYMMDDHH) are processed in UTC time. Daily files (file name contains YYYYMMDD) are processed in your local reporting time
zone see file name format. The offset between your local time and UTC will differ depending on your location.
Files are also processed independently, and their processing time can vary. It's common for a
later hour file to finish processing before an earlier file (like hour 6 before hour 5).
Don't rely on file order for your ingestion process, otherwise your process may stall.
Events can appear in earlier or later processed files relative to their event time.
Events are not always processed in the hour in which they occurred.
Column Order
Please do not rely on column order as a mechanism to consume your files.
We want your processing to be resilient to change; if you request extra columns
or if we make changes in the future then the order and number of columns in your reports can change.
Read the header row of each file and map this to fields in your data warehouse before attempting to
write data.
Duplicate files
Sometimes duplicate files are written for the same date and hour. If more than one file has the
same date/hour stamp, use the one with the latest minutes / seconds, based on the filename
timestamp. Duplicate files are created because back end processes determined there was an issue
with the original file.
Fields
The field names and field name format all changed but there is a mapping from old to new
(where available) at DCM Field migration including Match Tables.
If you are not a DBM user, these will be empty and you can ignore the field.
You may also see DBM Fields in your file. If you are a DBM user these fields will populate only
when the relevant permission is granted on the DBM advertiser level where advertisers are linked.
There is a mapping from old to new (where available) at DBM Field migration. DBM entity read files
will still be used for mapping purposes.
Bucket Names
The naming standards for DTv1 and DTv2 are different, specifically you can’t change the prefix on
your existing bucket name to work out the DTv2 bucket name. The bucket name will be given to you by
your support representative when your account is setup.
Generally DTv2 bucket names look like gs://dcdt_-dcm_account1234
Each file name will have a string of numbers, for example: dcm_account1234_impression_2016022601_20160225_234912_218211994.csv.gz 2016022601 is in YYYYMMDDHH format. This is the UTC hour for events in that
file (hours are numbered 0 to 23). 20160225_234912 is in YYYYMMDD_HHMMSS format. This is the time at which the report was
generated. 218211994 is the file ID.
Activity Files
You get one Activity file per day and the filename looks like this: dcm_account1234_activity_20160727_20160728_035750_268669761.csv.gz
This file contains data for 27 July 2016 and was generated at 3:57:50 on 28 July 2016.
Click Files
You get twenty four Click files per day and the filename looks like this: dcm_account1234_click_2016072717_20160728_012331_268381796.csv.gz
Take note of the UTC Hour after the date string in the filename; hours are numbered from 0 to 23 so
a 17 here indicates that the events in this file are for 17:00 to 17:59 (24 hour clock time) or 5:00pm to 6:00pm.
This file contains data for 5:00pm to 6:00pm, 27 July 2016 and was generated at 1:23:31 on
28 July 2016.
Impression Files
You get twenty four Impression files per day and the filename looks like this: dcm_account7312_impression_2016072717_20160728_012355_268381795.csv.gz
Other than the filename this behaves the same as Click Files.
Match Table Files
Match Table files are generated once daily, and filenames look like this: dcm_account1234_match_table_activity_cats_20160727_20160728_032226_268648829.csv.gz
This file contains data for 27 July 2016 and was generated at 3:22:26 on 28 July 2016.
Note: For some new match tables the data is static and no daily downloadable file is produced; you
can get the data for these files from the reference pages e.g.
Rich Media standard event types.
Data Transfer Fields
1.0 field name
DT v2.0 field name
Time
Deprecated
User-ID
User ID
Advertiser-ID
Advertiser ID
Buy-ID
Deprecated
Order-ID
Campaign ID
Ad-ID
Ad ID
Creative-ID
Rendering ID
Creative-Version
Creative Version
Creative-Size-ID
Deprecated (Retrieved from Match Table as Creative Pixel Size)
Site-ID
Site ID (DCM)
Page-ID
Placement ID
Keyword
Deprecated
Country-ID
Country Code
State/Province
State/Region
Areacode
Deprecated
Browser-ID
Browser/Platform ID
Browser-Version
Browser/Platform Version
OS-ID
Operating System ID
DMA-ID
Designated Market Area (DMA) ID
City-ID
City ID
Zip-Code
ZIP/Postal Code
Time-UTC-Sec
Deprecated
Local-User-ID
Deprecated
Activity-Type
Deprecated (Retrieved from 'activity_cats' Match Table)
Activity-Sub-Type
Deprecated (Retrieved from 'activity_cats' Match Table)
[null,null,["Last updated 2024-09-19 UTC."],[[["Data Transfer v2 (DTv2) replaces DTv1, introducing changes to file processing, naming conventions, and field structures."],["DTv2 files are processed hourly for impressions, clicks, and rich media (in UTC) and daily for activity and match tables (in your local reporting timezone)."],["When encountering duplicate files, prioritize the file with the latest timestamp to ensure you are using the most recent data."],["Refer to the provided DCM and DBM field migration guides to understand the changes in field names and formats for accurate data mapping."],["DTv2 utilizes specific file naming patterns that include account ID, data type, date and time information, and a unique file ID for easy identification."]]],[]]