This class reads input data from a CSV file. The coord_to_columns attribute
stores a mapping from target InputData coordinates and array names to the
CSV column names, if they are different. The fields are:
geo, time, kpi, revenue_per_kpi, population (single column)
The DataFrame must include either (1) or (2), but doesn't need to include
both.
Internally, this class reads the CSV file into a Pandas DataFrame and then
loads the data using DataFrameDataLoader.
Args
csv_path
The path to the CSV file to read from. One of the following
conditions is required:
There are no gaps in the data.
For up to max_lag initial periods there is only media data and
empty cells in all the data columns different from media, reach,
frequency, organic_media, organic_reach and
organic_frequency (kpi, revenue_per_kpi, media_spend,
rf_spend, controls, population and non_media_treatments).
coord_to_columns
A CoordToColumns object whose fields are the desired
coordinates of the InputData and the values are the current names of
columns (or lists of columns) in the CSV file. Example:
A string denoting whether the KPI is of a 'revenue' or
'non-revenue' type. When the kpi_type is 'non-revenue' and there
exists a revenue_per_kpi, ROI calibration is used and the analysis is
run on revenue. When the revenue_per_kpi doesn't exist for the same
kpi_type, custom ROI calibration is used and the analysis is run on
KPI.
media_to_channel
A dictionary whose keys are the actual column names for
media data in the CSV file and values are the desired channel names, the
same as for the media_spend data. Example:
A dictionary whose keys are the actual column
names for media_spend data in the CSV file and values are the desired
channel names, the same as for the media data. Example:
A dictionary whose keys are the actual column names for
reach data in the dataframe and values are the desired channel names,
the same as for the rf_spend data. Example:
A dictionary whose keys are the actual column names
for frequency data in the dataframe and values are the desired channel
names, the same as for the rf_spend data. Example:
A dictionary whose keys are the actual column names
for rf_spend data in the dataframe and values are the desired channel
names, the same as for the reach and frequency data. Example:
A dictionary whose keys are the actual column
names for organic_reach data in the dataframe and values are the
desired channel names, the same as for the organic_frequency. Example:
A dictionary whose keys are the actual
column names for organic_frequency data in the dataframe and values
are the desired channel names, the same as for the organic_reach
data. Example: