Connected Sheets

Connected Sheets allows you to analyze petabytes of data directly within Sheets. You can connect your spreadsheets with a BigQuery data warehouse and do the analysis by using familiar Sheets tools like pivot tables, charts and formulas.

This guide uses the public dataset shakespeare to show how to use Connected Sheets. The dataset contains the following information:

Field Type Description
word STRING A single unique word (where whitespace is the delimiter) extracted from a corpus
word_count INTEGER The number of times this word appears in this corpus
corpus STRING The work from which this word was extracted
corpus_date INTEGER The year in which this corpus was published

Working with a DataSource

Add a BigQuery data source

To add a data source, supply an AddDataSourceRequest in batchUpdate method. The request body should specify a DataSource field.

Replace <YOUR_PROJECT_ID> below with a valid Google Cloud project ID.

"addDataSource":{
   "dataSource":{
      "spec":{
         "bigQuery":{
            "projectId":"<YOUR_PROJECT_ID>",
            "tableSpec":{
               "tableProjectId":"bigquery-public-data",
               "datasetId":"samples",
               "tableId":"shakespeare"
            }
         }
      }
   }
}

After a data source is created successfully, an associated DATA_SOURCE sheet is created to provide a preview of up to 500 rows. The preview isn’t available immediately. An execution is triggered asynchronously to import BigQuery data.

The AddDataSourceResponse contains the following fields:

  • dataSource: The newly created DataSource. The dataSourceId is also populated and will be referenced to create each DataSource object from the data source.

  • dataExecutionStatus: As mentioned above, this is the status of an execution that imports BigQuery data into the preview sheet. See the DataExecutionStatus for more information.

Update or delete a data source

Use the batchUpdate method and supply an UpdateDataSourceRequest or DeleteDataSourceRequest request accordingly.

Working with DataSource Objects

Once a data source is added to the spreadsheet, a data source object can be created from the data source. A data source object is a regular Sheet tool like pivot tables, charts and formulas, but integrated with Connected Sheets, to power your analysis.

There are four types of objects:

  • DataSource Table
  • DataSource PivotTable
  • DataSource Chart
  • DataSource Formula

Add a DataSource Table

Also known as “Extract” in the Sheets editor, the object imports a static dump of the data from the data source into Sheets. Similar to a pivot table, the table is specified and anchored at the top left cell.

In this example, we use the batchUpdate method and supply an updateCells request to create a data source table of two columns word and word_count, up to 1000 rows

"updateCells":{
   "rows":{
      "values":[
         {
            "dataSourceTable":{
               "dataSourceId":"<YOUR_DATA_SOURCE_ID>",
               "columns":[
                  {
                     "name":"word"
                  },
                  {
                     "name":"word_count"
                  }
               ],
               "rowLimit":{
                  "value":1000
               },
               "columnSelectionType":"SELECTED"
            }
         }
      ]
   },
   "fields":"dataSourceTable"
}

After a data source table is created successfully, the data isn’t available immediately. In the Sheets editor, it is shown as a preview. You need to refresh the data source table to fetch BigQuery data. You can specify a RefreshDataSourceRequest within the same batchUpdate. See Refresh a DataSource object below for details. Note that all data source objects work in a similar way.

After a refresh completes (BigQuery data is fetched), the data source table is populated as shown:

screenshot of a data source table showing data from shakespeare dataset

Add a DataSource PivotTable

Unlike a conventional pivot table, a data source pivot table is backed by a data source and references the data by column name. This example creates a pivot table showing total word count by corpus.

"updateCells":{
   "rows":{
      "values":[
         {
            "pivotTable":{
               "dataSourceId":"<YOUR_DATA_SOURCE_ID>",
               "rows":{
                  "dataSourceColumnReference":{
                     "name":"corpus"
                  },
                  "sortOrder":"ASCENDING"
               },
               "values":{
                  "summarizeFunction":"SUM",
                  "dataSourceColumnReference":{
                     "name":"word_count"
                  }
               }
            }
         }
      ]
   },
   "fields":"pivotTable"
}

After BigQuery data is fetched, the data source pivot table is populated as shown:

screenshot of a data source pivot showing data from shakespeare dataset

Add a DataSource Chart

This example creates a data source chart, with COLUMN type, showing total word count by corpus.

"addChart":{
   "chart":{
      "spec":{
         "title":"Corpus by word count",
         "basicChart":{
            "chartType":"COLUMN",
            "domains":[
               {
                  "domain":{
                     "columnReference":{
                        "name":"corpus"
                     }
                  }
               }
            ],
            "series":[
               {
                  "series":{
                     "columnReference":{
                        "name":"word_count"
                     },
                     "aggregateType":"SUM"
                  }
               }
            ]
         }
      },
      "dataSourceChartProperties":{
         "dataSourceId":"<YOUR_DATA_SOURCE_ID>"
      }
   }
}

After BigQuery data is fetched, the data source chart is rendered as shown:

screenshot of a data source chart showing data from shakespeare dataset

Add a DataSource Formula

This example creates a data source formula to compute the average word count.

"updateCells":{
   "rows":[
      {
         "values":[
            {
               "userEnteredValue":{
                  "formulaValue":"=AVERAGE(shakespeare!word_count)"
               }
            }
         ]
      }
   ],
   "fields":"userEnteredValue"
}

After BigQuery data is fetched, the data source formula is populated as shown:

screenshot of a data source formula showing data from shakespeare dataset

Refresh a DataSource object

You can refresh a data source object to fetch the latest data from BigQuery based on the current data source specs and object configurations. You can use batchUpdate method, supply a RefreshDataSourceRequest and specify one or multiple objects to refresh via DataSourceObjectReferences.

Note that you can both create data source objects and refresh them within one batchUpdate request.

DataExecutionStatus

When you create new data sources or refresh data source objects, an execution is created in the background to fetch the data from BigQuery and return a response containing the execution status. If the execution starts successfully, the execution status is usually in the RUNNING state.

Because the process is asynchronous, your application should implement a polling model to retrieve the status for data source objects periodically, using spreadsheets.get), until the status stores either SUCCEEDED or FAILED state. The execution completes quickly in most cases, but this depends on the complexity of your data source spec. In any case, the execution should take no longer than 10 minutes.