The ML.QUANTILE_BUCKETIZE function

This document describes the ML.QUANTILE_BUCKETIZE function, which lets you break a continuous numerical feature into buckets based on quantiles.

When used in the TRANSFORM clause, the same quantiles are automatically used in prediction.

Syntax

ML.QUANTILE_BUCKETIZE(numerical_expression, num_buckets) OVER()

Arguments

ML.QUANTILE_BUCKETIZE takes the following arguments:

  • numerical_expression: the numerical expression to bucketize.
  • num_buckets: an INT64 value that specifies the number of buckets to split numerical_expression into.

Output

ML.QUANTILE_BUCKETIZE returns a STRING value that contains the name of the bucket. The returned bucket names are in the format of bin_<bucket_index>, with bucket_index starting at 1.

Example

The following example breaks a numerical expression of five elements into three buckets:

SELECT
  f, ML.QUANTILE_BUCKETIZE(f, 3) OVER() AS bucket
FROM
  UNNEST([1,2,3,4,5]) AS f;

The output looks similar to the following:

+---+--------+
| f | bucket |
+---+--------+
| 3 | bin_2  |
| 5 | bin_3  |
| 2 | bin_2  |
| 1 | bin_1  |
| 4 | bin_3  |
+---+--------+

What's next