NTILE
Returns the “bucket number” associated with each row, between 1 and the value of an expression. For example, creating 100 buckets puts the lowest 1% of values in the first bucket, while creating 10 buckets puts the lowest 10% of values in the first bucket. Each partition can have a different number of buckets.
Syntax:
NTILE (expr [, offset ...]
OVER ([partition_by_clause] order_by_clause)
The ORDER BY
clause is required. The
PARTITION BY
clause is optional. The window
clause is not allowed.
Usage notes:
The “ntile” name is derived from the practice of dividing
result sets into fourths (quartile), tenths (decile), and so on.
The NTILE()
function divides the result set based
on an arbitrary percentile value.
The number of buckets must be a positive integer.
The number of items in each bucket is identical or almost so, varying by at most 1. If the number of items does not divide evenly between the buckets, the remaining N items are divided evenly among the first N buckets.
If the number of buckets N is greater than the number of input rows in the partition, then the first N buckets each contain one item, and the remaining buckets are empty.
Examples:
The following example shows divides groups of animals into 4
buckets based on their weight. The ORDER BY ...
DESC
clause in the OVER()
clause means
that the heaviest 25% are in the first group, and the lightest 25%
are in the fourth group. (The ORDER BY
in the
outermost part of the query shows how you can order the final
result set independently from the order in which the rows are
evaluated by the OVER()
clause.) Because there
are 9 rows in the group, divided into 4 buckets, the first bucket
receives the extra item.
create table animals (name string, kind string, kilos decimal(9,3));
insert into animals values
('Elephant', 'Mammal', 4000), ('Giraffe', 'Mammal', 1200), ('Mouse', 'Mammal', 0.020),
('Condor', 'Bird', 15), ('Horse', 'Mammal', 500), ('Owl', 'Bird', 2.5),
('Ostrich', 'Bird', 145), ('Polar bear', 'Mammal', 700), ('Housecat', 'Mammal', 5);
select name, ntile(4) over (order by kilos desc) as quarter
from animals
order by quarter desc;
+------------+---------+
| name | quarter |
+------------+---------+
| Owl | 4 |
| Mouse | 4 |
| Condor | 3 |
| Housecat | 3 |
| Horse | 2 |
| Ostrich | 2 |
| Elephant | 1 |
| Giraffe | 1 |
| Polar bear | 1 |
+------------+---------+
The following examples show how the PARTITION
clause works for the NTILE()
function. Here, we
divide each kind of animal (mammal or bird) into 2 buckets, the
heavier half and the lighter half.
select name, kind, ntile(2) over (partition by kind order by kilos desc) as half
from animals
order by kind;
+------------+--------+------+
| name | kind | half |
+------------+--------+------+
| Ostrich | Bird | 1 |
| Condor | Bird | 1 |
| Owl | Bird | 2 |
| Elephant | Mammal | 1 |
| Giraffe | Mammal | 1 |
| Polar bear | Mammal | 1 |
| Horse | Mammal | 2 |
| Housecat | Mammal | 2 |
| Mouse | Mammal | 2 |
+------------+--------+------+
Again, the result set can be ordered independently from the analytic evaluation. This next example lists all the animals heaviest to lightest, showing that elephant and giraffe are in the “top half” of mammals by weight, while housecat and mouse are in the “bottom half”.
select name, kind, ntile(2) over (partition by kind order by kilos desc) as half
from animals
order by kilos desc;
+------------+--------+------+
| name | kind | half |
+------------+--------+------+
| Elephant | Mammal | 1 |
| Giraffe | Mammal | 1 |
| Polar bear | Mammal | 1 |
| Horse | Mammal | 2 |
| Ostrich | Bird | 1 |
| Condor | Bird | 1 |
| Housecat | Mammal | 2 |
| Owl | Bird | 2 |
| Mouse | Mammal | 2 |
+------------+--------+------+