COUNT Function
An aggregate function that returns the number of rows, or the number of non-NULL rows, that meet certain conditions:
- The notation COUNT(*) includes NULL values in the total.
- The notation COUNT(column_name) only considers rows where the column contains a non-NULL value.
- You can also combine COUNT with the DISTINCT operator to eliminate duplicates before counting, and to count the combinations of values across multiple columns.
When the query contains a GROUP BY clause, returns one value for each combination of grouping values.
Return type: BIGINT
Examples:
-- How many rows total are in the table, regardless of NULL values? select count(*) from t1; -- How many rows are in the table with non-NULL values for a column? select count(c1) from t1; -- Count the rows that meet certain conditions. -- Again, * includes NULLs, so COUNT(*) might be greater than COUNT(col). select count(*) from t1 where x > 10; select count(c1) from t1 where x > 10; -- Can also be used in combination with DISTINCT and/or GROUP BY. -- Combine COUNT and DISTINCT to find the number of unique values. -- Must use column names rather than * with COUNT(DISTINCT ...) syntax. -- Rows with NULL values are not counted. select count(distinct c1) from t1; -- Rows with a NULL value in _either_ column are not counted. select count(distinct c1, c2) from t1; -- Return more than one result. select month, year, count(distinct visitor_id) from web_stats group by month, year;
Impala only allows a single COUNT(DISTINCT columns) expression in each query.
If you do not need precise accuracy, you can produce an estimate of the distinct values for a column by specifying NDV(column); a query can contain multiple instances of NDV(column).
To produce the same result as multiple COUNT(DISTINCT) expressions, you can use the following technique for queries involving a single table:
select v1.c1 result1, v2.c1 result2 from (select count(distinct col1) as c1 from t1) v1 cross join (select count(distinct col2) as c1 from t1) v2;
Because CROSS JOIN is an expensive operation, prefer to use the NDV() technique wherever practical.
<< AVG Function | GROUP_CONCAT Function >> | |