Impala Built-In Functions
Impala supports several categories of built-in functions. These functions let you perform mathematical calculations, string manipulation, date calculations, and other kinds of data transformations directly in SELECT statements. The built-in functions let a SQL query return results with all formatting, calculating, and type conversions applied, rather than performing time-consuming postprocessing in another application. By applying function calls where practical, you can make a SQL query that is as convenient as an expression in a procedural programming language or a formula in a spreadsheet.
The categories of functions supported by Impala are:
- Impala Mathematical Functions
- Impala Type Conversion Functions
- Impala Date and Time Functions
- Impala Conditional Functions
- Impala String Functions
- Aggregation functions, explained in Impala Aggregate Functions.
You call any of these functions through the SELECT statement. For most functions, you can omit the FROM clause and supply literal values for any required arguments:
select abs(-1); +---------+ | abs(-1) | +---------+ | 1 | +---------+ select concat('The rain ', 'in Spain'); +---------------------------------+ | concat('the rain ', 'in spain') | +---------------------------------+ | The rain in Spain | +---------------------------------+ select power(2,5); +-------------+ | power(2, 5) | +-------------+ | 32 | +-------------+
When you use a FROM clause and specify a column name as a function argument, the function is applied for each item in the result set:
select concat('Country = ',country_code) from all_countries where population > 100000000; select round(price) as dollar_value from product_catalog where price between 0.0 and 100.0;
Typically, if any argument to a built-in function is NULL, the result value is also NULL:
select cos(null); +-----------+ | cos(null) | +-----------+ | NULL | +-----------+ select power(2,null); +----------------+ | power(2, null) | +----------------+ | NULL | +----------------+ select concat('a',null,'b'); +------------------------+ | concat('a', null, 'b') | +------------------------+ | NULL | +------------------------+
Aggregate functions are a special category with different rules. These functions calculate a return value across all the items in a result set, so they require a FROM clause in the query:
select count(product_id) from product_catalog; select max(height), avg(height) from census_data where age > 20;
Aggregate functions also ignore NULL values rather than returning a NULL result. For example, if some rows have NULL for a particular column, those rows are ignored when computing the AVG() for that column. Likewise, specifying COUNT(col_name) in a query counts only those rows where col_name contains a non-NULL value.
Aggregate functions are a special category with different rules. These functions calculate a return value across all the items in a result set, so they do require a FROM clause in the query:
select count(product_id) from product_catalog; select max(height), avg(height) from census_data where age > 20;
Aggregate functions also ignore NULL values rather than returning a NULL result. For example, if some rows have NULL for a particular column, those rows are ignored when computing the AVG() for that column. Likewise, specifying COUNT(col_name) in a query counts only those rows where col_name contains a non-NULL value.
Analytic functions are a variation on aggregate functions. Instead of returning a single value, or an identical value for each group of rows, they can compute values that vary based on a "window" consisting of other rows around them in the result set.