Impala type conversion functions
Conversion functions are usually used in combination with other functions, to
explicitly pass the expected data types. Impala has strict rules regarding data types for
function parameters. For example, Impala does not automatically convert a
DOUBLE
value to FLOAT
, a BIGINT
value to
INT
, or other conversion where precision could be lost or overflow could
occur. Also, for reporting or dealing with loosely defined schemas in big data contexts, you
might frequently need to convert values to or from the STRING
type.
Function reference:
Impala supports the following type conversion functions:
- CAST(expression AS type)
-
Purpose: Returns expression converted to the
type data type. If the expression value is of a type that cannot be converted to the target type:
- Of
DECIMAL
,DATE
, andBOOLEAN
, the function returns an error. - Of all other types, the function returns
NULL
.
Usage notes:
Use
CAST
when passing a column value or literal to a function that expects a parameter with a different type. Frequently used in SQL operations such asCREATE TABLE AS SELECT
andINSERT ... VALUES
to ensure that values from various sources are of the appropriate type for the destination columns.Where practical, do a one-time
CAST()
operation during the ingestion process to make each column into the appropriate type, rather than using manyCAST()
operations in each query; doing type conversions for each row during each query can be expensive for tables with millions or billions of rows.The way this function deals with time zones when converting to or from
TIMESTAMP
values is affected by the‑‑use_local_tz_for_unix_timestamp_conversions
startup flag for the impalad daemon. See TIMESTAMP data type for details about how Impala handles time zone considerations for theTIMESTAMP
data type. - Of
- CAST(expression AS type FORMAT pattern)
-
Purpose: Returns expression converted to
the type data type based on the
pattern format string. This signature of
CAST()
with theFORMAT
clause is supported only for casts betweenSTRING
/CHAR
/VARCHAR
types andTIMESTAMP
/DATE
types.The following rules apply to pattern. Any exceptions to these rules are noted in the Details column of the table below.-
pattern is a case-insensitive
STRING
. - If pattern is
NULL
, an empty string, or a number, an error returns. - A fewer digits in expression than
specified in the pattern is accepted if a
separator is correctly specified in the
pattern. For example,
CAST('5-01-2017' AS DATE FORMAT 'MM-dd-yyyy')
returnsDATE'2017-05-01'
. - If fewer number of digits are in
expression than specified in the
pattern, the current date is used to
complete the year pattern. For example,
CAST('19/05' AS DATE FORMAT 'YYYY/MM')
will returnDATE'2019-05-01'
when executed on August 8, 2019.
The following format patterns are supported in theFORMAT
clause.Pattern Description Details YYYY
4-digit year. YYY
Last 3 digits of a year. YY
Last 2 digits of a year. Y
Last digit of a year RRRR
4-digit round year If 1, 3 or 4-digit year values are provided in expression, treated as
YYYY
.If 2-digit years are provided in expression, treated as
RR
.For datetime to string conversions, treated as
YYYY
.If
YYYY
,YYY
,YY
,Y
, orRR
is given in the same pattern for a string to datetime conversion, an error returns.RR
2-digit round year. For datetime to string conversion, same as
YY
.For string to datetime conversions, the first 2 digits of the year in the return value depends on the specified two-digit year and the last two digits of the current year as follows:-
If the specified 2-digit year is 00 to 49:
-
If the last 2 digits of the current year are 00 to 49, the returned year has the same first 2 digits as the current year.
-
If the last 2 digits of the current year are 50 to 99, the first 2 digits of the returned year are 1 greater than the first 2 digits of the current year.
-
-
If the specified 2-digit year is 50 to 99:
-
If the last 2 digits of the current year are 00 to 49, the first 2 digits of the returned year are 1 less than the first 2 digits of the current year.
-
If the last 2 digits of the current year are 50 to 99, the returned year has the same first 2 digits as the current year.
-
If
YYYY
,YYY
,YY
,Y
, orRR
is given in the same pattern for a string to datetime conversion, an error returns.If 1-digit year values are provided in expression, it is treated as
YYYY
.MM
Month In datetime to string conversions, 1-digit month is prefixed with a zero.
DD
Day of month (1-31) In datetime to string conversions, one digit day is prefixed with a zero.
DDD
Day of year (1-366) In string to datetime conversions, providing
MM
andDD
along withDDD
results an error, e.g.YYYY-MM-DDD
.HH
HH12
Hour of day (1-12) In datetime to string conversions, 1-digit hours are prefixed with a zero.
If provided hour in expression is not between 1 and 12, returns an error.
If no AM/PM is provided in expression, the default is
AM
.HH24
Hour of day (0-23) In string to datetime conversions, if
HH12
,AM
,PM
are given in the same pattern, an error returns.MI
Minute of hour (0-59) In datetime to string conversions, 1-digit minutes are prefixed with a zero.
SS
Second of minute (0-59) In datetime to string conversions, 1-digit seconds are prefixed with a zero.
SSSSS
Second of Day (0-86399) In string to timestamp conversions, if SS
,HH
,HH12
,HH24
,MI
,AM
/PM
are given in the same pattern, an error returns.FF
FF1
, ...,FF9
Fractional second A number, 1 to 9, can be used to indicate the number of digits.
FF
specifies a 9 digits fractional second.AM
PM
A.M.
P.M.
Meridiem indicator For datetime to string conversions,
AM
andPM
are treated as synonyms. For example, casting'2019-01-01 11:00 am'
toTIMESTAMP
with the'YYYY-MM-DD HH12:MI PM'
pattern returns01-JAN-19 11.00.00.000000 AM
.For string to datetime conversion,
HH24
in the same pattern returns an error.TZH
Timezone offset hour An optional sign, + or -, and 2 digits for the value of signed numbers are allowed for the source expression, e.g.
“+10”
,“-05”
,"04"
.TZM
Timezone offset minute Unsigned numbers are allowed for the source expression.
-
.
/
,
'
;
:
<space>
Separator For string to datetime conversions, any separator character in the pattern string would match any separator character in the input expression.
For example,
CAST(“20191010” AS DATE FORMAT “YYYY-MM-DD”)
returns an error, butCAST("2019-.;10 10" AS DATE FORMAT "YYYY-MM-DD")
succeeds.T
Separates the date from the time. This pattern is used for accepting ISO 8601 datetime formats. Example:
YYYY-MM-DDTHH24:MI:SS.FF9Z
Z
Indicates the zero hour offset from UTC. This pattern is used for accepting ISO 8601 datetime formats. Examples:
Input Output CAST("02-11-2014" AS DATE FORMAT 'dd-mm-yyyy')
2014-11-02 CAST("365 2014" AS DATE FORMAT 'DDD-YYYY')
2014-12-31 CAST("5-01-26" AS DATE FORMAT 'YY-MM-DD')
Executed at 2019-01-01 11:11:11: 2015-01-26
CAST('2018-11-10T15:11:04Z' AS TIMESTAMP
FORMAT 'YYYY-MM-DDTHH24:MI:SSZ')
2018-11-10 15:11:04 CAST("95-01-28" AS DATE FORMAT 'YYY-MM-DD')
Executed at 2019-01-01 11:11:11: 2095-01-28
CAST("49-01-15" AS DATE FORMAT 'RR-MM-DD')
Round year when last 2 digits of current year is greater than 49. Executed at 2099-01-01 11:11:11:
2149-01-15
CAST('2019.10.10 13:30:40.123456 +01:30'
AS TIMESTAMP
FORMAT 'YYYY-MM-DD HH24:MI:SS.FF9 TZH:TZM')
2019-10-10 13:30:40.123456000 -
pattern is a case-insensitive
- TYPEOF(expression)
-
Purpose: Returns the name of the data type corresponding to
expression. For types with extra attributes,
such as length for
CHAR
andVARCHAR
, or precision and scale forDECIMAL
, includes the full specification of the type.Return type:
STRING
Usage notes: Typically used in interactive exploration of a schema, or in application code that programmatically generates schema definitions such as
CREATE TABLE
statements, for example, to get the type of an expression such ascol1 / col2
orCONCAT(col1, col2, col3)
. This function is especially useful for arithmetic expressions involvingDECIMAL
types because the precision and scale of the result is can be different than that of the operands.Examples:
TYPEOF(2)
returnsTINYINT
.TYPEOF(NOW())
returnsTIMESTAMP
.