Limitations and restrictions for Impala UDFs

The following limitations and restrictions apply to Impala UDFs in the current release.

Limited support for Hive Generic UDFs

Hive has 2 types of UDFs. This release contains limited support for the second generation UDFs called GenericUDFs. The main limitations are as follows:

  • Decimal types are not supported
  • Complex types are not supported
  • Functions are not extracted from the jar file

GenericUDFs cannot be made permanent. They will need to be recreated every time the server is restarted.

Other limitations

  • Impala does not support Hive UDFs that accept or return composite or nested types, or other types not available in Impala tables.
  • The Hive current_user() function cannot be called from a Java UDF through Impala.

  • All Impala UDFs must be deterministic, that is, produce the same output each time when passed the same argument values. For example, an Impala UDF must not call functions such as rand() to produce different values for each invocation. It must not retrieve data from external sources, such as from disk or over the network.
  • An Impala UDF must not spawn other threads or processes.
  • Prior to Impala 2.5 when the catalogd process is restarted, all UDFs become undefined and must be reloaded. In Impala 2.5 and higher, this limitation only applies to older Java UDFs. Re-create those UDFs using the new CREATE FUNCTION syntax for Java UDFs, which excludes the function signature, to remove the limitation entirely.
  • Impala currently does not support user-defined table functions (UDTFs).
  • The CHAR and VARCHAR types cannot be used as input arguments or return values for UDFs.