Key Derivation Functions

Key Derivation Functions (KDF) are mechanisms by which human-readable information, usually a password or other secret information, is translated into a cryptographic key suitable for data protection. For further information, read the Wikipedia entry on Key Derivation Functions. Currently, KDFs are ingested by CipherProvider implementations and return a fully-initialized Cipher object to be used for encryption or decryption. Due to the use of a CipherProviderFactory, the KDFs are not customizable at this time. Future enhancements will include the ability to provide custom cost parameters to the KDF at initialization time. As a work-around, CipherProvider instances can be initialized with custom cost parameters in the constructor but this is not currently supported by the CipherProviderFactory. Here are the KDFs currently supported by NiFi (primarily in the EncryptContent processor for password-based encryption (PBE)) and relevant notes:

  • NiFi Legacy KDF

    • The original KDF used by NiFi for internal key derivation for PBE, this is 1000 iterations of the MD5 digest over the concatenation of the password and 8 or 16 bytes of random salt (the salt length depends on the selected cipher block size).

    • This KDF is deprecated as of NiFi 0.5.0 and should only be used for backwards compatibility to decrypt data that was previously encrypted by a legacy version of NiFi.

  • OpenSSL PKCS#5 v1.5 EVP_BytesToKey

    • This KDF was added in v0.4.0.

    • This KDF is provided for compatibility with data encrypted using OpenSSL's default PBE, known as EVP_BytesToKey. This is a single iteration of MD5 over the concatenation of the password and 8 bytes of random ASCII salt. OpenSSL recommends using PBKDF2 for key derivation but does not expose the library method necessary to the command-line tool, so this KDF is still the de facto default for command-line encryption.

  • Bcrypt

    • This KDF was added in v0.5.0.

    • https://en.wikipedia.org/wiki/Bcrypt is an adaptive function based on the https://en.wikipedia.org/wiki/Blowfish_(cipher) cipher. This KDF is strongly recommended as it automatically incorporates a random 16 byte salt, configurable cost parameter (or "work factor"), and is hardened against brute-force attacks using https://en.wikipedia.org/wiki/General-purpose_computing_on_graphics_processing_units (which share memory between cores) by requiring access to "large" blocks of memory during the key derivation. It is less resistant to https://en.wikipedia.org/wiki/Field-programmable_gate_array brute-force attacks where the gate arrays have access to individual embedded RAM blocks.

    • Because the length of a Bcrypt-derived key is always 184 bits, the complete output is then fed to a SHA-512 digest and truncated to the desired key length. This provides the benefit of the avalanche effect on the formatted input.

    • The recommended minimum work factor is 12 (212 key derivation rounds) (as of 2/1/2016 on commodity hardware) and should be increased to the threshold at which legitimate systems will encounter detrimental delays (see schedule below or use BcryptCipherProviderGroovyTest#testDefaultConstructorShouldProvideStrongWorkFactor() to calculate safe minimums).

    • The salt format is $2a$10$ABCDEFGHIJKLMNOPQRSTUV. The salt is delimited by $ and the three sections are as follows:

      • 2a - the version of the format. An extensive explanation can be found http://blog.ircmaxell.com/2012/12/seven-ways-to-screw-up-bcrypt.html. NiFi currently uses 2a for all salts generated internally.

      • 10 - the work factor. This is actually the log2 value, so the total iteration count would be 210 in this case.

      • ABCDEFGHIJKLMNOPQRSTUV - the 22 character, Base64-encoded, unpadded, raw salt value. This decodes to a 16 byte salt used in the key derivation.

  • Scrypt

    • This KDF was added in v0.5.0.

    • https://en.wikipedia.org/wiki/Scrypt is an adaptive function designed in response to bcrypt. This KDF is recommended as it requires relatively large amounts of memory for each derivation, making it resistant to hardware brute-force attacks.

    • The recommended minimum cost is N=214, r=8, p=1 (as of 2/1/2016 on commodity hardware). p must be a positive integer and less than (2^32 − 1) * (Hlen/MFlen) where Hlen is the length in octets of the digest function output (32 for SHA-256) and MFlen is the length in octets of the mixing function output, defined as r * 128. These parameters should be increased to the threshold at which legitimate systems will encounter detrimental delays (see schedule below or use ScryptCipherProviderGroovyTest#testDefaultConstructorShouldProvideStrongParameters() to calculate safe minimums).

    • The salt format is $s0$e0101$ABCDEFGHIJKLMNOPQRSTUV. The salt is delimited by $ and the three sections are as follows:

      • s0 - the version of the format. NiFi currently uses s0 for all salts generated internally.

      • e0101 - the cost parameters. This is actually a hexadecimal encoding of N, r, p using shifts. This can be formed/parsed using Scrypt#encodeParams() and Scrypt#parseParameters().

        • Some external libraries encode N, r, and p separately in the form $400$1$1$. A utility method is available at ScryptCipherProvider#translateSalt() which will convert the external form to the internal form.

      • ABCDEFGHIJKLMNOPQRSTUV - the 12-44 character, Base64-encoded, unpadded, raw salt value. This decodes to a 8-32 byte salt used in the key derivation.

  • PBKDF2

    • This KDF was added in v0.5.0.

    • https://en.wikipedia.org/wiki/PBKDF2 is an adaptive derivation function which uses an internal pseudorandom function (PRF) and iterates it many times over a password and salt (at least 16 bytes).

    • The PRF is recommended to be HMAC/SHA-256 or HMAC/SHA-512. The use of an HMAC cryptographic hash function mitigates a length extension attack.

    • The recommended minimum number of iterations is 160,000 (as of 2/1/2016 on commodity hardware). This number should be doubled every two years (see schedule below or use PBKDF2CipherProviderGroovyTest#testDefaultConstructorShouldProvideStrongIterationCount() to calculate safe minimums).

    • This KDF is not memory-hard (can be parallelized massively with commodity hardware) but is still recommended as sufficient by https://csrc.nist.gov/publications/detail/sp/800-132/final and many cryptographers (when used with a proper iteration count and HMAC cryptographic hash function).

  • None

    • This KDF was added in v0.5.0.

    • This KDF performs no operation on the input and is a marker to indicate the raw key is provided to the cipher. The key must be provided in hexadecimal encoding and be of a valid length for the associated cipher/algorithm.