Stellar Language Quick Reference
Also available as:
PDF

Chapter 3. Stellar Language Functions

This section provides a list of all supported core functions language functions.

Table 3.1. Stellar Core Functions

FunctionDescriptionInputReturns
ABSReturns the absolute value of a numbernumber - The number to take the absolute value ofThe absolute value of the number passed in.
APPEND_IF_MISSINGAppends the suffix to the end of the string if the string does not already end with any of the suffixes.
  • string - The string to be appended.

  • suffix - The string suffix to append to the end of the string.

  • additionalsuffix - Optional - Additional string suffix that is a valid terminator.

A new string if prefix was prepended, the same string otherwise.
BINComputes the bin that the value is in given a set of bounds
  • value - the value to bin

  • bounds -A list of value bounds (excluding min and max) in sorted order

Which bin N the value falls in such that bound(N-1) <value <= bound(N). No min and max bounds are provided, so values small than the 0'th bound go in the 0'th bin, and values great than the last bound go in the M'th bin.
BLOOM_ADDAdds an element to the bloom filter passed in
  • bloom - The bloom filter

  • value* - The values to add

Bloom Filter
BLOOM_EXISTSIf the bloom filter contains the value
  • bloom - The bloom filter

  • value - The value to check

True if the filter might contain the value and false otherwise
BLOOM_INITReturns an empty bloom filter
  • expectedInsertions - The expected insertions

  • falsePositiveRate - The false positive rate you are willing to tolerate

Bloom Filter
BLOOM_MERGEReturns a merged bloom filter
  • bloomfilters - A list of bloom filters to merge

Bloom Filter or null if the list is empty
CEILINGReturns the ceiling of a number.
  • number - The number to take the ceiling of

The ceiling of the number passed in.
CHOPRemove the last character from a string.
  • string- The string to chop last character from, may be null.

String without last character, null if null string input.
CHOMPRemoves one newline from end of a string if its there, otherwise leaves it alone. A newline is "/n", "/r", "/r/n".
  • The string to chomp a newline from, may be null.

String without newline, null if null string input.
COSReturns the cosine of a number.
  • number - The number to take the cosine of.

The cosine of the number passed in.
COUNT_MATCHESCounts how many times the substring appears in the larger string.
  • string - The CharSequence to check, may be null.

  • substring/character - The number of non-overlapping occurrences, 0 if either CharSequence is null.

 
DAY_OF_MONTHThe numbered day within the month. The first day within the month has a value if 1.
  • dateTime - The datetime as a long representing the milliseconds since UNIX epoch

The numbered day within the month
DAY_OF_WEEKThe numbered day within the week. The first day of the week, Sunday, has a value of 1.
  • dateTime - The datetime as a long representing the milliseconds since UNIX epoch

The numbered day within the week.
DAY_OF_THE_YEARThe day number within the year. The first day of the year has value of 1.
  • dateTime - The datetime as a long representing the milliseconds since UNIX epoch

The day number within the year
DECODEDecodes the passed string with the provided encoding, which must be one of the encodings returned from GET_SUPPORTED_ENCODINGS
  • string - The string to decode

  • encoding - the encoding to use, must be one of the encodings returned from GET_SUPPORTED_ENCODINGS

  • verify - (optional), true or false to determine if string should be verified as being encoded with the passed encoding

  • The decoded string on success

  • The original string the string cannot be decoded

  • null on usage error

DOMAIN_REMOVE_SUBDOMAINSRemove subdomains from a domain
  • domain - Fully qualified domain name

The domain without the subdomains. (For example, DOMAIN_REMOVE_SUBDOMAINS ('mail.yahoo.com') yields 'yahoo.com')
DOMAIN_REMOVE_TLDRemoves the top level domain (TLD) suffix from a domain
  • domain - Fully qualified domain name

The domain without the TLD. (For example, DOMAIN_REMOVE_TLD('mail.yahoo.co.uk') yields 'mail.yahoo')
DOMAIN_TO_TLDExtracts the top level domain from a domain
  • domain - Fully qualified domain name

The domain of the TLD. (For example, DOMAIN_TO_TLD('mail.yahoo.com.uk') 'yields 'co.uk')
ENCODEEncodes the passed string with the provided encoding, which must be one of the encodings returned from GET_SUPPORTED_ENCODINGS
  • string - the string to encode

  • encoding - the encoding to use, must be one of the encodings returned from GET_SUPPORTED_ENCODINGS

  • The encoded string on success

  • null on error

ENDS_WITHDetermines whether a string ends with a suffix
  • string - The string to test

  • suffix - The proposed suffix

True if the string ends with the specified suffix and false if otherwise
ENRICHMENT_EXISTSInterrogates the HBase table holding the simple HBase enrichment data and returns whether the enrichment type and indicator are in the table
  • enrichment_type - The enrichment type

  • indicator - The string indicator to look up

  • nosql_table - The NoSQL table to use

  • column_family - The column family to use

True if the enrichment indicator exists and false otherwise
ENRICHMENT_GETInterrogates the HBase table holding the simple HBase enrichment data and retrieves the tabular value associated with the enrichment type and indicator
  • enrichment_type - The enrichment type

  • indicator - The string indicator to look up

  • nosql_table - The NoSQL table to use

  • column_family - The column family to use

A map associated with the indicator and enrichment type. Empty otherwise.
EXPReturns Euler's number raised to the power of the argument.
  • number - the power to which e is raised

Euler's number raised to the power of the argument.
FILL_LEFTFills or pads a given string with a given character, to a given length on the left.
  • input - string

  • fill - the fill character

  • len - the required length

The filled string
FILL_RIGHTFills or pads a given string with a given character, to a given length on the right.
  • input - string

  • fill - the fill character

  • len - the required length

Last element of the list
FILTERApplies a filter in the form of a lambda expression to a list. For example, `FILTER( [ 'foo', 'bar' ] , (x) -> x == 'foo')` would yield `[ 'foo'.
  • list - List of arguments.

  • predicate - The lambda expression to apply. This expression is assumed to take one argument and return a boolean.

The input list filtered by the predicate.
FLOORReturns the floor of a number.
  • number - The number to take the floor of

  • The floor of the number passed in.

FORMATReturns a formatted string using the specified format string and arguments. Uses Java's string formatting conventions
  • format - string

  • arguments - object(s)

A formatted string
FUZZY_LANGSReturns a list of IETF BCP 47 available to the system, such as en, fr, de. A list of IEF BGP 47 language tag strings
FUZZY_SCOREReturns the Fuzzy Score which indicates the similarity score between two strings. One point is given for every matched character. Subsequent matches yield two bonus points. A higher score indicates a higher similarity.
  • string - The full term that should be matched against.

  • string - The query that will be matched against a term.

  • string - The IETF BCP 47 language code to use.

An Integer representing the score.
GEO_GETLook up an IPV4 address and returns geographic information about it.
  • ip - The IPV4 address to look up

  • fields – Optional list of GeoIP fields to grab. Options are locID, country, city postalCode, dmaCode, latitude, longitude, location_point

  • len - the required length

If a Single field is requested, a string of the field. If multiple fields are requested, a map of string of fields. Otherwise null.
GEOHASH_CENTROIDCompute the centroid (geographic midpoint or center of gravity) of a set of geohashes
  • hashes - A collection of geohashes or a map associating geohashes to numeric weights

  • character_precision? - The number of characters to use in the hash. Default is 12.

The geohash of the centroid.
GEOHASH_DISTCompute the distance between geohashes.
  • hash1 - The first point as a geohash

  • hash2 - The second point as a geohash

  • strategy? - The great circle distance strategy to use. One of HAVERSINE, LAW_OF_COSINES, or VICENTY. Haversine is default.

The distance in kilometers between the hashes.
GEOHASH_FROM_LATLONGCompute geohash given a lat/long.
  • latitude - The latitude

  • longitude - The longitude

  • character_precision? - The number of characters to use in the hash. Default is 12.

A geohash of the lat/long.
GEOHASH_FROM_LOCCompute geohash given a geo enrichment location.
  • map - the latitude and logitude in a map (the output of GEO_GET)

  • longitude - The longitude

  • character_precision? - The number of characters to use in the hash. Default is 12

A geohash of the location.
GEOHASH_MAX_DISTCompute the maximum distance among a list of geohashes
The maximum distance in kilometers between any two locations
GEOHASH_TO_LATLONGCompute geohash given a lat/long.
A map containing the latitude and longitude of the hash (keys "latitude" and "longitude")
GETReturns the i'th element of the list
  • input - List

  • i - The index (0-based)

First element of the list
GET_FIRSTReturns the first element of the list
  • input - List

First element of this list
GET_LASTReturns the last element of the list
  • input - List

Last element of the list
GET_SUPPORTED_ENCODINGSReturns a list of the encodings that are currently supported. A List of String
HASHHashes a given value using the given hashing algorithm and returns a hex encoded string.
  • toHash - value to hash.

  • hashType - A valid string representation of a hashing algorithm. See 'GET_HASHES_AVAILABLE'.

A hex encoded string of a hashed value using the given algorithm. If 'hashType' is null then '00', padded to the necessary length, will be returned. If 'toHash' is not able to be hashed or 'hashType' is null then null is returned.
HLLP_ADDAdd value to the HyperLogLogPlus estimator set.
  • hyperLogLogPlus - the hllp estimator to add a value to

  • value+ - value to add to the set. Takes a single item or a list.

The HyperLogLogPlus set with a new value added
HLLP_CARDINALITYReturns HyperLogLogPlus-estimated cardinality for this set.
  • input - hyperLogLogPlus - the hllp set

Long value representing the cardinality for this set
HLLP_INITInitializes the set
  • p (required) - The precision value for the sparse set.

  • sp - The precision value for the sparse set. If sp Is not specified the sparse set will be disabled.

A new HyperLogLogPlus set
HLLP_MERGEMerge hllp sets together
  • hllp1 - First hllp set

  • hllp2 - Second hllp set

  • hllpn - Additional sets to merge

A new merged HyperLogLogPlus estimator set
IN_SUBNETReturns true if an IP is within a subnet range
  • ip - The IP address in string form

  • cidr+ - One or more IP ranges specified in CIDR notation (for example, 192.168.0.0/24)

True if the IP address is within at least one of the network ranges and false if otherwise
IS_DATEDetermines if the date contained in the string conforms to the specified format
  • date - The date in string form

  • format - The format of the date

True if the date is in the specified format and false if otherwise
IS_DOMAINTests if a string is a valid domain. Domain names are evaluated according to the standards RFC1034 Section 3, and RFC1123 section 2.1.
  • address - The string to test

True if the string is a valid domain and false if otherwise
IS_EMAILTests if a string is a valid email address
  • address -The string to test

True if the string is a valid email address and false if otherwise
IS_EMPTYReturns true if string or collection is empty or null and false if otherwise
  • input - Object of string or collection type (for example, list)

True if the string or collection is empty or null and false if otherwise
IS_ENCODINGReturns true if the passed string is encoded in one of the supported encodings and false if otherwise.
True if the passed string is encoded in one of the supported encodings and false if otherwise.
IS_INTEGERDetermines whether or not an object is an integer
  • x - The object to test

True if the object can be converted to an integer and false if otherwise
IS_IPDetermine if a string is an IP or not
  • ip - An object which we wish to test is an IP

  • type (optional) - Object of string or collection type (for example, list) one of IPv4 or IPv6. The default is IPv4.

True if the string is an IP and false if otherwise
IS_NANEvaluates if the passed number is NaN. The number is evaluated as a double.
  • number - number to evaluate"

True if the number is NaN, false if it is
IS_URLTests if a string is a valid URL
  • url - The string to test

True if the string is a valid URL and false otherwise
JOINJoins the components in the list of strings with the specified delimiter
  • list - List of strings

  • delim - String delimiter

String
KAFKA_GETRetrieves messages from a Kafka topic. Subsequent calls will continue retrieving messages sequentially from the original offset.
  • topic - the name of the Kafka topic.

  • count - The number of Kafka messages to retrieve.

  • config - Optional map of key/values that override any global properties.

List of String
KAFKA_PROPSRetrieves the Kafka properties that are used by other KAFKA_* functions like KAFKA_GET and KAFKA_PUT. The Kafka properties are compiled from a set of default properties, the global properties, and any overrides.
  • config - An optional map of key/values that override any global properties

Map of key/value pairs
KAFKA_PUTSends messages to a Kafka topic.
  • topic - The name of the Kafka topic.

  • messages -A list of messages to write.

  • config - Optional map of key/values that override any global properties.

N/A
KAFKA_TAILRetrieves messages from a Kafka topic always starting with the most recent message first.
  • topic - The name of the Kafka topic.

  • count - The number of Kafka messages to retrieve.

  • config - Optional map of key/values that override any global properties.

List of String
LENGTHReturns the length of a string or size of a collection. Returns 0 for empty or null strings.
  • input - Object of string or collection type (for example, list).

  • element - Element to add to list.

Integer
LIST_ADDAdds an element to a list.
  • list - List to add element to.

Resulting list with the item added at the end.
LNReturns the natural log of a number.
  • number - The number to take the natural log of

The natural log of the number passed in.
LOG2Returns the log (base 2) of a number.
  • number - The number to take the log (base 10) of

The log (base 2) of the number passed in.
LOG10Returns the log (base 10) of a number.
  • number - The number to take the log (base 2) of

The log (base 10) of the number passed in.
MAAS_GET_ENDPOINTInspects ZooKeeper and returns a map containing the name, version, and url for the model referred to by the input parameters
  • model_name - The name of the model

  • model_version - The optional version of the model. If the model version is not specified, the most current version is used.

A map containing the name, version, url for the REST endpoint (fields named name, version, and url). Note that the output of this function is suitable for input into the first argument of MAAS_MODEL_APPLY.
MAAS_MODEL_APPLYReturns the output of a model deployed via Model as a Service. Note: Results are cached locally 10 minutes.
  • endpoint - A map containing name, version, and url for the REST endpoint

  • function - The optional endpoint path; default is 'apply'

  • model_args - A dictionary of arguments for the model (these become request params)

The output of the model deployed as a REST endpoint in map form. Assumes REST endpoint returns a JSON map.
MAPApplies lambda expression to a list of arguments. e.g. `MAP( [ 'foo', 'bar' ] , (x) -> TO_UPPER(x) )` would yield `[ 'FOO', 'BAR' ]`.
  • string -List of arguments.

  • prefix - The string prefix to prepend to the start of the string.

  • additionalprefix - Optional - Additional string prefix that is valid.

A new String if prefix was prepended, the same string otherwise.
MAP_EXISTSChecks for existence of a key in a map
  • key - The key to check for existence

  • map - The may to check for existence of the key

True if the key is found in the map and false if otherwise
MAP_GETGets the value associated with a key from a map
  • key - The key

  • map - The map

  • default - Optionally the default value to return if the key is not in the map.

The object associated with the key in the map. If no value is associated with the key and default is specified, then default is returned. If no value is associated with the key or default, then null is returned.
MAXReturns the maximum value of a list of input values.
  • "list - List of arguments. The list may only contain objects that are mutually comparable / ordinal (implement java.lang.Comparable interface). Multi type numeric comparisons are supported: MAX([10,15L,15.3]) would return 15.3, but MAX(['23',25]) will fail and return null as strings and numbers can't be compared.

The maximum value of the list, or null if the list is empty or the input values were not comparable.
MINReturns the minimum value of a list of input values.
  • "list - List of arguments. The list may only contain objects that are mutually comparable / ordinal (implement java.lang.Comparable interface). Multi type numeric comparisons are supported: MIN([10,15L,15.3]) would return 10, but MIN(['23',25]) will fail and return null as strings and numbers can't be compared.

The minimum value of the list, or null if the list is empty or the input values were not comparable.
MONTHThe number representing the month. The first month, January, has a value of 0.
  • dateTime - The datetime as a long representing the milliseconds since UNIX epoch

The current month (0-based).
MULTISET_ADDAdds to a multiset, which is a map associating objects to their instance counts.
  • set - The multiset to add to

  • o - object to add to multiset

A multiset
MULTISET_INITCreates an empty multiset, which is a map associating objects to their instance counts.
  • input? - An initialization of the multiset

A multiset
MULTISET_MERGEMerges a list of multisets, which is a map associating objects to their instance counts.
  • sets - A collection of multisets to merge

A multiset
MULTISET_REMOVERemoves from a multiset, which is a map associating objects to their instance counts.
  • set - The multiset to add to

  • o - object to add to multiset

A multiset
MULTISET_TO_SETCreate a set out of a multiset, which is a map associating objects to their instance counts.
  • multiset - The multiset to convert

The set of objects in the multiset ignoring multiplicity
OBJECT_GETRetrieve and deserialize a serialized object from HDFS. The cache can be specified via two properties in the global config: "object.cache.size" (default 1000), "object.cache.expiration.minutes" (default 1440). Note, if these are changed in global config, topology restart is required.
  • path - The path in HDFS to the serialized object

The deserialized object.
PREPEND_IF_MISSINGPrepends the prefix to the start of the string if the string does not already start with any of the prefixes.
  • string - The string to be prepended.

  • prefix - The string prefix to prepend to the start of the string.

  • additionalprefix - Optional - Additional string prefix that is valid.

A new String if prefix was prepended, the same string otherwise.
PROFILE_FIXEDThe profile periods associated with a fixed lookback starting from now
  • durationAgo - How long ago should values be retrieved from?

  • units - The units of 'durationAgo'

  • config_overrides - Optional - Map (in curly braces) of name:value pairs, each overriding the global config parameter of the same name. Default is the empty Map, meaning no overrides.

The selected profile measurement timestamps. These are ProfilePeriod objects.
PROFILE_GETRetrieves a series of values from a stored profile
  • profile - The name of the profile

  • entity - The name of the entity

  • periods - The list of profile periods to grab. These are ProfilePeriod objects.

  • groups_list -Optional - Must correspond to the 'groupBy' list used in profile creation - List (in square brackets) of groupBy values used to filter the profile. Default is the empty list, meaning groupBy was not used when creating the profile.

  • config_overrides - Optional - Map (in curly braces) of name:value pairs, each overriding the global config parameter of the same name. Default is the empty Map, meaning no overrides.

The profile measurements
PROFILE_WINDOWThe profiler periods associated with a window selector statement from an optional reference timestamp.
  • WindowSelector - The statement specifying the window to select.

  • now - Optional - The timestamp to use for now.

  • config_overrides - Optional - Map (in curly braces) of name:value pairs, each overriding the global config parameter of the same name. Default is the empty Map, meaning no overrides.

Returns: The selected profile measurement periods. These are ProfilePeriod objects.
PROTOCOL_TO_NAMEConverts the IANA protocol number to the protocol name
  • IANA number

The protocol name associated with the IANA number
REDUCEReduces a list by a binary lambda expression. That is, the expression takes two arguments. Usage example: `REDUCE( [ 1, 2, 3 ] , (x, y) -> x + y, 0)` would sum the input list, yielding `6`.
  • list - List of arguments.

  • binary operation - The lambda expression function to apply to reduce the list. It is assumed that this takes two arguments, the first being the running total and the second being an item from the list.initial.

  • initial_value - The initial value to use.

The reduction of the list.

REGEXP_MATCHDetermines whether a regex matches a string
  • input -String to split

  • delim - String delimiter

List of strings
REGEX_GROUP_VALReturns the value of a group in a regex against a string
  • string The string to test

  • pattern -The proposed regex pattern

  • group - The integer that selects what group to select, starting at 1

The value of the group, or null if not matched or no group at index.
REGEX_REPLACEReplace all occurences of the regex pattern within the string by value
  • string - The input string

  • pattern - The proposed regex pattern

  • value - The value to replace the regex pattern

The modified input string with replaced values
ROUNDRounds a number to the nearest integer. This is half-up rounding.
  • number - The number to round

The nearest integer (based on half-up rounding).
SAMPLE-ADDAdd a value or collection of values to a sampler.
  • sampler - Sampler to use. If null, then a default Uniform sampler is created.

  • o - The value to add. If o is an Iterable, then each item is added.

The sampler.
SAMPLE_GETReturn the sample.
  • sampler - Sampler to use.

The resulting sample.
SAMPLE_INITCreate a reservoir sampler of a specific size or, if unspecified, size 1024. Elements sampled by the reservoir sampler will be included in the final sample with equal probability.
  • size? - The size of the reservoir sampler. If unspecified, the size is 1024.

The sampler object.
SAMPLE_MERGEMerge and resample a collection of samples.
  • samplers - A list of samplers to merge.

A sampler which represents the resampled merger of the samplers.
SET_ADDAdds to a set
  • set - The set to add to

  • o - object to add to set

A Set
SET_INITCreates an new set
  • Input? - An initialization of the set

A Set
SET_MERGEMerges a list of sets
  • sets - A collection of sets to merge

A Set
SET_REMOVERemoves from a set
  • set - The set to add to

  • o - object to add to set

A Set
SINReturns the sine of a number.
  • number - The number to take the sine of

The sine of the number passed in.
SHELL_EDITOpen an editor (optionally initialized with text) and return whatever is saved from the editor. The editor to use is pulled from EDITOR or VISUAL environment variable.
  • string - (Optional) A string whose content is used to initialize the editor.

The content that the editor saved after editor exit.
SHELL_GET_EXPRESSIONGet a stellar expression from a variable
  • variable - variable name

The stellar expression associated with the variable.
SHELL_LIST_VARSReturn the variables in a tabular form
  • wrap : Length of string to wrap the columns

A tabular representation of the variables.
SHELL_MAP2TABLETake a map and return a table
  • map - Map

The map in table form
SHELL_VARS2MAPTake a set of variables and return a map
  • variables* - variable names to use to create map

A map associating the variable name with the stellar expression.
SPLITSplits the string by the delimiter
  • inputs - String to split

  • delim - String delimiter

List of strings
SQRTReturns the square root of a number.
  • number - The number to take the square root of

The square root of the number passed in.
STARTS_WITHDetermines whether a string starts with a prefix
  • string -the string to test

  • prefix - The proposed prefix

True if the string starts with the specified prefix and false if otherwise
STATS_ADDAdd one or more input values to those that are used to calculate the summary statistics
  • stats - The Stellar statistics object. If null, then a new one is initialized

  • value+ - One or more numbers to add

A Stellar statistics object
STATS_BINComputes the bin that the value is in based on the statistical distribution.
  • stats - The Stellar statistics object

  • value - The value to bin

  • bound? - A list of percentile bin bounds (excluding min and max) or a string representing a known and common set of bins. For convenience, we have provided QUARTILE, QUINTILE, and DECILE which you can pass in as a string arg. If this argument is omitted, then we assume a Quartile bin split.

Which bin N the value falls in such that bound(N-1) < value <= bound(N). No min and max bounds are provided, so values smaller than the 0'th bound go in the 0'th bin, and values greater than the last bound go in the M'th bin.
STATS_COUNTCalculates the count of the values accumulated (or in the window if a window is used)
  • stats - The Stellar statistics object

The count of the values in the window or NaN if the statistics object is null
STATS_GEOMETRIC_MEANCalculates the geometric mean of the accumulated values (or in the window if a window is used). See http://commons.apache.org.proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics
  • stats - The Stellar statistics object

The geometric mean of the values in the window or NaN if the statistics object is null
STATS_INITInitializes a statistics object
  • window_size - The number of input data values to maintain in a rolling window in memory. If window_size is equal to 0, then no rolling window is maintained. Using no rolling window is less memory intensive, but cannot calculate certain statistics like percentiles and kurtosis.

A Stellar statistics object
STATS_KURTOSISCalculates the kurtosis of the accumulated values (or in the window if a window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics
  • stats - The Stellar statistics object

The kurtosis of the values in the window or NaN if the statistics object is null
STATS_MAXCalculates the maximum of the accumulated values (or in the window if a window is used)
  • stats - The Stellar statistics object

The maximum of the accumulated values in the window or NaN if the statistics object is null
STATS_MEANCalculates the mean of the accumulated values (or in the window if a window is used)
  • stats - The Stellar statistics object

The mean of the values in the window or NaN if the statistics objects is null
STATS_MERGEMerges statistics objects
  • statistics - A list of statistics providers

A Stellar statistics object
STATS_MINCalculates the minimum of the accumulated values (or in the window if a window is used)
  • stats - The Stellar statistics object

The minimum of the accumulated values in the window of NaN if the statistics object is null
STATS_PERCENTILEComputes the p'th percentile of the accumulated values (or in the window if a window is used)
  • stats - The Stellar statistics object

  • p - A double where 0<=1 representing the percentile

The p'th percentile of the data or NaN if the statistics object is null
STATS_POPULATION_VARIANCECalculates the population variance of the accumulated values (or in the window if a window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics
  • stats - The Stellar statistics object

The population variance of the values in the window of NaN if the statistics object is null
STATS_QUADRATIC_MEANCalculates the quadratic mean of the accumulated values (or in the window if the window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics
  • stats - The Stellar statistics object

The quadratic mean of the values in the window or NaN if the statistics object is null
STATS_SDCalculates the standard deviation of the accumulated values (or in the window if a window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics
  • stats - The Stellar statistics object

The standard deviation of the values in the window or NaN if the statistics object is null
STATS_SKEWNESSCalculates the skewness of the accumulated values (or in the window if a window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics
  • stats - The Stellar statistics object

The skewness of the values in the window of NaN if the statistics object is null
STATS_SUMCalculates the sum of the accumulated values (or in the window if a window is used)
  • stats - The Stellar statistics object

The sum of the values in the window or NaN if the statistics object is null
STATS_SUM_LOGSCalculates the sum of the (natural) log of the accumulated values (or in the window if a window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics
  • stats - The Stellar statistics object

The sum of the (natural) log of the values in the in window or NaN if the statistics object is null
STATS_SUM_SQUARESCalculates the sum of the squares of the accumulated values (or in the window if a window is used)
  • stats - The Stellar statistics object

The sum of the squares of the values in the window or NaN if the statistics object is null
STATS_VARIANCECalculates the variance of the accumulated values (or in the window if a window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics
  • stats - The Stellar statistics object

The variance of the values in the window or NaN if the statistics object is null
STRING_ENTROPYComputes the base-2 shannon entropy of a string.
  • input - String

The base-2 shannon entropy of the string (https://en.wikipedia.org/wiki/Entropy_(information_theory)#Definition). The unit of this is bits.
SYSTEM_ENV_GETReturns the value associated with an environment variable
  • env_var -Environment variable name to get the value for

String
SYSTEM_PROPERTY_GETReturns the value associated with a Java system property
  • key - Property to get the value for

String
TANReturns the tangent of a number.
  • number - The number to take the tangent of

The tangent of the number passed in.
TLSH_DISTWill return the hamming distance between two TLSH hashes (note: must be computed with the same params). For more information, see https://github.com/trendmicro/tlsh and Jonathan Oliver, Chun Cheng, and Yanggui Chen, TLSH - A Locality Sensitive Hash. 4th Cybercrime and Trustworthy Computing Workshop, Sydney, November 2013. For a discussion of tradeoffs, see Table II on page 5 of TLSH_CTC_final
  • hash1 - The first TLSH hash

  • hash2 - The first TLSH hash

  • includeLength? - Include the length in the distance calculation or not? Returns: An integer representing the distance between hash1 and hash2. The distance is roughly hamming distance, so 0 is very similar.

 
TO_DOUBLETransforms the first argument to a double precision number
  • Input - Object of string or numeric type

Double version of the first argument
TO_EPOCH_TIMESTAMPReturns the epoch timestamp of the dateTime in the specified format. If the format does not have a timestamp and you wish to assume a given timestamp, you may specify the timezone optionally.
  • dateTime - DateTime in string format

  • format - DateTime format as string

  • timezone - Optional timezone in a string format

Epoch timestamp
TO_FLOATTransforms the first argument to an integer
  • Input - Object of string or numeric type

Float version of the first argument
TO_INTEGERTransforms the first argument to an integer
  • Input - Object of string or numeric type

Integer version of the first argument
TO_JSON_LISTAccepts JSON string as an input and returns a List object parsed by Jackson. You need to be aware of content of JSON string that is to be parsed. For e.g. GET_FIRST( TO_JSON_LIST( '[ "foo", 2]') would yield foo
  • string - The JSON string to be parsed

A parsed List object
TO_JSON_MAPAccepts JSON string as an input and returns a Map object parsed by Jackson. You need to be aware of content of JSON string that is to be parsed. For e.g. MAP_GET( 'bar', TO_JSON_MAP( '{ "foo" : 1, "bar" : 2}' )would yield 2
  • string - The JSON string to be parsed

A parsed Map object
TO_JSON_OBJECTAccepts JSON string as an input and returns a JSON Object parsed by Jackson. You need to be aware of content of JSON string that is to be parsed. For e.g. MAP_GET( 'bar', TO_JSON_OBJECT( '{ "foo" : 1, "bar" : 2}' )would yield 2
  • string - The JSON string to be parsed

A parsed JSON object
TO_LONGTransforms the first argument to a long integer
  • input - Object of string or numeric type

Long version of the first argument
TO_LOWERTransforms the first argument to a lowercase string
  • Input -String

String
TO_STRINGTransforms the first argument to a string
  • Input - Object

String
TO_UPPERTransforms the first argument to an uppercase string
  • Input -String

Uppercase string
TRIMTrims white space from both sides of a string
  • Input -String

String
URL_TO_HOSTExtract the hostname from a URL
  • url - URL in string form

The hostname from the URL as a string (for example URL_TO_HOST('http://www.yahoo.com/foo') would yield 'www.yahoo.com'
URL_TO_PATHExtract the path from a URL
  • url - URL in string form

The path from the URL as a string (for example URL_TO_PATH('http://www.yahoo.com/foo') would yield 'foo'
URL_TO_PORTExtract the port from a URL. If the port is not explicitly stated in the URL, then an implicit port is inferred based on the protocol.
  • url - URL in string form

The port used in the URL as an integer (for example URL_TO_PORT('http://www.yahoo.com/foo') would yield 80)
URL_TO_PROTOCOLExtract the protocol from a URL
  • url - URL in string form

The protocol from the URL as a string (for example URL_TO_PROTOCOL('http://www.yahoo.com/foo') would yield 'http'
WEEK_OF_MONTHThe numbered week within the month. The first week within the month has a value of 1.
  • dateTime -The datetime as a long representing the milliseconds since UNIX epoch

The numbered week within the month
WEEK_OF_YEARThe numbered week within the year. The first week in the year has a value of 1.
  • dateTime - The datetime as a long representing the milliseconds since UNIX epoch

The numbered week within the year
YEARThe number representing the year
  • dateTime -The datetime as a long representing the milliseconds since UNIX epoch

The current year
ZIPZips lists into a single list where the ith element is an list containing the ith items from the constituent lists. See Python and Wikipedia for more context.
  • lists* - Lists to zip.

  • Returns: The zip of the lists. The returned list is the min size of all the lists. e.g., ZIP( [ 1, 2 ], [ 3, 4, 5] ) == [ [1, 3], [2, 4] ]

ZIP_LONGESTZips lists into a single list where the ith element is an list containing the ith items from the constituent lists. See Python and Wikipedia for more context.
  • lists* - Lists to zip.

  • Returns: The zip of the lists. The returned list is the max size of all the lists. Empty elements are null e.g., ZIP_LONGEST( [ 1, 2 ], [ 3, 4, 5] ) == [ [1, 3], [2, 4], [null, 5] ]


The following is an example query (in other words, a function which returns a boolean) which would be seen possibly in threat triage:

IN_SUBNET( ip, '192.168.0.0/24') or ip in [ '10.0.0.1', '10.0.0.2' ] or exists(is_local)

This evaluates to true precisely when one of the following is true:

  • The value of the ip field is in the 192.168.0.0/24 subnet

  • The value of the ip field is 10.0.0.1 or 10.0.0.2

  • The field is_local exists

The following is an example transformation which might be seen in a field transformation:

TO_EPOCH_TIMESTAMP(timestamp, 'yyyy-MM-dd HH:mm:ss', MAP_GET(dc, dc2tz, 'UTC'))

For a message with a timestamp and dc field, we want to set the transform the timestamp to an epoch timestamp given a timezone which we will lookup in a separate map, called dc2tz.

This will convert the timestamp field to an epoch timestamp based on the

  • Format yyyy-MM-dd HH:mm:ss

  • The value in dc2tz associated with the value associated with field dc, defaulting to UTC