Chapter 3. Stellar Language Functions
This section provides a list of all supported core functions language functions.
Table 3.1. Stellar Core Functions
Function | Description | Input | Returns |
---|---|---|---|
ABS | Returns the absolute value of a number | number - The number to take the absolute value of | The absolute value of the number passed in. |
APPEND_IF_MISSING | Appends the suffix to the end of the string if the string does not already end with any of the suffixes. |
| A new string if prefix was prepended, the same string otherwise. |
BIN | Computes the bin that the value is in given a set of bounds |
| Which bin N the value falls in such that bound(N-1) <value <= bound(N). No min and max bounds are provided, so values small than the 0'th bound go in the 0'th bin, and values great than the last bound go in the M'th bin. |
BLOOM_ADD | Adds an element to the bloom filter passed in |
| Bloom Filter |
BLOOM_EXISTS | If the bloom filter contains the value |
| True if the filter might contain the value and false otherwise |
BLOOM_INIT | Returns an empty bloom filter |
| Bloom Filter |
BLOOM_MERGE | Returns a merged bloom filter |
| Bloom Filter or null if the list is empty |
CEILING | Returns the ceiling of a number. |
| The ceiling of the number passed in. |
CHOP | Remove the last character from a string. |
| String without last character, null if null string input. |
CHOMP | Removes one newline from end of a string if its there, otherwise leaves it alone. A newline is "/n", "/r", "/r/n". |
| String without newline, null if null string input. |
COS | Returns the cosine of a number. |
| The cosine of the number passed in. |
COUNT_MATCHES | Counts how many times the substring appears in the larger string. |
| |
DAY_OF_MONTH | The numbered day within the month. The first day within the month has a value if 1. |
| The numbered day within the month |
DAY_OF_WEEK | The numbered day within the week. The first day of the week, Sunday, has a value of 1. |
| The numbered day within the week. |
DAY_OF_THE_YEAR | The day number within the year. The first day of the year has value of 1. |
| The day number within the year |
DECODE | Decodes the passed string with the provided encoding, which must be one of the
encodings returned from GET_SUPPORTED_ENCODINGS |
|
|
DOMAIN_REMOVE_SUBDOMAINS | Remove subdomains from a domain |
| The domain without the subdomains. (For example, DOMAIN_REMOVE_SUBDOMAINS ('mail.yahoo.com') yields 'yahoo.com') |
DOMAIN_REMOVE_TLD | Removes the top level domain (TLD) suffix from a domain |
| The domain without the TLD. (For example, DOMAIN_REMOVE_TLD('mail.yahoo.co.uk') yields 'mail.yahoo') |
DOMAIN_TO_TLD | Extracts the top level domain from a domain |
| The domain of the TLD. (For example, DOMAIN_TO_TLD('mail.yahoo.com.uk') 'yields 'co.uk') |
ENCODE | Encodes the passed string with the provided encoding, which must be one of the
encodings returned from GET_SUPPORTED_ENCODINGS |
|
|
ENDS_WITH | Determines whether a string ends with a suffix |
| True if the string ends with the specified suffix and false if otherwise |
ENRICHMENT_EXISTS | Interrogates the HBase table holding the simple HBase enrichment data and returns whether the enrichment type and indicator are in the table |
| True if the enrichment indicator exists and false otherwise |
ENRICHMENT_GET | Interrogates the HBase table holding the simple HBase enrichment data and retrieves the tabular value associated with the enrichment type and indicator |
| A map associated with the indicator and enrichment type. Empty otherwise. |
EXP | Returns Euler's number raised to the power of the argument. |
| Euler's number raised to the power of the argument. |
FILL_LEFT | Fills or pads a given string with a given character, to a given length on the left. |
| The filled string |
FILL_RIGHT | Fills or pads a given string with a given character, to a given length on the right. |
| Last element of the list |
FILTER | Applies a filter in the form of a lambda expression to a list. For example, `FILTER( [ 'foo', 'bar' ] , (x) -> x == 'foo')` would yield `[ 'foo'. |
| The input list filtered by the predicate. |
FLOOR | Returns the floor of a number. |
|
|
FORMAT | Returns a formatted string using the specified format string and arguments. Uses Java's string formatting conventions |
| A formatted string |
FUZZY_LANGS | Returns a list of IETF BCP 47 available to the system, such as en, fr, de. | A list of IEF BGP 47 language tag strings | |
FUZZY_SCORE | Returns the Fuzzy Score which indicates the similarity score between two strings. One point is given for every matched character. Subsequent matches yield two bonus points. A higher score indicates a higher similarity. |
| An Integer representing the score. |
GEO_GET | Look up an IPV4 address and returns geographic information about it. |
| If a Single field is requested, a string of the field. If multiple fields are requested, a map of string of fields. Otherwise null. |
GEOHASH_CENTROID | Compute the centroid (geographic midpoint or center of gravity) of a set of geohashes |
| The geohash of the centroid. |
GEO_DIST | Compute the distance between geohashes. |
| The distance in kilometers between the hashes. |
GEOHASH_FROM_LATLONG | Compute geohash given a lat/long. |
| A geohash of the lat/long. |
GEOHASH_FROM_LOC | Compute geohash given a geo enrichment location. |
| A geohash of the location. |
GEOHASH_MAX_DIST | Compute the maximum distance among a list of geohashes |
| The maximum distance in kilometers between any two locations |
GEOHASH_TO_LATLONG | Compute geohash given a lat/long. |
| A map containing the latitude and longitude of the hash (keys "latitude" and "longitude") |
GET | Returns the i'th element of the list |
| First element of the list |
GET_FIRST | Returns the first element of the list |
| First element of this list |
GET_LAST | Returns the last element of the list |
| Last element of the list |
GET_SUPPORTED_ENCODINGS | Returns a list of the encodings that are currently supported. | A List of String | |
HASH | Hashes a given value using the given hashing algorithm and returns a hex encoded string. |
| A hex encoded string of a hashed value using the given algorithm. If 'hashType' is null then '00', padded to the necessary length, will be returned. If 'toHash' is not able to be hashed or 'hashType' is null then null is returned. |
HLLP_ADD | Add value to the HyperLogLogPlus estimator set. |
| The HyperLogLogPlus set with a new value added |
HLLP_CARDINALITY | Returns HyperLogLogPlus-estimated cardinality for this set. |
| Long value representing the cardinality for this set |
HLLP_INIT | Initializes the set |
| A new HyperLogLogPlus set |
HLLP_MERGE | Merge hllp sets together |
| A new merged HyperLogLogPlus estimator set |
IN_SUBNET | Returns true if an IP is within a subnet range |
| True if the IP address is within at least one of the network ranges and false if otherwise |
IS_DATE | Determines if the date contained in the string conforms to the specified format |
| True if the date is in the specified format and false if otherwise |
IS_DOMAIN | Tests if a string is a valid domain. Domain names are evaluated according to the standards RFC1034 Section 3, and RFC1123 section 2.1. |
| True if the string is a valid domain and false if otherwise |
IS_EMAIL | Tests if a string is a valid email address |
| True if the string is a valid email address and false if otherwise |
IS_EMPTY | Returns true if string or collection is empty or null and false if otherwise |
| True if the string or collection is empty or null and false if otherwise |
IS_ENCODING | Returns true if the passed string is encoded in one of the supported encodings and false if otherwise. |
| True if the passed string is encoded in one of the supported encodings and false if otherwise. |
IS_INTEGER | Determines whether or not an object is an integer |
| True if the object can be converted to an integer and false if otherwise |
IS_IP | Determine if a string is an IP or not |
| True if the string is an IP and false if otherwise |
IS_NAN | Evaluates if the passed number is NaN. The number is evaluated as a double. |
| True if the number is NaN, false if it is |
IS_URL | Tests if a string is a valid URL |
| True if the string is a valid URL and false otherwise |
JOIN | Joins the components in the list of strings with the specified delimiter |
| String |
KAFKA_GET | Retrieves messages from a Kafka topic. Subsequent calls will continue retrieving messages sequentially from the original offset. |
| List of String |
KAFKA_PROPS | Retrieves the Kafka properties that are used by other KAFKA_* functions like KAFKA_GET and KAFKA_PUT. The Kafka properties are compiled from a set of default properties, the global properties, and any overrides. |
| Map of key/value pairs |
KAFKA_PUT | Sends messages to a Kafka topic. |
| N/A |
KAFKA_TAIL | Retrieves messages from a Kafka topic always starting with the most recent message first. |
| List of String |
LENGTH | Returns the length of a string or size of a collection. Returns 0 for empty or null strings. |
| Integer |
LIST_ADD | Adds an element to a list. |
| Resulting list with the item added at the end. |
LN | Returns the natural log of a number. |
| The natural log of the number passed in. |
LOG2 | Returns the log (base 2 ) of a number. |
| The log (base 2 ) of the number passed in. |
LOG10 | Returns the log (base 10 ) of a number. |
| The log (base 10 ) of the number passed in. |
MAAS_GET_ENDPOINT | Inspects ZooKeeper and returns a map containing the name, version, and url for the model referred to by the input parameters |
| A map containing the name, version, url for the REST endpoint (fields named name, version, and url). Note that the output of this function is suitable for input into the first argument of MAAS_MODEL_APPLY. |
MAAS_MODEL_APPLY | Returns the output of a model deployed via Model as a Service. Note: Results are cached locally 10 minutes. |
| The output of the model deployed as a REST endpoint in map form. Assumes REST endpoint returns a JSON map. |
MAP | Applies lambda expression to a list of arguments. e.g. `MAP( [ 'foo', 'bar' ] , (x) -> TO_UPPER(x) )` would yield `[ 'FOO', 'BAR' ]`. |
| A new String if prefix was prepended, the same string otherwise. |
MAP_EXISTS | Checks for existence of a key in a map |
| True if the key is found in the map and false if otherwise |
MAP_GET | Gets the value associated with a key from a map |
| The object associated with the key in the map. If no value is associated with the key and default is specified, then default is returned. If no value is associated with the key or default, then null is returned. |
MAX | Returns the maximum value of a list of input values. |
| The maximum value of the list, or null if the list is empty or the input values were not comparable. |
MIN | Returns the minimum value of a list of input values. |
| The minimum value of the list, or null if the list is empty or the input values were not comparable. |
MONTH | The number representing the month. The first month, January, has a value of 0. |
| The current month (0-based). |
MULTISET_ADD | Adds to a multiset, which is a map associating objects to their instance counts. |
| A multiset |
MULTISET_INIT | Creates an empty multiset, which is a map associating objects to their instance counts. |
| A multiset |
MULTISET_MERGE | Merges a list of multisets, which is a map associating objects to their instance counts. |
| A multiset |
MULTISET_REMOVE | Removes from a multiset, which is a map associating objects to their instance counts. |
| A multiset |
MULTISET_TO_SET | Create a set out of a multiset, which is a map associating objects to their instance counts. |
| The set of objects in the multiset ignoring multiplicity |
PREPEND_IF_MISSING | Prepends the prefix to the start of the string if the string does not already start with any of the prefixes. |
| A new String if prefix was prepended, the same string otherwise. |
PROFILE_FIXED | The profile periods associated with a fixed lookback starting from now |
| The selected profile measurement timestamps. These are ProfilePeriod objects. |
PROFILE_GET | Retrieves a series of values from a stored profile |
| The profile measurements |
PROFILE_WINDOW | The profiler periods associated with a window selector statement from an optional reference timestamp. |
| Returns: The selected profile measurement periods. These are ProfilePeriod objects. |
PROTOCOL_TO_NAME | Converts the IANA protocol number to the protocol name |
| The protocol name associated with the IANA number |
REDUCE | Reduces a list by a binary lambda expression. That is, the expression takes two arguments. Usage example: `REDUCE( [ 1, 2, 3 ] , (x, y) -> x + y, 0)` would sum the input list, yielding `6`. |
|
The reduction of the list. |
REGEXP_MATCH | Determines whether a regex matches a string |
| List of strings |
REGEX_GROUP_VAL | Returns the value of a group in a regex against a string |
| The value of the group, or null if not matched or no group at index. |
ROUND | Rounds a number to the nearest integer. This is half-up rounding. |
| The nearest integer (based on half-up rounding). |
SET_ADD | Adds to a set |
| A Set |
SET_INIT | Creates an new set |
| A Set |
SET_MERGE | Merges a list of sets |
| A Set |
SET_REMOVE | Removes from a set |
| A Set |
SIN | Returns the sine of a number. |
| The sine of the number passed in. |
SPLIT | Splits the string by the delimiter |
| List of strings |
SQRT | Returns the square root of a number. |
| The square root of the number passed in. |
STARTS_WITH | Determines whether a string starts with a prefix |
| True if the string starts with the specified prefix and false if otherwise |
STATS_ADD | Add one or more input values to those that are used to calculate the summary statistics |
| A Stellar statistics object |
STATS_BIN | Computes the bin that the value is in based on the statistical distribution. |
| Which bin N the value falls in such that bound(N-1) < value <= bound(N). No min and max bounds are provided, so values smaller than the 0'th bound go in the 0'th bin, and values greater than the last bound go in the M'th bin. |
STATS_COUNT | Calculates the count of the values accumulated (or in the window if a window is used) |
| The count of the values in the window or NaN if the statistics object is null |
STATS_GEOMETRIC_MEAN | Calculates the geometric mean of the accumulated values (or in the window if a window is used). See http://commons.apache.org.proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics |
| The geometric mean of the values in the window or NaN if the statistics object is null |
STATS_INIT | Initializes a statistics object |
| A Stellar statistics object |
STATS_KURTOSIS | Calculates the kurtosis of the accumulated values (or in the window if a window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics |
| The kurtosis of the values in the window or NaN if the statistics object is null |
STATS_MAX | Calculates the maximum of the accumulated values (or in the window if a window is used) |
| The maximum of the accumulated values in the window or NaN if the statistics object is null |
STATS_MEAN | Calculates the mean of the accumulated values (or in the window if a window is used) |
| The mean of the values in the window or NaN if the statistics objects is null |
STATS_MERGE | Merges statistics objects |
| A Stellar statistics object |
STATS_MIN | Calculates the minimum of the accumulated values (or in the window if a window is used) |
| The minimum of the accumulated values in the window of NaN if the statistics object is null |
STATS_PERCENTILE | Computes the p'th percentile of the accumulated values (or in the window if a window is used) |
| The p'th percentile of the data or NaN if the statistics object is null |
STATS_POPULATION_VARIANCE | Calculates the population variance of the accumulated values (or in the window if a window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics |
| The population variance of the values in the window of NaN if the statistics object is null |
STATS_QUADRATIC_MEAN | Calculates the quadratic mean of the accumulated values (or in the window if the window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics |
| The quadratic mean of the values in the window or NaN if the statistics object is null |
STATS_SD | Calculates the standard deviation of the accumulated values (or in the window if a window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics |
| The standard deviation of the values in the window or NaN if the statistics object is null |
STATS_SKEWNESS | Calculates the skewness of the accumulated values (or in the window if a window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics |
| The skewness of the values in the window of NaN if the statistics object is null |
STATS_SUM | Calculates the sum of the accumulated values (or in the window if a window is used) |
| The sum of the values in the window or NaN if the statistics object is null |
STATS_SUM_LOGS | Calculates the sum of the (natural) log of the accumulated values (or in the window if a window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics |
| The sum of the (natural) log of the values in the in window or NaN if the statistics object is null |
STATS_SUM_SQUARES | Calculates the sum of the squares of the accumulated values (or in the window if a window is used) |
| The sum of the squares of the values in the window or NaN if the statistics object is null |
STATS_VARIANCE | Calculates the variance of the accumulated values (or in the window if a window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics |
| The variance of the values in the window or NaN if the statistics object is null |
STRING_ENTROPY | Computes the base-2 shannon entropy of a string. | input - string | The base-2 shannon entropy of the string (https://en.wikipedia.org/wiki/Entropy_(information_theory)#Definition). The unit of this is bits. |
SYSTEM_ENV_GET | Returns the value associated with an environment variable |
| String |
SYSTEM_PROPERTY_GET | Returns the value associated with a Java system property |
| String |
TAN | Returns the tangent of a number. |
| The tangent of the number passed in. |
TLSH_DIST | Will return the hamming distance between two TLSH hashes (note: must be computed with the same params). For more information, see https://github.com/trendmicro/tlsh and Jonathan Oliver, Chun Cheng, and Yanggui Chen, TLSH - A Locality Sensitive Hash. 4th Cybercrime and Trustworthy Computing Workshop, Sydney, November 2013. For a discussion of tradeoffs, see Table II on page 5 of https://github.com/trendmicro/tlsh/blob/master/TLSH_CTC_final.pdf |
| |
TO_DOUBLE | Transforms the first argument to a double precision number |
| Double version of the first argument |
TO_EPOCH_TIMESTAMP | Returns the epoch timestamp of the dateTime in the specified format. If the format does not have a timestamp and you wish to assume a given timestamp, you may specify the timezone optionally. |
| Epoch timestamp |
TO_FLOAT | Transforms the first argument to an integer |
| Float version of the first argument |
TO_INTEGER | Transforms the first argument to an integer |
| Integer version of the first argument |
TO_JSON_LIST | Accepts JSON string as an input and returns a List object parsed by Jackson. You
need to be aware of content of JSON string that is to be parsed. For e.g.
GET_FIRST( TO_JSON_LIST( '[ "foo", 2]') would yield
foo |
| A parsed List object |
TO_JSON_MAP | Accepts JSON string as an input and returns a Map object parsed by Jackson. You
need to be aware of content of JSON string that is to be parsed. For e.g. MAP_GET(
'bar', TO_JSON_MAP( '{ "foo" : 1, "bar" : 2}' ) would yield
2 |
| A parsed Map object |
TO_JSON_OBJECT | Accepts JSON string as an input and returns a JSON Object parsed by Jackson. You
need to be aware of content of JSON string that is to be parsed. For e.g. MAP_GET(
'bar', TO_JSON_OBJECT( '{ "foo" : 1, "bar" : 2}' ) would yield
2 |
| A parsed JSON object |
TO_LONG | Transforms the first argument to a long integer |
| Long version of the first argument |
TO_LOWER | Transforms the first argument to a lowercase string |
| String |
TO_STRING | Transforms the first argument to a string |
| String |
TO_UPPER | Transforms the first argument to an uppercase string |
| Uppercase string |
TRIM | Trims white space from both sides of a string |
| String |
URL_TO_HOST | Extract the hostname from a URL |
| The hostname from the URL as a string (for example URL_TO_HOST('http://www.yahoo.com/foo') would yield 'www.yahoo.com' |
URL_TO_PATH | Extract the path from a URL |
| The path from the URL as a string (for example URL_TO_PATH('http://www.yahoo.com/foo') would yield 'foo' |
URL_TO_PORT | Extract the port from a URL. If the port is not explicitly stated in the URL, then an implicit port is inferred based on the protocol. |
| The port used in the URL as an integer (for example URL_TO_PORT('http://www.yahoo.com/foo') would yield 80) |
URL_TO_PROTOCOL | Extract the protocol from a URL |
| The protocol from the URL as a string (for example URL_TO_PROTOCOL('http://www.yahoo.com/foo') would yield 'http' |
WEEK_OF_MONTH | The numbered week within the month. The first week within the month has a value of 1. |
| The numbered week within the month |
WEEK_OF_YEAR | The numbered week within the year. The first week in the year has a value of 1. |
| The numbered week within the year |
YEAR | The number representing the year |
| The current year |
ZIP | Zips lists into a single list where the ith element is an list containing the ith items from the constituent lists. See python and wikipedia for more context. |
|
|
ZIP_LONGEST | Zips lists into a single list where the ith element is an list containing the ith items from the constituent lists. See python and wikipedia for more context. |
|
|
The following is an example query (in other words, a function which returns a boolean) which would be seen possibly in threat triage:
IN_SUBNET( ip, '192.168.0.0/24') or ip in [ '10.0.0.1', '10.0.0.2' ] or exists(is_local)
This evaluates to true precisely when one of the following is true:
The value of the
ip
field is in the192.168.0.0/24
subnetThe value of the
ip
field is10.0.0.1
or10.0.0.2
The field
is_local
exists
The following is an example transformation which might be seen in a field transformation:
TO_EPOCH_TIMESTAMP(timestamp, 'yyyy-MM-dd HH:mm:ss', MAP_GET(dc, dc2tz, 'UTC'))
For a message with a timestamp
and dc
field, we want to set the
transform the timestamp to an epoch timestamp given a timezone which we will lookup in a
separate map, called dc2tz
.
This will convert the timestamp field to an epoch timestamp based on the
Format
yyyy-MM-dd HH:mm:ss
The value in
dc2tz
associated with the value associated with fielddc
, defaulting toUTC