Appendix A. Stellar Language Functions
This section provides Stellar language functions supported by Hortonworks Cybersecurity Package (HCP) powered by Apache Metron.
The Stellar query language supports the following:
Referencing fields in the enriched JSON
Simple boolean operations:
and
,not
,or
Simple arithmetic operations:
*
,/
,+
,-
on real numbers or integersSimple comparison operations
<
,>
,<=
,>=
if/then/else comparisons (for example,
if var1 < 10 then 'less than 10' else '10 or more'
)Determining whether a field exists (via
exists
)The ability to have parenthesis to make order of operations explicit
User defined functions
The following keywords need to be single quote escaped in order to be used in Stellar expressions:
Stellar Language Inclusion Checks ("in" and "not in")
"in" supports string contains. e.g., "'foo' in 'foobar' == true"
"in" supports collection contains. e.g., "'foo' in [ 'foo', 'bar' ] == true"
"in" supports map key contains. e.g., "'foo' in { 'foo' : 5} == true"
"not in" is the negation of the in expression. e.g., "'grok' not in 'foobar' == true`"
Stellar Language Comparisons (`<`, `<=`, `>`, `>=`)
If either side of the comparison is null then return false.
If both values being compared implement number then the following:
If either side is a double then get double value from both sides and compare using given operator.
Else if either side is a float then get float value from both sides and compare using given operator.
Else if either side is a long then get long value from both sides and compare using given operator.
Otherwise get the int value from both sides and compare using given operator.
If both sides are of the same type and are comparable then use the compareTo method to compare values.
If none of the above are met then an exception is thrown.
Stellar Language Equality Check (`==`, `!=`)
Below is how the `==` operator is expected to work:
1. If either side of the expression is null then check equality using Java's `==` expression.
Else if both sides of the expression are of Java's type Number then:
If either side of the expression is a double then use the double value of both sides to test equality.
Else if either side of the expression is a float then use the float value of both sides to test equality.
Else if either side of the expression is a long then use long value of both sides to test equality.
Otherwise use int value of both sides to test equality
Otherwise use equals method compare the left side with the right side.
The `!=` operator is the negation of the above.
Table A.2. Stellar Language Functions
Function | Description | Input | Returns |
---|---|---|---|
ABS | Returns the absolute value of a number | number - The number to take the absolute value of | The absolute value of the number passed in |
APPEND_IF_MISSING | Appends the suffix to the end of the string if the string does not already end with any of the suffixes. |
| A new string if prefix was prepended, the same string otherwise. |
BIN | Computes the bin that the value is in given a set of bounds |
| Which bin N the value falls in such that bound(N-1) <value <= bound(N). No min and max bounds are provided, so values small than the 0'th bound go in the 0'th bin, and values great than the last bound go in the M'th bin. |
BLOOM_ADD | Adds an element to the bloom filter passed in |
| Bloom Filter |
BLOOM_EXISTS | If the bloom filter contains the value |
| True if the filter might contain the value and false otherwise |
BLOOM_INIT | Returns an empty bloom filter |
| Bloom Filter |
BLOOM_MERGE | Returns a merged bloom filter |
| Bloom Filter or null if the list is empty |
CHOP | Remove the last character from a string. |
| String without last character, null if null string input. |
CHOMP | Removes one newline from end of a string if its there, otherwise leaves it alone. A newline is "/n", "/r", "/r/n". |
| String without newline, null if null string input. |
COUNT_MATCHES | Counts how many times the substring appears in the larger string. |
| |
DAY_OF_MONTH | The numbered day within the month. The first day within the month has a value if 1. |
| The numbered day within the month |
DAY_OF_WEEK | The numbered day within the week. The first day of the week, Sunday, has a value of 1. |
| The numbered day within the week. |
DAY_OF_THE_YEAR | The day number within the year. The first day of the year has value of 1. |
| The day number within the year |
DOMAIN_REMOVE_SUBDOMAINS | Remove subdomains from a domain |
| The domain without the subdomains. (For example, DOMAIN_REMOVE_SUBDOMAINS ('mail.yahoo.com') yields 'yahoo.com') |
DOMAIN_REMOVE_TLD | Removes the top level domain (TLD) suffix from a domain |
| The domain without the TLD. (For example, DOMAIN_REMOVE_TLD('mail.yahoo.co.uk') yields 'mail.yahoo') |
DOMAIN_TO_TLD | Extracts the top level domain from a domain |
| The domain of the TLD. (For example, DOMAIN_TO_TLD('mail.yahoo.com.uk') 'yields 'co.uk') |
ENDS_WITH | Determines whether a string ends with a suffix |
| True if the string ends with the specified suffix and false if otherwise |
ENRICHMENT_EXISTS | Interrogates the HBase table holding the simple HBase enrichment data and returns whether the enrichment type and indicator are in the table |
| True if the enrichment indicator exists and false otherwise |
ENRICHMENT_GET | Interrogates the HBase table holding the simple HBase enrichment data and retrieves the tabular value associated with the enrichment type and indicator |
| A map associated with the indicator and enrichment type. Empty otherwise. |
FILL_LEFT | Fills or pads a given string with a given character, to a given length on the left |
| The filled string |
FILL_RIGHT | Fills or pads a given string with a given character, to a given length on the right |
| Last element of the list |
FILTER | Applies a filter in the form of a lambda expression to a list. For example, `FILTER( [ 'foo', 'bar' ] , (x) -> x == 'foo')` would yield `[ 'foo'. |
| The input list filtered by the predicate. |
FORMAT | Returns a formatted string using the specified format string and arguments. Uses Java's string formatting conventions |
| A formatted string |
GEO_GET | Look up an IPV4 address and returns geographic information about it. |
| If a Single field is requested, a string of the field. If multiple fields are requested, a map of string of fields. Otherwise null. |
GET | Returns the i'th element of the list |
| First element of the list |
GET_FIRST | Returns the first element of the list |
| First element of this list |
GET_LAST | Returns the last element of the list |
| Last element of the list |
HLLP_CARDINALITY | Returns HyperLogLogPlus-estimated cardinality for this set. |
| Long value representing the cardinality for this set |
HLLP_INIT | Initializes the set |
| A new HyperLogLogPlus set |
HLLP_MERGE | Merge hllp sets together |
| A new merged HyperLogLogPlus estimator set |
HLLP_OFFER | Add value to set |
| The HyperLogLogPlus set with a new object added |
IN_SUBNET | Returns true if an IP is within a subnet range |
| True if the IP address is within at least one of the network ranges and false if otherwise |
IS_DATE | Determines if the date contained in the string conforms to the specified format |
| True if the date is in the specified format and false if otherwise |
IS_DOMAIN | Tests if a string is a valid domain. Domain names are evaluated according to the standards RFC1034 Section 3, and RFC1123 section 2.1. |
| True if the string is a valid domain and false if otherwise |
IS_EMAIL | Tests if a string is a valid email address |
| True if the string is a valid email address and false if otherwise |
IS_EMPTY | Returns true if string or collection is empty or null and false if otherwise |
| True if the string or collection is empty or null and false if otherwise |
IS_INTEGER | Determines whether or not an object is an integer |
| True if the object can be converted to an integer and false if otherwise |
IS_IP | Determine if a string is an IP or not |
| True if the string is an IP and false if otherwise |
IS_URL | Tests if a string is a valid URL |
| True if the string is a valid URL and false otherwise |
JOIN | Joins the components in the list of strings with the specified delimiter |
| String |
LENGTH | Returns the length of a string or size of a collection. Returns 0 for empty or null strings. |
| Integer |
LIST_ADD | Adds an element to a list. |
| Resulting list with the item added at the end. |
MAAS_GET_ENDPOINT | Inspects ZooKeeper and returns a map containing the name, version, and url for the model referred to by the input parameters |
| A map containing the name, version, url for the REST endpoint (fields named name, version, and url). Note that the output of this function is suitable for input into the first argument of MAAS_MODEL_APPLY. |
MAAS_MODEL_APPLY | Returns the output of a model deployed via Model as a Service. Note: Results are cached locally 10 minutes. |
| The output of the model deployed as a REST endpoint in map form. Assumes REST endpoint returns a JSON map. |
MAP | Applies lambda expression to a list of arguments. e.g. `MAP( [ 'foo', 'bar' ] , (x) -> TO_UPPER(x) )` would yield `[ 'FOO', 'BAR' ]`. |
| A new String if prefix was prepended, the same string otherwise. |
MAP_EXISTS | Checks for existence of a key in a map |
| True if the key is found in the map and false if otherwise |
MAP_GET | Gets the value associated with a key from a map |
| The object associated with the key in the map. If no value is associated with the key and default is specified, then default is returned. If no value is associated with the key or default, then null is returned. |
MONTH | The number representing the month. The first month, January, has a value of 0. |
| The current month (0-based). |
PREPEND_IF_MISSING | Prepends the prefix to the start of the string if the string does not already start with any of the prefixes. |
| A new String if prefix was prepended, the same string otherwise. |
PROFILE_FIXED | The profile periods associated with a fixed lookback starting from now |
| The selected profile measurement timestamps. These are ProfilePeriod objects. |
PROFILE_GET | Retrieves a series of values from a stored profile |
| The profile measurements |
PROFILE_WINDOW | The profiler periods associated with a window selector statement from an optional reference timestamp. |
| Returns: The selected profile measurement periods. These are ProfilePeriod objects. |
PROTOCOL_TO_NAME | Converts the IANA protocol number to the protocol name |
| The protocol name associated with the IANA number |
REDUCE | Reduces a list by a binary lambda expression. That is, the expression takes two arguments. Usage example: `REDUCE( [ 1, 2, 3 ] , (x, y) -> x + y, 0)` would sum the input list, yielding `6`. |
|
The reduction of the list. |
REGEXP_MATCH | Determines whether a regex matches a string |
| List of strings |
SPLIT | Splits the string by the delimiter |
| List of strings |
STARTS_WITH | Determines whether a string starts with a prefix |
| True if the string starts with the specified prefix and false if otherwise |
STATS_ADD | Add one or more input values to those that are used to calculate the summary statistics |
| A Stellar statistics object |
STATS_BIN | Computes the bin that the value is in based on the statistical distribution. |
| Which bin N the value falls in such that bound(N-1) < value <= bound(N). No min and max bounds are provided, so values smaller than the 0'th bound go in the 0'th bin, and values greater than the last bound go in the M'th bin. |
STATS_COUNT | Calculates the count of the values accumulated (or in the window if a window is used) |
| The count of the values in the window or NaN if the statistics object is null |
STATS_GEOMETRIC_MEAN | Calculates the geometric mean of the accumulated values (or in the window if a window is used). See http://commons.apache.org.proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics |
| The geometric mean of the values in the window or NaN if the statistics object is null |
STATS_INIT | Initializes a statistics object |
| A Stellar statistics object |
STATS_KURTOSIS | Calculates the kurtosis of the accumulated values (or in the window if a window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics |
| The kurtosis of the values in the window or NaN if the statistics object is null |
STATS_MAX | Calculates the maximum of the accumulated values (or in the window if a window is used) |
| The maximum of the accumulated values in the window or NaN if the statistics object is null |
STATS_MEAN | Calculates the mean of the accumulated values (or in the window if a window is used) |
| The mean of the values in the window or NaN if the statistics objects is null |
STATS_MERGE | Merges statistics objects |
| A Stellar statistics object |
STATS_MIN | Calculates the minimum of the accumulated values (or in the window if a window is used) |
| The minimum of the accumulated values in the window of NaN if the statistics object is null |
STATS_PERCENTILE | Computes the p'th percentile of the accumulated values (or in the window if a window is used) |
| The p'th percentile of the data or NaN if the statistics object is null |
STATS_POPULATION_VARIANCE | Calculates the population variance of the accumulated values (or in the window if a window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics |
| The population variance of the values in the window of NaN if the statistics object is null |
STATS_QUADATIC_MEAN | Calculates the quadratic mean of the accumulated values (or in the window if the window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics |
| The quadratic mean of the values in the window or NaN if the statistics object is null |
STATS_SD | Calculates the standard deviation of the accumulated values (or in the window if a window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics |
| The standard deviation of the values in the window or NaN if the statistics object is null |
STATS_SKEWNESS | Calculates the skewness of the accumulated values (or in the window if a window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics |
| The skewness of the values in the window of NaN if the statistics object is null |
STATS_SUM | Calculates the sum of the accumulated values (or in the window if a window is used) |
| The sum of the values in the window or NaN if the statistics object is null |
STATS_SUM_LOGS | Calculates the sum of the (natural) log of the accumulated values (or in the window if a window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics |
| The sum of the (natural) log of the values in the in window or NaN if the statistics object is null |
STATS_SUM_SQUARES | Calculates the sum of the squares of the accumulated values (or in the window if a window is used) |
| The sum of the squares of the values in the window or NaN if the statistics object is null |
STATS_VARIANCE | Calculates the variance of the accumulated values (or in the window if a window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics |
| The variance of the values in the window or NaN if the statistics object is null |
STRING_ENTROPY | Computes the base-2 shannon entropy of a string. | input - string | The base-2 shannon entropy of the string (https://en.wikipedia.org/wiki/Entropy_(information_theory)#Definition). The unit of this is bits. |
SYSTEM_ENV_GET | Returns the value associated with an environment variable |
| String |
SYSTEM_PROPERTY_GET | Returns the value associated with a Java system property |
| String |
TO_DOUBLE | Transforms the first argument to a double precision number |
| Double version of the first argument |
TO_EPOCH_TIMESTAMP | Returns the epoch timestamp of the dataTime in the specified format. If the format does not have a timestamp and you wish to assume a given timestamp, you may specify the timezone optionally. |
| Epoch timestamp |
TO_INTEGER | Transforms the first argument to an integer |
| Integer version of the first argument |
TO_LOWER | Transforms the first argument to a lowercase string |
| String |
TO_STRING | Transforms the first argument to a string |
| String |
TO_UPPER | Transforms the first argument to an uppercase string |
| Uppercase string |
TRIM | Trims white space from both sides of a string |
| String |
URL_TO_HOST | Extract the hostname from a URL |
| The hostname from the URL as a string (for example URL_TO_HOST('http://www.yahoo.com/foo') would yield 'www.yahoo.com' |
URL_TO_PATH | Extract the path from a URL |
| The path from the URL as a string (for example URL_TO_PATH('http://www.yahoo.com/foo') would yield 'foo' |
URL_TO_PORT | Extract the port from a URL. If the port is not explicitly stated in the URL, then an implicit port is inferred based on the protocol. |
| The port used in the URL as an integer (for example URL_TO_PORT('http://www.yahoo.com/foo') would yield 80) |
URL_TO_PROTOCOL | Extract the protocol from a URL |
| The protocol from the URL as a string (for example URL_TO_PROTOCOL('http://www.yahoo.com/foo') would yield 'http' |
WEEK_OF_MONTH | The numbered week within the month. The first week within the month has a value of 1. |
| The numbered week within the month |
WEEK_OF_YEAR | The numbered week within the year. The first week in the year has a value of 1. |
| The numbered week within the year |
YEAR | The number representing the year |
| The current year |