Examples of AI functions and UDFs

Learn to use the ai_generate_text_default function for tasks like analyzing Amazon book reviews. Also, see how to create custom UDFs to make AI text analysis easier and more efficient, enhancing your data workflows and insights.

Examples of using the built-in AI function

The following example lists the steps needed to turn a prompt into a custom SQL function using just the built-in function ai_generate_text_default.
> select ai_generate_text_default(‘hello’);
  
Response:
Hello! How can I assist you today?
In the below example, a query is sent to the Amazon book reviews database for the book titled Artificial Superintelligence. The large language model (LLM) is prompted to classify the sentiment as positive, neutral, or negative.
> select customer_id, star_rating, ai_generate_text_default(CONCAT(‘Classify the following review as positive, neutral, or negative’, and only include the uncapitalized category in the response: ‘, review_body)) AS review_analysis, review_body from amazon_book_reviews where product_title=’Artificial Superintelligence’ order by customer_id LIMIT 1;

Response:
+--+------------+------------+----------------+------------------+
|  |customer_id |star_rating |review_analysis |review_body       |
+--+------------+------------+----------------+------------------+
|1 |4343565     | 5          |positive        |What is this book |
|  |            |            |                |all about …………    |
+--+------------+------------+----------------+------------------+

Examples of creating and using custom UDFs along with the built-in AI function

Instead of writing the prompts in a SQL query, you can build an UDF with your intended prompt. Once you build your custom UDF with the prompt, you can pass that prompt into the ai_generate_text_default built-in Impala function.

Example: Classify input customer reviews

The following UDF uses the Amazon book reviews database as the input and requests the LLM to classify the sentiment.

Classify input customer reviews:

IMPALA_UDF_EXPORT

StringVal ClassifyReviewsDefault(FunctionContext* context, const StringVal& input) {

 std::string request =

     std::string("Classify the following review as positive, neutral, or negative")

     + std::string(" and only include the uncapitalized category in the response: ")

     + std::string(reinterpret_cast<const char*>(input.ptr), input.len);

 StringVal prompt(request.c_str());

 return context->Functions()->ai_generate_text_default(context, prompt);

}

Now you can define these prompt building UDF and build them in Impala. Once you have them running, you can query your data sets using them.

Creating analyze_reviews function:

> CREATE FUNCTION analyze_reviews(STRING)
RETURNS STRING
LOCATION ‘s3a://dw-...............’
SYMBOL=’ClassifyReviews’

Using SELECT query for Sentiment analysis to classify Amazon book reviews

> SELECT customer_id, star_rating, analyze_reviews(review_body) AS review_analysis, review_body from amazon_book_reviews where product_title=’Artificial Superintelligence’ order by customer_id;
Impala Sentiment Analysis