Date strings are parsed using local timezone
Certain UDFs, such as datefiff() that use the VectorUDFDateDiffColScalar class, return an incorrect result when parsing scalar dates. Learn about the changes introduced to fix this issue.
Before Upgrade to CDP 7.1.7 SP1
The VectorUDFDateDiffColScalar
class uses
java.text.SimpleDateFormat
to parse the date strings and interpret the
scalar date to be in the local timezone.
Example:
create external table test_dt(id string, dt date);
insert into test_dt values('11', '2021-07-06'), ('22', '2021-07-07');
select datediff(dt1.dt, '2021-07-01') from test_dt dt1 left join test_dt dt on dt1.id = dt.id;
Output:
+------+
| _c0 |
+------+
| 6 |
| 7 |
+------+
Expected output:
+------+
| _c0 |
+------+
| 5 |
| 6 |
+------+
After Upgrade to CDP 7.1.7 SP1
The parsing mechanism of the VectorUDFDateDiffColScalar
class is updated to
interpret date strings in Coordinated Universal Time (UTC) and the UDFs now return correct
values.
For more information, see HIVE-25449.