Dates In Hive

Dates In Hive

After previously spending a lot of time looking at dates in SQL Server and oracle, I wanted to do the same justice to Hive.

Sample Date Functions

Hive is closer to oracle than to SQL Server when it comes to formatting datas, here are some samples:

WITH sample_date_data AS
(
SELECT TIMESTAMP(from_unixtime(unix_timestamp())) AS the_timestamp
)
SELECT the_timestamp                      AS the_timestamp,
       CAST(the_timestamp AS DATE)        AS the_date,
     --TRUNC
       TRUNC(the_timestamp, 'MM')         AS date_month,
       TRUNC(the_timestamp, 'YY')         AS date_year,
     --Common date parts
       MONTH(the_timestamp)               AS month,
       YEAR(the_timestamp)                AS year,
     --DATE_FORMAT
       DATE_FORMAT(the_timestamp, 'u')    AS daynumber_of_week,
       DATE_FORMAT(the_timestamp, 'EEEE') AS day_name
  FROM sample_date_data
;

With the above example, one can also see the differentiation between a timestamp and a date (if you aren’t interested in the time, it’s easier not to store/manipulate it).

TRUNC

Like oracle, hive has the TRUNC command which rounds down a date to the granularity specified. Unlike Oracle, this doesn’t include a default of day. For getting a whole day. I’d recommend casting to DATE.

The TRUNC function is described as:

Returns date truncated to the unit specified by the format (as of Hive 1.2.0). Supported formats: MONTH/MON/MM, YEAR/YYYY/YY. Example: trunc(‘2015-03-17’, ‘MM’) = 2015-03-01.

Language Manual UDF

DATE_FORMAT

Similar to Oracle and SQL Server equivalents, this is based on a Java library for converting/representing values stored as dates in readable/different formats.

The DATE_FORMAT function is described as:

Converts a date/timestamp/string to a value of string in the format specified by the date format fmt (as of Hive 1.2.0). Supported formats are Java SimpleDateFormat formats – https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html. The second argument fmt should be constant. Example: date_format(‘2015-04-08’, ‘y’) = ‘2015’.

Language Manual UDF

The format strings for this come straight from Java, Some common format strings are:

  • EEEE: Day Name
  • u: Day number of the week
  • M: Month in year

There is a full list of date format strings over on the oracle Java website: SimpleDateFormat

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.