Skip to content
adzap edited this page Sep 27, 2010 · 7 revisions

ValidatesTimeliness Parser

There are default formats which can be added to easily using the plugins format rules. Also formats can be easily removed without hacking the plugin at all.

Below are the default formats. If you think they are easy to read then you will be happy to know that is exactly the format you can use to define your own if you want. No complex regular expressions are needed.

Time formats

hh:nn:ss
hh-nn-ss
h:nn
h.nn
h nn
h-nn
h:nn_ampm
h.nn_ampm
h nn_ampm
h-nn_ampm
h_ampm

NOTE: Any time format without a meridian token (the ‘ampm’ token) is considered in 24 hour time.

Date formats

yyyy/mm/dd
yyyy-mm-dd
yyyy.mm.dd
m/d/yy  OR  d/m/yy
m\d\yy  OR  d\m\yy
d-m-yy
d.m.yy
d mmm yy

NOTE: To use non-US date formats see US/Euro Formats section

Datetime formats

m/d/yy h:nn:ss   OR  d/m/yy hh:nn:ss
m/d/yy h:nn      OR  d/m/yy h:nn
m/d/yy h:nn_ampm OR  d/m/yy h:nn_ampm
yyyy-mm-dd hh:nn:ss
yyyy-mm-dd h:nn
ddd mmm d hh:nn:ss zo yyyy # Ruby time string
yyyy-mm-ddThh:nn:ssZ  # ISO 8601 without zone offset
yyyy-mm-ddThh:nn:sszo # ISO 8601 with zone offset

NOTE: To use non-US date formats see US/Euro Formats section

Here is what each format token means:

Format tokens:
     y = year
     m = month
     d = day
     h = hour
     n = minute
     s = second
     u = micro-seconds
  ampm = meridian (am or pm) with or without dots (e.g. am, a.m, or a.m.)
     _ = optional space
    tz = Timezone abbreviation (e.g. UTC, GMT, PST, EST)
    zo = Timezone offset (e.g. +10:00, -08:00, +1000)

Repeating tokens:
     x = 1 or 2 digits for unit (e.g. 'h' means an hour can be '9' or '09')
    xx = 2 digits exactly for unit (e.g. 'hh' means an hour can only be '09')

Special Cases:
    yy = 2 or 4 digit year
  yyyy = exactly 4 digit year
   mmm = month long name (e.g. 'Jul' or 'July')
   ddd = Day name of 3 to 9 letters (e.g. Wed or Wednesday)
     u = microseconds matches 1 to 3 digits

All other characters are considered literal. For the technically minded, these formats are compiled into regular expressions at runtime so add little extra overhead than using regular expressions directly.

To see all defined formats look at the parser source code.

US/Euro Formats

The perenial problem for non-US developers or applications not primarily for the US, is the US date format of m/d/yy. This is ambiguous with the European format of d/m/yy. By default the plugin uses the US formats as this is the Ruby default when it does date interpretation.

To switch to using the European (or Rest of The World) formats put this in the initializer.

config.parser.remove_us_formats

Now ‘01/02/2000’ will be parsed as 1st February 2000, instead of 2nd January 2000.

Customising Formats

Sometimes you may not want certain formats to be valid. You can remove formats for each type and the parser will then not consider that a valid format. To remove a format stick this in the initializer

config.parser.remove_formats(:date, 'm\d\yy')

Adding new formats is simple. Again stick this in the initializer file

config.parser.add_formats(:time, "d o'clock")

Now “10 o’clock” will be a valid value.

You can embed regular expressions in the format but no gurantees that it will remain intact. If you avoid the use of any token characters and regexp dots or backslashes as special characters in the regexp, it may well work as expected. For special characters use POSIX character classes for safety. See the ISO 8601 datetime for an example of an embedded regular expression.

Because formats are evaluated in order, adding a format which may be ambiguous with an existing format, will mean your format is ignored. If you need to make your new format higher precedence than an existing format, you can include the before option like so

config.parser.add_formats(:time, 'ss:nn:hh', :before => 'hh:nn:ss')

Now a time of ‘59:30:23’ will be interpreted as 11:30:59 pm. This option saves you adding a new one and deleting an old one to get it to work.

Ambiguous Year

When dealing with 2 digit year values, by default a year is interpreted as being in the last century when at or above 30. You can customize this however

config.parser.ambiguous_year_threshold = 20

Now you get:

year of 19 is considered 2019
year of 20 is considered 1920
Clone this wiki locally