Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
383 views
in Technique[技术] by (71.8m points)

java - What is best ways to validate string date to be valid date according to format?

Started working with WEB UI recently. And encountered a problem of date string parsing/validation. "dd-mm-yyyy" Some approaches I found are:

  1. Matching - not complete validation, not flexible.

    (19|20)dd[- /.](0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])

  2. There was a post where guy suggest to preinitialize Set with possible date string - fast, valid, but also not flexible and memory consuming

Is there something easier, maybe available in public libs ?

Please don't suggest SimpleDateFormat :)

UPDATE for java 8 correct answer is https://stackoverflow.com/a/43076001/1479668

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Preamble:

If you don't care about details then the accepted answer suggesting DateTimeFormatter.ofPattern("yyyy MM dd"); is fine. Otherwise if you are interested in the tricky details of parsing then read further:


Regular expressions

As you have already recognized, a complete validation is not possible by using regular expressions like (19|20)dd[- /.](0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01]). For example, this expression would accept "2017-02-31" (February with 31 days???).

Java-8-parsing mechanism

The Java-8-class DateTimeFormatter however, can invalidate such non-existing dates just by parsing. To go into the details, we have to differentiate between syntactic validation and calendrical validation. The first kind of syntactic validation is performed by the method parseUnresolved().

Parsing is implemented as a two-phase operation. First, the text is parsed using the layout defined by the formatter, producing a Map of field to value, a ZoneId and a Chronology. Second, the parsed data is resolved, by validating, combining and simplifying the various fields into more useful ones. This method performs the parsing stage but not the resolving stage.

The main advantage of this method is to not use exception flow which makes this kind of parsing fast. However, the second step of parsing uses exception flow, see also the javadoc of the method parse(CharSequence, ParsePosition).

By contrast, this method will throw a DateTimeParseException if an error occurs, with the exception containing the error index. This change in behavior is necessary due to the increased complexity of parsing and resolving dates/times in this API.

IMHO a performancewise limitation. Another drawback is also that the currently available API does not allow to specify a dot OR a hyphen as you have done in your regular expression. The API only offers a construct like "[.][-]" (using optional sections), but the problem is that an input sequence of ".-" would also be okay for Java-8.

Well, these minor disadvantages are mentioned here for completeness. A final almost-perfect solution would be in Java-8:

String input = "2017-02.-31";
DateTimeFormatter dtf =
    DateTimeFormatter.ofPattern("yyyy[.][-]MM[.][-]dd").withResolverStyle(
        ResolverStyle.STRICT // smart mode truncates to Feb 28!
    );
ParsePosition pos = new ParsePosition(0);
TemporalAccessor ta = dtf.parseUnresolved(input, pos); // step 1
LocalDate date = null;
if (pos.getErrorIndex() == -1 && pos.getIndex() == input.length()) {
    try {
        date = LocalDate.parse(input, dtf); // step 2
    } catch (DateTimeException dte) {
        dte.printStackTrace(); // in strict mode (see resolver style above)
    }
}
System.out.println(date); // 2017-02-28 in smart mode

Important:

  • The best possible validation is only possible in strict resolver style.
  • The validation proposed also includes a check if there are trailing unparsed chars.
  • The result ta of method parseUnresolved() in step 1 cannot be used as intermediate result due to internal limitations of resolving. So this 2-step-approach is also not so overly good for performance. I have not benchmarked it against a normal 1-step-approach but hope that the main author of the new API (S. Colebourne) might have done it, see also for comparison his solution in his own Threeten-extra-library. More or less a hackish workaround to avoid exception flow as much as possible.
  • For Java 6+7, there is a backport available.

Alternative

If you look for an alternative but not for SimpleDateFormat, then you might also find my library Time4J interesting. It supports real OR-logic and avoids exception flow logic as much as possible (highly tuned parsing only in one step). Example:

    String input = "2017-02-31";
    ParseLog plog = new ParseLog();
    PlainDate date =
        ChronoFormatter.ofDatePattern(
            "uuuu-MM-dd|uuuu.MM.dd", PatternType.CLDR, Locale.ROOT)
        .parse(input, plog); // uses smart mode by default and rejects feb 31 in this mode
    if (plog.isError()) {
        System.out.println(plog.getErrorMessage());
    } else {
        System.out.println(date);
    }

Notes:

  • A check of trailing characters can be included in the same way as in Java-8
  • The parsed result is easily convertible to LocalDate via date.toTemporalAccessor()
  • Using the format attribute Attributes.LENIENCY would weaken the validation
  • Time4J is also available for Java 6+7 (when using version line v3.x)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

56.9k users

...