-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance when parsing many strings in the same format #28
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bit more explanation on code.
* Some additional comments * Some small tweaks to the code in an attempt to improve readability * Extended the test case a bit
I tried to make the code a bit more clear by leaving some additional comments and doing a bit more code cleanup. Let me know if there are specific parts that are still unclear. |
pom.xml
Outdated
@@ -5,7 +5,7 @@ | |||
<modelVersion>4.0.0</modelVersion> | |||
|
|||
<groupId>com.github.sisyphsu</groupId> | |||
<artifactId>dateparser</artifactId> | |||
<artifactId>dateparser-xyzt-ai</artifactId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ow, this was a mistake.
This was done for our local fork, and should not have been included in this PR. Will revert this for the PR.
Looks like this PR isn't ready to be merged. The following test fails: @Test
void foo() {
DateParser parser = DateParser.newBuilder().optimizeForReuseSimilarFormatted(true).build();
String inputString = "2022-08-09 19:04:31.600000+00:00";
assertEquals(parser.parseDate(inputString), parser.parseDate(inputString));
} I'm afraid it will require a fix for #29 first. |
Proposal for #17 .
By keeping track of which rules were used to parse the first string, parsing the next strings can try to use a matcher that only uses a subset of those rules.
The case in the benchmark is between 2 and 3 times faster on my machine: