-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add continuous language option #10
Conversation
- avoid parsing sugger 100g -> sugge, 100, g when continuous language
@@ -87,7 +92,7 @@ class RootParser < Parslet::Parser | |||
|
|||
rule(:reverse_format) do | |||
# e.g. flour 200g | |||
((whitespace >> quantity).absent? >> any).repeat.as(:ingredient) >> | |||
((whitespace >> quantity >> any.absent?).absent? >> any).repeat.as(:ingredient) >> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Take for example sugger 300g
: since sugger
contains g in the quantity, it transitions to the next rule after reading to sug
in continuous language.
Basically, nothing is considered to come after quantity, so add any.absent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite follow this change, but i'm happy that if the tests are passing then it's probably fine 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In continuation languages, where whitespace is treated as optional (denoted as "maybe"), parsing issues arise when encountering words that include units as substrings (for example, the letter "g" in "sugger"). This complication becomes apparent when tested locally.
The rule was designed to interpret strings structured as name+quantity. By ensuring no additional characters follow the quantity, we can eliminate the misinterpretation of substrings such as "g" in "suger" being wrongly recognized as a unit of quantity.
I hope I've explained it well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh great, I see! Thanks for explaining
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a nice improvement! I left a few suggestions 🙇
@@ -0,0 +1,9 @@ | |||
module Ingreedy | |||
module ContinuousLanguageLocale | |||
CONTINUOUS_LANGUAGES_LOCALES = [:ja].freeze |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can apply this improvement to Chinese and Thai languages as well? I'm not sure if there are others or not, but these both stand out to me as countries that we support at Cookpad today with a similar rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can.
For Japan, the JP development team had already reached an agreement, but if it seems appropriate, we can also add China and Thailand.
hmm 🤔 While loading ./spec/ingreedy_spec.rb a `raise SyntaxError` occurred, RSpec will now quit.
Failure/Error: require File.join(path, "dictionary_collection")
SyntaxError:
/home/runner/work/ingreedy/ingreedy/lib/ingreedy/dictionary_collection.rb:11: syntax error, unexpected ','
...cale] = Dictionary.new(locale:, **attributes)
... ^
/home/runner/work/ingreedy/ingreedy/lib/ingreedy/dictionary_collection.rb:41: syntax error, unexpected ','
...le] ||= Dictionary.new(locale:, **load_yaml(locale))
... ^
/home/runner/work/ingreedy/ingreedy/lib/ingreedy/dictionary_collection.rb:41: syntax error, unexpected ')', expecting `end'
...w(locale:, **load_yaml(locale))
... ^
/home/runner/work/ingreedy/ingreedy/lib/ingreedy/dictionary_collection.rb:52: syntax error, unexpected end-of-input, expecting `end' |
Omitting the hash value seems to be a Ruby 3.1 feature, sorry https://blog.saeloun.com/2021/09/28/ruby-allow-value-omission-in-hash-literals/ It should be I think that we also need to consider the order though, because
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
Please also consider bumping version.rb to 0.1.1 🙇 Actually maybe there is no need to do this
@@ -87,7 +92,7 @@ class RootParser < Parslet::Parser | |||
|
|||
rule(:reverse_format) do | |||
# e.g. flour 200g | |||
((whitespace >> quantity).absent? >> any).repeat.as(:ingredient) >> | |||
((whitespace >> quantity >> any.absent?).absent? >> any).repeat.as(:ingredient) >> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite follow this change, but i'm happy that if the tests are passing then it's probably fine 👍
hmm. I think so too. |
What
Add continuous language option
Why
In particular, languages like Japanese, where words are not separated by spaces, cannot be parsed, so this has been fixed.
砂糖300g
<- 砂糖=suggerHow
Add ContinuousLanguageLocale module to judge that current locale use continuous language.