Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove (disable) Perfomace/Casecmp (at least for Ruby 2.4) #4277

Closed
AlexWayfer opened this issue Apr 14, 2017 · 13 comments
Closed

Remove (disable) Perfomace/Casecmp (at least for Ruby 2.4) #4277

AlexWayfer opened this issue Apr 14, 2017 · 13 comments

Comments

@AlexWayfer
Copy link
Contributor

Expected behavior

No warnings for:

'АБВГ'.downcase == 'абвг'

Actual behavior

Warning: Use casecmp instead of downcase == .

Explanation

The String#casecmp method does not work with Unicode (even in Ruby 2.4.1), this is written in the documentation.

There is a method String#casecmp?, which works with Unicode, but it is slower:

Warming up --------------------------------------
String#downcase + ==   233.440k i/100ms
      String#casecmp   274.247k i/100ms
     String#casecmp?   219.906k i/100ms
Calculating -------------------------------------
String#downcase + ==      5.746M (± 1.7%) i/s -     28.947M in   5.039252s
      String#casecmp      6.942M (± 1.8%) i/s -     34.829M in   5.019073s
     String#casecmp?      4.517M (± 2.6%) i/s -     22.650M in   5.017864s

Comparison:
      String#casecmp:  6941676.9 i/s
String#downcase + ==:  5745893.8 i/s - 1.21x  slower
     String#casecmp?:  4517314.3 i/s - 1.54x  slower

RuboCop version

$ rubocop -V
0.48.1 (using Parser 2.4.0.0, running on ruby 2.4.1 x86_64-linux)
@AlexWayfer
Copy link
Contributor Author

Here is the issue for fast-ruby, referenced by cop.

@backus
Copy link
Contributor

backus commented Apr 14, 2017

If you are using Ruby 2.4 then the new casecmp? predicate method works: 'АБВГ'.casecmp?('абвг') # => true

@AlexWayfer
Copy link
Contributor Author

@backus Yes, but it's only in Ruby 2.4, and it's slower than String#downcase + ==.

So, String#downcase + == is good compromise between Unicode support (but only in Ruby 2.4, again) and perfomance.

In Ruby < 2.4… We need to use unicode gem in any way :( And casecmp + zero? will be faster, yes.

@AlexWayfer AlexWayfer changed the title Remove (disable) Perfomace/Casecmp Remove (disable) Perfomace/Casecmp (at least for Ruby 2.4) Apr 14, 2017
@bbatsov
Copy link
Collaborator

bbatsov commented Apr 15, 2017

I'd add a style check suggesting this use of casecmp? over downcase and == on 2.4. As for this particular issue - I see your point, but I think some people are finding it useful. Not sure what's the best course of action here.

@AlexWayfer
Copy link
Contributor Author

I'd add a style check suggesting this use of casecmp? over downcase and == on 2.4.

casecmp? is slower than downcase + ==!

Not sure what's the best course of action here.

Ruby >= 2.4: casecmp doesn't work with Unicode, casecmp? is slower than downcase + ==, so downcase + == is good compromise.

Ruby < 2.4: No casecamp? method, and we're needing to use unicode gem (or any other external solution), which make code like this:

# frozen_string_literal: true

require 'benchmark/ips'

require 'unicode'

SLUG = 'АБВГ'

def downcase
  Unicode.downcase(SLUG) == 'абвг' # => true
end

def casecmp
  Unicode.downcase(SLUG).casecmp('абвг').zero? # => true
end

Benchmark.ips do |x|
  x.report('String#downcase + ==') { downcase }
  x.report('String#casecmp')       { casecmp }
  x.compare!
end

And that's results:

Warming up --------------------------------------
String#downcase + ==   121.382k i/100ms
      String#casecmp   110.585k i/100ms
Calculating -------------------------------------
String#downcase + ==      1.569M (± 1.6%) i/s -      7.890M in   5.031095s
      String#casecmp      1.396M (± 1.7%) i/s -      7.077M in   5.070289s

Comparison:
String#downcase + ==:  1568611.7 i/s
      String#casecmp:  1396273.2 i/s - 1.12x  slower

@bbatsov
Copy link
Collaborator

bbatsov commented Apr 15, 2017

Style checks are not about speed, but about writing elegant code. From time to time I think the whole performance cops initiative was misguided, as the performance of certain things changes drastically between MRI revisions. Not to mention it might be completely different on JRuby and Rbx. One we finalize some extensions API, probably they would be moved out of RuboCop's core. Together with the Rails cops.

@bbatsov
Copy link
Collaborator

bbatsov commented Apr 15, 2017

Ruby >= 2.4: casecmp doesn't work with Unicode, casecmp? is slower than downcase + ==, so downcase + == is good compromise.

Many people don't really deal with non-ASCII strings, so this is not as big of an issue as it might seem. If everything you deal with is in English (quite common situation) than you're golden. If not - you should probably just disable this cop. So far you're the first person to complain about this, which leads to believe it's not a big problem in practice.

@AlexWayfer
Copy link
Contributor Author

Style checks are not about speed, but about writing elegant code.

Yes, but I'm not talking about "Style checks", but about Performance/ cop :) And this cop doesn't reference to Ruby Style Guide, but to Fast Ruby.

Maybe this cop must be modified and moved to another scope (Style/?), and some description must be added to Style Guide.

From time to time I think the whole performance cops initiative was misguided, as the performance of certain things changes drastically between MRI revisions. Not to mention it might be completely different on JRuby and Rbx.

Good thoughts.

Many people don't really deal with non-ASCII strings, so this is not as big of an issue as it might seem. If everything you deal with is in English (quite common situation) than you're golden. If not - you should probably just disable this cop.

Okay, you're probably right.

@Drenmi
Copy link
Collaborator

Drenmi commented May 5, 2017

From time to time I think the whole performance cops initiative was misguided, as the performance of certain things changes drastically between MRI revisions.

Yes. At least performance cops can not be "add and forget." The problem with language level optimisations in dynamic languages is that there are so many factors.

One we finalize some extensions API, probably they would be moved out of RuboCop's core.

Agreed. It should probably be handled by someone who has the interest to set up some performance test suite and run it for each new Ruby version.

@2karuitach
Copy link

$ rubocop -V
0.48.1 (using Parser 2.4.0.0, running on ruby 2.4.1 x86_64-linux)
@backus Yes, but it's only in Ruby 2.4, and it's slower than String#downcase + ==.

So, String#downcase + == is good compromise between Unicode support (but only in Ruby 2.4, again) and perfomance.

In Ruby < 2.4… We need to use unicode gem in any way :( And casecmp + zero? will be faster, yes.
I'd add a style check suggesting this use of casecmp? over downcase and == on 2.4.
casecmp? is slower than downcase + ==!

Not sure what's the best course of action here.
Ruby >= 2.4: casecmp doesn't work with Unicode, casecmp? is slower than downcase + ==, so downcase + == is good compromise.

Ruby < 2.4: No casecamp? method, and we're needing to use unicode gem (or any other external solution), which make code like this:

frozen_string_literal: true

require 'benchmark/ips'

require 'unicode'

SLUG = 'АБВГ'

def downcase
Unicode.downcase(SLUG) == 'абвг' # => true
end

def casecmp
Unicode.downcase(SLUG).casecmp('абвг').zero? # => true
end
Ruby >= 2.4: casecmp doesn't work with Unicode, casecmp? is slower than downcase + ==, so downcase + == is good compromise..ips do |x|
x.report('String#downcase + ==') { downcase }
x.report('String#casecmp') { casecmp }
x.compare!
end

@AlexWayfer
Copy link
Contributor Author

Please…

@bbatsov
Copy link
Collaborator

bbatsov commented Sep 18, 2018

There hasn't been much activity on this ticket and our Core Team is spread too thin on so many tasks, so I'll just close it for the sake of having a cleaner lists of tasks to focus on. We'd gladly review a PR, but it's unlikely that'd we're going to tackle this ourselves in the foreseeable future.

In general Performance cops are inherently linked to some MRI version, which is part of the reason we'll remove all of them from the core of RuboCop and move them to https://github.com/rubocop-hq/rubocop-performance.

@bbatsov bbatsov closed this as completed Sep 18, 2018
@AlexWayfer
Copy link
Contributor Author

There hasn't been much activity on this ticket and our Core Team is spread too thin on so many tasks, so I'll just close it for the sake of having a cleaner lists of tasks to focus on.

Issue still not resolved, and I think it should be opened. For cleaner lists of tasks you can use labels or GitHub Projects.

Closed undone issues will be forgotten or duplicated.

We'd gladly review a PR, but it's unlikely that'd we're going to tackle this ourselves in the foreseeable future.

I'd gladly make a PR, but it's a bit hard for me make serious multiple checks inside cops, inspecting source nodes, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants