Skip to content

Commit

Permalink
Various improvements
Browse files Browse the repository at this point in the history
Removed
-------

* `rack` as dependency
* Unused `config.ru`

Changed
-------

* Default path for cache is now in `./tmp` directory instead of `./config`

Added
-----

* Support of Rails 6

Development
-----------

* [EditorConfig](https://editorconfig.org/) added
* [RuboCop](https://github.com/rubocop-hq/rubocop) added and offenses fixed
* RuboCop task added for Travis CI
* Drop support for Ruby <= 2.2 (they even will not get security patches)
* Ignore built gems for git
  • Loading branch information
AlexWayfer committed Mar 17, 2021
1 parent 93fe240 commit 48236e0
Show file tree
Hide file tree
Showing 25 changed files with 199 additions and 111 deletions.
9 changes: 9 additions & 0 deletions .editorconfig
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
root = true

[*]
indent_style = space
indent_size = 2
end_of_line = lf
charset = utf-8
trim_trailing_whitespace = true
insert_final_newline = true
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@ Gemfile.lock
*.rbc
coverage
doc
*.gem
14 changes: 14 additions & 0 deletions .rubocop.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
AllCops:
TargetRubyVersion: 2.4
NewCops: enable

Layout/MultilineMethodCallIndentation:
EnforcedStyle: indented

Metrics/BlockLength:
Exclude:
- spec/**/*

Metrics/LineLength:
Exclude:
- spec/support/*
1 change: 0 additions & 1 deletion .ruby-version

This file was deleted.

16 changes: 12 additions & 4 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,9 +1,17 @@
language: ruby

rvm:
- 2.4.9
- 2.5.7
- 2.6.5
script: bundle exec rspec
- 2.4
- 2.5
- 2.6

before_install:
- gem update --system
# fixes Travis CI error: NoMethodError: undefined method `spec' for nil:NilClass
- gem install bundler

cache: bundler

script:
- bundle exec rspec
- bundle exec rubocop
2 changes: 2 additions & 0 deletions Gemfile
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
# frozen_string_literal: true

source 'https://rubygems.org'

gemspec
23 changes: 13 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,29 +15,32 @@ Configuration

A JSON file is used to match [user agent strings](http://simplyfast.info/browser) to a list of known bots.

If you'd like to use an [updated list](https://github.com/monperrus/crawler-user-agents) or make your own customizations, run `rake voight_kampff:import_user_agents`. This will download a `crawler-user-agents.json` file into the `./config` directory.
If you'd like to use an [updated list](https://github.com/monperrus/crawler-user-agents) or make your own customizations, run `rake voight_kampff:import_user_agents`. This will download a `crawler-user-agents.json` file into the `./tmp` directory.

__Note:__ The pattern entries in the JSON file are evaluated as [regular expressions](http://en.wikipedia.org/wiki/Regular_expression).

Usage
-----
There are three ways to use Voight-Kampff

1. Through Rack::Request such as in your [Ruby on Rails](http://rubyonrails.org) controllers:
`request.bot?`
1. Through `Rack::Request` in your app such as [Ruby on Rails](http://rubyonrails.org):
```ruby
request.bot?
```

2. Through the `VoightKampff` module:
2. Through the `VoightKampff` module:
`VoightKampff.bot? 'your user agent string'`

3. Through a `VoightKampff::Test` instance:
3. Through a `VoightKampff::Test` instance:
`VoightKampff::Test.new('your user agent string').bot?`

All of the above examples accept `human?` and `bot?` methods. All of these methods will return `true` or `false`.
All of the above examples accept `human?` and `bot?` methods.
All of these methods will return `true` or `false`.

Upgrading to version 1.0
------------------------

Version 1.0 uses a new source for a list of bot user agent strings since the old source was no longer maintained. This new source, unfortuately, does not include as much detail. Therefore the following methods have been deprecated:
Version 1.0 uses a new source for a list of bot user agent strings since the old source was no longer maintained. This new source, unfortunately, does not include as much detail. Therefore the following methods have been deprecated:
- `#browser?`
- `#checker?`
- `#downloader?`
Expand All @@ -51,10 +54,10 @@ Also, the gem no longer extends `ActionDispatch::Request` instead it extends `Ra

FAQ
---
__Q:__ __What's with the name?__
__Q:__ __What's with the name?__
__A:__ It's the [machine in Blade Runner](https://en.wikipedia.org/wiki/Blade_Runner#Voight-Kampff_machine) that is used to test whether someone is a human or a replicant.

__Q:__ __I've found a bot that isn't being matched__
__Q:__ __I've found a bot that isn't being matched__
__A:__ The list is being pulled from [github.com/monperrus/crawler-user-agents](https://github.com/monperrus/crawler-user-agents).
If you'd like to have entries added to the list, please create a pull request with that project. Once that pull request is merged, feel free to create an issue here and I'll release a new gem version with the updated list. In the meantime you can always run `rake voight_kampff:import_user_agents` on your project to get that updated list.

Expand All @@ -67,7 +70,7 @@ Thanks to [github.com/monperrus/crawler-user-agents](https://github.com/monperru
Contributing
------------
PR without tests will not get merged, Make sure you write tests for api and rails app.
PR without tests will not get merged, Make sure you write tests for API and Rails app.
Feel free to ask for help, if you do not know how to write a determined test.
Running Tests?
Expand Down
7 changes: 0 additions & 7 deletions config.ru

This file was deleted.

10 changes: 7 additions & 3 deletions lib/tasks/voight_kampff.rake
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
# frozen_string_literal: true

namespace :voight_kampff do
desc 'Import a new crawler-user-agents.json file'
task :import_user_agents, :url do |t, args|
task :import_user_agents, :url do |_t, args|
args.with_defaults url: 'https://raw.githubusercontent.com/monperrus/crawler-user-agents/master/crawler-user-agents.json'

require 'net/http'
Expand All @@ -9,8 +11,10 @@ namespace :voight_kampff do
contents = Net::HTTP.get(uri)

if contents.present?
file = File.open('./config/crawler-user-agents.json', 'w')
file.write(contents.force_encoding(Encoding::UTF_8))
File.write(
'./tmp/crawler-user-agents.json',
contents.force_encoding(Encoding::UTF_8)
)
else
puts "voight_kampff:import_user_agents - empty file received from #{uri}"
end
Expand Down
12 changes: 5 additions & 7 deletions lib/voight_kampff.rb
Original file line number Diff line number Diff line change
@@ -1,25 +1,23 @@
require 'json'
# frozen_string_literal: true

require 'voight_kampff/test'
require 'voight_kampff/methods'
require 'voight_kampff/rack_request' if defined?(Rack::Request)
require 'voight_kampff/engine' if defined?(Rails::Engine)

# Class helper methods
module VoightKampff
class << self
def root
require 'pathname'
Pathname.new File.expand_path '..', File.dirname(__FILE__)
end
ROOT = File.expand_path '..', __dir__

class << self
def human?(user_agent_string)
test(user_agent_string).human?
end

def bot?(user_agent_string)
test(user_agent_string).bot?
end
alias :replicant? :bot?
alias replicant? bot?

private

Expand Down
5 changes: 4 additions & 1 deletion lib/voight_kampff/engine.rb
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@
# frozen_string_literal: true

module VoightKampff
# Integration with Rails
class Engine < Rails::Engine
rake_tasks do
load 'tasks/voight_kampff.rake'
end

initializer :add_voight_kampff_methods do |app|
initializer :add_voight_kampff_methods do |_app|
ActionDispatch::Request.class_eval do
include VoightKampff::Methods
end
Expand Down
19 changes: 12 additions & 7 deletions lib/voight_kampff/methods.rb
Original file line number Diff line number Diff line change
@@ -1,10 +1,15 @@
module VoightKampff::Methods
def human?
VoightKampff::Test.new(user_agent).human?
end
# frozen_string_literal: true

module VoightKampff
# Helper for Rack::Request
module Methods
extend Forwardable
def_delegators :voight_kampff, :human?, :bot?, :replicant?

private

def bot?
VoightKampff::Test.new(user_agent).bot?
def voight_kampff
VoightKampff::Test.new(user_agent)
end
end
alias :replicant? :bot?
end
6 changes: 3 additions & 3 deletions lib/voight_kampff/rack_request.rb
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# frozen_string_literal: true

# Reopen the Rack::Request class to add bot detection methods
Rack::Request.class_eval do
include VoightKampff::Methods
end
Rack::Request.include VoightKampff::Methods if defined?(Rack::Request)
79 changes: 48 additions & 31 deletions lib/voight_kampff/test.rb
Original file line number Diff line number Diff line change
@@ -1,7 +1,50 @@
# frozen_string_literal: true

module VoightKampff
# Test User-Agent against Voight-Kampff
class Test
CRAWLERS_FILENAME = 'crawler-user-agents.json'

class << self
def crawlers
@crawlers ||= JSON.parse File.read preferred_path
end

def crawler_regexp
@crawler_regexp ||= begin
# NOTE: This is admittedly a bit convoluted
# but the performance gains make it worthwhile
crawler_patterns =
crawlers.map.with_index do |crawler, index|
"(?<match#{index}>#{crawler['pattern']})"
end.join('|')
crawler_patterns = "(#{crawler_patterns})"
Regexp.new(crawler_patterns, Regexp::IGNORECASE)
end
end

private

def lookup_paths
# These paths should be orderd by priority
[
(Rails.root if defined? Rails),
Dir.pwd,
VoightKampff::ROOT
]
end

def preferred_path
lookup_paths.find do |base_path|
next unless base_path

path = File.join base_path, 'tmp', CRAWLERS_FILENAME

break path if File.exist? path
end
end
end

attr_accessor :user_agent_string

def initialize(user_agent_string)
Expand All @@ -19,42 +62,16 @@ def human?
def bot?
!human?
end
alias :replicant? :bot?
alias replicant? bot?

private

def lookup_paths
# These paths should be orderd by priority
base_paths = []
base_paths << Rails.root if defined? Rails
base_paths << VoightKampff.root

base_paths.map { |p| p.join('config', CRAWLERS_FILENAME) }
end

def preferred_path
lookup_paths.find { |path| File.exists? path }
end

def matching_crawler
if match = crawler_regexp.match(@user_agent_string)
index = match.names.first.sub(/match/, '').to_i
crawlers[index]
end
end

def crawler_regexp
@@crawler_regexp ||= begin
# NOTE: This is admittedly a bit convoluted but the performance gains make it worthwhile
index = -1
crawler_patterns = crawlers.map{|c| index += 1; "(?<match#{index}>#{c["pattern"]})" }.join("|")
crawler_patterns = "(#{crawler_patterns})"
Regexp.new(crawler_patterns, Regexp::IGNORECASE)
end
end
match = self.class.crawler_regexp.match(@user_agent_string)
return unless match

def crawlers
@@crawlers ||= JSON.load(File.open(preferred_path, 'r'))
index = match.names.first.sub(/match/, '').to_i
self.class.crawlers[index]
end
end
end
5 changes: 4 additions & 1 deletion spec/controllers/replicants_controller_spec.rb
Original file line number Diff line number Diff line change
@@ -1,9 +1,12 @@
# frozen_string_literal: true

require 'spec_helper'

describe ReplicantsController, type: :controller do
let(:user_agent_string) { '' }
before do
expect_any_instance_of(ActionController::TestRequest).to receive(:user_agent).and_return user_agent_string
expect_any_instance_of(ActionController::TestRequest)
.to receive(:user_agent).and_return user_agent_string
get :index
end

Expand Down
13 changes: 8 additions & 5 deletions spec/internal/app/controllers/replicants_controller.rb
Original file line number Diff line number Diff line change
@@ -1,12 +1,15 @@
# frozen_string_literal: true

class ReplicantsController < ActionController::Base
def index
header = "Replicants:\n===========\n"

status, content = if request.bot?
[200, '- Rick Deckard']
else
[403, 'No replicants here']
end
status, content =
if request.bot?
[200, '- Rick Deckard']
else
[403, 'No replicants here']
end

render plain: header + content, status: status
end
Expand Down
2 changes: 2 additions & 0 deletions spec/internal/config/routes.rb
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
# frozen_string_literal: true

Rails.application.routes.draw do
resources :replicants, only: :index
root to: 'replicants#index'
Expand Down
8 changes: 6 additions & 2 deletions spec/lib/voight_kampff/rack_request_spec.rb
Original file line number Diff line number Diff line change
@@ -1,10 +1,14 @@
# frozen_string_literal: true

require 'spec_helper'

describe Rack::Request do
let(:user_agent_string) { }
let(:env) { {'HTTP_USER_AGENT' => user_agent_string} }
let(:user_agent_string) {}
let(:env) { { 'HTTP_USER_AGENT' => user_agent_string } }
subject { Rack::Request.new(env) }

require_relative '../../../lib/voight_kampff/rack_request'

it { expect(subject).to respond_to :human? }
it { expect(subject).to respond_to :bot? }
it { expect(subject).to respond_to :replicant? }
Expand Down
Loading

0 comments on commit 48236e0

Please sign in to comment.