Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove comments of config.string. #16625

Closed
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 27 additions & 1 deletion logstash-core/lib/logstash/config/source/multi_local.rb
Original file line number Diff line number Diff line change
Expand Up @@ -23,14 +23,17 @@ class MultiLocal < Local
include LogStash::Util::SubstitutionVariables
include LogStash::Util::Loggable

REMOVE_COMMENTS_CONFIG_KEYS = %w(config.string)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

review note: currently only config.string may contain the comments. YAML::safe_load removes the comments from other keys such as pipeline.id.


def initialize(settings)
@original_settings = settings
super(settings)
@match_warning_done = false
end

def pipeline_configs
pipelines = deep_replace(retrieve_yaml_pipelines)
yaml_config_without_comments = wipeout_comments(retrieve_yaml_pipelines)
pipelines = deep_replace(yaml_config_without_comments)
pipelines_settings = pipelines.map do |pipeline_settings|
clone = @original_settings.clone
clone.merge_pipeline_settings(pipeline_settings)
Expand Down Expand Up @@ -134,5 +137,28 @@ def do_warning?
end
!done
end

# @param [Object] `yaml_config` the input object
# @return [Object] Updated object where its key etries which match the `CONFIG_KEYS_TO_WIPE_OUT_COMMENTS` do not contain comments
def wipeout_comments(yaml_config)
case yaml_config
when Hash
yaml_config.each do |key, val|
yaml_config[key.to_s] = wipeout_comments(val) if REMOVE_COMMENTS_CONFIG_KEYS.include?(key)
end
when Array
yaml_config.map { |val| wipeout_comments(val) }
when String
yaml_config.lines.map do |line|
if line.strip.start_with?("#")
""
else
line.match?(/(?<!['"]) #/) ? line.gsub(/ (?<!['"])#.*/, '') : line
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still learning what is acceptable under these pipeline definitions.... Would it be possible to have string literals with a # char inside? Perhaps something like

filter {
  mutate { add_field => { "test_field" => "This is real data # not a comment" } }
}

In that case, would we mangle this using this regex?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would:

jruby-9.4.9.0 :010 > line = 'mutate { add_field => { "test_field" => "This is real data # not a comment" } }'
 => "mutate { add_field => { \"test_field\" => \"This is real data # not a comment\" } }" 
jruby-9.4.9.0 :011 > line.match?(/(?<!['"]) #/) ? line.gsub(/ (?<!['"])#.*/, '') : line
 => "mutate { add_field => { \"test_field\" => \"This is real data" 

same with ruby code that may have a #:

jruby-9.4.9.0 :008 > line = 'filter { ruby { code => "puts # 1" } }  # test'
 => "filter { ruby { code => \"puts # 1\" } }  # test" 
jruby-9.4.9.0 :009 > line.match?(/(?<!['"]) #/) ? line.gsub(/ (?<!['"])#.*/, '') : line
 => "filter { ruby { code => \"puts"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow! This is more than I initially thought. I have revised a logic but honestly I would avoid adding such complex art which touches user's config. Maybe we document that commenting env ${VAR} is not allowed for config.string?

Example pipelines.yml:

- pipeline.id: "test #id1" # test
  config.string: |
    input { 
      generator { count => 10 } #id${PIPELINE_WORKERS}"
      beats {
        port => 5555 # intentional comment contains "${BEATS_DEV_PORT}" variable, shouldn't break functionalities
        host => "127.0.0.1"
      }
      # another intentional comment contains "${BEATS_PROD_HOST}" variable, shouldn't break functionalities
    }
    filter {
      mutate { add_field => { "oopsy" => "This is real data # not a ' comment, oopsy" } }
      mutate {
        add_field => {
          'doopsy' => "This is real data # not a ' comment, doopsy" # pipeline.workers: "${PIPELINE_WORKERS}"
        }
        add_tag => [ "doopsy_tag", "${ENV_TAG}" ] # "${ENV_TAG_ANOTHER}
      }
      mutate {
        add_field => {
          "hoopsy" => 
            'This is real data # not a " comment, hoopsy'
        }
      }
    mutate { add_field => { "loopsy" => "This is real data # not a ' escape comment, \\" loopsy" } }
      mutate { add_field => { "woopsy" => 'This is real data # not a " comment, ${ENV_TAG_ANOTHER} whoopsy' } } # "${ENV_TAG_ANOTHER} } }
    }
    output {
      stdout { } # output
    } # pipeline.workers: "${PIPELINE_WORKERS}"
    # pipeline.workers: "${PIPELINE_WORKERS}"

  #  pipeline.workers: "${PIPELINE_WORKERS}"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, it's a very high hanging fruit for very little return in terms of user experience. +1 on documenting this limitation and closing the PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, i was looking back at the discussion around this issue trying to get the context. It seemed to me at first glance that the behavior of doing interpolation here is correct. Specifically the flow seemed to be that variables would be interpolated before generating the config. At that point there would be no concept of what is a comment vs config in the template input.

I do see how this may be a frustrating edge case, but it is one that is probably not worth introducing magic around.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback! I have added doc changes with the PR-16689 and that closes both current PR and upstream issue.

end
end.join("\n")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

review note: we keep new lines even though they are empty because when something goes wrong in config, error displays the line with specific error position.

else
yaml_config
end
end
end
end end end
29 changes: 18 additions & 11 deletions logstash-core/spec/logstash/config/source/multi_local_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -146,17 +146,16 @@

describe "#pipeline_configs" do

# let(:config_string) {
# "input {
# udp {
# port => 5555 # intentional comment contains \"${UDP_DEV_PORT}\" variable, shouldn't break functionalities
# host => \"127.0.0.1\"
# }
# # another intentional comment contains \"${UDP_PROD_HOST}\" variable, shouldn't break functionalities
# }
# output {}"
# }
let(:config_string) { "input {} output {}" }
let(:config_string) {
"input {
udp {
port => 5555 # intentional comment contains \"${UDP_DEV_PORT}\" variable, shouldn't break functionalities
host => \"127.0.0.1\"
}
# another intentional comment contains \"${UDP_PROD_HOST}\" variable, shouldn't break functionalities
}
output {}"
}
let(:retrieved_pipelines) do
[
{ "pipeline.id" => "main", "config.string" => config_string },
Expand All @@ -175,6 +174,14 @@
expect(configs.last).to be_a(::LogStash::Config::PipelineConfig)
end

it "wipes out the comments of `config.string`" do
expectation = "input {\n\n udp {\n\n port => 5555\n\n host => \"127.0.0.1\"\n\n }\n\n\n }\n\n output {}"
config_without_comments = subject.send(:wipeout_comments, retrieved_pipelines)
config_without_comments.each do |config_line|
expect(config_line['config.string']).to match expectation
end
end

context "using non pipeline related settings" do
let(:retrieved_pipelines) do [
{ "pipeline.id" => "main", "config.string" => "", "api.http.port" => 22222 },
Expand Down