Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove comments of config.string. #16625

Closed

Conversation

mashhurs
Copy link
Contributor

@mashhurs mashhurs commented Nov 3, 2024

Release notes

Environment ${VAR} in config.string comments will no longer be evaluated.

What does this PR do?

if config.string of pipeline config has ENV ${VAR} in its comments and ${VAR} is not set, pipeline may not start which is meaningless. This change introduces a logic to remove comments of config.string.
Note that other keys, such as pipeline.id are YAML definitions where their comments will be wiped out by YAML:safe_load.

Why is it important/What is the impact to the user?

Users who have ${VAR} comments in config.string currently need to set ${VAR} env variable. They are no longer required after this change.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • [ ] I have made corresponding changes to the documentation
  • [ ] I have made corresponding change to the default configuration files (and/or docker env variables)
  • I have added tests that prove my fix is effective or that my feature works

Author's Checklist

  • [ ]

How to test this PR locally

  • pull current PR
  • add following config to the pipeline.yml
     - pipeline.id: "test #id1" # test
       config.string: |
         input { generator { count => 10 } } #id${PIPELINE_WORKERS}"
         output {
           # pipeline.workers: "${PIPELINE_WORKERS}"
           stdout { codec => dots } # output
         } # pipeline.workers: "${PIPELINE_WORKERS}"
         # pipeline.workers: "${PIPELINE_WORKERS}"
    
  • without this change LS would complain to set the PIPELINE_WORKERS
  • after this change LS doesn't ask to set the PIPELINE_WORKERS

Related issues

Use cases

Screenshots

Logs

… if VAR is not set which is meaningless. This commit introduces a logic to remove comments of config.string. Note that other keys, such as pipeline.id are YAML definitions where their comments will be wiped out by .
@mashhurs mashhurs added the bug label Nov 3, 2024
else
line.match?(/(?<!['"]) #/) ? line.gsub(/ (?<!['"])#.*/, '') : line
end
end.join("\n")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

review note: we keep new lines even though they are empty because when something goes wrong in config, error displays the line with specific error position.

@@ -23,14 +23,17 @@ class MultiLocal < Local
include LogStash::Util::SubstitutionVariables
include LogStash::Util::Loggable

REMOVE_COMMENTS_CONFIG_KEYS = %w(config.string)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

review note: currently only config.string may contain the comments. YAML::safe_load removes the comments from other keys such as pipeline.id.

@mashhurs mashhurs marked this pull request as ready for review November 3, 2024 17:24
if line.strip.start_with?("#")
""
else
line.match?(/(?<!['"]) #/) ? line.gsub(/ (?<!['"])#.*/, '') : line
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still learning what is acceptable under these pipeline definitions.... Would it be possible to have string literals with a # char inside? Perhaps something like

filter {
  mutate { add_field => { "test_field" => "This is real data # not a comment" } }
}

In that case, would we mangle this using this regex?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would:

jruby-9.4.9.0 :010 > line = 'mutate { add_field => { "test_field" => "This is real data # not a comment" } }'
 => "mutate { add_field => { \"test_field\" => \"This is real data # not a comment\" } }" 
jruby-9.4.9.0 :011 > line.match?(/(?<!['"]) #/) ? line.gsub(/ (?<!['"])#.*/, '') : line
 => "mutate { add_field => { \"test_field\" => \"This is real data" 

same with ruby code that may have a #:

jruby-9.4.9.0 :008 > line = 'filter { ruby { code => "puts # 1" } }  # test'
 => "filter { ruby { code => \"puts # 1\" } }  # test" 
jruby-9.4.9.0 :009 > line.match?(/(?<!['"]) #/) ? line.gsub(/ (?<!['"])#.*/, '') : line
 => "filter { ruby { code => \"puts"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow! This is more than I initially thought. I have revised a logic but honestly I would avoid adding such complex art which touches user's config. Maybe we document that commenting env ${VAR} is not allowed for config.string?

Example pipelines.yml:

- pipeline.id: "test #id1" # test
  config.string: |
    input { 
      generator { count => 10 } #id${PIPELINE_WORKERS}"
      beats {
        port => 5555 # intentional comment contains "${BEATS_DEV_PORT}" variable, shouldn't break functionalities
        host => "127.0.0.1"
      }
      # another intentional comment contains "${BEATS_PROD_HOST}" variable, shouldn't break functionalities
    }
    filter {
      mutate { add_field => { "oopsy" => "This is real data # not a ' comment, oopsy" } }
      mutate {
        add_field => {
          'doopsy' => "This is real data # not a ' comment, doopsy" # pipeline.workers: "${PIPELINE_WORKERS}"
        }
        add_tag => [ "doopsy_tag", "${ENV_TAG}" ] # "${ENV_TAG_ANOTHER}
      }
      mutate {
        add_field => {
          "hoopsy" => 
            'This is real data # not a " comment, hoopsy'
        }
      }
    mutate { add_field => { "loopsy" => "This is real data # not a ' escape comment, \\" loopsy" } }
      mutate { add_field => { "woopsy" => 'This is real data # not a " comment, ${ENV_TAG_ANOTHER} whoopsy' } } # "${ENV_TAG_ANOTHER} } }
    }
    output {
      stdout { } # output
    } # pipeline.workers: "${PIPELINE_WORKERS}"
    # pipeline.workers: "${PIPELINE_WORKERS}"

  #  pipeline.workers: "${PIPELINE_WORKERS}"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, it's a very high hanging fruit for very little return in terms of user experience. +1 on documenting this limitation and closing the PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, i was looking back at the discussion around this issue trying to get the context. It seemed to me at first glance that the behavior of doing interpolation here is correct. Specifically the flow seemed to be that variables would be interpolated before generating the config. At that point there would be no concept of what is a comment vs config in the template input.

I do see how this may be a frustrating edge case, but it is one that is probably not worth introducing magic around.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback! I have added doc changes with the PR-16689 and that closes both current PR and upstream issue.

@jsvd jsvd self-requested a review November 6, 2024 17:38
@mashhurs mashhurs requested a review from donoghuc November 16, 2024 16:55
Copy link

Quality Gate passed Quality Gate passed

Issues
0 New issues
0 Fixed issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarQube

@elasticmachine
Copy link
Collaborator

💛 Build succeeded, but was flaky

Failed CI Steps

History

@mashhurs
Copy link
Contributor Author

Closing as this PR covers doc changes.

@mashhurs mashhurs closed this Dec 16, 2024
@mashhurs mashhurs deleted the wipe-out-config.string-comments branch December 16, 2024 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Commented out variable substitutions may still be evaluated in pipelines.yml
4 participants