Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add URScript language #7144

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

Conversation

EbbeFuglsang
Copy link

@EbbeFuglsang EbbeFuglsang commented Dec 2, 2024

URScript is the custom programming language, developed by Universal Robots, that controls the robot arm. Any interactions with PolyScope get converted into URScript commands, and are subsequently sent to the robot to be executed. This high level language is easy to learn, and allows you to program the robot without the PolyScope graphical interface. The URScript Manual contains an overview of the language structure, information on data types, and a complete reference of the standard functions.

Description

Checklist:

@EbbeFuglsang EbbeFuglsang requested a review from a team as a code owner December 2, 2024 12:42
EbbeFuglsang

This comment was marked as off-topic.

Copy link
Member

@lildude lildude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to the inline comments, you're also going to need to add this to generic.yml and come up with a 100% precise matching regex and add it to the heuristics. This is because .script is incredibly generic so the only way we'll be able to correctly identify these files is via a heuristc.

Popularity is no where near ready for merging so you might want to consider holding off on the suggested changes, close this PR and wait for popularity to meet our requirements.

lib/linguist/languages.yml Outdated Show resolved Hide resolved
vendor/README.md Outdated Show resolved Hide resolved
@lildude lildude changed the title URScript language added Add URScript language Dec 2, 2024
@EbbeFuglsang
Copy link
Author

Regarding popularity this search:
urscript OR UR5 AND ROS OR URCap gives 1.6k repos. They might not all have the URScript in seperate file. But with this search:
https://github.com/search?type=code&q=%28urscript+AND+path%3A*.script%29+OR+%28UR5+AND+ROS+AND+path%3A*.script%29+OR+%28URCap+AND+path%3A*.script%29+ gives 542 different .script files that holds URScript.
What is the threshold?

Regarding the heuristc. That makes a lot of sense. I will look into that. Thank you for you feedback so far🙏

@lildude
Copy link
Member

lildude commented Dec 6, 2024

What is the threshold?

linguist/CONTRIBUTING.md

Lines 76 to 80 in 5fad8d5

## Adding an extension to a language
We try only to add new extensions once they have some usage on GitHub.
In most cases we prefer that each new file extension be in use in at least 200 unique `:user/:repo` repositories before supporting them in Linguist
(but see [#5756][] for a temporary change in the criteria).

@EbbeFuglsang
Copy link
Author

I just added some heuristics I think that should reduce the risk of false positives.

lib/linguist/heuristics.yml Outdated Show resolved Hide resolved
@lildude
Copy link
Member

lildude commented Dec 9, 2024

Please merge main into your branch. Your branch is still showing signs of #7145 which has been resolved now.

lib/linguist/heuristics.yml Outdated Show resolved Hide resolved
Copy link
Member

@lildude lildude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See inline comments.

lib/linguist/heuristics.yml Outdated Show resolved Hide resolved
- extensions: ['.script']
rules:
- language: URScript
pattern: '^\s*def\s*[a-zA-Z_][a-zA-Z0-9_]*\s*\([^)]*\)\s*:[\s\S]*\send($|\s)'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks much better, thanks. I think there's still room for improvement:

Suggested change
pattern: '^\s*def\s*[a-zA-Z_][a-zA-Z0-9_]*\s*\([^)]*\)\s*:[\s\S]*\send($|\s)'
pattern: '^\s*def\s+[a-zA-Z_]\w*\s*\([^)]*\)\s*:[\s\S]*\send\b'
  • ^\s*def\s* replaced with ^\s*def\s+ as it appears at least one space is required
  • [a-zA-Z0-9_]* is the same as \w*
  • \s*end($|\s) is simplified to \s*end\b as \b matches a word boundary, which is equivalent to the end of the word end.

@Alhadis may have some further suggestions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. \send\b will match a (presumably illegal) construction like end., end-, end(, and many, many others.

  2. For performance's sake, [\s\S]* should match conservatively ([\s\S]*?) so that anything past the first \send\b isn't searched and included as part of the match:

    #!/usr/bin/env ruby
    greedy  = /^\s*def\s*[a-zA-Z_][a-zA-Z0-9_]*\s*\([^)]*\)\s*:[\s\S]*\send($|\s)/
    thrifty = /^\s*def\s*[a-zA-Z_][a-zA-Z0-9_]*\s*\([^)]*\)\s*:[\s\S]*?\send($|\s)/
    input   = File.read("samples/URScript/admittance_control.script")
    
    greedy  =~ input; $&.length # => 10019
    thrifty =~ input; $&.length # => 126

@EbbeFuglsang EbbeFuglsang requested a review from lildude December 13, 2024 07:10
@@ -712,7 +712,7 @@ disambiguations:
- extensions: ['.script']
rules:
- language: URScript
pattern: '^\s*def\s+[a-zA-Z_]\w*\s*\([^)]*\)\s*:[\s\S]*?\send($|\s)'
pattern: '(^\s*|\s+)def\s+[a-zA-Z_]\w*\s*\([^)]*\)\s*:[\s\S]*?\send($|\s)'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unnecessary (\s* is zero or more which includes \s+ which is one or more) and introduces a ReDoS.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I reverted the commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants