Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

detect/analyzer: add more details for the tcp.mss keyword - v2 #9681

Closed
wants to merge 1 commit into from

Conversation

0xEniola
Copy link
Contributor

@0xEniola 0xEniola commented Oct 23, 2023

Link to redmine ticket: https://redmine.openinfosecfoundation.org/issues/6355

Previous PR: #9673

Describe changes:

  • Included the options to be logged that were not included in the last PR; mode and arg2.
  • Added TcpmssModeToString function into detect-tcpmss.c file to convert the mode bit to a human readable value.
  • Included TcpmssModeToString to detect-tcpmss.h file.

Suggestion:

  • In the outputs, whenever the rule does not provide for value2, I personally feel it should then not be logged, just like the mode value is not logged when not provided in the rule. I traced it to rust/src/detect/uint.rs; line 67 where arg2 can be equated to NULL.

Output for id-1:

{
    "raw":"alert tcp any any -> any any (msg:\"Testing mss\"; tcp.mss:50; sid:1;)",
    "id":1,
    "gid":1,
    "rev":0,
    "msg":"Testing mss",
    "app_proto":"unknown",
    "requirements": [],
    "type":"pkt",
    "flags": [
        "src_any",
        "dst_any",
        "sp_any",
        "dp_any",
        "need_packet",
        "toserver",
        "toclient"
    ],
    "pkt_engines": [
        {
            "name":"packet",
            "is_mpm":false
        }
    ],
    "frame_engines": [],
    "lists": {
        "packet": {
            "matches": [
                {
                    "name":"tcp.mss",
                    "tcp_mss": {
                        "value1":50,
                        "value2":0
                    }
                }
            ]
        }
    }
}

Output for id-4:

{
    "raw":"alert tcp any any -> any any (msg:\"Testing mss\"; tcp.mss:123-456; sid:4;)",
    "id":4,
    "gid":1,
    "rev":0,
    "msg":"Testing mss",
    "app_proto":"unknown",
    "requirements":[],
    "type":"pkt",
    "flags":[
        "src_any",
        "dst_any",
        "sp_any",
        "dp_any",
        "need_packet",
        "toserver",
        "toclient"
    ],
    "pkt_engines":[
        {
            "name":"packet",
            "is_mpm":false
        }
    ],
    "frame_engines":[],
    "lists":{
        "packet":{
            "matches":[
                {
                    "name":"tcp.mss",
                    "tcp_mss":{
                        "mode":"range",
                        "value1":123,
                        "value2":456
                    }
                }
            ]
        }
    }
}

SV_BRANCH=OISF/suricata-verify#1436

Add more details to the tcp.mss keyword engine analysis output
Issue: OISF#6355
@jufajardini jufajardini added the outreachy Contributions made by Outreachy applicants label Oct 23, 2023
@github-actions
Copy link

NOTE: This PR may contain new authors:

Daniel Olatunji <[email protected]>

Copy link
Contributor

@jufajardini jufajardini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left a few inline comments to be addressed :)

Comment on lines +872 to +873
jb_set_uint(js, "value1", cd->arg1);
jb_set_uint(js, "value2", cd->arg2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm gathering that if the rule has a range, that's when we'll see something in cd->arg2. Could you make it so that if we have a range, instead of value1 and value2, we have min and max, and if we only have the other cases, where we only have arg1, we use value, instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm gathering that if the rule has a range, that's when we'll see something in cd->arg2. Could you make it so that if we have a range, instead of value1 and value2, we have min and max, and if we only have the other cases, where we only have arg1, we use value, instead?

That means, I'll have to write a switch-case part, right?

Also, I noticed when the rule has range, arg2 is still logged.
I mentioned it in the PR details.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the rule has a range, it makes sense that arg2 is logged, as the range is between a min and a max. Here, I'm just suggesting we change the names. When it's for equal, from what I saw in your test, value2 will be 0, so I think that in case we can not log that, and just focus on what is indeed present in the rule.

You can use a switch, or if else statements, whatever makes more sense for the data you'll have to handle... I'm thinking that here, since we likely only have arg1 and arg2 with the range mode, and if could suffice. But, as you've already been doing, tinker and double-check, as I can make mistakes and misinterpret things, too ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the rule has a range, it makes sense that arg2 is logged, as the range is between a min and a max. Here, I'm just suggesting we change the names. When it's for equal, from what I saw in your test, value2 will be 0, so I think that in case we can not log that, and just focus on what is indeed present in the rule.

You can use a switch, or if else statements, whatever makes more sense for the data you'll have to handle... I'm thinking that here, since we likely only have arg1 and arg2 with the range mode, and if could suffice. But, as you've already been doing, tinker and double-check, as I can make mistakes and misinterpret things, too ;)

Sorry! I meant, when rule isn't range, arg2 is still logged.

I traced the logic down to rust/src/detect/uint.rs, line 67.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry! I meant, when rule isn't range, arg2 is still logged.

It isn't logged, it's to complete the struct otherwise it's unacceptable in rust. It does not seem to serve any purpose in this case afaict.

I traced the logic down to rust/src/detect/uint.rs, line 67.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry! I meant, when rule isn't range, arg2 is still logged.

It isn't logged, it's to complete the struct otherwise it's unacceptable in rust. It does not seem to serve any purpose in this case afaict.

I traced the logic down to rust/src/detect/uint.rs, line 67.

Whenever I test for a rule that isn't in range, arg2 which don't have any value is logged in matches.
So I thought arg2 could be given the NULL value so it's not logged.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arg2 is template type in its definition which evaluates to u16 in this case. u16 cannot be NULL 😉

arg2 which don't have any value is logged in matches

only because you add it. You can check for the operand type and not log arg2 where it does not make sense (which seems everywhere but the range).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Means I'd have to write a switch-case for it.

OK.
Thank you 😊.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not an if statement as its an exclusive case? Do more operands use arg2?

Comment on lines +71 to +80
{
switch (mode) {
case 1:
return "less than";
case 3:
return "greater than";
case 5:
return "range";
default:
return NULL;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we've discussed in the SV PR, there are more possible values for mode. Check
https://github.com/OISF/suricata/blob/master/src/detect-engine-uint.h#L30-L38 for all possible values. It seems that there is no check that would indicate some of those are not possible, so I'm assuming Suri should be able to parse and process any for tcp.mss

Aside from that, do notice that DetectU16Data.mode is of type DetectUintMode, so we should use a parameter of that type, this is more secure, as it restricts the possible values for mode.

This also makes the code itself more human readable, and avoids magic numbers :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking that maybe default would better be "" or equal instead of NULL...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we've discussed in the SV PR, there are more possible values for mode. Check
https://github.com/OISF/suricata/blob/master/src/detect-engine-uint.h#L30-L38 for all possible values. It seems that there is no check that would indicate some of those are not possible, so I'm assuming Suri should be able to parse and process any for tcp.mss

Aside from that, do notice that DetectU16Data.mode is of type DetectUintMode, so we should use a parameter of that type, this is more secure, as it restricts the possible values for mode.

This also makes the code itself more human readable, and avoids magic numbers :)

I explained previously, in the userguide instructions for tcp.mss, the formats for rules only includes Less than, Greater than, range and equal.

Please can you elaborate on your second point? I don't quite understand; help with an instance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking that maybe default would better be "" or equal instead of NULL...

I thought of this too.

I'll do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we've discussed in the SV PR, there are more possible values for mode. Check
https://github.com/OISF/suricata/blob/master/src/detect-engine-uint.h#L30-L38 for all possible values. It seems that there is no check that would indicate some of those are not possible, so I'm assuming Suri should be able to parse and process any for tcp.mss
Aside from that, do notice that DetectU16Data.mode is of type DetectUintMode, so we should use a parameter of that type, this is more secure, as it restricts the possible values for mode.
This also makes the code itself more human readable, and avoids magic numbers :)

I explained previously, in the userguide instructions for tcp.mss, the formats for rules only includes Less than, Greater than, range and equal.

I see. Sorry, I missed that point among the threads.
Even if the userguide doesn't specify it, I think it's best if we handle those values somehow, because the data type that stores them could receive those, which means Suricata can parse them if a rule is written with them, and from the code I couldn't see a place where we reject such values, so I think it's best if we treat/handle them.

Please can you elaborate on your second point? I don't quite understand; help with an instance.

The second point is about the datatype for the struct field mode that we are matching on here. While in DetectIpOpts.code the field is a uint16_t, for DetectUintData_u16.mode, the type is DetectUintMode so it is best if the function parameter is that, instead of a uint8_t, as the DetectUintMode will have a very specific set of possible values, making the whole function easier to treat - and also allowing you to use something like

        case DetectUintModeLt:
            return "less than";

instead of case 1 for instance.

So, the parameter here should be of type DetectUintMode instead of uint8_t. And then you can check the possible types on that file that I shared before. Does this help?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we've discussed in the SV PR, there are more possible values for mode. Check
https://github.com/OISF/suricata/blob/master/src/detect-engine-uint.h#L30-L38 for all possible values. It seems that there is no check that would indicate some of those are not possible, so I'm assuming Suri should be able to parse and process any for tcp.mss
Aside from that, do notice that DetectU16Data.mode is of type DetectUintMode, so we should use a parameter of that type, this is more secure, as it restricts the possible values for mode.
This also makes the code itself more human readable, and avoids magic numbers :)

I explained previously, in the userguide instructions for tcp.mss, the formats for rules only includes Less than, Greater than, range and equal.

I see. Sorry, I missed that point among the threads.
Even if the userguide doesn't specify it, I think it's best if we handle those values somehow, because the data type that stores them could receive those, which means Suricata can parse them if a rule is written with them, and from the code I couldn't see a place where we reject such values, so I think it's best if we treat/handle them.

Please can you elaborate on your second point? I don't quite understand; help with an instance.

The second point is about the datatype for the struct field mode that we are matching on here. While in DetectIpOpts.code the field is a uint16_t, for DetectUintData_u16.mode, the type is DetectUintMode so it is best if the function parameter is that, instead of a uint8_t, as the DetectUintMode will have a very specific set of possible values, making the whole function easier to treat - and also allowing you to use something like

        case DetectUintModeLt:
            return "less than";

instead of case 1 for instance.

So, the parameter here should be of type DetectUintMode instead of uint8_t. And then you can check the possible types on that file that I shared before. Does this help?

OK! I understand you.

Thank you for the feedback.

*
* \param mode uint8_t DetectU16Data tcp.mss mode value
*/
const char *TcpmssModeToString(uint8_t mode)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: name standards -> DetectTcpmssModeToString

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: name standards -> DetectTcpmssModeToString

Oh! I didn't know I had to add Detect. I used the format as seen in detect-ipopts, that's why.

I'll rectify that.

Copy link
Contributor

@jufajardini jufajardini Oct 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, reviewer must have let that slip XD
But if you look around the file, most, if not all functions, start with DetectTcpmss, so it's best if we follow that, as this allows for one to have a good hint on what module this function is from, just from looking at its name :)

Copy link
Member

@inashivb inashivb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very torn b/w adding operand (mode) to the output or not.

  • It seems to be complicating the output.
  • It does not seem to be the practice w other keywords.

But, I don't have a strong opinion..

Comment on lines +872 to +873
jb_set_uint(js, "value1", cd->arg1);
jb_set_uint(js, "value2", cd->arg2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry! I meant, when rule isn't range, arg2 is still logged.

It isn't logged, it's to complete the struct otherwise it's unacceptable in rust. It does not seem to serve any purpose in this case afaict.

I traced the logic down to rust/src/detect/uint.rs, line 67.

@0xEniola
Copy link
Contributor Author

I'm very torn b/w adding operand (mode) to the output or not.

  • It seems to be complicating the output.
  • It does not seem to be the practice w other keywords.

But, I don't have a strong opinion..

I tried looking for other keywords that made use of said format, and how it was handled, but couldn't find any.

So I really don't know.

@jufajardini
Copy link
Contributor

I'm very torn b/w adding operand (mode) to the output or not.

  • It seems to be complicating the output.
  • It does not seem to be the practice w other keywords.

But, I don't have a strong opinion..

I tried looking for other keywords that made use of said format, and how it was handled, but couldn't find any.

So I really don't know.

Yeah, I don't think we have reached these rules in this task of adding details yet...

@inashivb
Copy link
Member

I'm very torn b/w adding operand (mode) to the output or not.

  • It seems to be complicating the output.
  • It does not seem to be the practice w other keywords.

But, I don't have a strong opinion..

I tried looking for other keywords that made use of said format, and how it was handled, but couldn't find any.
So I really don't know.

Yeah, I don't think we have reached these rules in this task of adding details yet...

Actually, byte_test also supports operands but no logging is done afaics

@jufajardini
Copy link
Contributor

jufajardini commented Oct 24, 2023

I'm very torn b/w adding operand (mode) to the output or not.

  • It seems to be complicating the output.
  • It does not seem to be the practice w other keywords.

But, I don't have a strong opinion..

I tried looking for other keywords that made use of said format, and how it was handled, but couldn't find any.
So I really don't know.

Yeah, I don't think we have reached these rules in this task of adding details yet...

Actually, byte_test also supports operands but no logging is done afaics

I see. To be honest, it makes sense to me that we do log it.

@inashivb
Copy link
Member

I see. To be honest, it makes sense to me that we do log it.

I looked at the issue again and it seems that we're collecting and are interested in postmatch info here, if yes, I think we can skip complicating the output with operands, instead just log the value that actually matched in a rule.

@0xEniola
Copy link
Contributor Author

I see. To be honest, it makes sense to me that we do log it.

I looked at the issue again and it seems that we're collecting and are interested in postmatch info here, if yes, I think we can skip complicating the output with operands, instead just log the value that actually matched in a rule.

I looked at the issue again, and yes, it's interested in postmatch info. But it's a bit confusing; I read the engine-analysis details again and what engine analysis does is to provide a report on the set signature, and not to specifically get a match and I found that these reports could be in rules_analysis.txt, patterns.json.

I'm still trying to understand it better before moving on though.

And I think it'll be great if Victor can help shed more light on this issue.

@jufajardini
Copy link
Contributor

I see. To be honest, it makes sense to me that we do log it.

I looked at the issue again and it seems that we're collecting and are interested in postmatch info here, if yes, I think we can skip complicating the output with operands, instead just log the value that actually matched in a rule.

I looked at the issue again, and yes, it's interested in postmatch info. But it's a bit confusing; I read the engine-analysis details again and what engine analysis does is to provide a report on the set signature, and not to specifically get a match and I found that these reports could be in rules_analysis.txt, patterns.json.

I'm still trying to understand it better before moving on though.

And I think it'll be great if Victor can help shed more light on this issue.

I agree that it would, but he's on a short vacation, so this would have to wait a bit more.

@0xEniola
Copy link
Contributor Author

I see. To be honest, it makes sense to me that we do log it.

I looked at the issue again and it seems that we're collecting and are interested in postmatch info here, if yes, I think we can skip complicating the output with operands, instead just log the value that actually matched in a rule.

I looked at the issue again, and yes, it's interested in postmatch info. But it's a bit confusing; I read the engine-analysis details again and what engine analysis does is to provide a report on the set signature, and not to specifically get a match and I found that these reports could be in rules_analysis.txt, patterns.json.

I'm still trying to understand it better before moving on though.

And I think it'll be great if Victor can help shed more light on this issue.

I agree that it would, but he's on a short vacation, so this would have to wait a bit more.

OK then.

I'll move to another issue then.

@jufajardini jufajardini added the decision-required Waiting on deliberation from the team label Nov 9, 2023
Copy link
Contributor

@jufajardini jufajardini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussing more on this with Victor and Shivani, we've settled on the output style that makes sense for us to see here.

Please ensure that we cover all possible operands from DetectUintMode, and that we handle the range case properly, too.

I'll ping @victorjulien here just in case there's something that I'm missing, so we get that feed back sooner.

Sharing a few examples:

  • tcp.mss:1234:
"ack": {
    "operand": "equal",
    "value": 1234 
}
  • tcp.mss:>123:
"ack": {
    "operand": "greater than",
    "value": 123
}
  • tcp.mss:<12345:
"ack": {
    "operand": "less than",
    "value": 12345
}
  • tcp.mss:123-1234:
"ack": {
    "operand": "range",
    "min": 123,
    "max": 1234
}
  • tcp.mss:<=1234:
"ack": {
    "operand": "less than or equal to",
    "value": 1234
}

@0xEniola
Copy link
Contributor Author

After discussing more on this with Victor and Shivani, we've settled on the output style that makes sense for us to see here.

Please ensure that we cover all possible operands from DetectUintMode, and that we handle the range case properly, too.

I'll ping @victorjulien here just in case there's something that I'm missing, so we get that feed back sooner.

Sharing a few examples:

  • tcp.mss:1234:
"ack": {
    "operand": "equal",
    "value": 1234 
}
  • tcp.mss:>123:
"ack": {
    "operand": "greater than",
    "value": 123
}
  • tcp.mss:<12345:
"ack": {
    "operand": "less than",
    "value": 12345
}
  • tcp.mss:123-1234:
"ack": {
    "operand": "range",
    "min": 123,
    "max": 1234
}
  • tcp.mss:<=1234:
"ack": {
    "operand": "less than or equal to",
    "value": 1234
}

Great Great!

I have been waiting while working on others.
I'll start working on it right away while waiting on any addition from @victorjulien .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
outreachy Contributions made by Outreachy applicants
Development

Successfully merging this pull request may close these issues.

3 participants