Is there any way in Yara-L to check if a UDM field contains a substring of another UDM field? The following example shows a use case for this and the question I am trying to ask of the data:
Solved! Go to Solution.
Looks like I was beaten to the punch, but since I already prepped this I'll throw it in here with the hope that this makes things even more clear.
The logic in the original post appears like it should work. I adapted it to some sample data we have in our demo instance. I kept this to a single event, but the logic remains identical.
The rule:
rule variable_testing {
meta:
author = "amalone and now eugene"
description = "Test using variable in various positions of the regex function"
severity = "Low"
events:
$event.metadata.event_type = "PROCESS_LAUNCH"
$event.metadata.product_name = "Microsoft-Windows-Sysmon"
$hostname = strings.to_lower($event.principal.hostname) // going to be "danieljones-pc"
$name_substring = strings.to_lower(re.capture($event.principal.hostname, "^([^-]*)")) // should pick up "danieljones
$fullpath = $event.src.file.full_path // should be "C:\Users\danieljones\Desktop\"
re.regex($fullpath, $name_substring) // checks to see if "danieljones" exists in the fullpath
outcome:
$Unaltered_Hostname = $hostname
$Extracted_Username = $name_substring
$Full_Path = $fullpath
condition:
$event
}
And here are some of the resulting detections along with the pertinent fields displayed (seems I can't upload images so here is a table):
timestamp | Detection ID | event | Full_Path (Outcome) | Extracted_Username (Outcome) | Unaltered_Hostname (Outcome) |
2022-12-22T00:10:25Z | de_0c0f2e1a-e9c4-f20a-847c-3876f95239a5 | executable.exe launched by sandbox-control.exe | C:\Users\danieljones\Desktop\ | danieljones | danieljones-pc |
2022-12-22T00:10:35Z | de_c5e1bb87-476a-68e5-1c0f-5254d02372b5 | WindPlugin.exe launched by explorer.exe | C:\Users\danieljones\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\ | danieljones | danieljones-pc |
2022-12-22T00:10:55Z | de_37638139-4917-93e9-d3a2-9c1e6991e340 | program.exe launched by explorer.exe | C:\Users\danieljones\Desktop\ | danieljones | danieljones-pc |
2022-12-22T00:24:06Z | de_83b814ab-491f-6f38-00ed-33fe65490fae | executable.exe launched by sandbox-control.exe | C:\Users\danieljones\Desktop\ | danieljones | danieljones-pc |
2022-12-22T00:24:16Z | de_d8c1187c-ede6-1b49-5da1-7759ccce5387 | WindPlugin.exe launched by explorer.exe | C:\Users\danieljones\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\ | danieljones | danieljones-pc |
2022-12-22T00:24:36Z | de_fd46a82b-888c-881c-1e35-5273aa873336 | program.exe launched by explorer.exe | C:\Users\danieljones\Desktop\ | danieljones | danieljones-pc |
As I read this, it sounds like you are trying to extract a value from one field in one event, extract a value from another field in another event (maybe same field, maybe different) and make them part of your join to determine if the rule will fire. Below is an adaptation of what I think you are attempting but using some data I had handy that I was working with.
In my example, we have Azure and O365 events and I want to correlate them together since they are very similar but in my case the action i am isolating on is in the metadata.description field in one event and in the other it is in the product_event_type (I will be looking into that later!). Notice I also pulled in line 7 and in line 15 the actual values (which are admittedly similar, but wanted to show you what those look like.)
Lines 9 and 17 have the captures. I am just looking for "service principal" with a leading space in both to illustrate the point. I suspect the regex to do the capture might be where things are getting dicey.
In the match, I am using that substring and the vendor name (mainly because I don't have a great userid or ip in both events in this example, so vendor worked in a pinch for a join, but hopefully you get the idea.
Jumping down to the condition section, I added the not array.contains because I was getting some null values in my match for some reason, so that was mainly to strike any extra noise.
rule variable_testing {
meta:
events:
$event1.metadata.event_type = "STATUS_UPDATE"
$event1.metadata.product_name = "Azure AD Directory Audit"
//$event1.metadata.description = "Update service principal"
$event1.metadata.description != ""
re.capture($event1.metadata.description, `\s(service.principal)`) = $substring
$event1.metadata.vendor_name = $vendor
$event2.metadata.event_type = "USER_UNCATEGORIZED"
$event2.metadata.product_name = "Office 365"
//$event2.metadata.product_event_type = "Update service principal."
$event2.metadata.product_event_type != ""
re.capture($event2.metadata.product_event_type, `\s(service.principal)`) = $substring
$event2.metadata.vendor_name = $vendor
match:
$substring, $vendor over 1m
outcome:
$sub = array_distinct($substring)
condition:
$event1 and $event2 and not arrays.contains($sub, "")
}
re.capture($selection1.target.process.file.full_path, /.*\\(.*)/)
Thank you for the response!
This is a great example for when the value we are looking for inside the UDM fields is known and we will definitely use it in the future!
In my case the values matched by the regex may not be the same every time. What I am looking for is somewhat of a contains function in regards to a udm field. So in pseudo code I could do something like this:
Looks like I was beaten to the punch, but since I already prepped this I'll throw it in here with the hope that this makes things even more clear.
The logic in the original post appears like it should work. I adapted it to some sample data we have in our demo instance. I kept this to a single event, but the logic remains identical.
The rule:
rule variable_testing {
meta:
author = "amalone and now eugene"
description = "Test using variable in various positions of the regex function"
severity = "Low"
events:
$event.metadata.event_type = "PROCESS_LAUNCH"
$event.metadata.product_name = "Microsoft-Windows-Sysmon"
$hostname = strings.to_lower($event.principal.hostname) // going to be "danieljones-pc"
$name_substring = strings.to_lower(re.capture($event.principal.hostname, "^([^-]*)")) // should pick up "danieljones
$fullpath = $event.src.file.full_path // should be "C:\Users\danieljones\Desktop\"
re.regex($fullpath, $name_substring) // checks to see if "danieljones" exists in the fullpath
outcome:
$Unaltered_Hostname = $hostname
$Extracted_Username = $name_substring
$Full_Path = $fullpath
condition:
$event
}
And here are some of the resulting detections along with the pertinent fields displayed (seems I can't upload images so here is a table):
timestamp | Detection ID | event | Full_Path (Outcome) | Extracted_Username (Outcome) | Unaltered_Hostname (Outcome) |
2022-12-22T00:10:25Z | de_0c0f2e1a-e9c4-f20a-847c-3876f95239a5 | executable.exe launched by sandbox-control.exe | C:\Users\danieljones\Desktop\ | danieljones | danieljones-pc |
2022-12-22T00:10:35Z | de_c5e1bb87-476a-68e5-1c0f-5254d02372b5 | WindPlugin.exe launched by explorer.exe | C:\Users\danieljones\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\ | danieljones | danieljones-pc |
2022-12-22T00:10:55Z | de_37638139-4917-93e9-d3a2-9c1e6991e340 | program.exe launched by explorer.exe | C:\Users\danieljones\Desktop\ | danieljones | danieljones-pc |
2022-12-22T00:24:06Z | de_83b814ab-491f-6f38-00ed-33fe65490fae | executable.exe launched by sandbox-control.exe | C:\Users\danieljones\Desktop\ | danieljones | danieljones-pc |
2022-12-22T00:24:16Z | de_d8c1187c-ede6-1b49-5da1-7759ccce5387 | WindPlugin.exe launched by explorer.exe | C:\Users\danieljones\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\ | danieljones | danieljones-pc |
2022-12-22T00:24:36Z | de_fd46a82b-888c-881c-1e35-5273aa873336 | program.exe launched by explorer.exe | C:\Users\danieljones\Desktop\ | danieljones | danieljones-pc |
Thanks for the response!
This solves the problem and it seems like this works in our environment as well. In my environment it seems it will accept 1 and not 2 due to not using a placeholder event variable:
1.
2.