See if a UDM field contains a substring of another UDM field

Is there any way in Yara-L to check if a UDM field contains a substring of another UDM field? The following example shows a use case for this and the question I am trying to ask of the data:

 
rule variable_testing {
meta:
  author = "amalone"
  description = "Test to see if we can find a substring of one UDM field inside another udm field"  
  severity = "Low"

events:
  $file.metadata.event_type = "FILE_MODIFICATION"
  $file.principal.hostname = $hst
  // Get the name of the file involved in the file modification
  $fileName = re.capture($file.target.file.full_path, `(?:\\|\/)([^\/\\]+)$`)
  
  
  $launch.metadata.event_type = "PROCESS_LAUNCH"
  $launch.principal.hostname = $hst
   /*
   Is it possible to see if a UDM field contains a substring of another UDM field? For example, I have a file modification where I grab the name
   of the file using the re.capture function. I want to match this event with a process launch event on the same host where the target.proccess.command_line
   contains the name of the file from the file modifcation. The following two lines are syntactically incorrect but demonstrate the idea of what im trying to accomplish.

   //re.regex($launch.target.process.command_line, $fileName)

   //re.regex($launch.target.process.command_line, re.capture($event.target.file.full_path, `(?:\\|\/)([^\/\\]+)$`) ) nocase

    */
  

match:
  $hst over 1m

outcome:
 $name = array_distinct($fileName)

condition:
  $file and $launch

}
Solved Solved
0 4 1,108
1 ACCEPTED SOLUTION

Looks like I was beaten to the punch, but since I already prepped this I'll throw it in here with the hope that this makes things even more clear. 

The logic in the original post appears like it should work. I adapted it to some sample data we have in our demo instance. I kept this to a single event, but the logic remains identical. 

The rule:

 

rule variable_testing {
meta:
  author = "amalone and now eugene"
  description = "Test using variable in various positions of the regex function"  
  severity = "Low"

events:
  $event.metadata.event_type = "PROCESS_LAUNCH"
  $event.metadata.product_name = "Microsoft-Windows-Sysmon"
  $hostname = strings.to_lower($event.principal.hostname) // going to be "danieljones-pc"
  $name_substring = strings.to_lower(re.capture($event.principal.hostname, "^([^-]*)")) // should pick up "danieljones
  $fullpath = $event.src.file.full_path // should be "C:\Users\danieljones\Desktop\"
  re.regex($fullpath, $name_substring) // checks to see if "danieljones" exists in the fullpath
  
outcome:
 $Unaltered_Hostname = $hostname
 $Extracted_Username = $name_substring
 $Full_Path = $fullpath

condition: 
    $event
}

 

And here are some of the resulting detections along with the pertinent fields displayed (seems I can't upload images so here is a table):

timestampDetection IDeventFull_Path (Outcome)Extracted_Username (Outcome)
Unaltered_Hostname (Outcome)
2022-12-22T00:10:25Zde_0c0f2e1a-e9c4-f20a-847c-3876f95239a5executable.exe launched by sandbox-control.exeC:\Users\danieljones\Desktop\danieljonesdanieljones-pc
2022-12-22T00:10:35Zde_c5e1bb87-476a-68e5-1c0f-5254d02372b5WindPlugin.exe launched by explorer.exeC:\Users\danieljones\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\danieljonesdanieljones-pc
2022-12-22T00:10:55Zde_37638139-4917-93e9-d3a2-9c1e6991e340program.exe launched by explorer.exeC:\Users\danieljones\Desktop\danieljonesdanieljones-pc
2022-12-22T00:24:06Zde_83b814ab-491f-6f38-00ed-33fe65490faeexecutable.exe launched by sandbox-control.exeC:\Users\danieljones\Desktop\danieljonesdanieljones-pc
2022-12-22T00:24:16Zde_d8c1187c-ede6-1b49-5da1-7759ccce5387WindPlugin.exe launched by explorer.exeC:\Users\danieljones\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\danieljonesdanieljones-pc
2022-12-22T00:24:36Zde_fd46a82b-888c-881c-1e35-5273aa873336program.exe launched by explorer.exeC:\Users\danieljones\Desktop\danieljonesdanieljones-pc

 

View solution in original post

4 REPLIES 4

As I read this, it sounds like you are trying to extract a value from one field in one event, extract a value from another field in another event (maybe same field, maybe different) and make them part of your join to determine if the rule will fire.  Below is an adaptation of what I think you are attempting but using some data I had handy that I was working with.

In my example, we have Azure and O365 events and I want to correlate them together since they are very similar but in my case the action i am isolating on is in the metadata.description field in one event and in the other it is in the product_event_type (I will be looking into that later!). Notice I also pulled in line 7 and in line 15 the actual values (which are admittedly similar, but wanted to show you what those look like.)

Lines 9 and 17 have the captures. I am just looking for "service principal" with a leading space in both to illustrate the point. I suspect the regex to do the capture might be where things are getting dicey.

In the match, I am using that substring and the vendor name (mainly because I don't have a great userid or ip in both events in this example, so vendor worked in a pinch for a join, but hopefully you get the idea.

Jumping down to the condition section, I added the not array.contains because I was getting some null values in my match for some reason, so that was mainly to strike any extra noise.

 

rule variable_testing {
meta:

events:
  $event1.metadata.event_type = "STATUS_UPDATE"
  $event1.metadata.product_name = "Azure AD Directory Audit"
  //$event1.metadata.description = "Update service principal"
  $event1.metadata.description != ""
  re.capture($event1.metadata.description, `\s(service.principal)`) = $substring
  $event1.metadata.vendor_name = $vendor

  
  $event2.metadata.event_type = "USER_UNCATEGORIZED"
  $event2.metadata.product_name = "Office 365"
  //$event2.metadata.product_event_type = "Update service principal."
  $event2.metadata.product_event_type != ""
  re.capture($event2.metadata.product_event_type, `\s(service.principal)`) = $substring
  $event2.metadata.vendor_name = $vendor

match:
  $substring, $vendor over 1m

outcome:
  $sub = array_distinct($substring)

condition:
  $event1 and $event2 and not arrays.contains($sub, "")

}
 
One final thing. I use this example in our rules workshop and I have found this to be useful for capturing the process, ie powershell.exe or calc.exe. Note that the forward slashes at the front and end could be swapped with ` instead.
re.capture($selection1.target.process.file.full_path, /.*\\(.*)/)

Thank you for the response! 

This is a great example for when the value we are looking for inside the UDM fields is known and we will definitely use it in the future!

In my case the values matched by the regex may not be the same every time. What I am looking for is somewhat of a contains function in regards to a udm field. So in pseudo code I could do something like this:

 

  $file.metadata.event_type = "FILE_MODIFICATION"
  $file.principal.hostname = $hst
  // Get the name of the file involved in the file modification
  $fileName = re.capture($file.target.file.full_path, `(?:\\|\/)([^\/\\]+)$`)
  
  
  $launch.metadata.event_type = "PROCESS_LAUNCH"
  $launch.principal.hostname = $hst
  
  str.contains($launch.target.process.command_line, $fileName)
// Would return true if $fileName is a substring of $launch.target.process.command_line
 
 
 
 

Looks like I was beaten to the punch, but since I already prepped this I'll throw it in here with the hope that this makes things even more clear. 

The logic in the original post appears like it should work. I adapted it to some sample data we have in our demo instance. I kept this to a single event, but the logic remains identical. 

The rule:

 

rule variable_testing {
meta:
  author = "amalone and now eugene"
  description = "Test using variable in various positions of the regex function"  
  severity = "Low"

events:
  $event.metadata.event_type = "PROCESS_LAUNCH"
  $event.metadata.product_name = "Microsoft-Windows-Sysmon"
  $hostname = strings.to_lower($event.principal.hostname) // going to be "danieljones-pc"
  $name_substring = strings.to_lower(re.capture($event.principal.hostname, "^([^-]*)")) // should pick up "danieljones
  $fullpath = $event.src.file.full_path // should be "C:\Users\danieljones\Desktop\"
  re.regex($fullpath, $name_substring) // checks to see if "danieljones" exists in the fullpath
  
outcome:
 $Unaltered_Hostname = $hostname
 $Extracted_Username = $name_substring
 $Full_Path = $fullpath

condition: 
    $event
}

 

And here are some of the resulting detections along with the pertinent fields displayed (seems I can't upload images so here is a table):

timestampDetection IDeventFull_Path (Outcome)Extracted_Username (Outcome)
Unaltered_Hostname (Outcome)
2022-12-22T00:10:25Zde_0c0f2e1a-e9c4-f20a-847c-3876f95239a5executable.exe launched by sandbox-control.exeC:\Users\danieljones\Desktop\danieljonesdanieljones-pc
2022-12-22T00:10:35Zde_c5e1bb87-476a-68e5-1c0f-5254d02372b5WindPlugin.exe launched by explorer.exeC:\Users\danieljones\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\danieljonesdanieljones-pc
2022-12-22T00:10:55Zde_37638139-4917-93e9-d3a2-9c1e6991e340program.exe launched by explorer.exeC:\Users\danieljones\Desktop\danieljonesdanieljones-pc
2022-12-22T00:24:06Zde_83b814ab-491f-6f38-00ed-33fe65490faeexecutable.exe launched by sandbox-control.exeC:\Users\danieljones\Desktop\danieljonesdanieljones-pc
2022-12-22T00:24:16Zde_d8c1187c-ede6-1b49-5da1-7759ccce5387WindPlugin.exe launched by explorer.exeC:\Users\danieljones\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\danieljonesdanieljones-pc
2022-12-22T00:24:36Zde_fd46a82b-888c-881c-1e35-5273aa873336program.exe launched by explorer.exeC:\Users\danieljones\Desktop\danieljonesdanieljones-pc

 

Thanks for the response!

This solves the problem and it seems like this works in our environment as well. In my environment it seems it will accept 1 and not 2 due to not using a placeholder event variable: 

1. 

re.regex($launch.target.process.command_line , $fileName)

2. 

re.regex($launch.target.process.command_line, re.capture($event.target.file.full_path, `(?:\\|\/)([^\/\\]+)$`) ) nocase