

\d, represents between 1 and 3 digits, since it was preceded by a " \d".Īre regular expressions gibberish? No, but you'll never be able to convince some people. 192.168.224.3 - a match, will be displayedĪnd here is our regular expression: 192 \.Here are sample values in the src_ip field: With regex, results are focused within the IP range of interest. Without the regex command, the search results on the same dataset include values that we don't want, such as 8.8.8.8 and 192.168.229.225. I could use the eval function called cidrmatch, but I can use regex to do the same thing and by mastering regex, I can use it in many other scenarios. In contrast to the rex command, the regex command does not create new fields.

Conversely, it can also show the results that do NOT match the pattern if the regular expression is negated. When used, it shows results that match the pattern specified. The regex command uses regular expressions to filter events.

The parenthesis () signifies a capture group, while the value captured inside is assigned to the field name.
#Splunk group by regex plus#
The plus sign extends that single character to one or more matches this ensures that the expression stops when it gets to an ampersand, which would denote another value in the form_data. So, we're matching any single character that is not an ampersand. The square brackets +signify a class, meaning anything within them will be matched the carat symbol (in the context of a class) means negation. This snippet in the regular expression matches anything that is not an ampersand. ?specifies the name of the field that the captured value will be assigned to. Here is my regular expression to extract the password.
#Splunk group by regex password#
The value immediately after that is the password value that I want to extract for my analysis. The passwd= string is a literal string, and I want to find exactly that pattern every time. I have highlighted a couple of items of interest to work with.
#Splunk group by regex code#
In the code below, I show the value of the form_data field. So how did that happen? How did this new field appear, you ask? Let's break this down. Now we can perform operations on this new field, such as stats, discussed in John Stoner's excellent blog post: " I Need To Do Some Hunting. Cool, huh? Now when I look at the results.lo and behold, I have a new field called “pass”! Notice that we use the rex command against the form_data field and then create a NEW field called pass? The “gibberish” in the middle is our regular expression-or “regex”-that pulls that data from the “form_field”.

This will create a “pass” field that you can then search for unencrypted passwords in its value. In this one event you can see an unencrypted password-something you never want to see in your web logs! In order to find out how widespread this unencrypted password leakage is, you’ll need to create a search using the rex command. As you start your analysis, you may start by hunting in wire data for http traffic and come across a field in your web log data called form_data. As a hunter, you’ll want to focus on the extraction capability.Īs an example, you may hypothesize that there are unencrypted passwords being sent across the wire and that you want to identify and extract that information for analysis. The rex command allows you to substitute characters in a field (which is good for anonymization) as well as extracting values and assigning them to a new field. Splunk offers two commands ( rex and regex) in SPL that allow Splunk analysts to utilize regular expressions in order to assign values to new fields or narrow results on the fly as part of their search. However, on occasion, some valuable nuggets of information are not assigned to a field by default and as an analyst, you’ll want to hunt for these treasures. Additionally, Splunk can pull out the most interesting fields for any given data source at search time. With Splunk, all logs are indexed and stored in their complete form (.compared to some *ahem* lesser platforms that only store certain fields). “But stop,” you say, “Splunk uses fields!” When working with ASCII data and trying to find something buried in a log, it's invaluable. Regular Expression-or "gibberish” to the uninitiated-is a compact language that allows analysts to define a pattern in text. This is part eight of the " Hunting with Splunk: The Basics" series.
