Friday, April 29, 2016

Filter Log Records between Two Dates

If you have a log file with a date time per line and you need to filter the lines that fall between certain dates you can use this Unix script for that.

For example, if you log file looks like:


Apr 15 02:29:58 GCAdaptiveSizePolicy::compute_survivor_space_size_and_thresh: 


 cat GC.log | sed 's/\(.*\) \([[:digit:]]*\) \([[:digit:]][[:digit:]]\):\([[:digit:]][[:digit:]]\):\([[:digit:]][[:digit:]]\) \(.*$\)/\1 \2\3\4\5 \6/' | awk -v from="$from" -v to="$to" '$2 > from && $2 < to {print $0}' | sed 's/\(.*\) \([[:digit:]]*\)\([[:digit:]][[:digit:]]\)\([[:digit:]][[:digit:]]\)\([[:digit:]][[:digit:]]\) \(.*$\)/\1 \2 \3:\4:\5 \6/'   


cat GC.log -- reads files sequentially, writing them to standard output.

Matches the date:

sed  's/\(.*\) \([[:digit:]]*\) \([[:digit:]][[:digit:]]\):\([[:digit:]][[:digit:]]\):\([[:digit:]][[:digit:]]\) \(.*$\)/\1 \2\3\4\5 \6/'

Each regular expression is enclosed between parenthesis, which are escaped, in our example:

\(.*\)  -- matches "Apr"
\([[:digit:]]*\)  -- Matches the day of the month, "15"
\([[:digit:]][[:digit:]]\): -- Match the hour, "02:"
\([[:digit:]][[:digit:]]\): -- Match the minute "29:"
\([[:digit:]][[:digit:]]\)  -- Match the seconds "58 "
\(.*$\) -- Match the rest of the line

then the second part (between //) is the output

\1 \2\3\4\5 \6

which prints the month, then the day of the month together with the time in order to form a long number and then the rest of the line. The output of the sed command is:

Apr 15022958  GCAdaptiveSizePolicy::compute_survivor_space_size_and_thresh:

The next part of the command is the awk script

awk -v from="15010000" -v to="15035959" '$2 > from && $2 < to {print $0}'

The about script will do the filtering, printing only the lines that are between the from and to values. The next part return the line to its original format, so it takes the output from the awk where the day of the month is toghether with the time and split it again to its original format.

sed  's/\(.*\) \([[:digit:]]*\)\([[:digit:]][[:digit:]]\)\([[:digit:]][[:digit:]]\)\([[:digit:]][[:digit:]]\) \(.*$\)/\1 \2 \3:\4:\5 \6/'


Hope that this is useful for you.