Plotting timedata in gnuplot

Posted on

Problem :

I have a log file (auth.log) where non-relevant lines has been removed.
I wish to aggregate lines per hour/day into the plot, meaning that each line that is within the same hour or day is aggregated into one tic in the plot.

I have been looking into functions, but I keep getting stuck.

This is what I have so far, but it will only work if I have a “variable” for each line in the log file.

#!/usr/bin/env gnuplot                                                          

set terminal png size 1200,800                                                  
set output "graph.png"                                                          
set title "Breakin Attempts"                                                    

set key top right box                                                           
set style data lines                                                            
set border 3                                                                    
set grid                                                                        
set pointsize 3                                                                 

set xlabel "Number of breakin attempts"                                         
set xtics nomirror                                                              
set xdata time                                                                  
set timefmt "%b %d %H:%M:%S"                                                    
set format x "%m/%d"                                                            

set ylabel "Time"                                                               
set ytics nomirror                                                              

plot "pc1.log" using 1:4 title "PC1" linecolor rgb "red",                                                   
     "pc2.log" using 1:4 title "PC2" linecolor rgb "blue",             
     "pc3.log" using 1:4 title "PC3" linecolor rgb "green"

Here is an example of the data

Sep 18 11:26:30 root 60.191.36.196                                              
Sep 18 11:26:34 root 60.191.36.196                                              
Sep 18 11:26:37 root 60.191.36.196
Sep 18 19:21:31 root 198.56.193.74                                              
Sep 18 19:21:33 root 198.56.193.74

In this case the two entries at 19:21:xx will be one tic of 2 and the three at 11:26:xx will be a tic of 3.

Solution :

I assume you want the count of entries per time unit (minutes in your example). I do not know, whether gnuplot can count lines in this manner. I would use awk (or any language convenient for you) to cumulate the data instead. Something like this would do:

script = ‘{time = $3; gsub(/:[0-9][0-9]$/, “”, time); date=sprintf(“%s %s %s”, $1, $2, time)} date==last{count++} date!=last{print date, count; count=0}’

pipe(file) = sprintf(“< awk ‘%s’ %s”, script, file)
plot pipe(“pc1.log”) title “PC1”

Your question is not very explicit. As Hannes, I assume you want to plot the number of lines corresponding to a certain date.

Gnuplot is not well suited for this, pre-processing the file is recommended.

However, with gnuplot 3.4 or later you can program counters (as global variables), so you could have something like this:

currentx=1/0
currentn=0
increaseandreturn(returnvalue)=(currentn=currentn+1,returnvalue)
startnewxandreturn(x,returnvalue)=(currentx=x,currentn=0,returnvalue)
count(x)=((x==currentx)?increaseandreturn(1/0):startnewxandreturn(x,currentn))
plot "file.gdat" using ($1-1):(count($1)) with points

It works only for sorted files (it will add consecutive entries, not nonconsecutive ones), currentx has to contain the first value (or you need to insert more tests). For dates you will need to adapt the sript a little.

You can test it e.g with a file generated by gnuplot like this:

set table "file.gdat"
set parametric
plot [0:20] floor(exp(t/10)),t
unset table

Leave a Reply

Your email address will not be published. Required fields are marked *