Wednesday, November 18, 2015

R - How to read web logs

R - is one of the most powerful computing tool and it's free, I really wanted to try it out for sometime now.

Got a recent problem, had to read Apache access logs, there are a few apache viewers available, but either they are paid or it is a pain to configure them.

I looked at R, and it was pretty easy to get this going.

I used RStudio for windows, found it really usefull.

1. User read.table() to read the apache log file, please note on WINDOWS, you will have to specify path with double slashes \\, example

df = read.table('C\\Users\\brij\\Desktop\\ccsites.access_log')

at this point if you do head(df) you can see you data vectors





2. You can define your names for columns

 colnames(df)=c("host","ident1","ident2","date","time","request","status","duration")

3. Next, format the DATE

df$date=as.Date(df$date,"[%d/%b/%Y")

4. Plot and see some charts :-)

reqs=as.data.frame(table(df$date))

ggplot(data=reqs, aes(x=as.Date(Var1), y=Freq)) + geom_line() + xlab("Date") + ylab("Requests") + theme(legend.position = "none")





Nice and simple ...