## Wednesday, 21 September 2011

### Simple time series plot using R : Part 1

As a task for my Financial eco assignment I had to plot a simple time series of the overnight MIBOR(Mumbai interbank offer rates) for the past one year . The job could very well have been done easily in MS-Excel but I choose to plot it in R instead and the quality of the graph, pixel-wise and neatness wise, was way better than what I could have obtained with MS-Excel. All this at the cost of a minimal 3 lines of code:

# The overnight MIBOR rates were stored in a file name "Call_Rates_2011.csv", this is just a normal Excel file saved in a CSV(comma separated delimited) format that R can read.
# The way in which R conceptualizes the data is similar to that in Excel, to draw a simple analogy you can assume that the variable "a" now stores the entire Excel spreed sheet in it.
# You will have to make sure that the working directory is the one that contains the file "Call_Rates_2011.csv"

# The 2 column headers in my CSV file were "date" and "mibor", so the below code plots "date" on the x-axis and "mibor" on the y-axis. The as.Date() tells R that the column "date" contains dates in the format "day-month-year"('%d-%b-%y').
# a\$(column header) is the standard way of referring to a column in the "spreadsheet" contained in "a"
# xlab : x-axis label
# ylab : y- axis label
# type : line(l)
# col : color of the line

plot(as.Date(a\$date,'%d-%b-%y'), a\$mibor, xlab= "Months", ylab= "MIBOR overnight rates(percentage)", type='l', col='red')

# This is to get the titles in place
# main : main title
# col.main : color of the main title
#font.main : font size of the title

title(main="Overnight MIBOR rates for last one year", col.main="black", font.main=4)

And the plot hence obtained thus looks like:

Incase you can't make out the difference in the quality of the plot obtained just drop in your comments and email address and I will mail you the pdf and the jpg image of the plot. You can pull/stretch it to see that the pixels don't get distorted and it looks way neater if you present it in your slide in a presentation.

1. I don't know why I have not found this blogspot earlier.

Very detailed and clear explanations.

Way to go, Shreyes.

2. Shreyes, I tried the exact same R codes you provided in this tutorial and following is what the console displayed besides not showing any graph:

Error in plot.window(...) : need finite 'xlim' values
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf

P.S: I used the MIBOR .csv file you provided in one of your other tutorials.

Thank you.

3. GUEYE, If you used the MIBOR.csv file provided by me in http://programming-r-pro-bro.blogspot.in/search/label/Principal%20component%20analysis this post, then you would get this error because that file contains "#NA". In the same post I have used the approx() function to replace these "#NA" with a linear interpolation.

Since R does not know how to deal with these values it assigns garbage values "inf" instead. Try doing this before you go ahead with the plot:

a[, 2] <- approx(as.Date(a\$date, '%d-%b-%y'), a[ ,2], as.Date(a\$date, '%d-%b-%y'))\$y

It should work now. Let me know if you still face any problem.

~
Shreyes

1. Hi Shreyes, I got the same problem with my data. I tried your fix code but it pops the error: need at least two non-NA values to interpolate. Any idea?

2. What is a[,2], and what is "\$y" in the above example? Can/should the '%d-%b-%y' string be changed to the "%m/%d/%Y %H:%M" string if I'm using the as.POSIXlt example below?

4. Hi Shreyas,
I am still beginning with R.
I have a CSV file like this
time,data
01/29/2013 19:26:04,110.087103
01/29/2013 19:28:04,56.978100
01/29/2013 19:30:04,91.755860
01/29/2013 19:32:04,66.255792
01/29/2013 19:34:04,86.740205
01/29/2013 19:36:04,99.137451
01/29/2013 19:38:04,68.836168
01/29/2013 19:40:04,106.748553
01/29/2013 19:42:04,39.968326
01/29/2013 19:44:04,61.309700

How to plot the graph using this data ?

Also can R take huge CSV files, I have a CSV file, hwich has 10000 lines.

Thanks

A

1. Try this:

x\$new_date <- as.POSIXlt(x\$Date, format= "%m/%d/%Y %H:%M")

Since your date is in date time format you need to specify that to R first. See if you are able to plot the values after this.(using the x\$new_date instead of x\$Date)

5. Thanks Shreyes, I will try this.

Also, Can R take large sets of CSV data ?

I am a SAN engineer by profession and the performance data we get is huge.

I am trying to make the graph generation automated, as I have already arranged the data as CSV using Ruby programming language

Regards,
A

1. R does have a shortcoming of using the physical memory to create/dump its objects, having said that I don't think 10,000 rows of data should pose any problem. However, as the size of the data increases, say greater than 40-50Mb, depending on your machine it might start acting funny.

Revolution R seems to have a package RevoScale that breaks the data into chunks and is proficient when it comes to dealing with large datasets, but that is as far as my knowledge goes about Rev R, I am trying to get familiar with Rev R and am not a pro at that.

Hope this helps.

2. Hi Shreyes,

Thanks. I tried the script. though the graph got created, it is not very clear.

X axis wasn't in the proper format.

and Lines didn't come properly. Still it is good learning and I want to develop on this.

best Regards,

Athreya

6. Hi Shreyes,

I got the graph, however X axis is all messed. I think all the characters are overlapped

png(file="latency.png",width=1000,height=350,res=72)
d\$new_date <- as.POSIXlt(d\$date, format= "%m/%d/%Y %H:%M")
plot(d\$new_date,d\$iops,xaxt='n',xlab= "Dates",ylab="iops", type="l", col='red')
axis.POSIXct(1,at=d\$new_date,labels=format(d\$date, format= "%m/%d/%Y %H:%M"),las=2)
q()

This is how the file looks like

01/31/2013 22:26:05,36.953642
01/31/2013 22:28:17,82.334787
01/31/2013 22:30:28,89.057602
01/31/2013 22:32:38,105.279861
01/31/2013 22:34:38,69.626364
01/31/2013 22:36:57,68.110564
01/31/2013 22:39:24,122.304370
01/31/2013 22:41:32,67.490331
01/31/2013 22:43:56,107.949942
01/31/2013 22:46:26,93.248857
01/31/2013 22:48:32,70.976643
01/31/2013 22:52:42,142.202259
01/31/2013 22:54:56,84.722920
01/31/2013 22:57:10,41.355000

regards,
A

7. How do you show the date labels instead of the month labels

8. Hi Shreyes,
This was really very helpful. It worked for me as you shown your instructions above. I have a question:

I have groups of record in my csv file. I want to create multiple graph in same window with different color for different groups. For Example, I have N1 record of group A at different timepoint and N2 for B at different time points. I want to have two graphs in same window with Red and green color. Is it possible? Is it possible to give the option “group by” in below example?

plot(as.Date(Lines\$TIMEPOINT), Lines\$RESULT_VALUE, xlab= "Time In Week.", ylab= "Result Value AT Specific Week.", type='l', col='red')

I saw there are different kinds of graph but this method was very simple. Thanks in advance for your help.

Regards,
Piyush

00-00-0C-12-43-02 2/21/2015 0:00 GOOD
00-00-0C-12-43-02 2/21/2015 1:00 GOOD
00-00-0C-12-43-02 2/21/2015 2:00 CRASH
00-00-0C-12-43-02 2/21/2015 3:00 CRASH
00-00-0C-12-43-02 2/21/2015 4:00 ERROR
00-00-0C-12-43-02 2/21/2015 5:00 WARN
00-00-0C-12-43-02 2/21/2015 6:00 GOOD
00-00-0C-12-43-02 2/21/2015 7:00 CRASH
00-00-0C-12-43-02 2/21/2015 8:00 GOOD
00-00-0C-12-43-02 2/21/2015 9:00 GOOD
00-00-0C-12-43-02 2/21/2015 10:00 GOOD
00-00-0C-12-43-02 2/21/2015 11:00 GOOD
00-00-0C-12-43-02 2/21/2015 12:00 GOOD
00-00-0C-12-43-02 2/21/2015 13:00 ERROR
00-00-0C-12-43-02 2/21/2015 14:00 ERROR
00-00-0C-12-43-02 2/21/2015 15:00 GOOD
00-00-0C-12-43-02 2/21/2015 16:00 GOOD
00-03-E0-43-11-19 2/21/2015 0:00 GOOD
00-03-E0-43-11-19 2/21/2015 1:00 GOOD
00-03-E0-43-11-19 2/21/2015 2:00 CRASH
00-03-E0-43-11-19 2/21/2015 3:00 ERROR
00-03-E0-43-11-19 2/21/2015 4:00 GOOD
00-03-E0-43-11-19 2/21/2015 5:00 ERROR
00-03-E0-43-11-19 2/21/2015 6:00 ERROR
00-03-E0-43-11-19 2/21/2015 7:00 GOOD
00-03-E0-43-11-19 2/21/2015 8:00 GOOD
00-03-E0-43-11-19 2/21/2015 9:00 GOOD
00-03-E0-43-11-19 2/21/2015 10:00 GOOD
00-03-E0-43-11-19 2/21/2015 11:00 GOOD
00-03-E0-43-11-19 2/21/2015 12:00 GOOD
00-03-E0-43-11-19 2/21/2015 13:00 ERROR
00-03-E0-43-11-19 2/21/2015 14:00 GOOD
00-03-E0-43-11-19 2/21/2015 15:00 GOOD
00-03-E0-43-11-19 2/21/2015 16:00 ERROR

10. i want to know count how many time nifty cross 8200 level i have csv file with me but i dont knw to put formula in R for that please anybody help me.

11. I want to plot a continuous time series for the data in this format
Jan Feb...........Dec
1990
1991
1992
.
.
.
.
2010

I want to have a script that will plot one year followed by the other i.e. starts from Jan-Dec and continued from the next year.

12. Hey,
How to plot the log of one column versus another.