R: Difference between revisions
No edit summary |
No edit summary |
||
Line 66: | Line 66: | ||
<pre>acf(d1.004.filter_1min,lag.max=length(d1.004.filter_1min))</pre> | <pre>acf(d1.004.filter_1min,lag.max=length(d1.004.filter_1min))</pre> | ||
We can also do a crosscorelation, where two timeseries are compared to time lagged versions of each other rather than themselves. When we see a periodic pattern in this case it means that there is some temporal commonality between them. Usually this means one series contributes components to another. | |||
<pre>ccf(d1.001.filter_1min,d1.004.filter_1min,lag.max=80000)</pre> |
Latest revision as of 01:21, 8 December 2006
R is a statistical analysis and visualization package similar to the commercial "S".
I used R to load the data files created by the python program and create the plots.
Command Summary
You can get help on R functions:
help.search("anova")
help(plot)
Load the time series Library into R:
library(tseries)
Load the python data file: ("d1-001" is the code I used for episodes, it refers to disk-1, title 1 of the DVD.
d1.001 = read.ts(file="data/d1-001.log")
Get some descriptives on the data:
summary(d1.001)
Do a line plot of the data in blue, with custom labels for the plot (main) and the Y Axix (ylab)
plot(d1.001,type="l",col="blue",ylab="Pixel mean of interframe difference",main="mean(csi) : Season 1, Epsiode 1")
Plot a nice histogram of the data with 3000 bins, for x values between 0 and 30:
hist(d1.001,col="blue",breaks=3000,xlim=c(0,30))
Find the location of one click of the mouse:
locator(1)
Save a 11 point plots vertically to a single PDF with 0.3 inch margins on all sides:
pdf("Season-1-raw.pdf",width=8.5,height=11) # create a PDF file rather than plotting to screen. par(mfrow=c(11,1),mai=c(0.3,0.3,0.3,0.3)) # set the page to 11 rows, 1 column, and 0.3 margins. plot(d1.001,type="p",pch=".") plot(d1.004,type="p",pch=".") plot(d1.007,type="p",pch=".") plot(d1.010,type="p",pch=".") plot(d2.001,type="p",pch=".") plot(d2.004,type="p",pch=".") plot(d2.007,type="p",pch=".") plot(d2.010,type="p",pch=".") plot(d3.001,type="p",pch=".") plot(d3.004,type="p",pch=".") plot(d3.007,type="p",pch=".") plot(d3.010,type="p",pch=".") dev.off() # close the file
Do a linear filter of the data, using a moving average function with equal weights on all coefficients, using blocks of 1min (3600 frames). This gives us the following coefficient: 1/(2*3600+1)=7201. "filter()" applies the function where the coeffecients are 1/7201 copied 7201 times. The repetion is done by the rep() function. The results of the filter are stored in the variable d1.004.filter_1min where any NA's (missing values) returned by the filter are ignored using na.exclude().
d1.004.filter_1min = na.exclude(filter(d1.004,filter=rep(1/7201,7201)))
We can write this filtered timeseries to disk:
write(d1.004.filter_1min,file="data/d1-004.filter_1min.data",ncolumns=1)
To autocorrelate (for all possible lag times) the filtered data from one episode to another you can use the "acf()" function. This makes a copy of the time series and time shifts it. It compares the correlation between the timeseries and the lagged version of itself. If you see a repeating pattern in the result then there us a periodic component for that particular lag.
acf(d1.004.filter_1min,lag.max=length(d1.004.filter_1min))
We can also do a crosscorelation, where two timeseries are compared to time lagged versions of each other rather than themselves. When we see a periodic pattern in this case it means that there is some temporal commonality between them. Usually this means one series contributes components to another.
ccf(d1.001.filter_1min,d1.004.filter_1min,lag.max=80000)