R: Difference between revisions

From Research
Jump to navigation Jump to search
Bbogart (talk | contribs)
No edit summary
Bbogart (talk | contribs)
No edit summary
Line 3: Line 3:
I used R to load the data files created by the python program and create the plots.  
I used R to load the data files created by the python program and create the plots.  


Here is a summary of the commands I used:  
== Command Summary ==
 
You can search the descriptions of R functions:
 
<pre>help.search("anova")</pre>
 
<pre>help(plot)</pre>


Load the time series Library into R:
Load the time series Library into R:

Revision as of 01:05, 8 December 2006

R is a statistical analysis and visualization package similar to the commercial "S".

I used R to load the data files created by the python program and create the plots.

Command Summary

You can search the descriptions of R functions:

help.search("anova")
help(plot)

Load the time series Library into R:

library(tseries)

Load the python data file: ("d1-001" is the code I used for episodes, it refers to disk-1, title 1 of the DVD.

d1.001 = read.ts(file="data/d1-001.log")

Get some descriptives on the data:

summary(d1.001)

Do a line plot of the data in blue, with custom labels for the plot (main) and the Y Axix (ylab)

plot(d1.001,type="l",col="blue",ylab="Pixel mean of interframe difference",main="mean(csi) : Season 1, Epsiode 1")

Plot a nice histogram of the data with 3000 bins, for x values between 0 and 30:

hist(d1.001,col="blue",breaks=3000,xlim=c(0,30))

Find the location of one click of the mouse:

locator(1)

Save a 11 point plots vertically to a single PDF with 0.3 inch margins on all sides:

pdf("Season-1-raw.pdf",width=8.5,height=11) # create a PDF file rather than plotting to screen.
par(mfrow=c(11,1),mai=c(0.3,0.3,0.3,0.3))   # set the page to 11 rows, 1 column, and 0.3 margins.
plot(d1.001,type="p",pch=".")
plot(d1.004,type="p",pch=".")
plot(d1.007,type="p",pch=".")
plot(d1.010,type="p",pch=".")
plot(d2.001,type="p",pch=".")
plot(d2.004,type="p",pch=".")
plot(d2.007,type="p",pch=".")
plot(d2.010,type="p",pch=".")
plot(d3.001,type="p",pch=".")
plot(d3.004,type="p",pch=".")
plot(d3.007,type="p",pch=".")
plot(d3.010,type="p",pch=".")
dev.off() # close the file

Do a linear filter of the data, using a moving average function with equal weights on all coefficients, using blocks of 1min (3600 frames). This gives us the following coefficient: 1/(2*3600+1)=7201. "filter()" applies the function where the coeffecients are 1/7201 copied 7201 times. The repetion is done by the rep() function. The results of the filter are stored in the variable d1.004.filter_1min where any NA's (missing values) returned by the filter are ignored using na.exclude().

d1.004.filter_1min = na.exclude(filter(d1.004,filter=rep(1/7201,7201)))

We can write this filtered timeseries to disk:

write(d1.004.filter_1min,file="data/d1-004.filter_1min.data",ncolumns=1)