Chapter 15 Appendix A A sample session
The following session is intended to introduce to you some features of the R environment by using them. Many features of the system will be unfamiliar and puzzling at first, but this puzzlement will soon disappear.
-
Start R appropriately for your platform (see Invoking R).
-
The R program begins, with a banner.
(Within R code, the prompt on the left hand side will not be shown to avoid confusion.)
-
help.start()
-
Start the HTML interface to on-line help (using a web browser available at your machine). You should briefly explore the features of this facility with the mouse.
Iconify the help window and move on to the next part.
-
x <- rnorm(50)
y <- rnorm(x)
-
Generate two pseudo-random normal vectors of x- and y-coordinates.
-
plot(x, y)
-
Plot the points in the plane. A graphics window will appear automatically.
-
ls()
-
See which R objects are now in the R workspace.
-
rm(x, y)
-
Remove objects no longer needed. (Clean up).
-
x <- 1:20
-
Make x = (1, 2, …, 20).
-
w <- 1 + sqrt(x)/2
-
A ‘weight’ vector of standard deviations.
-
dummy <- data.frame(x=x, y= x + rnorm(x)*w)
dummy
-
Make a data frame of two columns, x and y, and look at it.
-
fm <- lm(y ~ x, data=dummy)
summary(fm)
-
Fit a simple linear regression and look at the analysis. With
y
to the left of the tilde, we are modelling y dependent on x. -
fm1 <- lm(y ~ x, data=dummy, weight=1/w^2)
summary(fm1)
-
Since we know the standard deviations, we can do a weighted regression.
-
attach(dummy)
-
Make the columns in the data frame visible as variables.
-
lrf <- lowess(x, y)
-
Make a nonparametric local regression function.
-
plot(x, y)
-
Standard point plot.
-
lines(x, lrf\(y)</code></dt> <dd><p>Add in the local regression.</p> </dd> <dt><code class="calibre2">abline(0, 1, lty=3)</code></dt> <dd><p>The true regression line: (intercept 0, slope 1).</p> </dd> <dt><code class="calibre2">abline(coef(fm))</code></dt> <dd><p>Unweighted regression line.</p> </dd> <dt><code class="calibre2">abline(coef(fm1), col = "red")</code></dt> <dd><p>Weighted regression line.</p> </dd> <dt><code class="calibre2">detach()</code></dt> <dd><p>Remove data frame from the search path.</p> </dd> <dt><code class="calibre2">plot(fitted(fm), resid(fm),</code><br /> <code class="calibre2"> xlab="Fitted values",</code><br /> <code class="calibre2"> ylab="Residuals",</code><br /> <code class="calibre2"> main="Residuals vs Fitted")</code></dt> <dd><p>A standard regression diagnostic plot to check for heteroscedasticity. Can you see it?</p> </dd> <dt><code class="calibre2">qqnorm(resid(fm), main="Residuals Rankit Plot")</code></dt> <dd><p>A normal scores plot to check for skewness, kurtosis and outliers. (Not very useful here.)</p> </dd> <dt><code class="calibre2">rm(fm, fm1, lrf, x, dummy)</code></dt> <dd><p>Clean up again.</p> </dd> </dl> <p>The next section will look at data from the classical experiment of Michelson to measure the speed of light. This dataset is available in the <code class="calibre2">morley</code> object, but we will read it to illustrate the <code class="calibre2">read.table</code> function.</p> <dl> <dt><code class="calibre2">filepath <- system.file("data", "morley.tab" , package="datasets")</code><br /> <code class="calibre2">filepath</code></dt> <dd><p>Get the path to the data file.</p> </dd> <dt><code class="calibre2">file.show(filepath)</code></dt> <dd><p>Optional. Look at the file.</p> </dd> <dt><code class="calibre2">mm <- read.table(filepath)</code><br /> <code class="calibre2">mm</code></dt> <dd><p>Read in the Michelson data as a data frame, and look at it. There are five experiments (column <code class="calibre2">Expt</code>) and each has 20 runs (column <code class="calibre2">Run</code>) and <code class="calibre2">sl</code> is the recorded speed of light, suitably coded.</p> </dd> <dt><code class="calibre2">mm\)Expt <- factor(mm\(Expt)</code><br /> <code class="calibre2">mm\)Run <- factor(mm$Run)
-
Change
Expt
andRun
into factors. -
attach(mm)
-
Make the data frame visible at position 3 (the default).
-
plot(Expt, Speed, main=“Speed of Light Data”, xlab=“Experiment No.”)
-
Compare the five experiments with simple boxplots.
-
fm <- aov(Speed ~ Run + Expt, data=mm)
summary(fm)
-
Analyze as a randomized block, with ‘runs’ and ‘experiments’ as factors.
-
fm0 <- update(fm, . ~ . - Run)
anova(fm0, fm)
-
Fit the sub-model omitting ‘runs’, and compare using a formal analysis of variance.
-
detach()
rm(fm, fm0)
-
Clean up before moving on.
We now look at some more graphical features: contour and image plots.
-
x <- seq(-pi, pi, len=50)
y <- x
-
x is a vector of 50 equally spaced values in the interval [-pi, pi]. y is the same.
-
f <- outer(x, y, function(x, y) cos(y)/(1 + x^2))
-
f is a square matrix, with rows and columns indexed by x and y respectively, of values of the function cos(y)/(1 + x^2).
-
oldpar <- par(no.readonly = TRUE)
par(pty=“s”)
-
Save the plotting parameters and set the plotting region to “square”.
-
contour(x, y, f)
contour(x, y, f, nlevels=15, add=TRUE)
-
Make a contour map of f; add in more lines for more detail.
-
fa <- (f-t(f))/2
-
fa
is the “asymmetric part” of f. (t()
is transpose). -
contour(x, y, fa, nlevels=15)
-
Make a contour plot, …
-
par(oldpar)
-
… and restore the old graphics parameters.
-
image(x, y, f)
image(x, y, fa)
-
Make some high density image plots, (of which you can get hardcopies if you wish), …
-
objects(); rm(x, y, f, fa)
-
… and clean up before moving on.
R can do complex arithmetic, also.
-
th <- seq(-pi, pi, len=100)
z <- exp(1i*th)
-
1i
is used for the complex number i. -
par(pty=“s”)
plot(z, type=“l”)
-
Plotting complex arguments means plot imaginary versus real parts. This should be a circle.
-
w <- rnorm(100) + rnorm(100)*1i
-
Suppose we want to sample points within the unit circle. One method would be to take complex numbers with standard normal real and imaginary parts …
-
w <- ifelse(Mod(w) > 1, 1/w, w)
-
… and to map any outside the circle onto their reciprocal.
-
plot(w, xlim=c(-1,1), ylim=c(-1,1), pch=“+”,xlab=“x”, ylab=“y”)
lines(z)
-
All points are inside the unit circle, but the distribution is not uniform.
-
w <- sqrt(runif(100))exp(2pirunif(100)1i)
plot(w, xlim=c(-1,1), ylim=c(-1,1), pch=“+”, xlab=“x”, ylab=“y”)
lines(z)
-
The second method uses the uniform distribution. The points should now look more evenly spaced over the disc.
-
rm(th, w, z)
-
Clean up again.
-
q()
-
Quit the R program. You will be asked if you want to save the R workspace, and for an exploratory session like this, you probably do not want to save it.