---
title: "Examples"
output: pdf_document
date: "Wednesday, Feb 17, 2015"
---
# An Example for R Markdown
This is an example of using **r markdown** to produce an *HTML* page from a _Markdown_ document.
**r markdown** embeds r codes onto a _markdown_ document.
## A simple data analysis with R
```{r, echo=FALSE, message=FALSE, warning=FALSE}
## load all needed packages at the beginning of the document
# install.packages("rmarkdown")
# library(rmarkdown)
library(ggplot2)
````
```{r}
yctl <- c(4.17, 5.58, 5.18, 6.11, 4.50, 4.61, 5.17, 4.53, 5.33, 5.14)
ytrt <- c(4.81, 4.17, 4.41, 3.59, 5.87, 3.83, 6.03, 4.89, 4.32, 4.69)
trt <- c(rep(0, 10), rep(1, 10))
weight <- c(yctl, ytrt)
lm.1 <- lm(weight ~ trt)
lm.0 <- lm(weight ~ 1)
summary(lm.1)
```
#### Formulate a hypothesis:
$$H_0 : \beta_1 = 0$$
$$H_1 : \beta_1 \neq 0$$
$$
\begin{aligned}
H_0 : \beta_1 &= 0\\
H_1 : \beta_1 &\neq 0
\end{aligned}
$$
#### Result:
We cannot reject $H_0$.
Comparing those treated to controls, the estimated mean weight difference is `r lm.1$coef[2]`, but this weight difference is not statistically significantly different from 0. (p value > 0.05).
Variable |Estimate |Std. Error| t value |p-value
------------|---------|----------|---------|--------
(Intercept) | 5.0320 | 0.2202| 22.850 |9.55e-15
trt | -0.3710 | 0.3114| -1.191 | 0.249
```{r, echo=F}
anova(lm.0, lm.1)
```
## R code chunks
Now we write some code chunks in this markdown file:
```{r }
x <- 1+1 # a simple calculator
```
```{r displaying}
set.seed(123)
rnorm(5) # boring random numbers
```
## Inline R code and Mathematics expression
Inline R code is also supported, e.g. the value of `x` is `r x`.
The mean of the numbers 2,3,4 is `r mean(c(2,3,4))`.
$5 \times \pi$ = `r 5*pi`.
## Plots
We can also produce plots:
```{r, eval=FALSE, echo=FALSE}
dim(cars)
head(cars)
```
```{r, fig.width = 4, fig.height = 3}
# first attempt
plot(cars)
```
```{r, out.width = '\\maxwidth', eval=FALSE, echo=FALSE}
# second attempt
plot(cars)
```
```{r, fig.width=4, fig.height=3, message =F}
# third attempt
qplot(speed, dist, data=cars) + geom_smooth()
```
### Problem 6. [10 points]
Conduct a simulation to empirically demonstrate the properties of the LSE $\beta_1$; specifically, illustrate the findings that $\hat{\beta_1}$ is unbiased and has variance $\frac{\sigma^2}{S_{xx}}$. A suggested structure for your simulation is:
1. Set parameter values of your choice for $\beta_0, \beta_1, \sigma^2_{\epsilon}$ and choose a sample size $n$.
2. Select values for $x_i$ (recall these are assumed to be fixed in Problem 3b)
3. Repeat these steps:
+ Simulate errors $\epsilon_i$ from a mean-zero distribution with variance $\sigma^2_{\epsilon}$.
+ Construct observed responses $y_i$ according to the simple linear regression model.
+ Fit the simple linear regression and save the estimated slope $\hat{\beta}_1$.
4. Repeat step (3) many times to find the empirical distribution of $\hat{\beta}_1$. Report the mean and variance of the estimates across simulated datasets and provide a histogram or other visual display.
#### Solution:
For code implementing this simulation exercise, please refer to the .Rmd file that generates this report; here, we note the relevant quantities and disucss the results of the simulation. Because simulations are computationally demanding, this code chunk uses the _cache=TRUE_ option, so that the code is only executed once and the results are saved.
```{r Problem6Simulation, echo=FALSE, cache=TRUE}
## set seed to ensure reproducibility
set.seed(12345)
## define betas, sigma^2 and the x values
beta0 = 3
beta1 = 3
sigma2 = 5
x.p6 = 1:20
nrep = 1000
## create a vector to store estimated slopes
beta1hat.p6 = rep(NA, nrep)
## use a for loop to simulate
for(i in 1:nrep){
## create y's using the SLR form, adding simulated errors that are
## unique to this iteration.
y = beta0 + beta1*x.p6 + rnorm(20, 0, sd = sqrt(sigma2))
## fit a linear regression model and save the estimated slope in
## the vector defined above.
beta1hat.p6[i] = lm(y ~ x.p6)$coef[2]
}
```
For this simulation we used $n=20$, $\beta_0 = 3$, $\beta_1 = 3$, $x = 1, 2, \ldots, 20$, $\sigma^2 = 5$ and generated errors using a normal distribution.
From problem 3, we expect that $E(\hat{\beta}_1) = 3$ and $Var(\hat{\beta}_1) = \frac{\sigma^2}{ \sum_{i=1}^n (x_i-\bar x)^2} = 0.00752$. From 1000 simulated datasets, we had an empirical mean `r mean(beta1hat.p6)` and variance `r var(beta1hat.p6)`. A density plot of estimated coefficients across all simulations is shown below.
```{r, echo=FALSE, fig.width=4, fig.height=4}
## plot of estimated slopes
beta1hat = as.data.frame(beta1hat.p6)
ggplot(beta1hat, aes(x = beta1hat.p6)) + geom_density(fill = "blue", alpha = .2)
```
# Key Formatting Constructs
The key formatting constructs are discussed at .
To make it go on another line, add two spaces after the previous line.
## Emphasis
This is *italic*. This is **bold**.
## Superscripts
This is y^2^.
## Lists
### Unordered
* Item 1
* Item 2
+ Item 2a
+ Item 2b
### Ordered
1. Item 1
2. Item 2
3. Item 3
+ Item 3a
+ Item 3b
## Block Quotes
A friend once said:
> It's always better to give than to receive.
> $H_0 : \beta_1 = 0$
$H_1 : \beta_1 \neq 0$
## Displaying Blocks of Code Without Evaluating
In some situations, you want to display R code but not evaluate it. Here is an example of how you format.
```
This text is displayed verbatim.
```
## Math
We can embed LaTeX math expression into R markdown:
$$f(\alpha, \beta) \propto x^{\alpha-1}(1-x)^{\beta-1}$$.
## Conclusion
Markdown is easy to write. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. R markdown combines regular text, html, latex, R, and other stuff, and is a useful tool.
For more details on using R Markdown see .
## Some LaTeX Basics
In this section, we show you some rudiments of the LaTeX typesetting language.
### Subscripts and Superscripts
To indicate a subscript, use the underscore `_` character. To indicate a superscript, use a single caret character `^`. Note: this can be confusing, because the R Markdown language delimits superscripts with two carets. In LaTeX equations, a single caret indicates the superscript.
### Square Roots
We indicate a square root using the `\sqrt` operator.
```
$$\sqrt{b^2 - 4ac}$$
```
$$\sqrt{b^2 - 4ac}$$
### Alligned equations
$$
\begin{aligned}
\dot{x} & = \sigma(y-x)\\
\dot{y} & = \rho x-y -xz\\
\dot{z} & = -\beta z +xy
\end{aligned}
$$
### Fractions
Displayed fractions are typeset using the `\frac` operator.
```
$$\frac{4z^3}{16}$$
```
$$\frac{4z^3}{16}$$
### Summation Expressions
Here is an example.
```
$$\sum_{i=1}^{n} X^3_i$$
```
$$\sum_{i=1}^{n} X^3_i$$
### Parentheses
In LaTeX, you can create parentheses, brackets, and braces which size themselves automatically to contain large expressions. You do this using the `\left` and `\right` operators. Here is an example
```
$$\sum_{i=1}^{n} \left( \frac{X_i}{Y_i} \right)$$
```
$$\sum_{i=1}^{n} \left( \frac{X_i}{Y_i} \right)$$
### Greek Letters
Many statistical expressions use Greek letters. Much of the Greek alphabet is implemented in LaTeX.
```
$$\alpha, \beta, \gamma, \Gamma$$
```
$$\alpha, \beta, \gamma, \Gamma$$
### Special Symbols
All common mathematical symbols are implemented, and you can find a listing on the LaTeX cheat sheet.
```
$$a \pm b$$
$$x \ge 15$$
```
$$a \pm b$$
$$x \ge 15$$
### Special Functions
LaTeX typesets special functions in a different font from mathematical variables. These functions, such as $\sin$, $\cos$, etc. are indicated in LaTeX with a backslash. Here is an example that also illustrates how to typeset an integral.
```
$$\int_0^{2\pi} \sin x~dx$$
```
$$\int_0^{2\pi} \sin x~dx$$
### Matrices
Matrics are presented in the `array` environment. One begins with the statement
`\begin{array}` and ends with the statement `\end{array}`. Following the opening statement, a format code is used to indicate the formatting of each column. In the example below, we use the code `{rrr}` to indicate that each column is right justified. Each row is then entered, with cells separated by the `&` symbol, and each line (except the last) terminated by `\\`.
```
$$\begin{array}
{rrr}
1 & 2 & 3 \\
4 & 5 & 6 \\
7 & 8 & 9
\end{array}
$$
```
$$\begin{array}
{rrr}
1 & 2 & 3 \\
4 & 5 & 6 \\
7 & 8 & 9
\end{array}
$$
In math textbooks, matrices are often surrounded by brackets, and are assigned to a boldface letter. Here is an example
```
$$\mathbf{X} = \left[\begin{array}
{rrr}
1 & 2 & 3 \\
4 & 5 & 6 \\
7 & 8 & 9
\end{array}\right]
$$
```
$$\mathbf{X} = \left[\begin{array}
{rrr}
1 & 2 & 3 \\
4 & 5 & 6 \\
7 & 8 & 9
\end{array}\right]
$$
```{r test-python, engine='python'}
x = 'hello, python world!'
print(x)
print(x.split(' '))
```