Module #5 - Random Variables & Probability Distributions

This week's topics covered discrete random & continuous random variables and probability distributions; including discrete, binomial, continuous and normal distributions.

### Question # 1 : Consider a population consisting of the following values, which represents the number of ice cream purchases during the academic year for each of the five housemates: 8, 14, 16, 10, 11

1. #### Compute the mean of this population.

```  > x <- c(8, 14, 16, 10, 11)
> mean(x)
 11.8
```
1. #### Select a random sample of size 2 out of the five members. See the example used in my Power-point presentation slide # 13:

Possible Samples
8, 14 22 / 2 = 11
8, 16 24 / 2 = 12
8, 10 18 / 2 = 9
8, 11 19 / 2 = 9.5
14, 16 30 / 2 = 15
14, 10 24 / 2 = 12
14, 11 25 / 2 = 12.5
16, 10 26 / 2 = 13
16, 11 27 / 2 = 13.5
10, 11 21 / 2 = 10.5

```  > y <- sample(x,2)
> y
 16  8
```

1. #### Compute the mean and standard deviation of your sample.

```  > y <- sample(x,2)
> y
 16  8
> mean(y)
 12
> sd(y)
 5.656854
```

1. #### Compare the mean and standard deviation of your sample to the entire population of this set (8, 14, 16, 10, 11).

```  x̄ = 12
s = 5.656854

μ = 11.8
σ = 3.193744
```

### Question # 2 : Suppose that the sample size n = 100 and the population proportion p = 0.95.

1. #### Does the sample proportion p have approximately a normal distribution? Explain.

I am unsure how to answer this question. To my knowledge, the steps involved would require:
• Taking 100 random samples of size 2 from the array of 5 discrete values
• Graphically displaying the results in a histogram
• Comparing this visually to the bell-curve of a normal distribution
I do not know how to effectively accomplish this in R, but would hypothesize the distribution as not normal, due to the fact that it would be not be a continuous distribution.

1. #### What is the smallest value of n for which the sampling distribution of p is approximately normal?

I am unsure how to accomplish this using R, or otherwise. I would assume that no possible value of n would lead to a normal distribution, since the frequency distribution of the discrete values would not be continuous.