# 生物統計研讀筆記 - Sample Distribution 樣本分佈

• ## Central Limit theorem

• When sampling from a non-normally distributed population with mean $\mu_X$ and variance $\sigma^2_X$
• the distribution of the sample mean (sampling distribution) will have the following attributes:
• The distribution of X’s will be approximately normal
• the approach to normality becoming better as the sample size increases.
• Generally, the sample size required for the sampling distribution of X to approach normality depends on the shape of the original distribution.
• Samples of 30 or more give very good normal approximations for this sampling distribution of X in nearly all situations.
• ## Conﬁdence Intervals for the Population Mean

• $\frac{\bar{X}-\mu}{\frac{\sigma}{\sqrt{n}}}$ satisfy the $Z$ distribution
• EX: $P\left(-1.960 \leq \frac{\bar{X}-\mu}{\frac{\sigma}{\sqrt{n}}} \leq 1.960\right)=0.95$ -> $C\left(\bar{X}-1.960 \frac{\sigma}{\sqrt{n}} \leq \mu \leq \bar{X}+1.960 \frac{\sigma}{\sqrt{n}}\right)=0.95$
• Confidence Interval requires:
• a point estimate, like the sample mean $\bar{X}$
• a measure of variability, like standard error of the mean $\frac{\sigma}{\sqrt{n}}$
• a desired level of confidence $1-\alpha$ , in this case $1-\alpha$ = 0.95, so $\alpha$ = 0.05
• and the sampling distribution of the point estimate
• the endpoints of the conﬁdence interval: $\text { point estimate } \pm \text { (confidence factor }) \text { (standard error })$
• ## Confidence Interval for Population Mean $\mu$

• If variance of population known: $Z$ distribution to derive the Confidence Interval
• If variance of population unknown: replace the population standard error with the sample standard error, $t$ distribution to derive the Confidence Interval
• ## Confidence interval for Population Variance $\sigma$

• $\frac{(n-1) s^2}{\sigma^2}$ satisfy the $\chi^2$ distribution
• $$P\left(\chi_{\frac{\alpha}{2}}^2 \leq \chi^2 \leq \chi_{1-\frac{\alpha}{2}}^2\right)=P\left[\chi_{\frac{\alpha}{2}}^2 \leq \frac{(n-1) s^2}{\sigma^2} \leq \chi_{1-\frac{\alpha}{2}}^2\right]=1-\alpha$$
• $$P\left[\frac{1}{\chi_{\frac{\alpha}{2}}^2} \geq \frac{\sigma^2}{(n-1) s^2} \geq \frac{1}{\chi_{1-\frac{\alpha}{2}}^2}\right]=1-\alpha$$
• $$C\left(\frac{(n-1) s^2}{\chi_{1-\frac{\alpha}{2}}^2} \leq \sigma^2 \leq \frac{(n-1) s^2}{\chi_{\frac{\alpha}{2}}^2}\right)=1-\alpha$$
• $$C\left(\sqrt{\frac{(n-1) s^2}{\chi_{1-\frac{\alpha}{2}}^2}} \leq \sigma \leq \sqrt{\frac{(n-1) s^2}{\chi_{\frac{\alpha}{2}}^2}}\right)=1-\alpha$$
• ## Confidence Interval for Population Proportion

• Consider the population proportion in a binomial distribution
• $$\hat{p}=\frac{X}{n}$$
• $$\sigma_{\hat{p}}^2=\frac{\sigma^2}{n}=\frac{p(1-p)}{n}$$
• $$L_1=\hat{p}-z_{1-\frac{\alpha}{2}} \sqrt{\frac{p(1-p)}{n}}$$
• $$L_2=\hat{p}+z_{1-\frac{\alpha}{2}} \sqrt{\frac{p(1-p)}{n}}$$
• ### Derive the necessary sample size

• Define the margin of error for a $(1-\alpha) 100 %$ conﬁdence interval for a population proportion to be
• $$m=z_{1-\frac{\alpha}{2}} \mathrm{SE}{\hat{p}}=z{1-\frac{\alpha}{2}} \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$$ - Using $\hat{p}$ = 0.5 is the conservative approach as it will produce an overestimate of $n$ .
• $$m \leq z_{1-\frac{\alpha}{2}} \sqrt{\frac{0.5(1-0.5)}{n}}$$
• $$m^2 \leq z_{1-\frac{\alpha}{2}}^2\left(\frac{0.25}{n}\right)$$
• $$n=\left(\frac{z_{1-\frac{\alpha}{2}}}{2 m}\right)^2$$
• Ex: If want a 95% confidence interval with a 2% margin of error, $\alpha$ = 0.05 and $m$ = 0.02.
• $$n=\left(\frac{z_{1-\frac{\alpha}{2}}^{2 m}}{2 m}\right)^2=\left(\frac{1.960}{2(0.02)}\right)^2=2401$$