生物統計研讀筆記 - Sample Distribution 樣本分佈
目錄
正在學習的生物統計中的樣本分佈的重點摘錄。
Central Limit theorem
- When sampling from a non-normally distributed population with mean $\mu_X$ and variance $\sigma^2_X$
- the distribution of the sample mean (sampling distribution) will have the following attributes:
- The distribution of X’s will be approximately normal
- the approach to normality becoming better as the sample size increases.
- Generally, the sample size required for the sampling distribution of X to approach normality depends on the shape of the original distribution.
- Samples of 30 or more give very good normal approximations for this sampling distribution of X in nearly all situations.
- The distribution of X’s will be approximately normal
Confidence Intervals for the Population Mean
- $\frac{\bar{X}-\mu}{\frac{\sigma}{\sqrt{n}}}$ satisfy the $Z$ distribution
- EX: $P\left(-1.960 \leq \frac{\bar{X}-\mu}{\frac{\sigma}{\sqrt{n}}} \leq 1.960\right)=0.95$ -> $C\left(\bar{X}-1.960 \frac{\sigma}{\sqrt{n}} \leq \mu \leq \bar{X}+1.960 \frac{\sigma}{\sqrt{n}}\right)=0.95$
- Confidence Interval requires:
- a point estimate, like the sample mean $\bar{X}$
- a measure of variability, like standard error of the mean $\frac{\sigma}{\sqrt{n}}$
- a desired level of confidence $1-\alpha$ , in this case $1-\alpha$ = 0.95, so $\alpha$ = 0.05
- and the sampling distribution of the point estimate
- the endpoints of the confidence interval: $\text { point estimate } \pm \text { (confidence factor }) \text { (standard error })$
Confidence Interval for Population Mean $\mu$
- If variance of population known: $Z$ distribution to derive the Confidence Interval
- If variance of population unknown: replace the population standard error with the sample standard error, $t$ distribution to derive the Confidence Interval
Confidence interval for Population Variance $\sigma$
- $\frac{(n-1) s^2}{\sigma^2}$ satisfy the $\chi^2$ distribution
- $$P\left(\chi_{\frac{\alpha}{2}}^2 \leq \chi^2 \leq \chi_{1-\frac{\alpha}{2}}^2\right)=P\left[\chi_{\frac{\alpha}{2}}^2 \leq \frac{(n-1) s^2}{\sigma^2} \leq \chi_{1-\frac{\alpha}{2}}^2\right]=1-\alpha$$
- $$P\left[\frac{1}{\chi_{\frac{\alpha}{2}}^2} \geq \frac{\sigma^2}{(n-1) s^2} \geq \frac{1}{\chi_{1-\frac{\alpha}{2}}^2}\right]=1-\alpha$$
- $$C\left(\frac{(n-1) s^2}{\chi_{1-\frac{\alpha}{2}}^2} \leq \sigma^2 \leq \frac{(n-1) s^2}{\chi_{\frac{\alpha}{2}}^2}\right)=1-\alpha$$
- $$C\left(\sqrt{\frac{(n-1) s^2}{\chi_{1-\frac{\alpha}{2}}^2}} \leq \sigma \leq \sqrt{\frac{(n-1) s^2}{\chi_{\frac{\alpha}{2}}^2}}\right)=1-\alpha$$
Confidence Interval for Population Proportion
- Consider the population proportion in a binomial distribution
- $$\hat{p}=\frac{X}{n}$$
- $$\sigma_{\hat{p}}^2=\frac{\sigma^2}{n}=\frac{p(1-p)}{n}$$
- $$L_1=\hat{p}-z_{1-\frac{\alpha}{2}} \sqrt{\frac{p(1-p)}{n}}$$
- $$L_2=\hat{p}+z_{1-\frac{\alpha}{2}} \sqrt{\frac{p(1-p)}{n}}$$
Derive the necessary sample size
- Define the margin of error for a $(1-\alpha) 100 %$ confidence interval for a population proportion to be
- $$ m=z_{1-\frac{\alpha}{2}} \mathrm{SE}{\hat{p}}=z{1-\frac{\alpha}{2}} \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} $$ - Using $\hat{p}$ = 0.5 is the conservative approach as it will produce an overestimate of $n$ .
- $$ m \leq z_{1-\frac{\alpha}{2}} \sqrt{\frac{0.5(1-0.5)}{n}} $$
- $$ m^2 \leq z_{1-\frac{\alpha}{2}}^2\left(\frac{0.25}{n}\right) $$
- $$ n=\left(\frac{z_{1-\frac{\alpha}{2}}}{2 m}\right)^2 $$
- Ex: If want a 95% confidence interval with a 2% margin of error, $\alpha$ = 0.05 and $m$ = 0.02.
- $$ n=\left(\frac{z_{1-\frac{\alpha}{2}}^{2 m}}{2 m}\right)^2=\left(\frac{1.960}{2(0.02)}\right)^2=2401 $$