生物統計研讀筆記 - Sample Distribution 樣本分佈

正在學習的生物統計中的樣本分佈的重點摘錄。

  • Central Limit theorem

    • When sampling from a non-normally distributed population with mean $\mu_X$ and variance $\sigma^2_X$
    • the distribution of the sample mean (sampling distribution) will have the following attributes:
      • The distribution of X’s will be approximately normal
        • the approach to normality becoming better as the sample size increases.
      • Generally, the sample size required for the sampling distribution of X to approach normality depends on the shape of the original distribution.
      • Samples of 30 or more give very good normal approximations for this sampling distribution of X in nearly all situations.
  • Confidence Intervals for the Population Mean

    • $\frac{\bar{X}-\mu}{\frac{\sigma}{\sqrt{n}}}$ satisfy the $Z$ distribution
    • EX: $P\left(-1.960 \leq \frac{\bar{X}-\mu}{\frac{\sigma}{\sqrt{n}}} \leq 1.960\right)=0.95$ -> $C\left(\bar{X}-1.960 \frac{\sigma}{\sqrt{n}} \leq \mu \leq \bar{X}+1.960 \frac{\sigma}{\sqrt{n}}\right)=0.95$
    • Confidence Interval requires:
      • a point estimate, like the sample mean $\bar{X}$
      • a measure of variability, like standard error of the mean $\frac{\sigma}{\sqrt{n}}$
      • a desired level of confidence $1-\alpha$ , in this case $1-\alpha$ = 0.95, so $\alpha$ = 0.05
      • and the sampling distribution of the point estimate
    • the endpoints of the confidence interval: $\text { point estimate } \pm \text { (confidence factor }) \text { (standard error })$
  • Confidence Interval for Population Mean $\mu$

    • If variance of population known: $Z$ distribution to derive the Confidence Interval
    • If variance of population unknown: replace the population standard error with the sample standard error, $t$ distribution to derive the Confidence Interval
  • Confidence interval for Population Variance $\sigma$

    • $\frac{(n-1) s^2}{\sigma^2}$ satisfy the $\chi^2$ distribution
    • $$P\left(\chi_{\frac{\alpha}{2}}^2 \leq \chi^2 \leq \chi_{1-\frac{\alpha}{2}}^2\right)=P\left[\chi_{\frac{\alpha}{2}}^2 \leq \frac{(n-1) s^2}{\sigma^2} \leq \chi_{1-\frac{\alpha}{2}}^2\right]=1-\alpha$$
    • $$P\left[\frac{1}{\chi_{\frac{\alpha}{2}}^2} \geq \frac{\sigma^2}{(n-1) s^2} \geq \frac{1}{\chi_{1-\frac{\alpha}{2}}^2}\right]=1-\alpha$$
    • $$C\left(\frac{(n-1) s^2}{\chi_{1-\frac{\alpha}{2}}^2} \leq \sigma^2 \leq \frac{(n-1) s^2}{\chi_{\frac{\alpha}{2}}^2}\right)=1-\alpha$$
    • $$C\left(\sqrt{\frac{(n-1) s^2}{\chi_{1-\frac{\alpha}{2}}^2}} \leq \sigma \leq \sqrt{\frac{(n-1) s^2}{\chi_{\frac{\alpha}{2}}^2}}\right)=1-\alpha$$
  • Confidence Interval for Population Proportion

    • Consider the population proportion in a binomial distribution
    • $$\hat{p}=\frac{X}{n}$$
    • $$\sigma_{\hat{p}}^2=\frac{\sigma^2}{n}=\frac{p(1-p)}{n}$$
    • $$L_1=\hat{p}-z_{1-\frac{\alpha}{2}} \sqrt{\frac{p(1-p)}{n}}$$
    • $$L_2=\hat{p}+z_{1-\frac{\alpha}{2}} \sqrt{\frac{p(1-p)}{n}}$$
    • Derive the necessary sample size

      • Define the margin of error for a $(1-\alpha) 100 %$ confidence interval for a population proportion to be
      • $$ m=z_{1-\frac{\alpha}{2}} \mathrm{SE}{\hat{p}}=z{1-\frac{\alpha}{2}} \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} $$ - Using $\hat{p}$ = 0.5 is the conservative approach as it will produce an overestimate of $n$ .
      • $$ m \leq z_{1-\frac{\alpha}{2}} \sqrt{\frac{0.5(1-0.5)}{n}} $$
      • $$ m^2 \leq z_{1-\frac{\alpha}{2}}^2\left(\frac{0.25}{n}\right) $$
      • $$ n=\left(\frac{z_{1-\frac{\alpha}{2}}}{2 m}\right)^2 $$
      • Ex: If want a 95% confidence interval with a 2% margin of error, $\alpha$ = 0.05 and $m$ = 0.02.
      • $$ n=\left(\frac{z_{1-\frac{\alpha}{2}}^{2 m}}{2 m}\right)^2=\left(\frac{1.960}{2(0.02)}\right)^2=2401 $$
0%