Chapter 03.
One-parameter Models
๋ณธ ํฌ์คํ ์ First Course in Bayesian Statistical Methods๋ฅผ ์ฐธ๊ณ ํ์๋ค.
Binomial Model
Prior: $\theta \text{ ~ } Beta(a,b)$
Likelihood: $Y|\theta \text{ ~ } Binomial(n, \theta) $
Posterior: $\theta|y \text{ ~ } Beta(a+y, b+n-y) $
a: prior ์ฑ๊ณตํ์, b: prior ์คํจํ์, $\omega$=a+b: concentration
$E[\theta|y] = \frac{a+y}{a+b+n} = \frac{n}{a+b+n}\times\frac{y}{n} + \frac{a+b}{a+b+n}\times\frac{a}{a+b}$ where $\frac{y}{n}$ = sample mean, $\frac{a}{a+b}$ = prior expectation
Posterior Predictive
$n^* = 1$์ผ ๋ : $\tilde{Y}|y \text{ ~ } Ber(\frac{a+y}{a+b+n})$
$n^* \geq 2$์ผ ๋ : $p(\tilde{Y}=y^*|y) = \binom{n^*}{y^*}\frac{B(a+y+y^*, b+n+n^*-y-y^*)}{B(a+y, b+n-y)}$ where $B(\alpha, \beta) = \frac{\Gamma(\alpha)\Gamma(\beta)}{\Gamma(\alpha+\beta)} $
Poisson Model
Prior: $\theta \text{ ~ } Gamma(a,b) $
Likelihood: $Y_1, ..., Y_n \text{ ~ iid. } Poisson(\theta)$
Posterior: $\theta|y_1, ..., y_n \text{ ~ } Gamma(a+\sum_{i=1}^{n}{y_i}, b+n) $
a: sum of counts from b prior observations, b: number of prior observations
$E[\theta|y_1, ..., y_n] = \frac{a+\sum y_i}{b+n} = \frac{b}{b+n}\frac{a}{b} + \frac{n}{b+n}\frac{\sum y_i}{n}$
Posterior Predictive: $\tilde{Y}=y^*|y_1, ..., y_n \text{ ~ } NB(a+\sum y_i+y^*, \frac{b+n}{b+n+1}) $
๋จ, ์ฌ๊ธฐ์ $Negative Binomial$์ ์ฑ๊ณต์ด ์๋ ์คํจํ์๋ฅผ ์ธ๋ ๋ถํฌ ํํ์ด๋ค. ์์ธํ ๋ด์ฉ์ ํ๋ฅ ๋ถํฌ ํฌ์คํ
์์ ํ์ธํ์.
Exponential Family
exponential family(์ง์์กฑ)์ pdf ๋๋ pmf๋ ๋ค์๊ณผ ๊ฐ์ ํ์์ผ๋ก ํํ๋ ์ ์์ด์ผ ํ๋ค.
$ p(y_i|\phi) = h(y)c(\phi)exp\big[\phi K(y)\big]$
exponential family ์์ฒด์ ๋ํด์ ๋ณด๋ค ์์ธํ ๊ฒ์ ํด๋น ํฌ์คํ
์ ์ฐธ๊ณ ํ์.
Prior
$$\begin{align} p(\phi) &= k(n_0, t_0)c(\phi)^n_0e^{n_0t_0\phi} \\ &\propto c(\phi)^n_0e^{n_0t_0\phi} \end{align}$$
Likelihood
$$L(\phi|y_1,...,y_n) \propto c(\phi)^n exp(\phi \sum_{i=1}^{n}K(y_i))$$
Posterior
$$\begin{align} p(\phi|y) &\propto p(\phi)f(y|\phi) \\ &\propto c(\phi)^{n_0}e^{n_0t_0\phi} \cdot c(\phi)^n exp(\phi \sum_{i=1}^{n}K(y_i)) \\ &\propto c(\phi)^{n_0}exp\big[n_0t_0\phi + \phi \sum_{i=1}^{n}K(y_i) \big] \\ &\propto c(\phi)^{n_0}exp\big[ \phi \big( n_0t_0 + n\frac{\sum_{i=1}^{n}K(y_i)}{n} \big)\big] \end{align}$$
์ฌ๊ธฐ์ $n_0$์ $t_0$์ ๊ฐ๊ฐ prior sample size์ prior guess of $K(Y)$๋ฅผ ๋ปํ๋ค.
Conjugate Prior
prior์ posterior์ ํ๋ฅ ๋ถํฌํํ๊ฐ ๊ฐ์ ์ ์๋๋ก prior์ ์ค์ ํ๋ฉด ์ด๋ฅผ conjugate prior๋ผ๊ณ ํ๋ค.
์์ ์์ ์ธ์๋ Normal model ๋ฑ์ด ์๋๋ฐ, ์ด๋ค์ ๋ํด์๋ ๋ค์์ ์ด์ด์ ์ดํด๋ณด๋๋ก ํ๊ฒ ๋ค.
๋ค์ํ ์์๋ค์ ์ํค๋ฐฑ๊ณผ์ ์์ธํ ๋์์์ผ๋ ๊ถ๊ธํ ์ฌ๋๋ค์ ์ถ๊ฐ์ ์ผ๋ก ์ดํด๋ณด์๋ ์ข๊ฒ ๋ค.
์ฃผ์์ฌํญ
์ฌํํ๋ฅ ๋ถํฌ๊ฐ ์ฐจ์ด๊ฐ ๋ง์ด ๋๋ ๊ฒ๊ณผ ์ฌํ์์ธก์น๊ฐ ์ฐจ์ด๊ฐ ๋ง์ด ๋๋ ๊ฒ์ ์ฐจ์ด๋ฅผ ์์๋์ด์ผ ํ๋ค. ์ฆ, {${\theta_1 > \theta_2}$}์ {$\tilde{Y_1} > \tilde{Y_2}$}๋ ๋ค๋ฅด๋ค.
Strong evidence of a difference between two populations does not mean that the difference itself is large.
Conclusion
Conjugacy๋ฅผ ์ ์์๋์.
ํน์ ๊ถ๊ธํ ์ ์ด๋ ์๋ชป๋ ๋ด์ฉ์ด ์๋ค๋ฉด, ๋๊ธ๋ก ์๋ ค์ฃผ์๋ฉด ์ ๊ทน ๋ฐ์ํ๋๋ก ํ๊ฒ ์ต๋๋ค.