todo

  • model comparison
  • ch7, 9, 13(power analysis and sample size) + beta distributions
  • sec 23.1

Probability, the frequency of the event occurring

  • join probability P(X=value1,Y=valueA)=P(X=value1Y=valueA)xP(X=x,Y=valueA)
  • marginal probability P(X=value1)=yP(X=value1,Y=valueA)=dy p(x,y).
  • conditional probability

    P(X=value1Y=valueA)=P(X=value1,Y=valueA)xP(X=x,Y=valueA)=p(x,y)xdx p(x,y)=P(X=value1,Y=valueA)P(Y=valueA)
    • tell the relation between two different, but related conditionals
      1. P(XY)=P(X,Y) / P(Y) the joint probability of the two events over marginal probability of the conditioned
      2. take P(XY) and multiply it by P(Y) => P(XY)P(Y)=P(X,Y)
      3. respectively P(YX)P(X)=P(X,Y)
      4. equate the two alternative representations of the joint P(X,Y) => P(XY)P(Y)=P(YX)P(X)
      5. divide by P(Y) => P(XY)=P(YX)P(X)P(Y)
    • conditional probability with many events
      • P(XY,Z)=P(Y,ZX)P(Z,Y)P(X)
      • where, P(Y,ZX)=P(YX)P(ZX)
      • and P(Z,Y)=[P(Y,ZX)P(X)]+[P(Y,ZXNOT)P(XNOT)]

From prior to updated posterior belief

  • Let’s say there is a hypothesis θ and some data Y to support/oppose it.
    • The prior (prevalence) P(θ) the strength of our belief in θ without the data Y.
    • The posterior (precision) P(θY) is the strength of our belief in θ when tha data Y is taken into account.
    • The likelihood (sensitivity, power) P(Yθ) is the probability that the data could be generated by the model with param values θ.
    • The evidence (marginal likelihood) P(Y) is the probability of the data according to the model, determined by summing across all possible param values weighted by the strength of belief in those param values.
    • This is a relation between the prior/subjective belief (the expert opinion on the hypothesis) P(θ) and the posterior/updated belief P(θY) once there is some data to consider.
    P(θY)=P(Yθ)P(θ)P(Y)=dθ p(Yθ) p(θ)
  • Posterior with two independent events:

    P(θY,Z)=P(Yθ,Z)P(YZ)  P(θZ)

    • since Y and Z are independent events => P(Yθ,Z)=P(Yθ)
    • the denominator requires attention too: P(YZ)=P(Y,Z)P(Z)

Tables

  • notes True/False Positive/Negative here refers to the test result (e.g. False Negative means that the test have been incorrectly negative (should have been positive, and that a sick person without disease have a negative test)
  • LR+ Positive Likelihood ratio: How much the odds of having a disease increase when test is positive
  • LR- Negative Likelihood ratio: How much the odds of not having a disease increase when test is negative
  • Power of a binary hypothesis is the probability that a test correctly rejects the H0 when a specific H1 is true = P(H0=rejectH1=true). "bayesian table"

Examples

  • Example with a disease with θ(disease)=(sick,well) and Y(test)=(positive ,negative ). What is the Pr of having a disease when the test is positive (what is the Precision) given that TP rate (Sensitivity) is 0.99, FP rate (Type I error)= 0.09 and Prevalence is 0.02?
test \ disease θ=sick θ=well marginal P(Y)
Y = positive 0.99* 0198 0.0198 0.09** 0882 0.0882 —- 1080 0.1080
Y = negative 0.01 0002 0.0002 0.91 8918 0.8918 —- 8920 0.8920
marginal P(θ) 1.00 0200 0.02*** 1.00 9800 0.98 —- 10000 1.00
  • * likelihood P(Yθ sick) = 0.99 // True Positive Rate
  • ** P(Yθ well) = 0.09 // False Positive Rate
  • *** prior belief P(θ sick)=0.02
  • evidence P(Y)=P(Y,θ sick)+P(Y,θ well) = $(0.990.02) + (0.090.98) = 0.1080$
  • posterior P(θ sickY)=P(Yθ sick) / P(Y)P(θ sick) = 0.99 / 0.1080.02=0.18333
  • conditional P(θ sickY)=P(θ sick , Y) / P(Y) = 0.0198 / 0.108=0.18333

  • Example with coin flipping: prior belief of getting heads is 0.5 half the time (50%) with chances of skewness (0.25 and 0.75) the other half of the time(25% of the time it is 0.25, and 25% of the time it is 0.75); testing data shows 3 heads out of 12 flips; => what is the probability of getting a head on a given flip?
    • prior is θ=P(heads)=0.5 half the time, 0.25 and 0.75 the other half of the time.
    • "prior"
      • Note: for those beliefs, the predicted probability for heads is (using the evidence formula):
        • =P(Y=headsθ=0.25)P(θ=0.25)
        • +P(Y=headsθ=0.5)P(θ=0.5)
        • +P(Y=headsθ=0.75)P(θ=0.75)
        • $= (0.250.25)+(0.50.5)+(0.25*0.75) = 0.5$
    • likelihood is P(Yθ)=θ3 (1θ)9
    • "likelihood"
      • for P(θ=heads)=0.25 the likelihood is 0.253(10.25)9=0.01560.075=0.0012
      • for P(θ=heads)=0.5 the likelihood is 0.53(10.5)9=0.1250.002=0.0002
      • for P(θ=heads)=0.75 the likelihood is 0.753(10.75)9=0.42193.8147E06=1.60933E06
    • evidence is P(Y=heads)=θP(Y=headsθ) P(θ) // sum of all joint probabilities
      • =P(Y=headsθ=0.25)P(θ=0.25)
      • +P(Y=headsθ=0.5)P(θ=0.5)
      • +P(Y=headsθ=0.75)P(θ=0.75)
      • $= (0.00120.25)+(0.00020.5)+(1.60933E-06*0.75) = 0.000417$
    • posterior is: P(θY=heads)=P(Y=headstheta)P(θ)P(Y=heads)
    • "posterior"
      • for θ=0.25 the posterior is 0.00120.25/0.000417=0.704
      • for θ=0.5 the posterior is 0.00020.5/0.000417=0.293
      • for θ=0.75 the posterior is 1.60933E060.75/0.000417=0.0029

Bibliography

  • Kruschke - Doing Bayesian data analysis - 2010
  • based on Kruschke:
    • http://tinyheero.github.io/2016/03/20/basic-prob.html
    • http://tinyheero.github.io/2016/04/21/bayes-rule.html
    • http://tinyheero.github.io/2017/03/08/how-to-bayesian-infer-101.html
  • https://commons.wikimedia.org/wiki/File:Preventive_Medicine_Statistics_Sensitivity_TPR,_Specificity_TNR,_PPV,_NPV,_FDR,_FOR,_ACCuracy,_Likelihood_Ratio,_Diagnostic_Odds_Ratio_2_Final.png
  • https://eli.thegreenplace.net/2018/conditional-probability-and-bayes-theorem/