chessboard.api.priors.EstimateMissingPriors

chessboard.api.priors.EstimateMissingPriors(data, background, downsample=2000)

Estimate missing value priors.

Empirically estimate missing value model priors for the signal and background groups. In both cases, the priors are estimated using Beta-Binomial regression.

\[\upsilon_j \sim BetaBinomial(n, \mu\Phi, (1 − \mu)\Phi) logit(\mu) = \beta_0 + \beta_1\chi_j\]

Here, \(\upsilon_j\) is the missingness rate and \(\chi_j\) is the median read depth for a given LSV \(j\).

For the signal, \(\upsilon_j\) and \(\chi_j\) are derived from the data. For the background, these quantities are obtained from control data. For example, GTEX data for the equivalent tissue. We provide sample data for GTEX whole blood in “gtex_missingness.tsv”. This file should have a column “miss” representing \(\upsilon_j\) and a column “mdepth” representing \(\chi_j\).

Parameters

data (Data) – A CHESSBOARD object containg the data on which you need to estimate priors.
background (str) – Path to the TSV file for estimating background priors.
downsample (int) – Number of samples in the TSV to use. This is down sampling is to help reduce run time.