In the Bayesian multiple regression model, the posterior density of the model parameters depends on the likelihood of the data given the parameters and a prior probability for the model parameters. The choice of the prior for marker effects can influence the type and extent of shrinkage induced in the model.
Usage
gmap(
y = NULL,
X = NULL,
W = NULL,
stat = NULL,
trait = NULL,
sets = NULL,
fit = NULL,
Glist = NULL,
chr = NULL,
rsids = NULL,
ids = NULL,
b = NULL,
bm = NULL,
seb = NULL,
mask = NULL,
LD = NULL,
n = NULL,
vg = NULL,
vb = NULL,
ve = NULL,
ssg_prior = NULL,
ssb_prior = NULL,
sse_prior = NULL,
lambda = NULL,
scaleY = TRUE,
shrinkLD = FALSE,
shrinkCor = FALSE,
formatLD = "dense",
pruneLD = TRUE,
r2 = 0.05,
checkLD = TRUE,
h2 = NULL,
pi = 0.001,
updateB = TRUE,
updateG = TRUE,
updateE = TRUE,
updatePi = TRUE,
adjustE = TRUE,
models = NULL,
checkConvergence = FALSE,
critVe = 3,
critVg = 5,
critVb = 5,
critPi = 3,
ntrial = 1,
nug = 4,
nub = 4,
nue = 4,
verbose = FALSE,
msize = 100,
threshold = NULL,
ve_prior = NULL,
vg_prior = NULL,
tol = 0.001,
nit = 100,
nburn = 50,
nit_local = NULL,
nit_global = NULL,
method = "bayesC",
algorithm = "mcmc"
)
Arguments
- y
A vector or matrix of phenotypes.
- X
A matrix of covariates.
- W
A matrix of centered and scaled genotypes.
- stat
Dataframe with marker summary statistics.
- trait
Integer used for selection traits in covs object.
- sets
A list of character vectors where each vector represents a set of items. If the names of the sets are not provided, they are named as "Set1", "Set2", etc.
- fit
List of results from gbayes.
- Glist
List of information about genotype matrix stored on disk.
- chr
Chromosome for which to fit BLR models.
- rsids
Character vector of rsids.
- ids
vector of individuals used in the study
- b
Vector or matrix of marginal marker effects.
- bm
Vector or matrix of adjusted marker effects for the BLR model.
- seb
Vector or matrix of standard error of marginal effects.
- mask
Vector or matrix specifying if marker should be ignored.
- LD
List with sparse LD matrices.
- n
Scalar or vector of number of observations for each trait.
- vg
Scalar or matrix of genetic (co)variances.
- vb
Scalar or matrix of marker (co)variances.
- ve
Scalar or matrix of residual (co)variances.
- ssg_prior
Scalar or matrix of prior genetic (co)variances.
- ssb_prior
Scalar or matrix of prior marker (co)variances.
- sse_prior
Scalar or matrix of prior residual (co)variances.
- lambda
Vector or matrix of lambda values
- scaleY
Logical indicating if y should be scaled.
- shrinkLD
Logical indicating if LD should be shrunk.
- shrinkCor
Logical indicating if cor should be shrunk.
- formatLD
Character specifying LD format (default is "dense").
- pruneLD
Logical indicating if LD pruning should be applied.
- r2
Scalar providing value for r2 threshold used in pruning
- checkLD
Logical indicating if LD matches summary statistics.
- h2
Trait heritability.
- pi
Proportion of markers in each marker variance class.
- updateB
Logical indicating if marker (co)variances should be updated.
- updateG
Logical indicating if genetic (co)variances should be updated.
- updateE
Logical indicating if residual (co)variances should be updated.
- updatePi
Logical indicating if pi should be updated.
- adjustE
Logical indicating if residual variance should be adjusted.
- models
List structure with models evaluated in bayesC.
- checkConvergence
Logical indicating if convergences should be checked.
- critVe
Scalar providing value for z-score threshold used in checking convergence for Ve
- critVg
Scalar providing value for z-score threshold used in checking convergence for Vg
- critVb
Scalar providing value for z-score threshold used in checking convergence for Vg
- critPi
Scalar providing value for z-score threshold used in checking convergence for Pi
- ntrial
Integer providing number of trials used if convergence is not obtaines
- nug
Scalar or vector of prior degrees of freedom for genetic (co)variances.
- nub
Scalar or vector of prior degrees of freedom for marker (co)variances.
- nue
Scalar or vector of prior degrees of freedom for residual (co)variances.
- verbose
Logical; if TRUE, it prints more details during iteration.
- msize
Integer providing number of markers used in computation of sparseld
- threshold
Scalar providing value for threshold used in adjustment of B
- ve_prior
Scalar or matrix of prior residual (co)variances.
- vg_prior
Scalar or matrix of prior genetic (co)variances.
- tol
Convergence criteria used in gbayes.
- nit
Number of iterations.
- nburn
Number of burnin iterations.
- nit_local
Number of local iterations.
- nit_global
Number of global iterations.
- method
Method used (e.g. "bayesN","bayesA","bayesL","bayesC","bayesR").
- algorithm
Specifies the algorithm.
Value
A list containing:
bm
Vector or matrix of posterior means for marker effects.dm
Vector or matrix of posterior means for marker inclusion probabilities.vb
Scalar or vector of posterior means for marker variances.vg
Scalar or vector of posterior means for genomic variances.ve
Scalar or vector of posterior means for residual variances.rb
Matrix of posterior means for marker correlations.rg
Matrix of posterior means for genomic correlations.re
Matrix of posterior means for residual correlations.pi
Vector of posterior probabilities for models.h2
Vector of posterior means for model probability.param
List of current parameters used for restarting the analysis.stat
Matrix of marker information and effects used for genomic risk scoring.
Details
This function implements Bayesian linear regression models to provide unified mapping of genetic variants, estimate genetic parameters (e.g. heritability), and predict disease risk. It is designed to handle various genetic architectures and scale efficiently with large datasets.
Examples
# Plink bed/bim/fam files
bedfiles <- system.file("extdata", paste0("sample_chr",1:2,".bed"), package = "qgg")
bimfiles <- system.file("extdata", paste0("sample_chr",1:2,".bim"), package = "qgg")
famfiles <- system.file("extdata", paste0("sample_chr",1:2,".fam"), package = "qgg")
# Prepare Glist
Glist <- gprep(study="Example", bedfiles=bedfiles, bimfiles=bimfiles, famfiles=famfiles)
# Simulate phenotype
sim <- gsim(Glist=Glist, chr=1, nt=1)
# Compute single marker summary statistics
stat <- glma(y=sim$y, Glist=Glist, scale=FALSE)
str(stat)
# Define fine-mapping regions
sets <- Glist$rsids
Glist$chr[[1]] <- gsub("21","1",Glist$chr[[1]])
Glist$chr[[2]] <- gsub("22","2",Glist$chr[[2]])
# Fine map
fit <- gmap(Glist=Glist, stat=stat, sets=sets, verbose=FALSE,
method="bayesC", nit=1500, nburn=500, pi=0.001)
fit$post # Posterior inference for every fine-mapped region
fit$conv # Convergence statistics for every fine-mapped region
# Posterior inference for marker effect
head(fit$stat)