Introduction

LD score regression is a statistical method used to estimate the heritability of complex traits and to understand the genetic architecture underlying genome-wide association study (GWAS) results. By analyzing the relationship between linkage disequilibrium (LD) scores (which measure the extent to which SNPs are correlated with nearby genetic variants) and GWAS summary statistics, it separates true polygenic signal from confounding biases like population stratification. This approach helps to quantify the contribution of common genetic variants to trait heritability.

This tutorial demonstrates how to estimate narrow-sense heritability and genetic correlation using the LD Score Regression (LDSC) method within the context of a gact database. It also includes examples of partitioning of heritability using different annotation sources. The script makes use of the qgg and gact libraries to process and analyze GWAS summary statistics.


# Load libraries
library(qgg)
library(gact)

# Load GAlist
GAlist <- readRDS(file="C:/Users/gact/hsa.0.0.1/GAlist_hsa.0.0.1.rds")

# Check studies in gact database
GAlist$studies

# Select GWAS study IDs
studyIDs <- c("GWAS1","GWAS2","GWAS6")

# Get GWAS summary statistics for studyIDs (e.g. z and n) from gact database
stat <- getMarkerStat(GAlist=GAlist, studyID=studyIDs)

# Get ldscores matched to the ancestry of GWAS data
ldscores <- getLDscores(GAlist=GAlist, ancestry="EUR", version="1000G")

# Estimate narrow sense heritability using ldsc
fit <- ldsc(z=stat$z, n=stat$n, ldscores=ldscores, what="h2")
fit

# Estimate genetic correlation using using ldsc
fit <- ldsc(z=stat$z, n=stat$n, ldscores=ldscores, what="rg")
fit$h2
fit$rg

# Partioning of heritability using chromosome information using standard regression method
sets <- getMarkerSets(GAlist=GAlist, feature="Chromosomes")
fit <- ldsc(z=stat$z, n=stat$n, ldscores=ldscores, sets=sets, what="h2", residual=TRUE)
fit

# Partioning of heritability using regulatory categories information and a BLR method
sets <- getMarkerSets(GAlist=GAlist, feature="Regulatory Categories")
fit <- ldsc(z=stat$z, n=stat$n, ldscores=ldscores, sets=sets, what="h2", method="bayesC", residual=TRUE)
fit