This function calculates the normalization factor for each sample using different methods. See details.
Usage
norm.fact(
df,
method = c("TMM", "TMMex", "MedR", "QN"),
logratioTrim = 0.3,
sumTrim = 0.05,
Weighting = TRUE,
Acutoff = -1e+10
)
Arguments
- df
a data frame or matrix of allele depth values (total depth per snp per sample)
- method
character. method to be used (see details). Default
TMM
- logratioTrim
numeric. percentage value (0 - 1) of variation to be trimmed in log transformation
- sumTrim
numeric. amount of trim to use on the combined absolute levels (“A” values) for method
TMM
- Weighting
logical, whether to compute (asymptotic binomial precision) weights
- Acutoff
numeric, cutoff on “A” values to use before trimming
Details
Originally described for normalization of RNA sequences
(Robinson & Oshlack 2010), this function computes normalization (scaling)
factors to convert observed library sizes into effective library sizes.
It uses the method trimmed means of M-values proposed by Robinson &
Oshlack (2010). See the original publication and edgeR
package
for more information.
The method MedR
is median ratio normalization;
QN - quantile normalization (see Maza, Elie, et al. 2013 for a
comparison of methods).
References
Robinson MD, and Oshlack A (2010). A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biology 11, R25
Robinson MD, McCarthy DJ and Smyth GK (2010). edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26
Examples
if (FALSE) vcf.file.path <- paste0(path.package("rCNV"), "/example.raw.vcf.gz")
vcf <- readVCF(vcf.file.path)
#> Error: object 'vcf.file.path' not found
df<-hetTgen(vcf,"AD-tot",verbose=FALSE)
#> Error: object 'vcf' not found
norm.fact(df) # \dontrun{}
#> Error in if (method == "TMM") { f75 <- suppressWarnings(calcFactorQuantile(data = as.matrix(df), lib.size = colSums(as.matrix(df)), p = 0.75)) if (median(f75) < 1e-20) { ref <- which.max(colSums(sqrt(as.matrix(df)))) } else { ref <- which.min(abs(f75 - mean(f75))) } out <- apply(df, 2, FUN = TMM, ref = df[, ref], logratioTrim = logratioTrim, sumTrim = sumTrim, Weighting = Weighting, Acutoff = Acutoff)} else if (method == "TMMex") { ref <- which.max(colSums(sqrt(as.matrix(df)))) out <- apply(df, 2, FUN = TMMex, ref = df[, ref], logratioTrim = logratioTrim, sumTrim = sumTrim, Weighting = Weighting, Acutoff = Acutoff)} else if (method == "MedR") { pseudo <- apply(df, 1, function(xx) { exp(mean(log(as.numeric(xx)[as.numeric(xx) > 0]))) }) out <- apply(df, 2, function(xx) { median(as.numeric(xx)/pseudo, na.rm = T) })}: the condition has length > 1