simulateZINLDA generates sparse count data according to a zero-inflated latent Dirichlet allocation model.

simulateZINLDA(D, V, N, K, Alpha, Pi, a, b)

Arguments

D

number of samples.

V

number of unique taxa.

N

vector of length D containing the total number of sequencing readings per sample.

K

number of latent subcommunities.

Alpha

scalar symmetric hyperparameter of the Dirichlet prior on theta.

Pi

scalar symmetric zero-inflation hyperparameter of the ZIGD prior on beta.

a

scalar symmetric hyperparameter of the ZIGD prior on beta.

b

scalar symmetric hyperparameter of the ZIGD prior on beta.

Value

A list containing the following elements:

cohort

D-length list of character vectors containing the taxa assigned to each sequencing read in each microbial sample.

z

D-length list of vectors containing the subcommunity assignments for each sequencing read in each microbial sample.

sampleTaxaMatrix

matrix of counts, analogous to an OTU matrix.

theta

matrix of subcommunity probabilities per sample.

beta

matrix of taxa probabilities per subcommunity.

delta

matrix of structural zero indicators for each taxa and subcommunity.

Examples

N.d = rdu(n = 50, min = 10000, max = 15000) simData = simulateZINLDA(D = 50, V = 100, N = N.d, K = 5, Alpha = .1, Pi = 0.4, a = .05, b = 10)