Based on my own experience, I don’t have super strong feelings about this, but I know this is a contentious issue in the microbial NGS world, based mainly off the McMurdie and Holmes (2014) PLoS Computational Biology paper that questioned the validity/necessity of rarefaction. In theory, species richness should go up with more sampling, so if individual samples vary by a large amount in total number of sequences, then rarefying them all to a least common sequence number is one way to account for this issue. There is a lot of sequence info thrown away, however, so before doing that, I suggest you plot out the relationship between number of sequences per sample and OTU richness per sample. If there is no positive line, that suggests that sequence depth is not likely an important driver of OTU richness. The safest current way to proceed is to analyze the data twice, once with and once without rarefaction. In most cases, I have found the ecological answer is pretty similar. You can also check out the approach of Tedersoo et al. (2014) Science, which used the residuals of relationship between OTU richness and sequence number per sample as their input into their ecological analyses. By using the residuals instead of the raw OTU values, they are effectively accounting for any effect of sequencing depth ahead of additional ecological analyses.

No comments:

Post a Comment