Addressing big data challenges in exposure science
Determining potential health risks associated with chemical exposures requires analytical techniques capable of measuring the identity and concentration of contaminants in environmental samples. Mass spectrometry (MS), a technique that measures the mass of ionized molecules (usually after separation by chromatography), is the work-horse of environmental exposure analysis due to its high sensitivity and selectivity, which are required to detect contaminants in complex environmental samples. When components of a mixture are known, MS analysis can provide critically important data on the concentration of individual contaminants through so-called "targeted" analysis. MS also provides a means for identifying the components of a mixture based on the obtained MS data—without previously knowing the identity of the contaminant—through a technique known as "non-targeted" analysis. The combination of targeted and non-targeted analysis provides a framework for identifying and subsequently quantifying exposure-relevant contaminants in environmental media. However, full realization of this framework is currently stymied by the large number of possible chemical contaminants (e.g., more than 82,000 commercially relevant substances registered under the Toxic Substances Control Act), limited tools for the high-throughput identification contaminants by MS, and a paucity of approaches for integrating large-scale MS measurements with exposure relevant meta-data (e.g., time of sampling, sample location, type of media sampled).
Research in the Analytical Cheminformatics group of the Falk Foundation Environmental Exposomics Laboratory seeks to address these challenges through the development of cheminformatics tools designed specifically for assessing environmental exposures. We are actively developing software tools for the non-targeted identification of organic contaminants using MS data in concert with meta-analysis of known environmental contaminants. These approaches provide rational and efficient prioritization of compounds for in-depth study by later targeted analysis. Furthermore, the Analytical Cheminformatics group serves as a resource to researchers—from the Duke community and beyond—facing challenges in designing and implementing exposomics studies through consultations and formal collaborations.