ASBMB recommends clearer guidance on metabolomics data sharing

In letter to the NCI, society calls for policies that won’t burden scientists, standardized but flexible data collection and sharing practices, and continued community engagement
Mallory Smith
Jan. 9, 2023

The American Society for Biochemistry and Molecular Biology sent a letter to the National Cancer Institute on Dec. 30 regarding how to support privacy, reproducibility and harmonization of metabolomics data in alignment with the new National Institutes of Health Data Management and Sharing Policy.

When the NCI published a request for information titled “Soliciting Input on the Use and Reuse of Cancer Metabolomics Data” in October, the ASBMB was eager to ask its members about their experiences and to share their concerns about the NIH’s data-management and -sharing policy.

Briefly, in its letter, the ASBMB told the NCI that (1) -omics research produces large, complex data sets that threaten to burden many scientists under the NIH policy; (2) the diversity of metabolomics research necessitates both standardization and flexibility to be maximally effective; and (3) high variability in sample preparation, data collection, software, metabolite nomenclature and more makes the reuse and integration of metabolomics data extremely difficult.

Compliance must not burden investigators

The NIH data-management and -sharing policy, effective Jan. 25, aims to enable validation, promote data reuse and provide public access to NIH-funded research.

Rick Page of Miami University, chair of the ASBMB Public Affairs Advisory Committee, said that this effort is “noble and laudable but has the potential to require onerous data annotation efforts in order to yield useful publicly shared data.”

His concern arises from the NIH policy’s broad definition of scientific data, which must be of “sufficient quality to validate and replicate research findings, regardless of whether the data are used to support scholarly publications.” You can read the full policy and definition here.

For scientists who perform metabolomics and -omics research, this could be a tall order.

The society recommended that the NIH issue more guidance on what level of data and information are required to be compliant. The organization noted that it is imperative that the clarifications be “sufficiently flexible” to accommodate the diverse methods used in metabolomics research and their technical limitations.

Standardize, but stay flexible too

For data-management and -sharing to be maximally effective, there must be some standardization in terms of data formats, nomenclature and metadata information. For metabolomics data sets, standardization will be a challenge.

Metabolomics research most commonly is conducted using either mass spectrometry or nuclear magnetic resonance. These techniques are highly sensitive to experimental parameters. This means that variations in media, sample preparation, instrumentation and more can affect the results of the experiment significantly. Without sufficient metadata to instruct other scientists on how the data was collected and analyzed, the datasets are of little use.

To ensure data have proper metadata without undue burden, the society recommended that repositories require “a reasonable degree of metadata that is standardized in format and interoperable with international standards.”

Metabolomics involves collecting a snapshot of millions of molecules varying in chemical structures, chimeric states and chemical modifications. To distinguish one unique molecule from others is no small task, let alone naming them.

Many different formats and styles exist to name molecules, including InChIKey, SMILES, PubChem, ChemSpider, CHEBI and several others, but molecules still can have multiple names that complicate the deposition and retrieval of metabolomics data.

The ASBMB said that standardization of nomenclature across scientific fields would be beneficial, but interoperability of naming formats should be prioritized.

Additionally, metabolomics is a rapidly evolving field, and standardizations run the risk of being outdated quickly. The society called for the NCI and the NIH to structure data repositories to “accommodate new technologies and incorporate new functionalities with ease.”

Data diversity and complexity hinder metabolomics reuse

Andrew Lane, a professor at the University of Kentucky, agreed with the NIH’s goal of the data-sharing policy, stating that data in metabolomics should be “easily retrievable and understandable to nonexperts.” But he had some concerns about how to achieve successful transformation of metabolomics data to biological significance.

Pathway analysis and enrichment software can be “unnecessarily reductive in its assumptions,” Lane said. This type of software is designed to analyze complex data sets and output the metabolic pathways that may be upregulated or downregulated. However, the results may be misleading. To help ensure metabolomics data are shared and reused responsibly, the society recommended that these software packages clearly communicate to users that their outputs require additional validation.

The NCI requested feedback on researchers’ experiences integrating metabolomics data into multiomics studies, to which Lane said: “It is critical that the data are carefully managed and highly interoperable between multiple -omics data streams to ensure the output isn’t misleading or overly reductive.”

He clarified that to do this for each tissue, cancer type and specialized metabolism within an organism, the NIH must be prepared for “horrendous complexity.”

To increase reuse of metabolomics data by nonexperts, the society recommended that repositories be required to provide thorough instructions on how to properly retrieve, process and analyze metabolomics data sets to ensure they are utilized correctly.

Lane explained that the complexity of metabolomics makes standardization and centralized deposition a challenge.

“Developing a system that is effective for everyone is actually very difficult,” Lane said. He applauded the efforts of researchers at the University of California, San Diego, in developing a workable databank system for metabolomics, the Metabolomics Workbench, but noted that some issues related to depositing tracer data remain.

Let’s stay in touch

The society credited the NCI for soliciting input from the scientific community on cancer metabolomics data-management and -sharing but encouraged continued engagement.

“Decisions on these policies must consider both the utility of deposited data and the financial and time costs associated with meeting the final requirements” and should not be rushed, the society wrote.

The society asked the NCI to convene a summit to provide direct and candid discussions with investigators, journals and industry for setting standards and implementing those standards into research workflows.

This will ensure policymakers strike a balance between delivering on goals for the new NIH data-management and -sharing policy and implementing it in a way that is amenable to current scientific infrastructure.

The ASBMB and its members also hope to gain clarity on how federal science agencies and research institutes plan to support the infrastructure necessary for effective data-management and -sharing, such as funding for repositories. This type of support is critical for public access to scientific data but has yet to be defined clearly by policymakers.

Enjoy reading ASBMB Today?

Become a member to receive the print edition monthly and the digital edition weekly.

Learn more
Mallory Smith

Mallory Smith earned her Ph.D. in biochemistry and molecular biology from the University of Kansas Medical Center and held a postdoc at the National Institutes of Health before joining ASBMB as a science policy manager. She is passionate about improving the STEM workforce pipeline, supporting early-career researchers, and advocating for basic science at the institutional, local and national level. Smith is chair of the National Postdoctoral Association Advocacy Committee.

Sign up for the ASBMB advocacy newsletter

Get the latest from ASBMB Today

Enter your email address, and we’ll send you a weekly email with recent articles, interviews and more.

Latest in Policy

Policy highlights or most popular articles

A call to action: Urge Congress to support scientific research

A call to action: Urge Congress to support scientific research

May 21, 2024

ASBMB members can write to policymakers to advocate for robust science funding in fiscal year 2025.

ASBMB members head to Capitol Hill

ASBMB members head to Capitol Hill

May 20, 2024

They will encourage lawmakers to support essential R&D appropriations to keep the U.S. competitive and retain highly skilled talent.

Genetics studies have a diversity problem that researchers struggle to fix

Genetics studies have a diversity problem that researchers struggle to fix

April 28, 2024

Researchers in South Carolina are trying to build a DNA database to better understand how genetics affects health risks. But they’re struggling to recruit enough Black participants.

National Academies propose initiative to sequence all RNA molecules

National Academies propose initiative to sequence all RNA molecules

April 19, 2024

Unlocking the epitranscriptome could transform health, medicine, agriculture, energy and national security.

ATP delegates push for improved policies
Society News

ATP delegates push for improved policies

April 5, 2024

This ASBMB program helps advocates gain skills to address issues that affect science and scientists.

Advocacy workshops at Discover BMB 2024
Annual Meeting

Advocacy workshops at Discover BMB 2024

Feb. 7, 2024

Topics include running for office, becoming an advocate, and navigating the grant review process at the NIH.