Bio-IT World Toward a Predictive Model for a Cell

Bio-IT World

Last month, a really nice piece of systems biology work was published in the journal Cell in which researchers developed a predictive model for a free cell, in this case the Archea organism, Halobacterium salinarum NRC-1.

Toward a Predictive Model for a Cell

Last month, a really nice piece of systems biology work was published in the journal Cell in which researchers developed a predictive model for a free cell, in this case the Archea organism, Halobacterium salinarum NRC-1.What’s more, the authors suggest that even though their model is for a relatively simply organism (~2400 genes), the approach used to build it can probably be used to tackle complex organisms

The authors of the paper, Environmental and Gene Regulatory Influence Network (EGRIN): A Predictive Model for Transcriptional Control of Physiology in a Free Living Cell, say they used a data-driven discovery approach to determine regulatory and functional interrelationships among roughly 80 percent of NRC-1’s genes.

“Using relative changes in 72 transcription factors and 9 environmental factors (EFs) this model accurately predicts dynamic transcriptional responses of all these genes in 147 newly collected experiments representing completely novel genetic backgrounds and environments – suggesting a remarkable degree of network completeness. Using this model we have constructed and tested hypotheses critical to this organism’s interaction with its changing hypersaline environment. This study supports the claim that the high degree of connectivity within biological and EF networks will enable the construction of similar models for any organism from relatively modest numbers of experiments.”

It is perhaps unsurprising that much of the work was done at the Institute for Systems Biology (ISB) and led by current ISB researcher Nitin Baliga and a former ISB researcher now at the Center for Comparative Functional Genomics, New York University: Richard Bonneau.

Indeed, this was a classic systems biology exercise, as espoused by ISB president and founder Lee Hood (see What Is Systems Biology?), who was among the report’s authors. The work involved global measurements (genome-wide); quantitative and dynamic measurements; careful system perturbation (genetic and environmental); integrating different data types; and of course adherence to the systems biology cycle of perturbation-measurement-model-hypothesis-perturbation.

There were several hurdles. For example, roughly 38 percent of NRC-1’s genes had little or no functional assignments. The group incorporated functional relationships from comparative genomics as well as predicted structural and domain similarities until achieving “nearly 90 percent…meaningful association with either a characterized protein, a protein family or a structural fold.” Similar techniques were used to boost the number of putative transcription factors.

Two hundred sixty-six microarray experiments were used to construct the networks, and 147 microarray experiments were used to validate model predictions. Network construction was based the “Inferelator algorithm” (catchy name) developed in large measure by Bonneau. The authors note the number of experiments required was relatively modest, given EGRIN’s model’s high accuracy, and suggest the interdependence of many networks and, at least for metabolism, that cells may usually function in one or a few dominant states.

“What is powerful about this approach is that it took under six years to move from genome sequence to this level of understanding for a relatively poorly-studied organism. Indeed, it would be significantly quicker to implement the same approach with a newly sequenced organism given that much of the scientific methods including experimental procedures, algorithms, and software have been delineated through our study,” write the authors.

Of course, there’s still work to be done on EGRIN. Many other regulatory mechanisms – small RNAs, epigenetic modifications, post-translational modifications, metabolite-based feedback – are not included and may account, at least in part, for its failure to predict what 20 percent of the genes are doing.
———————–
This article first appeared in Bio-IT World’s Systems Biology newsletter.  Click here for a free subscription to Systems Biology.