Research Overview

In the Baliga Lab at ISB we are reverse engineering biological circuits to understand how cells adapt to new environments.

All organisms have a remarkable capacity to process complex changes and adjust behavior to best suit their environment.  The cascade of events from sensing an environmental change and triggering a response constitute many layers of intricate information processing pathways such as signal transduction, pre- and post-transcriptional regulation, post-translation regulation, and allosteric modulation of enzyme function. Information is propagated in a controlled manner through all of these complex layers to effect a systemic change in cell behavior.  We are developing and applying systems approaches to fully understand how these complex processes operate. These studies are conducted using simple prokaryotic model organisms that are easy to culture, manipulate and interrogate in the laboratory.

Our long term vision is to use the predictive mathematical gene regulatory network models to drive strategies for engineering designer circuits for a variety of biotechnological applications such as bioenergy, bioremediation and medicine.

Model Organisms

Our choice of model organisms is dictated by the biological question being investigated.  Accordingly, research in our laboratory is conducted on a slew of model organisms including halophilic archaea, hypersaline algae, oceanic diatoms, methanogens, hyperthermophiles, sulfur metabolizing organisms, yeast, mouse and human.  While we are experts in the biology of many of these organisms, for those outside our forte we have established collaborations with domain experts who have deep understanding of the biology of the organism and the characteristics of the environments in which they live. 

Halophiles

Research in our lab is focused on two halophilic archaeal - Halobacterium salinarum NRC-1 and Haloarcula marismortui ATCC 43049.

Briefly, halophilic archaea dominate hypersaline environments, such as the Great Salt Lake, the Dead Sea and South San Francisco Bay, using robust physiologies that are appropriately tuned to the environment through signal transduction and gene regulatory networks. With streamlined genomes, up to 87% of which encode protein coding genes, these organisms offer an incredible opportunity to understand at a systems level mechanisms underlying environmental response systems.

To facilitate these studies we have determined the complete genome sequences for both of these organisms and have developed an array of genome scale strategies tailored to analyzing their biology. 

Using these powerful tools we are applying systems approaches to dissect the complete sets of metabolic and gene regulatory networks that together modulate cell behavior in changing and often stressful environmental conditions.

There are several reasons that influenced our choice of halophilic archaea as our model system(s) –First, little is known about these organisms, which forces us to take a data-driven approach to their inquiry. Consequently, our approaches will not rely on decades of carefully conducted gene-by-gene research and as such these approaches will also be applicable for characterizing most organisms on this planet.

Second, a lot of fascinating biology exists in these environments and we se this as a wealth of biotechnology potential that remains untapped. 

Third, these prokaryotic organisms have a small genome and, therefore, relatively simple and tractable in many ways to splice engineered genes in and out of the chromosome or on extra-chromosomal elements.

Finally, from an evolutionary standpoint these organisms are similar to bacteria in their genome organization and gene regulatory mechanisms but closer to eukaryotes with respect to their core genetic information processing mechanisms. This provides yet another opportunity to investigate the fascinating amalgamation of gene regulation strategies in Archaea, components of which have thus far been studied separately in Bacteria and in Eukaryotes.

Halobacterium species NRC-1 Genome

 

Search

for

Loading....

  Show all domain search results in SBEAMS for Halobacterium sp. NRC-1 (new window)

  Explore data with the Halobacterium Gaggle (new window)

  Data Download


 

Bonneau R, Baliga NS, Deutsch EW, Shannon P, Hood L.
Comprehensive de novo structure prediction in a systems-biology context for the archaea Halobacterium sp. NRC-1.
Genome Biology 2004, 5(8):R52-68.
PMID: 15287974 [PubMed - indexed for MEDLINE]

Ng. W.V., Kennedy SP, Mahairas GG, Berquist B, Pan, M., Shukla HD, Lasky SR, Baliga, N., Thorsson V, Sbrogna J, Swartzell S, Weir D, Hall J, Dahl TA, Welti R, Goo YA, Leithauser B, Keller K, Cruz R, Danson MJ, Hough DW, Maddocks DG, Jablonski PE, Krebs MP, Angevine CM, Dale H, Isenbarger TA, Peck RF, Pohlschroder M, Spudich JL, Jung KW, Alam M, Freitas T, Hou S, Daniels CJ, Dennis PP, Omer AD, Ebhardt H, Lowe TM, Liang P, Riley M, Hood L, DasSarma S. 2000.
Genome sequence of Halobacterium species NRC-1.
Proc Natl Acad Sci U S A 97(22):12176-81

We would like to hear from you. If you have additions, corrections, or other comments, please click here to submit them.

Data Download

SEQUENCES


Derived Proteins (FASTA format)

Genome Sequence (FASTA format)

RAW ANNOTATION INPUT DATA


The following files are data products from protein domain matching software. The files were loaded into the database and used in combination to annotate the proteins.

PFAM Search Summary (tab-delimited text)

TMHMM / SignalP (tab-delimited text)

Rosetta results (Mammoth Summary)

GENE COORDINATES


The following files contain gene names, start/stop coordinates, and directional orientation.

Main Chromosome Coordinates

PNRC100 Coordinates (Nomenclature)

PNRC200 Coordinates

IMAGES


Halobacterium NRC-1 Genome Map (PDF format)

Microarray Design


Spotted Expression Microarray Platform

PUBLISHED DATASETS


A predictive model for transcriptional control of physiology in a free living cell.

RNA expression data

Lambdas (tab-delimited text)

Log10 Ratios (tab-delimited text)

The anatomy of microbial cell state transitions in response to oxygen.

Protein data

p-Values (tab-delimited text)

Log10 Ratios (tab-delimited text)

RNA expression data

Lambdas (tab-delimited text)

Log10 Ratios (tab-delimited text)

General transcription factor specified global gene regulation in archaea.

Protein - DNA interactions (chIP-chip)

Lambdas (tab-delimited text)

Log10 Ratios (tab-delimited text)

RNA expression data

Lambdas (tab-delimited text)

Log10 Ratios (tab-delimited text)

An integrated systems approach for understanding cellular responses to gamma radiation.

Lambdas (tab-delimited text)

Log10 Ratios (tab-delimited text)

A systems view of haloarchaeal strategies to withstand stress from transition metals.

Lambdas (tab-delimited text)

Log10 Ratios (tab-delimited text)

Systems Level Insights Into the Stress Response to UV Radiation in the Halophilic Archaeon Halobacterium NRC-1.

Lambdas (tab-delimited text)

Log10 Ratios (tab-delimited text)

 

Haloarcula marismortui Genome

Search

for

Loading....

  Compare Haloarcula and Halobacterium Genomes (new window)

  Browse Domain Search Results (new window)

  Explore Data with the Haloarcula Gaggle (new window)

  Download Data

Genome sequence of Haloarcula marismortui: A halophilic archaeon from the Dead Sea

Baliga, N.S., R. Bonneau, M.T. Facciotti, M. Pan, G. Glusman, E.W. Deutsch, P. Shannon, Y. Chiu, R.S. Weng, R.R. Gan, P. Hung, S.V. Date, E. Marcotte, L. Hood, and W.V. Ng.
Genome Res. 2004 14: 2221-2234.

We would like to hear from you. If you have additions, corrections, or other comments, please click here to submit them.

Data Download

SEQUENCES

The following files contain the finished genome and derived proteins.

Genome Sequence  (FASTA format)

Derived Proteins  (FASTA format)

 

ANNOTATIONS

The following file contains a snapshot of current annotations.
All Annotations (tab-delimited text)

 

RAW ANNOTATION INPUT DATA

The following files are data products from protein domain matching software. The files were loaded into the database and used in combination to annotate the proteins.

PFAM Search Summary  (tab-delimited text)

IPRScan Summary (tab-delimited text)

GENE COORDINATES

Lists individual genes with start and stop coordinates, and directional orientation.

Coordinate File  (tab-delimited text)

 

IMAGES

The following file contains an image of the entire genome.

Haloarcula Marismortui Genome Map 

Hyperthermophiles

Hyperthermophiles

A hyperthermophile is an organism that thrives in extremely hot environments— from 60 degrees C (140 degrees F) upwards. An optimal temperature for the existence of hyperthermophiles is above 80°C (176°F). Hyperthermophiles are a subset of extremophiles, micro-organisms within the domain Archaea, although some bacteria are able to tolerate temperatures of around 100°C (212°F), too. Many hyperthermophiles are also able to withstand other environmental extremes such as high acidity or radiation levels.

Hyperthermophiles were first discovered by Thomas D. Brock in 1969, in hot springs in Yellowstone National Park, Wyoming. Since then, more than fifty species have been discovered. The most hardy hyperthermophiles yet discovered live on the superheated walls of deep-sea hydrothermal vents, requiring temperatures of at least 90°C for survival. An extraordinary heat-tolerant hyperthermophile is the recently discovered Strain 121 which has been able to double its population during 24 hours in an autoclave at 121°C (hence its name); the current record growth temperature is 122°C, for Methanopyrus kandleri.

Although no hyperthermophile has yet been discovered living at temperatures above 122°C, their existence is very possible (Strain 121 survived being heated to 130°C for two hours, but was not able to reproduce until it had been transferred into a fresh growth medium, at a relatively cooler 103°C). However, it is thought unlikely that microbes could survive at temperatures above 150°C, as the cohesion of DNA and other vital molecules begins to break down at this point.

[Content from Wikipedia]


Pyrococcus furiosus DSM 3638

Pyrococcus furiosus is an extremophilic species of Archaea. It is notable for having an optimum growth temperature of 100°C (a temperature which would destroy most living organisms), and for being one of the few organisms identified as possessing enzymes containing tungsten, an element rarely found in biological molecules.

 

Methanogens

Methanogens

Methanogens are microorganisms that produce methane as a metabolic byproduct in anoxic conditions. They are classified as archaea, a group quite distinct from bacteria. They are common in wetlands, where they are responsible for marsh gas, and in the guts of animals such as ruminants and humans, where they are responsible for the methane content of belching in ruminants and flatulence in some humans. In marine sediments biomethanation is generally confined to where sulfates are depleted, below the top layers. Others are extremophiles, found in environments such as hot springs and submarine hydrothermal vents as well as in the "solid" rock of the Earth's crust, kilometers below the surface.

Methanogens are usually coccoid (spherical) or bacilli (rod shaped). There are over 50 described species of methanogens, which do not form a monophyletic group, although all methanogens belong to Archaea. Methanogens are also anaerobic. Although methanogens cannot function under aerobic conditions they can sustain oxygen stresses for a prolonged time. Methanosarcina barkeri is exceptional in possessing a superoxide dismutase (SOD) enzyme, and may survive longer than the others. Some methanogens, called hydrogenotrophic, use carbon dioxide (CO2) as a source of carbon, and hydrogen as a reducing agent. Some of the CO2 is reacted with the hydrogen to produce methane, which produces an electrochemical gradient across a membrane, used to generate ATP through chemiosmosis. In contrast, plants and algae use water as their reducing agent. Methanogens lack peptidoglycan, a polymer that is found in the cell walls of the Bacteria but not Archaea. Some methanogens have a cell wall that is composed of pseudopeptidoglycan. Other methanogens do not, but have at least one paracrystalline array (S-layer) made up of proteins that fit together like a jigsaw puzzle.

[Content from Wikipedia]


Methanococcus maripaludis S2

Methanococcus maripaludis (Latin "mare" meaning sea, "palus" meaning marsh) is a model species among the methanogenic Archaea. Originally characterized by W. J. Jones, the species was the predominent methanogen isolated from a salt-marsh sediment in South Carolina, United States. Numerous additional isolates were obtained by W. Whitman, including strain S2, also known as strain LL. M. maripaludis is strictly anaerobic, hydrogenotrophic (growing on hydrogen and carbon dioxide) and nitrogen-fixing, and is a mesophilic relative of the hyperthermophilic Methanococcus jannaschii. Cells are irregular cocci with weak motility. M. maripaludis is an excellent laboratory model because of rapid, reliable growth, a complete genome sequence, a robust set of genetic tools, and ongoing studies with expression arrays and proteomics.

[Content from HP of John Leigh Lab at UW]

 

Sulfur Metabolizing

Sulfur Metabolizing

Sulfolobus solfataricus P2

Sulfolobus is a genus of microorganism in the family Sulfolobaceae. It belongs to the archea domain. Sulfolobus species grow in volcanic springs with optimal growth occurring at pH 2-3 and temperatures of 75-80 °C, making them acidophiles and thermophiles respectively. Sulfolobus cells are irregularly shaped and flagellar. Species of Sulfolobus are generally named after the location from which they were first isolated, e.g. Sulfolobus solfataricus was first isolated in the Solfatara (volcano). Other species can be found throughout the world in areas of volcanic or geothermal activity, such as geological formations called mud pots, which are also known as solfatare (plural of solfatara).

Sulfolobus proteins are of interest for biotechnology and industrial use due to their thermostable nature. One application is the creation of artificial derivatives from S. acidocaldarius proteins, named affitins. Intracellular proteins are not necessarily stable at low pH however, as Sulfolobus species maintain a significant pH gradient across the outer membrane. Sulfolobales are metabolically dependent on sulfur: heterotrophic or autotrophic, their energy comes from the oxidation of sulfur and/or cellular respiration in which sulfur acts as the final electron acceptor. For example, S. tokodaii is known to oxidize hydrogen sulfide to sulfate intracellularly.

[Content from Wikipedia]

Sulfolobus solfataricus has been found in different areas including Yellowstone National Park, Mount St. Helens, Iceland, Italy, and Russia to name a few. Sulfolobus is located almost wherever there is volcanic activity. They strive in environments where the temperature is about 80oC with a pH at about 3 and sulfur present.

[Content from MicrobeWiki]


Desulfovibrio vulgaris Hildenborough

Desulfovibrio is a rod-shaped anaerobic bacterium with a 3 Mbp genome. Desulfovibrio is known for its flexibility in the variety of electron acceptors it utilizes including sulfate, sulfur, nitrate, and nitrite among others. Species of Desulfovibrio have long been of interest as bioremediators, with the ability to reduce several toxic metals such as uranium (VI), chromium (VI) and iron (III).

 

 

ENIGMA


Project Title:  Systems approach in a multi-organism strategy to understand biomolecular interactions in DOE-relevant organisms

Funding: U.S. Department of Energy, ENIGMA (Environment and Networks Integrated with Genomes and Molecular Assemblies)

Rational reengineering of biology for the purpose of bioremediation, bioenergy or C-sequestration requires deep understanding of all functional interactions of relevant components within native cell(s). Many of these functional interactions are conserved across diverse species to different degrees depending on their evolutionary distance. We are conducting integrative analysis of genomic architecture and composition, transcriptome and proteome structure/function, protein-protein and protein-DNA interactions and metabolic networks to find keystone complexes and specialized circuit architectures for important application-relevant genes.  These studies are readily generalizable to any organism as they are being developed using diverse organisms with important biological and evolutionary relevance to several DOE mission goals. Specifically, these organisms have enormous potentials from the standpoint of H2 production, N2 fixation, and C-sequestration; they include a heterotrophic halophile Halobacterium salinarum NRC-1, an anaerobic thermophile (Pyrococcus furiosus DSM 3638), an acidophilic and aerobic thermophile (Sulfolobus solfataricus P2); a hydrogenotrophic methanogen (Methanococcus maripaludis S2), and a photoheterotrophic halophile (Halobacterium salinarum NRC-1). In particular, we have developed an integrated systems approach to rapidly reverse engineer a gene regulatory network model (Environment & Gene Regulatory Influence Network, EGRIN) for any organism.  An EGRIN model can predict responses of an organism to endogenous and environmental stressors and can be used a powerful springboard to assign gene function, identify complexes, and rationally re-engineer networks for specific applications such as bioenergy and bioremediation.

You can reach computational and experimental results from this study and access all of the software tools developed in this project by visiting ENIGMA portal (http://baliga.systemsbiology.net/enigma)

Gaggle software tools can be accessed at the Gaggle website and at the DOE/Archaea page.

 

H2 Regulation


Project Title:

Hydrogen regulation and global responses to electron, carbon, and nitrogen sources in Methanococcus maripaludis

PIs:

 John A. Leigh, University of Washington, leighj@u.washington.edu

William B. Whitman, University of Georgia

Murray Hackett, University of Washington

Nitin Baliga, Institute for Systems Biology

Funding: US Department of Energy Office of Basic Energy Sciences, Basic Research for the Hydrogen Fuel Initiative, Award No.DE-FG02-08ER64685

Project Goals:

1.  Use transcriptomics, proteomics, and metabolomics to study the systems biology of H2 metabolism, formate metabolism, nitrogen fixation, and carbon assimilation in Methanococcus maripaludis.

2.  Determine the mechanism of H2 sensing and transcriptional regulation by H2.

Background:

Methanogenic Archaea (methanogens) catalyze the critical, methane-producing step in the anaerobic decomposition of organic matter and have applications in carbon-neutral fuel production.  Most species of methanogens are hydrogenotrophic and use hydrogen gas (H2) as the electron donor for the reduction of carbon dioxide to methane.  In addition, many species can use formate in place of H2, and a few can use certain alcohols.  These microorganisms contain very high levels of different types of hydrogenases and consume H2 at very high rates.  In addition, under certain conditions the H2 uptake system can be induced to produce H2 at high rates; this occurs with formate as electron donor.  As another way to produce H2, certain species of methanogens fix nitrogen, and therefore have the potential to produce H2 using the nitrogenase system.  Finally, most hydrogenotrophic methanogens are autotrophs, and assimilate CO2 by the acetyl-CoA pathway.  Hence, the biology of hydrogenotrophic methanogens is relevant to potential bio-energy applications from the points of view of H2 production, nitrogen fixation, and carbon assimilation. We are engaged in a long-term effort to understand regulatory networks in hydrogenotrophic methanogens, members of the Archaea whose energy metabolism specializes in the use of H2 to reduce CO2 to methane. Our studies focus on Methanococcus maripaludis, a model species with good laboratory growth characteristics, facile genetic tools, and a tractable genome of 1722 annotated ORFs. A key aspect of our approach is the use of continuous culture for maintaining defined nutrient conditions (Haydock et al., 2004).

For more information, analysis and results see H2 Regulation pages.

Software & Algorithms

We are utilizing systems approaches to investigate fundamental biological questions such as cellular responses to environmental and genetic perturbations. To support this effort we develop and utilize software tools and algorithms along the entire path of systems analysis, from data acquisition to synthesis of biological understanding. We have divided these efforts into five (overlapping) categories, which are listed below, along with their specific computational software tools and algorithms that fall under each category.

1. Interactive Integration and Exploration

 The Gaggle is a framework for exchanging data between independently developed software tools and databases to enable interactive exploration of systems biology data.
The Firegoose toolbar connects the Gaggle to the web. By downloading and installing this extension into your Firefox browser you can broadcast data between the Gaggle and web resources.

2. Data Analysis and Statistical Modeling

 cMonkey learns context-specific (condition-dependent) modules of co-regulated genes by integrating (a) gene expression data, (b) de novo detection of cis-regulatory DNA motifs, and (c) connectivity in functional association or physical interaction networks.

3. Automated Inference and Data Integration

MeDiChI is method for the automated, model-based deconvolution of protein-DNA binding (Chromatin immunoprecipitation followed by hybridization to a genomic tiling microarray -- ChIP-chip) data that discovers DNA binding sites at high resolution

Inferelator identifies the most probable regulatory influencers (environmental factors and/or transcriptional regulators) of each bicluster and can be used to make dynamical predictions of bicluster responses under novel experiments.

4. Data Visualization

 The Genome Browser is software tool for visualizing of high-density data plotted against coordinates on the genome. Tiling arrays, ChIP-chip, and high-throughput sequencing are a few potential uses.
 

5. Data Acquisition and Management

GWAP is an experimental data archive which is searchable by a rich set of metadata. 
 

6. Cloud Computing

CSpotRun

 CSpotRun allows us to run hundreds of instances of bioinformatics algorithms (or any other computational task) in the cloud, inexpensively and without loss of data.

 

 

 

 

 

Systems Approach

In a systems approach, the various cellular networks are perturbed by changing environmental conditions or directly perturbing the network by removing genes or modifying their function. One can then measure consequences of these changes as they reverberate throughout the cellular networks. Some changes that can be measured using current technologies include mRNA levels, protein abundance, protein-protein interactions, protein-DNA interactions, protein modifications and metabolite concentrations.

 

The ultimate goal is to integrate and process all these measurements to formulate mathematical models that recapitulate all previous observations and predict new behavior in face of novel environmental perturbations.

 

In a systems approach biological questions drive development of new technologies that, in turn, generate large amounts of new kinds of system-wide measurments of biological network properties. The analysis of these data requires the development of new software tools and algorithms, that extract meaningful biological insights that yields a systems level understanding of cellular responses.

Marine Systems: from single cells to global biogeochemical cycles.

We are specifically interested in identifying and understanding at a systems level how phytoplankton communities and processes affect the chemical composition of organic compounds in the marine environment and how these compounds in-turn influence microbial and phytoplankton ecology.


Projects
A systems biology approach of diatom response to ocean acidification and climate change”

The oceans contribute 40--50% of the total photosynthesis on Earth driving the "biological pump" in the surface oceans, which exports carbon to the deep sea where it is sequestered. If the pump stops, the concentration of CO2 in the atmosphere would double. The world oceans are predicted to decrease 0.5 pH units by 2050 as a result of increasing atmospheric CO2 due to anthropogenic activity. The goal of this research is to understand the impact of these environmental perturbations on the contribution of diatoms to carbon cycling using a model system. Diatoms are the most productive phytoplankton group in the world oceans accounting for about 40 percent of the marine primary production, they form the basis of food webs in coastal and upwelling systems, support important fisheries and have a major role in carbon as well as silica cycling. In this work we focus on the influence of ocean acidification and high temperature stress on carbon cycling. Specifically, we are characterizing - at molecular and cellular levels using a systems approach - the influence of ocean acidification and temperature stress on carbon fixation in a model diatom “Thalassiosira pseudonana”.

 

Arctic Systems

Marine microgels: A microlayer source of summer CCN in high Arctic open leads

There is no region on earth where climate change is manifesting faster than it does in the Arctic. Models projecting future climate are the most uncertain in this region. Global climate is intimately connected to variability in sea ice, open ocean biogeochemical cycling and circulation, atmospheric radiation, and clouds over the Arctic Ocean. The goal of this research is to understand the influence of marine biological sources of aerosol particle production or growth. Specifically we focus in understanding the sources of microgels to gain a mechanistic understanding of the CCN (cloud condensation nuclei) formation and bio-radiative coupling.

 

Phytoplankton, Archaea Interactions

Phytoplankton produce 50% of total global organic carbon (C), and Archaea account for 40% of the microbial biomass in the world oceans, and 20% of the total biomass however their interaction is not well understood. We are interested in mechanistically understanding their coupled physiologies and cycling of nutrients in the context of the microbial loop paradigm and its implications to understanding the structure of complex aquatic ecosystems.