Pyrococcus furiosus DSM 3638

Pyrococcus furiosus DSM 3638

Refseq: NC_003413, 1908256 bp

NCBI taxonomy ID: 186497

Transcriptome structure

Analysis of growth samples of P. furiosus DSM 3638 identified 1305 transcriptional units of which 398 were polycistronic transcripts and 907 were monocistronic (Table 1). Only 13 of 70 unannotated transcripts showed sequence similarity to genes in hyperthermophilic archaea (Pyrococcus and Thermococcus) and bacteria (Thermotoga).

P. furiosus has 1.9 Mb genome containing at least 2065 ORFs (Robb et al., 2001). The predicted ORFs can largely vary in size and number depending on the automatic annotation pipeline, and the current GenBank annotation (RefSeq) has 60 additional questionable ORFs such as PF0736.1n. Of 127 ORFs which were not present in the original GenBank annotation of P. furiosus (Poole et al., 2005), tiling array analysis identified at least 58 ORFs including 12 ones not overlapping any of the current GenBank ORFs and two transcripts antisense to annotated ORFs, which demonstrates the capability of high-resolution strand-specific tiling microarray in detecting unannotated and antisense transcripts (Fig 1A).

In P. furiosus, CRISPR RNA (crRNA) and RAMP module Cas proteins (Cmr) form a ribonucleoprotein complex in order to bind and cleave complementary RNA rather than DNA (Hale et al., 2009). Another six cas genes are located directly adjacent to the cmr genes, however, these two gene set showed conditional regulation throughout the growth stages (Fig 1B). Interestingly, cas genes at the other locus were constantly up-regulated. This study firstly demonstrates genome-wide regulation of CRISPR/Cas system.

Conditional use of alternative TTSs was observed in a large putative operon (mbh1-14) encoding membrane-bound hydrogenase (MBH) which is primarily responsible for producing H2 during growth (Jenney and Adams, 2008; Sapra et al., 2000) (Fig 1C). Divergent ORFs, mbh1 and PF1422 encoding thioredoxin reductase, shared DNA binding palindrome motif (GTTn3AAC, marked by asterisks) recognized by SurR transcriptional regulator of hydrogen and elemental sulphur metabolism in their upstream regions, (Lipscomb et al., 2009). Strikingly, mbh1 was coexpressed with surR (r=0.95), while the expression of PF1422 was anti-correlated with that of surR (r = -0.94). In the previous in vitro transcription assay, addition of SurR only resulted in high expression of mbh1 (Lipscomb et al., 2009). Taken together, SurR can be transcriptional activator of mbh1 and transcriptional repressor of PF1422 through the same SurR DNA binding site.

Table 1. Overview of the Transcriptome Structure of P. furiosus DSM 3638 

Figure 1. Examples of discoveries made through tiling array analysis of dynamic changes in transcriptome structure of P. furiosus DSM 3638.

(A) Identification of misannotation. PF0736.1n encoding hypothetical protein was not expressed (P = 0.16), while its opposite strand was expressed to generate possible 74 consecutive amino acids. Possibly, this sequence can be 5’ UTR of PF0736 or novel ORF. It should be noted that PF0736.1n was reported to be expressed in PCR-based microarray (Poole et al., 2005), which exemplifies the validity of strand-specific differential expression of high-resolution tiling microarray. (B) CRISPR/CAS system. In Pyrococcus CRISPR-Cas system, the guide crRNAs was suggested to be processed and translocated to the Cmr complex by ribonuclease Cas6. (a) In tiling array experiments, cas6 and cmr gene clusters formed separate transcript units, but highly coexpressed. (b). Their expression was also highly correlated with all the seven CRISPRs and lots of computationally predicted small nucleolar RNAs (snoRNAs) (r > 0.9). Unexpectedly, the adjacent core cas genes (cas1, cas4, cas5t, cas6) and other cas genes (cst1 and cst2) were differently expressed, which implies conditional breaks in the organization of operon during cellular responses in differing environments. (c) While the core cas gene cluster at locus #1 was down-regulated throughout the culture conditions, cas genes at locus #2 were up-regulated (r ~ 0.4). (C) Conditional regulation of MBH operon. Segmentation analysis identified three alternative TSSs located upstream of mbh1-9 encoding putative Na+/H+ antiporter, of mbh10-12 (homologs of subunits (hycG and hycE) of E. coli hydrogenase 3 complex), and of mbh13-14 (homologs of E. coli hycD and hycF). Interestingly, based on predicted transmembrane domains, Mbh1-9 and Mbh10-12 were suggested to be located in membrane and cytoplasm, respectively (Holden et al., 2001). On the opposite strand of MBH operon, there seemed to be at least two ncRNA, and one of them located at 3’ end of the MBH operon was correlated with surR (r=0.95). TSS of mbh1 was determined using primer extension (Lipscomb et al., 2009) and the TSSs mapped by the two independent methodologies mapped within 19 nucleotides of each other. TSS for PF1422 was found to be located at 40 bps away from the annotated start site, and putative another start site was located near the TSS, which implies misannotated boundary of this gene. 

 

P. furiosus DSM 3638 Resources

Data for this project can be accessed through the following software and databases that were generated through DOE funding.

AGaggle Genome Browser (ID/PWmaggie/archaea; Note: this link downloads large data files (~60-190MB)).   The Gaggle Genome Browser is software developed by the Baliga Lab for visualizing systems biology data organized by their genomic coordinates.  You can learn more about GGB by going here.  You will find extensive information on data formats and features along with demos and screencasts on how to use this software.  Once you have launched the GGB software prepackaged with P. furiosus DSM 3638 data, you can browse the annotated and curated information on transcriptome structure as follows: (1) Right click on bookmarks file and save the file to your desktop;  (2) In GGB, click on Bookmarks>Load Bookmarks to load this file; (3) Finally, click on Bookmarks>Show Bookmarks --the bookmarks should appear as a new pane on the right hand side.  GGB can communicate with other software in the Gaggle framework; for more information about Gaggle go here.  Additional software tools for analyzing P. furiosus DSM 3638 data in Gaggle can be found here. [Using GGB: go here for information on how to interpret information contained in the various tracks.  IMPORTANT: Please go here to make sure your computer is set up right for using this software]

References