{"id":145,"date":"2019-08-01T23:24:41","date_gmt":"2019-08-01T23:24:41","guid":{"rendered":"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/?page_id=145"},"modified":"2019-12-02T03:37:22","modified_gmt":"2019-12-02T03:37:22","slug":"subtyping-melanoma","status":"publish","type":"page","link":"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/subtyping-melanoma\/","title":{"rendered":"Melanoma Gene Regulatory Network"},"content":{"rendered":"\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h4 class=\"wp-block-heading\">Brooke Ury and Myles Vinh Farr<\/h4>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h4 class=\"wp-block-heading\">Goal<\/h4>\n\n\n\n<p>In this project, our goal was to subtype melanoma using genome-wide expression analysis.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h4 class=\"wp-block-heading\">Background<\/h4>\n\n\n\n<p style=\"text-align:left\">Melanoma is a type of skin cancer generally caused by a combination of factors, including UV exposure and genetics. The number of cases per year continues to grow,&nbsp; with more than 200,000 cases expected next year for the US alone. It is considered the most serious type of skin cancer due to its ease in metastasizing (spreading to other organs and parts of the body). However, cancers are a buildup of mutations, so not all patients of melanoma will have the same molecular profiles. In this analysis, we attempted to subtype melanoma by creating gene regulatory networks using MINER, a python program developed at ISB.<\/p>\n\n\n\n<p>We used data from the GDC (Genomic Data Commons), a part of the NIH, the National Institute for Health. Specifically, it came from the project TCGA-SKCM (the cancer genome atlas, for skin cutaneous melanoma). The data came in the form of FPKM values, which stands for fragments per kilobase millions. It is a measure of how many mRNA sequences there are per gene, which is normalized for gene length and the depth of the experiment.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h4 class=\"wp-block-heading\">Processing Data<\/h4>\n\n\n<p><span style=\"font-weight: 400\">The raw data involved 470 patients, with approximately 60,000 FPKM values per patient, as there were around 60,000 genes to account for. However, the majority of the FPKM values were zero. Humans have thousands of genes, but all cells are not going to express all those genes. After all, genes for smelling would not be expressed in a red blood cell.&nbsp; Our goal was to find the genes whose FPKM values had large variation, and that were generally greater than 1. Ideally, we only wanted to analyze genes that were being expressed in patients, and that were being expressed in different ways (if all melanoma patients had an FPKM value of 10 for a certain gene, it would not help us to determine different types of melanoma). We tried multiple methods to process the data, including coefficient of variation, entropy, and a filtering method. Ultimately, we decided to eliminate a gene from out analysis if less than 30% of the FPKM values for that gene were above 3.<\/span><\/p>\n\n\n<div class=\"wp-block-columns has-2-columns is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"795\" height=\"282\" src=\"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/Threshold-2.png\" alt=\"\" class=\"wp-image-731\" srcset=\"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/Threshold-2.png 795w, https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/Threshold-2-300x106.png 300w, https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/Threshold-2-768x272.png 768w\" sizes=\"auto, (max-width: 795px) 100vw, 795px\" \/><\/figure>\n\n\n\n<p><\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"799\" height=\"285\" src=\"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/COV_Entropy-2.png\" alt=\"\" class=\"wp-image-732\" srcset=\"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/COV_Entropy-2.png 799w, https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/COV_Entropy-2-300x107.png 300w, https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/COV_Entropy-2-768x274.png 768w\" sizes=\"auto, (max-width: 799px) 100vw, 799px\" \/><\/figure>\n\n\n\n<p><\/p>\n<\/div>\n<\/div>\n\n\n\n<p> Figure 1. Filtering and refining. (A) The raw data (with many zeros). (B) Data post-filtering using a minimum FPKM threshold. (C) (D) Alternate attempts at filtering (coefficient of variation and information entropy). CV failed because more expressed genes had lower variation. Entropy failed because there was a lot of noise making it hard to set an arbitrary threshold. <\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h4 class=\"wp-block-heading\">Clustering<\/h4>\n\n\n\n<p style=\"text-align:left\">After we processed the data, we used MINER to cluster the genes. Groups of genes are clustered together if they are coexpressed, meaning that the overexpression and underexpression patterns of genes is similar between patients.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"296\" src=\"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/clustering-1024x296.png\" alt=\"\" class=\"wp-image-728\" srcset=\"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/clustering-1024x296.png 1024w, https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/clustering-300x87.png 300w, https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/clustering-768x222.png 768w, https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/clustering.png 1268w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>Figure 2. Clustering using PCA and K-means. (A) A random selection of genes before clustering. (B) The first three clusters showing how a group of genes in a cluster&nbsp;are coexpressed.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h4 class=\"wp-block-heading\">Network mapping\/ mechanistic inference<\/h4>\n\n\n\n<p>After the initial clusters were created, they were compared to known regulons from a database. The database includes information about both the regulon and the genes controlled by the regulon. If the genes from the cluster matched to the genes controlled by the regulon, then the genes were kept and associated to the regulon. Additionally, each regulon per patient was assigned one of three values: 1 if the regulon is upregulated, -1 if the regulon is downregulated or 0 if the regulon was approximately average.<\/p>\n\n\n\n<figure class=\"wp-block-image is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/network-activity.png\" alt=\"\" class=\"wp-image-733\" width=\"233\" height=\"220\" srcset=\"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/network-activity.png 453w, https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/network-activity-300x283.png 300w\" sizes=\"auto, (max-width: 233px) 100vw, 233px\" \/><\/figure>\n\n\n\n<p>Figure 3. Network Activity. The coexpression clusters were then compared to known transcription factor-gene interactions to create coregulation clusters (regulons). Next, regulons for each patient were assigned one of three values: 1 (red\/upregulated), -1 (blue\/ downregulated), or 0 (white\/null).<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h4 class=\"wp-block-heading\">Subtype discovery<\/h4>\n\n\n\n<p>After regulons were discovered, the next step using MINER was to create subtypes based on the data. First, patients were compared to each other to create initial clusters of similarity. Then, from those initial clusters, MINER\u00a0compared them to known transcriptional programs. MINER determined which transcriptional programs regulation changed significantly between initial clusters, creating more refined subtypes.<\/p>\n\n\n\n<div class=\"wp-block-columns has-2-columns is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<figure class=\"wp-block-image is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/similarity-matrix.png\" alt=\"\" class=\"wp-image-734\" width=\"207\" height=\"187\" srcset=\"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/similarity-matrix.png 504w, https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/similarity-matrix-300x271.png 300w\" sizes=\"auto, (max-width: 207px) 100vw, 207px\" \/><\/figure>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<figure class=\"wp-block-image is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/subtypes.png\" alt=\"\" class=\"wp-image-735\" width=\"209\" height=\"192\" srcset=\"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/subtypes.png 495w, https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/subtypes-300x276.png 300w\" sizes=\"auto, (max-width: 209px) 100vw, 209px\" \/><\/figure>\n<\/div>\n<\/div>\n\n\n\n<p>Figure 4. Disease subtypes and survival through patient similarity. (A) Initial patient groups formed by comparing patients\u2019 molecular profiles. (B) Final subtype form showing how there are clusters of regulons (programs) and how patients group together through these transcriptional programs.&nbsp;<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h4 class=\"wp-block-heading\">Survival by Subtype<\/h4>\n\n\n\n<p>After the subtypes of melanoma were determined, we went away from MINER and created our own code to determine the survivability of each of the eight subtypes of melanoma. We used a Kaplan-Meier estimate to determine the survivability, which is essentially a ratio of alive patients to dead patients at each moment in time.&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/survival.png\" alt=\"\" class=\"wp-image-736\" width=\"296\" height=\"285\" srcset=\"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/survival.png 574w, https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/survival-300x289.png 300w\" sizes=\"auto, (max-width: 296px) 100vw, 296px\" \/><\/figure>\n\n\n\n<p>Figure 5. Survival of Melanoma patients by subtype. Risk estimations calculated by creating Kaplan-Meier curves for subgroups. Notably, certain groups are much more\/less at risk than the population as a whole (blue line).<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h4 class=\"wp-block-heading\">Results<\/h4>\n\n\n\n<p>Although some of the subgroups had what appeared to be very similar molecular profiles, the survivability of these groups were actually quite different. For example, subgroups 1 and 2 seem very similar at first glance, but one of the subgroup\u2019s survival curve was much higher than the average for all the patients, and the other subgroup was much smaller.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h4 class=\"wp-block-heading\">Future Directions<\/h4>\n\n\n\n<p>In the future, we hope to more extensively study the gene regulatory network from a biology standpoint. We want to understand why specific mutations in certain genes caused the differences in&nbsp;survival.&nbsp;<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<p>Check out our github here: <a href=\"https:\/\/github.com\/isbinternship\/MelanomaExpression\/tree\/master\">https:\/\/github.com\/isbinternship\/MelanomaExpression\/tree\/master<\/a><\/p>\n\n\n\n<div class=\"wp-block-columns has-2-columns is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"768\" src=\"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/poster-1-1024x768.jpg\" alt=\"\" class=\"wp-image-764\" srcset=\"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/poster-1-1024x768.jpg 1024w, https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/poster-1-300x225.jpg 300w, https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/08\/poster-1-768x576.jpg 768w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"802\" height=\"602\" src=\"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/10\/ISBPoster.png\" alt=\"\" class=\"wp-image-928\" srcset=\"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/10\/ISBPoster.png 802w, https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/10\/ISBPoster-300x225.png 300w, https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-content\/uploads\/sites\/7\/2019\/10\/ISBPoster-768x576.png 768w\" sizes=\"auto, (max-width: 802px) 100vw, 802px\" \/><\/figure>\n<\/div>\n<\/div>\n\n\n\n<p>Figure 6. Brooke and Myles with their completed poster.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h4 class=\"wp-block-heading\">Acknowledgments<\/h4>\n\n\n\n<p>We would like to thank the SEE program, Claudia Ludwig, and Rachel Calder for facilitating the amazing learning opportunity and&nbsp; their continued support; Nitin S. Baliga for the resources; Adrian Lopez Garcia de Lomana and Jacob J. Valenzuela for their mentorship; Maryann Ruiz and Alex Carr for helping us in the lab; Anne Gillies, Ellie Rider, and Michael Walker for leading us through the program; Matt Wall for the development of MINER; Andrew Liu and James Park for helping us understand MINER; and all the other high school interns for keeping things fun and giving us endless support<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Brooke Ury and Myles Vinh Farr Goal In this project, our goal was to subtype melanoma using genome-wide expression analysis. Background Melanoma is a type of skin cancer generally caused by a combination of factors, including UV exposure and genetics. The number of cases per year continues to grow,&nbsp; with more than 200,000 cases expected [&hellip;]<\/p>\n","protected":false},"author":53,"featured_media":318,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-145","page","type-page","status-publish","has-post-thumbnail","hentry","post","post-with-thumbnail","post-with-thumbnail-large"],"_links":{"self":[{"href":"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-json\/wp\/v2\/pages\/145","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-json\/wp\/v2\/users\/53"}],"replies":[{"embeddable":true,"href":"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-json\/wp\/v2\/comments?post=145"}],"version-history":[{"count":19,"href":"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-json\/wp\/v2\/pages\/145\/revisions"}],"predecessor-version":[{"id":996,"href":"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-json\/wp\/v2\/pages\/145\/revisions\/996"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-json\/wp\/v2\/media\/318"}],"wp:attachment":[{"href":"https:\/\/baliga.systemsbiology.net\/see-interns\/hs2019\/wp-json\/wp\/v2\/media?parent=145"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}