Gene Expression Exploration in Model & Other Plant Species



Overall Workflow:

  • Sorting Out Phenotypic Window to Look For Change in Gene of Interest;
  • Identifying the experimental studies that has the strongest expression level for a Gene of Interest;
  • Differentiate between condition-independent and condition-dependent co-expression networks and sorting genes expressed in certain conditions;
  • From a list of query genes and their corresponding expression values, determining which reactions differentially occur and which pathways they are part of.
  • Identifying interactions with a protein (or list of proteins) of interest;
  • Formulate gene regulatory network having high promoter abundance hits among Genes of interest, and prediction e.g. possible interolog relationship;







Expression Visualization From the Develpmental & Seed Map



Organism Arabidopsis thaliana


If we don’t see a phenotype under “normal” growth conditions for a mutant line of a particular gene (e.g. a T-DNA insertion line), the expression data for that gene can suggest where or under which conditions to look to be able to see a phenotype. So, from the virtual ePlant I’ve made a query on my Gene of Interest ABI3. Its expression level is on the peak specifically in Seeds stage 9 w/o siliques (1249.81).


The map is colored Yellow-Red according to the expression level.

Fig-1: Expression on Developemental Map Fig-2: Expression Level on Chart


ABI3 is known to be associated with seed dormancy, so strong expression in seeds that have been induced to have secondary dormancy makes perfect biological sense!


The strip that gradually changes from white to black near the top of the screen. This is how the Browser depicts which data compendia have the weakest or strongest expression. Here I have strongest expression in Seed(max) compendium.

And here I got very strong expression in SD1 and SD2, which are secondarily dormant samples. So, there’s a treatment that’s been applied to make the seeds dormant again.


Fig: Expression on Seed Map

Fig: Expression on Seed Map


Now, looking at expression in the root, there is strong expression in the procambium. The max expression level threshold is only 473.65 here. Relatively sensitive threshold than other compendiums, inferring relatively greater expression of ABI3 in roots.

In fact, if we look for ABI3 role in roots, ABI3 is involved in lateral root formation. So, there is a known biological role determined with genetic analysis for ABI3 and lateral root development. So, it makes sense for ABI3 to be expressed in this area.


Fig: Expression on the Root

Fig: Expression on the Root












Highthroughput Genevestigator Analysis Across Environmental & Genotypic Perturbations



From the anatomical collection, it is seen these Root protoplasts showing very strong ABI3 expression as seen in the virtual ePlant. Then strong expression in Seeds. Samples grouped according to the same anatomical part as part of the curation process.


Fig: Expression Anatomy




Genetic Purturbation:
ABI3 most strongly and most weakly expressed in the Lec-1 samples in the genetic perturbations view. ABI3 is really dramatically turned on in terms of Lec-1 expression, increasing expression level, here its seen that it’s actually an overexpression line and then also going to the bottom in the list and here lec1, a loss of function mutant has a dramatically decreased expression of ABI3. So, it’s nice to see that in the knockout, here there is reduced expression, in the overexpression of Lec1 line, increased expression of ABI3. So, that’s quite nice and that’s how strong and weak expression in Lec-1 samples happened, it depends on the kind of genetic manipulation in those particular samples, one case overexpression, in the other case, a knockout.


Fig: Expression in Genotypic Perturbation Fig: Expression in Genotypic Perturbation












Co-Expression Visualization



There are a couple of different ways of using such information. If we know that a certain gene of interest is involved in a particular process, we can look at genes that are similarly expressed and identify candidates with annotations like “hypothetical protein” that might also be involved in the process, as a kind of primary screen to identify “new” genes for that process. Similarly, we can look at genes with annotations derived from other contexts and infer that these are also involved in our process of interest.


Another way to use coexpression analysis is to ask for all genes matching a user-defined expression pattern to be returned, say genes increased in expression in the roots of heat-stressed plants, but not in roots of plants exposed to other environmental stresses. We could then use the promoters of these genes to drive reporter lines or to discover novel cis-elements.




Co-Expression Under Different Specific Conditions


The idea with coexpression analyses in specific conditions is that certain biological processes (like seed development or drought response) have coopted different sets of genes to help a plant develop or respond, and thus different sets of genes would be coexpressed under those specific conditions.


Here are Top ranked Co-expressed genelist to ABI3, under developemntal stage(1) & abiotic stress Condition(2)


Fig: Coexpressed Genes under Developmemntal Stages Fig: Coexpressed genes Under Abiotic Stress Fig: Coexpressed Network












Functional Classification & Pathway Enrichment Analysis



I have used the top 50 Co-expressed genes for ABI3 across a “Developmental Map” as identified with the Co-Expression analytics.




Gene Ontology Study


In the output, a table of enriched GO categories for the list of 50 genes is displayed among which four GO Biological Process terms (lipid localization, response to abscisic acid stimulus, macromolecule localization, post-embryonic development) and two GO Molecular Function terms (nutrient reservoir activity, lipid binding) are included.


Examining these, they seem to “make sense” in the context of the later stages of seed development, when ABI3 and these genes are expressed, in so-far as this is the time when lipid reserves are being accumulated and the seed begins to desiccate, etc.


Fig: Coexpressed Genes under Developmemntal Stages Fig: Coexpressed genes Under Abiotic Stress Fig: Coexpressed Network


In the summary of classification terms, categories that are overrepresented relative to the total number of instances of the term in the overall GO or MapMan database are BOLDED. The relative enrichment is shown on the left, while the absolute number of counts in a given category is on the right. The colour scheme for the categories is also used in the graphical section and for the bar code in the Overview Table.


Fig: Coexpressed Genes under Developmemntal Stages Fig: Coexpressed genes Under Abiotic Stress Fig: Coexpressed Network












Biological Network System Analytics



I have used the top 50 Co-expressed genes for ABI3 across a “Developmental Map” as identified with the Co-Expression analytics.




Protein-Protein Interactions in Cytoscape Among Developmental Co-expressed Genes


Cytoscape, which is a nice tool for looking at or inferring gene regulatory networks. Here I got the protein-protein interactions, and which gene shows the most edges with this gene set file of ABI3 co-expressed genes? In this case, ABI3 itself has the most documented protein-protein interactions. The Square Nodes depicts Transcription family proteins. Others circular nodes are input genes here. ABI3 has 94% Co-Expression with the input gene list.


Fig: Coexpressed PPI Network

Fig: Coexpressed PPI Network



Gene Regulatory Network


Now, I have identified potential gene regulatory networks with the set of co-expressed genes in this case. ABF4 is pretty good predicted regulator: it has 21 inferred mappings to input set promoters and has documented Protein DNA Interactions with 25% of the input genes.


Fig: Coexpressed Gene Regulatory Network Fig: Coexpressed Gene Regulatory Network



ABI5 (At2g36270) has experimentally-determined protein-DNA interactions with 41% of the input set promoters. It is also coexpressed with 90% of the input gene set, and has PWM mappings to 16 input gene set promoters, making it a very good candidate as a potential regulator.


Fig: Coexpressed Gene Regulatory Network Fig: Coexpressed Gene Regulatory Network



As found on my analysis with genetic perturbation, ABI3 is really dramatically turned on in terms of Lec-1 expression, increasing expression level. So, I have extracted the genes upregulated in LEC1OX plants.


Analysis on LEC1-ox Genes with Increased Expression
In this case, the majority of genes increased in the LEC1OX line are METABOLIC in nature, at least based on this cursory visual analysis. It also incorporates known microRNAs and their targets – another aspect of regulation. A snapshot of the Cytoscape graphis shown here, with the Spring Embedded layout and VirtualPlant_Style applied to the network. Metabolites from KEGG or AraCyc are shown by the orange nodes – there are lots of connections to the genes with increased expression levels from the LEC1OX line. the colouring scheme shown.


Fig: Network

Fig: Network

Fig: Coexpressed Gene Regulatory NetworkFig: Coexpressed Gene Regulatory Network



---
title: "Transcriptome Systemic Analysis"
author: "Md. Tabassum Hossain Emon"
output:
  html_notebook:
    toc: yes
    toc_float: yes
    includes:
      after_body: footer.html
    theme: flatly
  html_document:
    df_print: paged
---

<br>
<br>

<h1 align='center'><strong>Gene Expression Exploration in Model & Other Plant Species</strong></h1>

<br>

<br>


**Overall Workflow:**	

- Sorting Out Phenotypic Window to Look For Change in **Gene of Interest**;
- Identifying the experimental studies that has the strongest expression level for a **Gene of Interest**;
- Differentiate between condition-independent and condition-dependent co-expression networks and  sorting genes expressed in certain conditions;
- From a list of query genes and their corresponding expression values, determining which
reactions differentially occur and which pathways they are part of.
- Identifying interactions with a protein (or list of proteins) of interest;
- Formulate gene regulatory network having high promoter abundance hits among Genes of interest, and prediction e.g. possible interolog relationship;


<br>








<br><br><br><br>

******

<h2 align="center">Expression Visualization From the Develpmental & Seed Map</h2>

<br><br>

**Organism** 	*Arabidopsis thaliana*

<br>

If we don’t see a phenotype under “normal” growth conditions for a mutant line of a
particular gene (e.g. a T-DNA insertion line), the expression data for that gene can suggest where or under which conditions to look to be able to see a phenotype. So, from the virtual ePlant I've made a query on my **Gene of Interest ABI3**. Its expression level is on the peak specifically in **Seeds stage 9 w/o siliques** (1249.81).

<br>
<p align=center style="color:teal">The map is colored **Yellow-Red** according to the expression level.</p>






![**Fig-1:** Expression on Developemental Map](expression/dev_map.JPG) ![**Fig-2:** Expression Level on Chart](expression/chart_dev.JPG)



<br>

ABI3 is known to be associated with seed dormancy, so strong expression in seeds that have been induced to have secondary dormancy makes perfect biological sense!


<br>
The strip that gradually changes from white to black near the top of the screen. This is how the Browser depicts which data compendia have the weakest or strongest expression. Here I have strongest expression in **Seed(max)** compendium.

And here I got very strong expression in SD1 and SD2, which are secondarily dormant samples. So, there's a treatment that's been applied to make the seeds dormant again.

<br>

![**Fig:** Expression on Seed Map](expression/seed_exp.JPG){width=150%}



<br>
Now, looking at expression in the root, there is strong expression in the procambium. The max expression level threshold is only **473.65** here. Relatively sensitive threshold than other compendiums, inferring relatively greater expression of ABI3 in roots. 

In fact, if we look for ABI3 role in roots, ABI3 is involved in lateral root formation. So, there is a known biological role determined with genetic analysis for ABI3 and lateral root development. So, it makes sense for ABI3 to be expressed in this area.

<br>

![**Fig:** Expression on the Root](expression/root_map.JPG){width=150%}









<br><br><br><br><br><br><br><br>


******

<br><br>






<h3 align="center">Highthroughput Genevestigator Analysis Across Environmental & Genotypic Perturbations</h3>


<br><br>
From the anatomical collection, it is seen these **Root protoplasts** showing very strong ABI3 expression as seen in the virtual ePlant. Then strong expression in **Seeds**. Samples grouped according to the same anatomical part as part of the curation process.

<br>
![**Fig:** Expression Anatomy](expression/genevestigator_1.JPG)

<br>



<br><br>


**Genetic Purturbation:** <br>
*ABI3* most strongly and most weakly expressed in the **Lec-1** samples in the genetic perturbations view. ABI3 is really dramatically turned on in terms of Lec-1 expression, increasing expression level, here its seen that it's actually an overexpression line and then also going to the bottom in the list and here **lec1**, a loss of function mutant has a dramatically decreased expression of *ABI3*. So, it's nice to see that in the knockout, here there is reduced expression, in the overexpression of **Lec1 line**, increased expression of *ABI3*. So, that's quite nice and that's how strong and weak expression in **Lec-1 samples** happened, it depends on the kind of genetic manipulation in those particular samples, one case overexpression, in the other case, a knockout.



<br>
![**Fig:** Expression in Genotypic Perturbation](expression/gen_up.JPG) ![**Fig:** Expression in Genotypic Perturbation](expression/gen_down.JPG)







<br><br><br><br><br><br><br><br>


******

<br><br>


<h1 align="center"><strong>Co-Expression Visualization</strong></h1>

<br><br>


There are a couple of different ways of using such information. If we know that a certain gene of interest is involved in a particular process, we can look at genes that are similarly expressed and identify candidates with annotations like “hypothetical protein” that might also be involved in the process, as a kind of primary screen to identify “new” genes for that process. Similarly, we can look at genes with annotations derived from other contexts and infer that these are also involved in our process of interest.

<br>

Another way to use coexpression analysis is to ask for all genes matching a user-defined expression pattern to be returned, say genes increased in expression in the roots of heat-stressed plants, but not in roots of plants exposed to other environmental stresses. We could then use the promoters of these genes to drive reporter lines or to discover novel cis-elements.


<br><br><br>

<h2 align="center">Co-Expression Under Different Specific Conditions</h2>


<br>

The idea with coexpression analyses in specific conditions is that certain biological processes (like seed development or drought response) have coopted different sets of genes to help a plant develop or respond, and thus different sets of genes would be coexpressed under those specific conditions.

<br>

<p align=center>Here are Top ranked Co-expressed genelist to <strong>ABI3</strong>, under developemntal stage(1) & abiotic stress Condition(2) </p> 




<br>

![**Fig:** Coexpressed Genes under Developmemntal Stages](expression/tissue_coex.JPG) ![**Fig:** Coexpressed genes Under Abiotic Stress](expression/coex_net.JPG) ![**Fig:** Coexpressed Network](expression/stress_coex.JPG)












<br><br><br><br><br><br><br><br>


******

<br><br>


<h1 align="center"><strong>Functional Classification & Pathway Enrichment Analysis</strong></h1>

<br><br>


I have used the top 50 Co-expressed genes for *ABI3* across a “Developmental Map” as identified with the Co-Expression analytics.




<br><br><br>

<h2 align="center">Gene Ontology Study</h2>


<br>

In the output, a table of enriched GO categories for the list of 50 genes is displayed among
which four GO Biological Process terms (lipid localization, response to abscisic acid stimulus, macromolecule localization, post-embryonic development) and two GO Molecular Function terms (nutrient reservoir activity, lipid binding) are included.

<br>

Examining these, they seem to “make sense” in the context of the later stages of seed development, when *ABI3* and these genes are expressed, in so-far as this is the time when lipid reserves are being accumulated and the seed begins to desiccate, etc.




<br>

![**Fig:** Coexpressed Genes under Developmemntal Stages](expression/Go_amigo.png) ![**Fig:** Coexpressed genes Under Abiotic Stress](expression/Go_mf.png) ![**Fig:** Coexpressed Network](expression/Go_cc.png)


<br>


In the summary of classification terms, categories that are overrepresented relative to the
total number of instances of the term in the overall GO or MapMan database are **BOLDED**.
The relative enrichment is shown on the left, while the absolute number of counts in a given
category is on the right. The colour scheme for the categories is also used in the graphical
section and for the bar code in the Overview Table.

<br>


![**Fig:** Coexpressed Genes under Developmemntal Stages](expression/summ_go_1.JPG){width=150%} ![**Fig:** Coexpressed genes Under Abiotic Stress](expression/summ_go_2.JPG){width=150%} ![**Fig:** Coexpressed Network](expression/g_pro.JPG)







<br><br><br><br><br><br><br><br>


******

<br><br>


<h1 align="center"><strong>Biological Network System Analytics</strong></h1>

<br><br>


I have used the top 50 Co-expressed genes for *ABI3* across a “Developmental Map” as identified with the Co-Expression analytics.




<br><br><br>

<h2 align="center">Protein-Protein Interactions in Cytoscape Among Developmental Co-expressed Genes</h2>


<br>

Cytoscape, which is a nice tool for looking at or inferring gene regulatory networks. Here I got the protein-protein interactions, and which gene shows the most edges with this gene set file of **ABI3 co-expressed genes?** In this case, ABI3 itself has the most documented protein-protein interactions. The **Square Nodes** depicts Transcription family proteins. Others circular nodes are input genes here. **ABI3** has **94%** Co-Expression with the input gene list.


<br>


![**Fig:** Coexpressed PPI Network](expression/ppi_abi3.JPG)

<br><br>

<h2 align="center">Gene Regulatory Network</h2>

<br>

Now, I have identified potential gene regulatory networks with the set of co-expressed genes in this case. **ABF4** is pretty good predicted regulator: it has 21 inferred mappings to input set promoters and has documented Protein DNA Interactions with 25% of the input genes.



<br>


![**Fig:** Coexpressed Gene Regulatory Network](expression/ABF4.JPG) ![**Fig:** Coexpressed Gene Regulatory Network](expression/abf4_table.JPG)



<br><br>


**ABI5 (At2g36270)** has experimentally-determined protein-DNA interactions with 41% of the input set promoters. It is also coexpressed with 90% of the input gene set, and has PWM mappings to 16 input gene set promoters, making it a very good candidate as a potential regulator.

<br>

![**Fig:** Coexpressed Gene Regulatory Network](expression/ABI5.JPG) ![**Fig:** Coexpressed Gene Regulatory Network](expression/ABI5_t.JPG)


<br><br>





As found on my analysis with genetic perturbation, **ABI3** is really dramatically turned on in terms of **Lec-1 expression**, increasing expression level. So, I have extracted the genes upregulated in LEC1OX plants.

<br>

**Analysis on LEC1-ox Genes with Increased Expression**
<br>
In this case, the majority of genes increased in the **LEC1OX line are METABOLIC in nature**, at least based on this cursory visual analysis. It also incorporates known microRNAs and their targets – another aspect of regulation. A snapshot of the Cytoscape graphis shown here, with the Spring Embedded layout and VirtualPlant_Style applied to the network. Metabolites from KEGG or AraCyc are shown by the orange nodes – there are lots of connections to the genes with increased expression levels from the **LEC1OX line**.
the colouring scheme shown.

<br>

![**Fig:** Network](expression/network.png) 

![**Fig:** Coexpressed Gene Regulatory Network](expression/node_color.JPG)![**Fig:** Coexpressed Gene Regulatory Network](expression/node_shape.JPG)


<br>

<br>









 

A work by Md. Tabassum Hossain Emon

emon.biotech.10th@gmail.com