Comparative analysis of genomic features in industrial and fast-growing trees: A study of poplar and eucalypt

Document Type : Scientific article

Author

Assistant Professor, Ahar Faculty of Agriculture and Natural Resources, University of Tabriz, Ahar, I. R. Iran

Abstract

Background and objectives: Identification of genetic similarities and gene orthology between species can be used in understanding the evolution of genomes, conservation, and breeding. A lot of knowledge about the genome function of forest trees can be extracted through comparative genomics studies. So far, various economically important crop species have been well studied in this field, but forest trees have been less studied. It seems that comprehensive studies in the direction of genome comparison between industrial and fast-growing trees, poplar (Populus trichocarpa) and eucalypt (Eucalyptus grandis), which share a common ancestor from the Rosids clade, have been relatively limited, especially given that these two plant species serve as a model and have up-to-date biological data. This study aims to compare the complete genome sequence of eucalyptus and poplars in terms of genomic characteristics such as genome size, chromosome number, gene content, microsatellite markers, the number of genes of the terpene synthase gene family and identification of genes related to two important traits of interest to forest tree breeders, including wood formation and cell wall quality.
Methodology: In this research, whole genome sequencing of eucalyptus (E. grandis) with NCBI accession number GCF-016545825.1 and poplar (P. trichocarpa) with NCBI accession number GCF- 000002775.5 is used. Both the tree species are model plants and their genomes were assembled at the chromosome level. In this study, we investigate various genomic characteristics, including genome size, chromosome number, total GC content, gene count, protein-coding genes, small non-coding RNAs (SncRNA), pseudogenes, and microsatellite sequences, in two rapidly growing poplar and eucalypt species. Additionally, we construct a corresponding Venn diagram to illustrate the findings. Also, the sequences of microsatellites with MISA software in Perl and the sequences related to tandem duplication on the genomes were extracted. Also, the number of terpene synthase gene family genes in two species was compared. Finally, genes related to two important traits of interest to breeders, including wood formation and cell wall quality traits, were studied.
Results: The results reveal that the eucalypt genome is larger than that of poplar, containing 42,619 genes, including 33,352 protein-encoding genes. The poplar genome, on the other hand, consists of 34,621 genes, with 29,617 being protein-coding. Moreover, the number of pseudogenes in the eucalypt genome is 2.9 times higher than that in poplar. The number of eucalyptus chromosomes is 11 and the number of poplar chromosomes is 19. The number of small RNAs for eucalyptus and poplar genomes were 1507 and 1347, respectively. According to the genome annotation information available on NCBI site, some genes were found only in Eucalyptus and some genes were found only in poplar. According to the Venn diagram, 14,484 unique genes for Eucalyptus and 12,114 genes specific to poplar were identified. 9133 genes were shared between the two species. The total number of microsatellite markers identified on the eucalyptus genome was 136,147 and for the poplar genome was 77,024. The results showed that the genomes of eucalyptus and poplar are composed of 3.8 Mb and 10.2 Mb of microsatellite sequences, respectively. Interestingly, the eucalypt genome exhibits 1.8 times more microsatellite markers and a 1.2 times greater marker density (Total microsatellite sizes in kilobases divided by genome size in megabases or kb/Mb) compared to the poplar genome. It should be noted that 4067 types of motifs were identified in the eucalyptus genome and 2898 types of motifs were identified in the poplar genome. We observed an inverse relationship between the frequency of microsatellites and the number of nucleotides among the genomic sequences of the studied species. So, with the increase in the frequency of microsatellites, a significant decrease in the number of nucleotides has been observed. Based on this, single and two nucleotide microsatellites had the highest frequency, while eight and nine nucleotide microsatellites had the lowest frequency. The results of the evaluation regarding the difference in the presence of the terpene synthase gene family in the two studied species also indicated that 112 genes were identified in eucalyptus and 7 genes were identified in poplar. The number of clusters has been identified as 3185 in Eucalyptus species and 2575 in poplar species. The total number of retained tandem genes in the eucalypt genome was 16 % more than that of the poplar genome. Additionally, the number of functional and non-functional genes in eucalypt surpasses that of poplar. The valuable insights obtained from such comparative genomics studies have the potential to facilitate plant breeding and conservation genetic efforts. The alternative splicing event has occurred in a large number of genes related to wood formation trait in the two studied trees with different patterns. A total of 59 candidate genes for cell wall quality trait were identified for poplar and eucalyptus.
Conclusion: Comparative genomics can speed up the breeding program of tree species by providing diverse alleles related to important economic and ecological traits and also help to preserve endangered and genetically distinct species.

Keywords

Main Subjects


Adhikari, S.; Saha, S.; Biswas, A.; Rana, T.S.; Bandyopadhyay, T. K.; Ghosh, P.; Application of molecular markers in plant genome analysis: a review. Nucleus 2017, 60: 283-297.
Ai, W.; Liu, Y.; Mei, M.; Zhang, X.; Tan, E.; Liu, H.; Han, X.; Zhan, H.; Lu, X.; A chromosome-scale genome assembly of the Mongolian oak (Quercus mongolica). Molecular Ecology Resources 2022, 22: 2396–2410.
Baldrich, P.; Bélanger, S.; Kong, S.; Pokhrel, S.; Tamim, S.; Teng, C.; Schiebout, C.; Guna, S. Gurazada, R.; Gupta, P.; Patel, P.; Razifard, H.; Nakano, M.; Dusia, A.; Meyers, B. C.; Frank, M. H.; The evolutionary history of small RNAs in Solanaceae. Plant Physiology 2022, 2: 644–665.
Bennetzen, J.L.; Transposable element contributions to plant gene and genome evolution. Plant Molecular Biology 2020, 42: 251–269.
Butler, J.B.; Freeman, J.S.; Potts, B.M.; Annotation of the Corymbia terpene synthase gene family shows broad conservation but dynamic evolution of physical clusters relative to Eucalyptus. Heredity 2018, 121, 87–104.
Butler, J.B.; Vaillancourt, R.E.; Potts, B.M; Comparative genomics of Eucalyptus and Corymbia reveals low rates of genome structural rearrangement. BMC Genomics 2017, 18, 397.
Heissl, A.; Betancourt, A.J.; Hermann, P.; Povysil, G.; Arbeithuber, B.; Futschik, A.; Ebner, T.; Tiemann-Boege, I.; Length asymmetry and heterozygosity strongly influences the evolution of poly-A microsatellites at meiotic recombination hotspots. BioRxiv 2018, 431841.
Hernández, M.A.; Vaillancourt, R.E.; Potts, B.M.; Insights into the evolution of the eucalypt CER1 and CER3 genes involved in the synthesis of alkane waxes. Tree Genetics and Genomes 2024, 20: 1-15.
MirMohammadi Maibody, S.A.M.; Golkar, P.; Application of DNA molecular markers in plant breeding. Journal of Plant Genetic Researches 2019, 6 (1): 1-30 (In Persian).
Mohammadi, Y.; Banaei-Asl, F.; Espahbodi, K.; Evaluation of genetic relationships of selected Ash trees for seed orchard formation at ‎Chamestan research station. Forest Research and Development 2023, 9: 17-27 (In Persian).
Myburg, A.A.; Grattapaglia, D.; Tuskan, G.A.; Hellsten, U.; Hayes, R.D.; Grimwood, J.; Jenkins, J.; Lindquist, E.; Tice, H.; Bauer, D.; The genome of Eucalyptus grandis. Nature 2014, 510: 356–62.
Onda, Y.; Mochida, K.; Exploring genetic diversity in plants using high-throughput sequencing techniques. Current Genomics 2016, 17(4): 358.
Pancaldi, F.; Vlegels, D.; Rijken, H.; van Loo, E.N.; Trindade, L.M.; Detection and analysis of syntenic quantitative trait loci controlling cell wall quality in angiosperms. Frontier Plant Science 2022, 13: 855093.
Potter, K.M.; Hipkins, V.D.; Mahalovich, M.F.; Means, R.E.; Nuclear genetic variation across the range of Ponderosa pine (Pinus ponderosa): Phylogeographic, taxonomic and conservation implications. Tree Genetics and Genomes 2015, 11: 38.
Rostami, R.; Seyedi, N.; Yousefzadeh, H.; Genetic diversity of wild apple (Malus orientalis Uglitz.) in Hyrcanian Forests of Iran by SSR markers. Forest Research and Development 2019, 5: 169-179 (In Persian).
Saadati Jebeli, M.; Marashi, H.; Shahriari, F.; Seifi, A.; Fekrat, L.; Evaluation of genomic diversity of pistachio resistant and susceptible cultivars to pest psyllid using whole genome sequencing approach. Genetic Engineering and Biosafety Journal 2022, 11 (1): 64-71 (In Persian).
Salse, J.; Translational research from models to crops: comparative genomics for plant breeding. Comptes Rendus Biologies 2023, 4: 111-128.
Sadeghi, S.M.; Sardabi, H.; Kazerooni, H.; Sharifi, M.A.; Farrar, N.; Rashvand, S.; Adaptability and performance of industrial Eucalyptus species in Dashtestan, Bushehr province, Iran. Journal of Forest Poplar Research 2018, 2: 264-275 (In Persian).
Song, X.; Yang, Q.; Bai, Y.; Gong, K.; Wu, T.; Yu, T.; Pei, Q.; Duan, W.; Huang, Z.; Wang, Z.; Li, Z.; Kang, X.; Zhao, W.; Ma, X.; Comprehensive analysis of SSRs and database construction using all complete gene-coding sequences in major horticultural and representative plant. Horticulture Research 2021, 122: 2-17.
Taheri, A.; Seyedi, N.; Abdollahi Mandoulakani, B.; Mirzaghaderi, G.; Najafi, S.; Vahdati, K.; Genetic diversity in Persian walnut (Juglans regia L.) seedlings using SSR markers. Forest Research and Development 2022, 8(1): 13-26 (In Persian).
Wang, C.; Liu, X.; Peng, S.; Xu, Q.; Yuan, X.; Feng, Y.; Yu, H.; Wang, Y.; Wei, X.; Development of novel microsatellite markers for the BBCC Oryza genome (Poaceae) using high-throughput sequencing technology. PloS One 2014, 9 (3): e91826.
Wijerathna-Yapa, A.; Bishnoi, R.; Ranawaka, B.; Maya Magar, M.; Ur Rehman, H.; Bharad, S.G.; Lorenc, M.T.; Ramtekey, V.; Gohar, S.; Lata, C.; Harun-Or-Rashid, M.D.; Razzaq, M.; Sajjad, M.; Basnet, B.R.; Rice–wheat comparative genomics: Gains and gaps. The Crop Journal 2023, 1-14
Xie, J.; Li, Y.; Liu, X.; Zhao, Y.; Li, B.; Ingvarsson, P.K.; Zhang, D.; Evolutionary origins of pseudogenes and their association with regulatory sequences in plants. Plant Cell 2019, 31: 563–578.
Xu, P.; Kong, Y.; Song, D.; Conservation and functional influence of alternative splicing in wood formation of Populus and Eucalyptus. BMC Genomics 2014, 15, 1-12.
Zhu, L.; Wu, H.; Li, H.; Tang, H.; Zhang, L.; Xu, H.; Jiao, F.; Wang, N.; Yang, L.; Short tandem repeats in plants: Genomic distribution and function prediction. Electronic Journal of Biotechnology 2021, 50: 37-44.