Identification of SNPs in rice GPAT genes and in silico analysis of their functional impact on GPAT proteins

Authors

  • Imran SAFDER State Key Laboratory of Rice Biology and China National Center for Rice Improvement, China National Rice Research Institute, Hangzhou 310006 (CN)
  • Gaoneng SHAO State Key Laboratory of Rice Biology and China National Center for Rice Improvement, China National Rice Research Institute, Hangzhou 310006 (CN)
  • Zhonghua SHENG State Key Laboratory of Rice Biology and China National Center for Rice Improvement, China National Rice Research Institute, Hangzhou 310006 (CN)
  • Peisong HU State Key Laboratory of Rice Biology and China National Center for Rice Improvement, China National Rice Research Institute, Hangzhou 310006 (CN)
  • Shaoqing TANG State Key Laboratory of Rice Biology and China National Center for Rice Improvement, China National Rice Research Institute, Hangzhou 310006 (CN)

DOI:

https://doi.org/10.15835/nbha49312346

Keywords:

3000 Rice Genome project, functional SNPs, in silico analysis, nucleotide variation

Abstract

SNPs are the most common nucleotide variations in the genome. Functional SNPs in the coding region, known as nonsynonymous SNPs (nsSNPs), change amino acid residues and affect protein function. Identifying functional SNPs is an uphill task as it is difficult to correlate between variation and phenotypes in association studies. Computational in silico analysis provides an opportunity to understand the SNPs functional impact to proteins and facilitate experimental approaches in understanding the relationship between the phenotype and genotype. Advancement in sequencing technologies contributed to sequencing thousands of genomes. As a result, many public databases have been designed incorporating this sequenced data to explore nucleotide variations. In this study, we explored functional SNPs in the rice GPAT family (as a model plant gene family), using 3000 Rice Genome Sequencing Project data. We identified 1056 SNPs, among hundred rice varieties in 26 GPAT genes, and filtered 98 nsSNPs. We further investigated the structural and functional impact of these nsSNPs using various computational tools and shortlisted 13 SNPs having high damaging effects on protein structure. We found that rice GPAT genes can be influenced by nsSNPs and they might have a major effect on regulation and function of GPAT genes. This information will be useful to understand the possible relationships between genetic mutation and phenotypic variation, and their functional implication on rice GPAT proteins. The study will also provide a computational pathway to identify SNPs in other rice gene families.

References

Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, … Sunyaev SR (2010). A method and server for predicting damaging missense mutations. Nature Methods 7:248-249. https://doi.org/10.1038/nmeth0410-248

Alexandrov N, Tai S, Wang W, Mansueto L, Palis K, Fuentes RR, … Li Z (2015). SNP-Seek database of SNPs derived from 3000 rice genomes. Nucleic Acids Research 43:D1023-D1027. https://doi.org/10.1093/nar/gku1039

Arif R, Akram F, Jamil T, Mukhtar H, Lee SF, Saleem M (2017). Genetic variation and its reflection on posttranslational modifications in frequency clock and mating Type a-1 proteins in Sordaria fimicola. BioMed Research International 2017:1268623. https://doi.org/10.1155/2017/1268623

Arshad M, Attya Bhatti PJ (2018). Identification and in silico analysis of functional SNPs of human TAGAP protein: A comprehensive study. PloS One 13(1):e0188143. https://doi.org/10.1371/journal.pone.0188143

Ashkenazy H, Abadi S, Martz E, Chay O, Mayrose I, Pupko T, Ben-Tal N (2016). ConSurf 2016: an improved methodology to estimate and visualize evolutionary conservation in macromolecules. Nucleic Acids Research 44:W344-W350. https://doi.org/10.1093/nar/gkw408

Ashkenazy H, Erez E, Martz E, Pupko T, Ben-Tal N (2010). ConSurf 2010: calculating evolutionary conservation in sequence and structure of proteins and nucleic acids. Nucleic Acids Research 38:W529-W533. https://doi.org/10.1093/nar/gkq399

Berezin C, Glaser F, Rosenberg J, Paz I, Pupko T, Fariselli P, Casadio R, Ben-Tal N (2004). ConSeq: the identification of functionally and structurally important residues in protein sequences. Bioinformatics 20:1322-1324. https://doi.org/10.1093/bioinformatics/bth070

Bhardwaj A, Dhar YV, Asif MH, Bag SK (2016). In silico identification of SNP diversity in cultivated and wild tomato species: insight from molecular simulations. Scientific Reports 6:38715-38715. https://doi.org/10.1038/srep38715

Bhardwaj VK, Purohit R (2020). Structural changes induced by substitution of amino acid 129 in the coat protein of Cucumber mosaic virus. Genomics 112:3729-3738. https://doi.org/10.1016/j.ygeno.2020.04.023

Blom N, Gammeltoft S, Brunak S (1999). Sequence and structure-based prediction of eukaryotic protein phosphorylation sites. Journal of Molecular Biology 294:1351-1362. https://doi.org/10.1006/jmbi.1999.3310

Carugo O, Pongor S (2001). A normalized root‐mean‐square distance for comparing protein three‐dimensional structures. Protein Science 10:1470-1473. https://doi.org/10.1110/ps.690101

Celniker G, Nimrod G, Ashkenazy H, Glaser F, Martz E, Mayrose I, Pupko T, Ben‐Tal N (2013). ConSurf: using evolutionary data to raise testable hypotheses about protein function. Israel Journal of Chemistry 53:199-206. https://doi.org/10.1002/ijch.201200096

Chaisan T, Van K, Kim MY, Kim KD, Choi B-S, Lee S-H (2012). In silico single nucleotide polymorphism discovery and application to marker-assisted selection in soybean. Molecular Breeding 29:221-233. https://doi.org/10.1007/s11032-010-9541-y

Chen M-H, Bergman CJ, Pinson SRM, Fjellstrom RG (2008). Waxy gene haplotypes: Associations with pasting properties in an international rice germplasm collection. Journal of Cereal Science 48:781-788. https://doi.org/10.1016/j.jcs.2008.05.004

Choi Y, Chan AP (2015). PROVEAN web server: a tool to predict the functional effect of amino acid substitutions and indels. Bioinformatics (Oxford, England) 31:2745-2747. https://doi.org/10.1093/bioinformatics/btv195

Choi Y, Sims GE, Murphy S, Miller JR, Chan AP (2012). Predicting the functional effect of amino acid substitutions and indels. PloS One 7. https://doi.org/10.1371/journal.pone.0046688

Cobb JN, DeClerck G, Greenberg A, Clark R, McCouch S (2013). Next-generation phenotyping: requirements and strategies for enhancing our understanding of genotype-phenotype relationships and its relevance to crop improvement. Theoretical and Applied Genetics 126:867-887. https://doi.org/10.1007/s00122-013-2066-0

De Alencar S, Lopes JC (2010). A comprehensive in silico analysis of the functional and structural impact of SNPs in the IGF1R gene. BioMed Research International 715139. https://doi.org/10.1155/2010/715139

Deller MC, Kong L, Rupp B (2016). Protein stability: a crystallographer's perspective. Acta Crystallographica Section F: Structural Biology Communications 72:72-95. https://doi.org/10.1107/S2053230X15024619

Friso G, van Wijk KJ (2015). Posttranslational protein modifications in plant metabolism. Plant Physiology 169:1469-1487. https://doi.org/10.1107/S2053230X15024619

Gailing O, Vornam B, Leinemann L, Finkeldey R (2009). Genetic and genomic approaches to assess adaptive genetic variation in plants: forest trees as a model. Physiologia Plantarum 137:509-519. https://doi.org/10.1111/j.1399-3054.2009.01263.x

Goff SA, Ricke D, Lan T-H, Presting G, Wang R, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H (2002). A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 296:92-100. https://doi.org/10.1126/science.1068275

Guajardo V, Solís S, Almada R, Saski C, Gasic K, Moreno MÁ (2020). Genome-wide SNP identification in Prunus rootstocks germplasm collections using genotyping-by-sequencing: phylogenetic analysis, distribution of SNPs and prediction of their effect on gene function. Scientific Reports 10:1467-1467. https://doi.org/10.1038/s41598-020-58271-5

Gulzar N, Dingerdissen H, Yan C, Mazumder R (2017). Impact of nonsynonymous single-nucleotide variations on post-translational modification sites in human proteins. Protein Bioinformatics, Springer, pp 159-190. https://doi.org/10.1007/978-1-4939-6783-4_8

Han JH, Kerrison N, Chothia C, Teichmann SA (2006). Divergence of interdomain geometry in two-domain proteins. Structure (London, England, 1993) 14:935-945. https://doi.org/10.1016/j.str.2006.01.016

Hirakawa H, Shirasawa K, Ohyama A, Fukuoka H, Aoki K, Rothan C, Sato S, Isobe S, Tabata S (2013). Genome-wide SNP genotyping to infer the effects on gene functions in tomato. DNA Research 20:221-233. https://doi.org/10.1093/dnares/dst005

Huq MA, Akter S, Nou IS, Kim HT, Jung YJ, Kang KK (2016). Identification of functional SNPs in genes and their effects on plant phenotypes. Journal of Plant Biotechnology 43:1-11. https://doi.org/10.5010/JPB.2016.43.1.1

Islam MJ, Khan AM, Parves MR, Hossain MN, Halim MA (2019). Prediction of deleterious non-synonymous SNPs of human STK11 gene by combining algorithms, molecular docking, and molecular dynamics simulation. Scientific Reports 9:16426. https://doi.org/10.1038/s41598-019-52308-0

Jackson SA (2016). Rice: the first crop genome. Rice 9:1-3. https://doi.org/10.1186/s12284-016-0087-4

Jiang D, Ye Q-l, Wang F-S, Cao L (2010). The mining of citrus EST-SNP and its application in cultivar discrimination. Agricultural Sciences in China 9:179-190. https://doi.org/10.1016/S1671-2927(09)60082-1

Jovine L, Qi H, Williams Z, Litscher E, Wassarman PM (2002). The ZP domain is a conserved module for polymerization of extracellular proteins. Nature Cell Biology 4:457-461. https://doi.org/10.1038/ncb802

Kamaraj B, Purohit R (2013). In silico screening and molecular dynamics simulation of disease-associated nsSNP in TYRP1 gene and its structural consequences in OCA3. BioMed Research International 2013:697051-697051. https://doi.org/10.1155/2013/697051

Kelley LA, Mezulis S, Yates CM, Wass MN, Sternberg MJ (2015). The Phyre2 web portal for protein modeling, prediction and analysis. Nature Protocols 10:845. https://doi.org/10.1107/S2053230X15024619

Kharabian-Masouleh A, Waters DLE, Reinke RF, Ward R, Henry RJ (2012). SNP in starch biosynthesis genes associated with nutritional and functional properties of rice. Scientific Reports 2:557. https://doi.org/10.1038/srep00557

Kharabian A (2010). An efficient computational method for screening functional SNPs in plants. Journal of Theoretical Biology 265:55-62. https://doi.org/10.1016/j.jtbi.2010.04.017

Korani W, Clevenger JP, Chu Y, Ozias-Akins P (2019). Machine learning as an effective method for identifying true single nucleotide polymorphisms in polyploid plants. Plant Genome 12. https://doi.org/10.3835/plantgenome2018.05.0023

Kumar B, Abdel-Ghani AH, Pace J, Reyes-Matamoros J, Hochholdinger F, Lübberstedt T (2014). Association analysis of single nucleotide polymorphisms in candidate genes with root traits in maize (Zea mays L.) seedlings. Plant Science 224:9-19. https://doi.org/10.1007/s11103-015-0314-1

Lek M, Karczewski KJ, Minikel EV, Samocha KE, Banks E, Fennell T, … MacArthur DG (2016). Analysis of protein-coding genetic variation in 60,706 humans. Nature 536:285-291. https://doi.org/10.1038/nature19057

Li B, Krishnan VG, Mort ME, Xin F, Kamati KK, Cooper DN, Mooney SD, Radivojac P (2009a). Automated inference of molecular mechanisms of disease from amino acid substitutions. Bioinformatics 25:2744-2750. https://doi.org/10.1093/bioinformatics/btp528

Li J-Y, Wang J, Zeigler RS (2014). The 3,000 rice genomes project: new opportunities and challenges for future rice research. Gigascience 3:2047-217X. https://doi.org/10.1186/2047-217X-3-8

Li X, Gao X, Ren J, Jin C, Xue Y (2009b). BDM-PUB: Computational prediction of protein ubiquitination sites with a Bayesian discriminant method. https://doi.org/10.2174/1389202919666191014091250

Liao M-l, Somero GN, Dong Y-W (2019). Comparing mutagenesis and simulations as tools for identifying functionally important sequence changes for protein thermal adaptation. Proceedings of the National Academy of Sciences 116:679-688. https://doi.org/10.1073/pnas.1817455116

Majeed S, Rana IA, Atif RM, Ali Z, Hinze L, Azhar MT (2019). Role of SNPs in determining QTLs for major traits in cotton. Journal of Cotton Research 2:5. https://doi.org/10.1186/s42397-019-0022-5

Mammadov J, Aggarwal R, Buyyarapu R, Kumpatla S (2012). SNP markers and their impact on plant breeding. International Journal of Plant Genomics 2012:728398. https://doi.org/10.1155/2012/728398

Mansueto L, Fuentes RR, Borja FN, Detras J, Abriol-Santos JM, Chebotarov D, … Alexandrov N (2016). Rice SNP-seek database update: new SNPs, indels, and queries. Nucleic Acids Research 45:D1075-D1081. https://doi.org/10.1093/nar/gkw1135

McCouch SR, Zhao K, Wright M, Tung C-W, Ebana K, Thomson M, Reynolds A, Wang D, DeClerck G, Ali ML (2010). Development of genome-wide SNP assays for rice. Breeding Science 60:524-535. https://doi.org/10.1270/jsbbs.60.524

Nelson MR, Marnellos G, Kammerer S, Hoyal CR, Shi MM, Cantor CR, Braun A (2004). Large-scale validation of single nucleotide polymorphisms in gene regions. Genome Research 14:1664-1668. https://doi.org/10.1101/gr.2421604

Ng PC, Henikoff S (2003). SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Research 31:3812-3814. https://doi.org/10.1093/nar/gkg509

Ortbauer M, Vahdati K, Leslie C (2013). Abiotic stress adaptation: protein folding stability and dynamics. Abiotic Stress-Plant Responses and Applications in Agriculture 1:3-25. https://doi.org/10.5772/53129

Pea G, Aung HH, Frascaroli E, Landi P, Pè ME (2013). Extensive genomic characterization of a set of near-isogenic lines for heterotic QTL in maize (Zea mays L.). BMC Genomics 14:61. https://doi.org/10.1186/1471-2164-14-61

Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, Ferrin TE (2004). UCSF Chimera–A visualization system for exploratory research and analysis. Journal of Computational Chemistry 25:1605-1612. https://doi.org/10.1002/jcc.20084

Piquerez SJ, Balmuth AL, Sklenář J, Jones AM, Rathjen JP, Ntoukakis V (2014). Identification of post-translational modifications of plant protein complexes. JoVE Journal of Visualized Experiments e51095. https://doi.org/10.3791/51095

Qiu W-R, Xiao X, Lin W-Z, Chou K-C (2014). iMethyl-PseAAC: identification of protein methylation sites via a pseudo amino acid composition approach. BioMed Research International 947416. https://doi.org/10.1155/2014/947416

Radivojac P, Vacic V, Haynes C, Cocklin RR, Mohan A, Heyen JW, Goebl MG, Iakoucheva LM (2010). Identification, analysis, and prediction of protein ubiquitination sites. Proteins: Structure, Function, and Bioinformatics 78:365-380. https://doi.org/10.1002/prot.22555

Rasal KD, Shah TM, Vaidya M, Jakhesara SJ, Joshi CG (2015). Analysis of consequences of non-synonymous SNP in feed conversion ratio associated TGF-β receptor type 3 gene in chicken. Meta Gene 4:107-117. https://doi.org/10.1016/j.mgene.2015.03.006

Safder I, Shao G, Sheng Z, Hu P, Tang S (2021). Identification and analysis of the structure, expression and nucleotide polymorphism of the GPAT gene family in rice. Plant Gene 100290. https://doi.org/10.1016/j.plgene.2021.100290

Salmon M, Thimmappa RB, Minto RE, Melton RE, Hughes RK, O’Maille PE, Hemmings AM, Osbourn A (2016). A conserved amino acid residue critical for product and substrate specificity in plant triterpene synthases. Proceedings of the National Academy of Sciences 113:E4407-E4414. https://doi.org/10.1073/pnas.1605509113

Sandhu D, Pudussery MV, Kumar R, Pallete A, Markley P, Bridges WC, Sekhon RS (2020). Characterization of natural genetic variation identifies multiple genes involved in salt tolerance in maize. Functional & Integrative Genomics 20:261-275. https://doi.org/10.1007/s10142-019-00707-x

Schreiber L, Nader-Nieto AC, Schönhals EM, Walkemeier B, Gebhardt C (2014). SNPs in genes functional in starch-sugar interconversion associate with natural variation of tuber starch and sugar content of potato (Solanum tuberosum L.). G3 Genes, Genomes, Genetics 4:1797-1811. https://doi.org/10.1534/g3.114.012377

Senior AW, Evans R, Jumper J, Kirkpatrick J, Sifre L, Green T, … Hassabis D (2020). Improved protein structure prediction using potentials from deep learning. Nature 577:706-710. https://doi.org/10.1038/s41586-019-1923-7

Seymour GB, Chapman NH, Chew BL, Rose JK (2013). Regulation of ripening and opportunities for control in tomato and other fruits. Plant Biotechnology Journal 11:269-278. https://doi.org/10.1111/j.1467-7652.2012.00738.x

Tam V, Patel N, Turcotte M, Bossé Y, Paré G, Meyre D (2019). Benefits and limitations of genome-wide association studies. Nature Reviews Genetics 20:467-484. https://doi.org/10.1038/s41576-019-0127-1

Tibbs Cortes L, Zhang Z, Yu J (2021). Status and prospects of genome-wide association studies in plants. The Plant Genome 14:e20077. https://doi.org/10.1002/tpg2.20077

Wang C-C, Yu H, Huang J, Wang W-S, Faruquee M, Zhang F, … Zheng T-Q (2020a). Towards a deeper haplotype mining of complex traits in rice with RFGB v2.0. Plant Biotechnology Journal 18:14-16. https://doi.org/10.1111/pbi.13215

Wang H, Ham T-H, Im D-E, Lar SM, Jang S-G, Lee J, Mo Y, Jeung J-U, Kim ST, Kwon S-W (2020b). A new SNP in rice gene encoding pyruvate phosphate dikinase (PPDK) associated with floury endosperm. Genes (Basel) 11:465. https://doi.org/10.3390/genes11040465

Wang H, Mo Y-J, Im D-E, Jang S-G, Ham T-H, Lee J, Jeung J-U, Kwon S-W (2018). A new SNP in cy OsPPDK gene is associated with floury endosperm in Suweon 542. Molecular Genetics and Genomics 293:1151-1158. https://doi.org/10.3390/genes11040465

Wen P-P, Shi S-P, Xu H-D, Wang L-N, Qiu J-D (2016). Accurate in silico prediction of species-specific methylation sites based on information gain feature optimization. Bioinformatics 32:3107-3115. https://doi.org/10.1093/bioinformatics/btw377

Withana WVE, Kularathna RMRE, Kottearachchi NS, Kekulandara DS, Weerasena J, Steele KA (2020). In silico analysis of the fragrance gene (badh2) in Asian rice (Oryza sativa L.) germplasm and validation of allele specific markers. Plant Genetic Resources: Characterization and Utilization 18:71-80. https://doi.org/10.1017/S1479262120000015

Xia Y, Li R, Ning Z, Bai G, Siddique KH, Yan G, Baum M, Varshney RK, Guo P (2013). Single nucleotide polymorphisms in HSP17. 8 and their association with agronomic traits in barley. PloS One 8:e56816. https://doi.org/10.1371/journal.pone.0056816

Xue Y, Zhou F, Zhu M, Ahmed K, Chen G, Yao X (2005). GPS: a comprehensive www server for phosphorylation sites prediction. Nucleic Acids Research 33:W184-W187. https://doi.org/10.1093/nar/gki393

Yang J, Anishchenko I, Park H, Peng Z, Ovchinnikov S, Baker D (2020). Improved protein structure prediction using predicted interresidue orientations. Proceedings of the National Academy of Sciences 117:1496-1503. https://doi.org/10.1073/pnas.1914677117

Yang W, Bai X, Kabelka E, Eaton C, Kamoun S, van der Knaap E, David F (2004). Discovery of single nucleotide polymorphisms in Lycopersicon esculentum by computer aided analysis of expressed sequence tags. Molecular Breeding 14:21-34. https://doi.org/10.1023/B:MOLB.0000037992.03731.a5

Zaynab M, Fatima M, Abbas S, Sharif Y, Umair M, Zafar MH, Bahadar K (2018). Role of secondary metabolites in plant defense against pathogens. Microbial Pathogenesis 124:198-202. https://doi.org/10.1016/j.micpath.2018.08.034

Zhang M, Huang C, Wang Z, Lv H, Li X (2020). In silico analysis of non-synonymous single nucleotide polymorphisms (nsSNPs) in the human GJA3 gene associated with congenital cataract. BMC Molecular and Cell Biology 21:12. https://doi.org/10.1186/s12860-020-00252-7

Zhang W, Mirlohi S, Li X, He Y (2018). Identification of functional single-nucleotide polymorphisms affecting leaf hair number in Brassica rapa. Plant Physiology 177:490-503. https://doi.org/10.1104/pp.18.00025

Zhang Y, Skolnick J (2005). TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Research 33:2302-2309. https://doi.org/10.1093/nar/gki524

Zhao H, Yao W, Ouyang Y, Yang W, Wang G, Lian X, … Xie W (2015). RiceVarMap: a comprehensive database of rice genomic variations. Nucleic Acids Research 43:D1018-D1022. https://doi.org/10.1093/nar/gku894

Downloads

Published

2021-09-27

How to Cite

SAFDER, I., SHAO, G. ., SHENG, Z. ., HU, P. ., & TANG, S. . (2021). Identification of SNPs in rice GPAT genes and in silico analysis of their functional impact on GPAT proteins. Notulae Botanicae Horti Agrobotanici Cluj-Napoca, 49(3), 12346. https://doi.org/10.15835/nbha49312346

Issue

Section

Research Articles
CITATION
DOI: 10.15835/nbha49312346

Most read articles by the same author(s)