Validating methods for testing natural molecules on molecular pathways of interest in silico and in vitro


Gene expression
Bioinformatics tools
Biochemical pathways
In vitro • Natural molecules


Differentially expressed genes can serve as drug targets and are used to predict drug response and disease progression. In silico drug analysis based on the expression of these genetic biomarkers allows the detection of putative therapeutic agents, which could be used to reverse a pathological gene expression signature. Indeed, a set of bioinformatics tools can increase the accuracy of drug discovery, helping in biomarker identification. Once a drug target is identified, in vitro cell line models of disease are used to evaluate and validate the therapeutic potential of putative drugs and novel natural molecules. This study describes the development of efficacious PCR primers that can be used to identify gene expression of specific genetic pathways, which can lead to the identification of natural molecules as therapeutic agents in specific molecular pathways. For this study, genes involved in health conditions and processes were considered. In particular, the expression of genes involved in obesity, xenobiotics metabolism, endocannabinoid pathway, leukotriene B4 metabolism and signaling, inflammation, endocytosis, hypoxia, lifespan, and neurotrophins were evaluated. Exploiting the expression of specific genes in different cell lines can be useful in in vitro to evaluate the therapeutic effects of small natural molecules.


[1] Moore JH. The ubiquitous nature of epistasis in determining susceptibility to common human diseases. Hum Hered 2003;56:73-82.
[2] Greenspan RJ. The flexible genome. Nat Rev Genet 2001;2:383-7.
[3] Grimes T, Potter SS, Datta S. Integrating gene regulatory pathways into differential network analysis of gene expression data. Sci Rep 2019;9:1-12.
[4] Barabási AL, Oltvai ZN. Network biology: understanding the cell’s functional organization. Nat Rev Genet 2004;5:101-13.
[5] Mitra K, Carvunis AR, Ramesh SK, Ideker T. Integrative approaches for finding modular structure in biological networks. Nat Rev Genet 2013;14:719-32.
[6] Langfelder P, Mischel PS, Horvath S. When is hub gene selection better than standard meta-analysis? PLoS One 2013;8:e61505.
[7] Sikdar S, Datta S. A novel statistical approach for identification of the master regulator transcription factor. BMC Bioinformatics 2017;18.
[8] Tian S, Wang C, Wang B. Incorporating Pathway Information into Feature Selection towards Better Performed Gene Signatures. Biomed Res Int 2019;2019:2497509.
[9] Iourov IY, Vorsanova SG, Yurov YB. Pathway-based classification of genetic diseases. Mol Cytogenet 2019;12:4.
[10] Bertelli M, Kiani AK, Paolacci S, Manara E, Dautaj A, Beccari T, Michelini S. Molecular pathways involved in lymphedema: Hydroxytyrosol as a candidate natural compound for treating the effects of lymph accumulation. J Biotechnol 2020;308:82-6.
[11] Rastogi SC, Rastogi P, Mendiratta N. Bioinformatics methods and applications: genomics proteomics and drug discovery. 3rd ed. Delhi: PHI Learning Pvt Ltd 2008.
[12] Cascante M, Boros LG, Comin-Anduix B, de Atauri P, Centelles JJ, Lee PW. Metabolic control analysis in drug discovery and disease. Nat Biotechnol 2002;20:243-9.
[13] Davidov E, Holland J, Marple E, Naylor S. Advancing drug discovery through systems biology. Drug Discov Today 2003;8:175-83.
[14] Cho DY, Kim YA, Przytycka TM. Chapter 5: Network biology approach to complex diseases. PLoS Comput Biol 2012;8:e1002820.
[15] Kim YA, Wuchty S, Przytycka TM. Identifying causal genes and dysregulated pathways in complex diseases. PLoS Comput Biol 2011;7:e1001095.
[16] Al-Shahrour F, Díaz-Uriarte R, Dopazo J. Discovering molecular functions significantly related to phenotypes by combining gene expression data and biological information. Bioinformatics 2005;21:2988-93.
[17] Tian L, Greenberg SA, Kong SW, Altschuler J, Kohane IS, Park PJ. Discovering statistically significant pathways in expression profiling studies. Proc Natl Acad Sci USA 2005;102:13544-9.
[18] Kim SY, Volsky DJ. PAGE: parametric analysis of gene set enrichment. BMC Bioinformatics 2005;6:144.
[19] Grimes T, Potter SS, Datta S. Integrating gene regulatory pathways into differential network analysis of gene expression data. Sci Rep 2019;9:5479.
[20] Jiang Q, Jin S, Jiang Y, Liao M, Feng R, Zhang L, Liu G, Hao J. Alzheimer’s Disease variants with the genome-wide significance are significantly enriched in immune pathways and active in immune cells. Mol Neurobiol 2017;54:594-600.
[21] Swarup S, Huang W, Mackay TF, Anholt RR. Analysis of natural variation reveals neurogenetic networks for Drosophila olfactory behavior. Proc Natl Acad Sci USA 2013;110:1017-22.
[22] Yang X, Zhu H, Qin Q, Yang Y, Yang Y, Cheng H, Sun X. Genetic variants and risk of esophageal squamous cell carcinoma: a GWAS-based pathway analysis. Gene 2015;556:149-52.
[23] Liu G, Luo S, Lei Y, Wu J, Huang Z, Wang K, Yang P, Huang X. A nine-hub-gene signature of metabolic syndrome identified using machine learning algorithms and integrated bioinformatics. Bioengineered 2021;12:5727-38. https://doi.or g/10.1080/21655979.2021.1968249
[24] Kar SP, Seldin MF, Chen W, Lu E, Hirschfield GM, Invernizzi P, Heathcote J, Cusi D; Italian PBC Genetics Study Group, Gershwin ME, Siminovitch KA, Amos CI. Pathway-based analysis of primary biliary cirrhosis genome-wide association studies. Genes Immun 2013;14:179-86.
[25] Menashe I, Figueroa JD, Garcia-Closas M, Chatterjee N, Malats N, Picornell A, Maeder D, Yang Q, Prokunina-Olsson L, Wang Z, Real FX, Jacobs KB, Baris D, Thun M, Albanes D, Purdue MP, Kogevinas M, Hutchinson A, Fu YP, Tang W, Burdette L, Tardón A, Serra C, Carrato A, García-Closas R, Lloreta J, Johnson A, Schwenn M, Schned A, Andriole G Jr, Black A, Jacobs EJ, Diver RW, Gapstur SM, Weinstein SJ, Virtamo J, Caporaso NE, Landi MT, Fraumeni JF Jr, Chanock SJ, Silverman DT, Rothman N. Large-scale pathway-based analysis of bladder cancer genome-wide association data from five studies of European background. PLoS One 2012;7:e29396.
[26] Nurnberger JI Jr, Koller DL, Jung J, Edenberg HJ, Foroud T, Guella I, Vawter MP, Kelsoe JR; Psychiatric Genomics Consortium Bipolar Group. Identification of pathways for bipolar disorder: a meta-analysis. JAMA Psychiatry 2014;71:657-64.
[27] Kao CF, Jia P, Zhao Z, Kuo PH. Enriched pathways for major depressive disorder identified from a genome-wide association study. Int J Neuropsychopharmacol 2012;15:1401-11.
[28] Lee YH, Kim JH, Song GG. Pathway analysis of a genome-wide association study in schizophrenia. Gene 2013;525:107-15.
[29] Duncan LE, Holmans PA, Lee PH, O’Dushlaine CT, Kirby AW, Smoller JW, Öngür D, Cohen BM. Pathway analyses implicate glial cells in schizophrenia. PLoS One 2014;9:e89441.
[30] de las Fuentes L, Yang W, Dávila-Román VG, Gu C. Pathway-based genome-wide association analysis of coronary heart disease identifies biologically important gene sets. Eur J Hum Genet 2012;20:1168-73.
[31] Kanehisa M, Furumichi M, Tanabe M, Sato Y, Morishima K. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res 2017;45:D353-61.
[32] Mi H, Muruganujan A, Ebert D, Huang X, Thomas PD. PANTHER version 14: more genomes, a new PANTHER GO-slim and improvements in enrichment analysis tools. Nucleic Acids Res 2019;47:D419-26.
[33] Petri V, Jayaraman P, Tutaj M, Hayman GT, Smith JR, De Pons J, Laulederkind SJ, Lowry TF, Nigam R, Wang SJ, Shimoyama M, Dwinell MR, Munzenmaier DH, Worthey EA, Jacob HJ. The pathway ontology - updates and applications. J Biomed Semantics 2014;5:7.
[34] Nishimura D. BioCarta. Biotech software & internet report 2001;2:117-20.
[35] Paz A, Brownstein Z, Ber Y, Bialik S, David E, Sagir D, Ulitsky I, Elkon R, Kimchi A, Avraham KB, Shiloh Y, Shamir R. SPIKE: a database of highly curated human signaling pathways. Nucleic Acids Res 2011;39:D793-9.
[36] Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 2000;25:25-9.
[37] Schaefer CF, Anthony K, Krupa S, Buchoff J, Day M, Hannay T, Buetow KH. PID: the Pathway Interaction Database. Nucleic Acids Res 2009;37:D674-9.
[38] Caspi R, Billington R, Keseler IM, Kothari A, Krummenacker M, Midford PE, Ong WK, Paley S, Subhraveti P, Karp PD. The MetaCyc database of metabolic pathways and enzymes - a 2019 update. Nucleic Acids Res 2020;48:D445-53.
[39] Croft D, Mundo AF, Haw R, Milacic M, Weiser J, Wu G, Caudy M, Garapati P, Gillespie M, Kamdar MR, Jassal B, Jupe S, Matthews L, May B, Palatnik S, Rothfels K, Shamovsky V, Song H, Williams M, Birney E, Hermjakob H, Stein L, D’Eustachio P. The Reactome pathway knowledgebase. Nucleic Acids Res 2014;42:D472-7.
[40] Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA 2005;102:15545-50.
[41] Biomarkers Definitions Working Group.. Biomarkers and surrogate endpoints: preferred definitions and conceptual framework. Clin Pharmacol Ther 2001;69:89-95.
[42] Lazar C, Taminau J, Meganck S, Steenhoff D, Coletta A, Molter C, de Schaetzen V, Duque R, Bersini H, Nowé A. A survey on filter techniques for feature selection in gene expression microarray analysis. IEEE/ACM Trans Comput Biol Bioinform 2012;9:1106-19.
[43] Jang K, Tong T, Lee J, Park T, Lee H. Altered Gene Expression Profiles in Peripheral Blood Mononuclear Cells in Obese Subjects. Obes Facts 2020;13:375-85.
[44] Burczynski ME, Peterson RL, Twine NC, Zuberek KA, Brodeur BJ, Casciotti L, Maganti V, Reddy PS, Strahs A, Immermann F, Spinelli W, Schwertschlag U, Slager AM, Cotreau MM, Dorner AJ. Molecular classification of Crohn’s disease and ulcerative colitis patients using transcriptional profiles in peripheral blood mononuclear cells. J Mol Diagn 2006;8:51-61.
[45] Erre GL, Piga M, Carru C, Angius A, Carcangiu L, Piras M, Sotgia S, Zinellu A, Mathieu A, Passiu G, Pescatori M. Global microRNA profiling of peripheral blood mononuclear cells in patients with Behçet’s disease. Clin Exp Rheumatol 2015;33:S72-S79.
[46] Bonetti G, Paolacci S, Samaja M, Maltese PE, Michelini S, Michelini S, Michelini S, Ricci M, Cestari M, Dautaj A, Medori MC, Bertelli M. Low Efficacy of Genetic Tests for the Diagnosis of Primary Lymphedema Prompts Novel Insights into the Underlying Molecular Pathways. Int J Mol Sci 2022; 23:7414
[47] Bolognesi ML, Cavalli A. Multitarget drug discovery and polypharmacology. ChemMedChem 2016;11:1190-2.
[48] Schenone M, Dančík V, Wagner BK, Clemons PA. Target identification and mechanism of action in chemical biology and drug discovery. Nat Chem Biol 2013;9:232-40.
[49] [49] Zeng H, Qiu C, Cui Q. Drug-Path: a database for drug-induced pathways. Database (Oxford). 2015;2015:bav061.
[50] Freshour SL, Kiwala S, Cotto KC, Coffman AC, McMichael JF, Song JJ, Griffith M, Griffith OL, Wagner AH. Integration of the Drug-Gene Interaction Database (DGIdb 4.0) with open crowdsource efforts. Nucleic Acids Res 2021;49:D1144-51.
[51] Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A, Han L, He J, He S, Shoemaker BA, Wang J, Yu B, Zhang J, Bryant SH. PubChem substance and compound databases. Nucleic AcidsRes2016;44:D1202–13.
[52] Mailman MD, Feolo M, Jin Y, Kimura M, Tryka K, Bagoutdinov R, Hao L, Kiang A, Paschall J, Phan L, Popova N, Pretel S, Ziyabari L, Lee M, Shao Y, Wang ZY, Sirotkin K, Ward M, Kholodov M, Zbicz K, Sherry ST. The NCBI dbGaP database of genotypes and phenotypes. Nat Gen 2007;39:1181-6.
[53] Tryka KA, Hao L, Sturcke A, Jin Y, Wang ZY, Ziyabari L, Lee M, Popova N, Sharopova N, Kimura M, Feolo M. NCBI’s Database of Genotypes and Phenotypes: dbGaP. Nucleic Acids Res 2014;42:D975-9.
[54] Swainston N, Hastings J, Dekker A, Muthukrishnan V, May J, Steinbeck C, Mendes P. libChEBI: an API for accessing the ChEBI database. J Cheminform 2016;8.
[55] Wishart DS, Feunang YD, Guo AC, Lo EJ, Marcu A, Grant JR, Sajed T, Johnson D, Li C, Sayeeda Z, Assempour N, Iynkkaran I, Liu Y, Maciejewski A, Gale N, Wilson A, Chin L, Cummings R, Le D, Pon A, Knox C, Wilson M. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res 2018;46:D1074-82.
[56] Thorn CF, Klein TE, Altman RB. PharmGKB: the Pharmacogenomics Knowledge Base. Methods Mol Biol 2013;1015:311-20.
[57] Kuhn M, von Mering C, Campillos M, Jensen LJ, Bork P. STITCH: interaction networks of chemicals and proteins. Nucleic Acids Res 2008;36:D684-8.
[58] Wishart DS, Tzur D, Knox C, Eisner R, Guo AC, Young N, Cheng D, Jewell K, Arndt D, Sawhney S, Fung C, Nikolai L, Lewis M, Coutouly MA, Forsythe I, Tang P, Shrivastava S, Jeroncic K, Stothard P, Amegbey G, Block D, Hau DD, Wagner J, Miniaci J, Clements M, Gebremedhin M, Guo N, Zhang Y, Duggan GE, Macinnis GD, Weljie AM, Dowlatabadi R, Bamforth F, Clive D, Greiner R, Li L, Marrie T, Sykes BD, Vogel HJ, Querengesser L. HMDB: the Human Metabolome Database. Nucleic Acids Res 2007;35:D521-6.
[59] Kale NS, Haug K, Conesa P, Jayseelan K, Moreno P, Rocca-Serra P, Nainala VC, Spicer RA, Williams M, Li X, Salek RM, Griffin JL, Steinbeck C. MetaboLights: An Open-Access Database Repository for Metabolomics Data. Curr Protoc Bioinformatics 2016;53:14.13.1-14.13.18.
[60] Babbi G, Martelli PL, Profiti G, Bovo S, Savojardo C, Casadio R. eDGAR: a database of Disease-Gene Associations with annotated Relationships among genes. BMC Genomics 2017;18:554.
[61] Zeng X, Zhang P, He W, Qin C, Chen S, Tao L, Wang Y, Tan Y, Gao D, Wang B, Chen Z, Chen W, Jiang YY, Chen YZ. NPASS: natural product activity and species source database for natural product research, discovery and tool development. Nucleic Acids Res 2018;46:D1217-22.
[62] Horai H, Arita M, Kanaya S, Nihei Y, Ikeda T, Suwa K, Ojima Y, Tanaka K, Tanaka S, Aoshima K, Oda Y, Kakazu Y, Kusano M, Tohge T, Matsuda F, Sawada Y, Hirai MY, Nakanishi H, Ikeda K, Akimoto N, Maoka T, Takahashi H, Ara T, Sakurai N, Suzuki H, Shibata D, Neumann S, Iida T, Tanaka K, Funatsu K, Matsuura F, Soga T, Taguchi R, Saito K, Nishioka T. MassBank: a public repository for sharing mass spectral data for life sciences. J Mass Spectrom 2010;45:703-14.
[63] Trupp M, Altman T, Fulcher CA, Caspi R, Krummenacker M, Paley S, Karp PD. Beyond the genome (BTG) is a (PGDB) pathway genome database: HumanCyc. Genome Biol 2010;11(Suppl 1):O12.
[64] Lv C, Nagle DG, Zhou Y, Zhang W. Application of Connectivity Map (CMAP) Database to Research on Traditional Chinese Medicines (TCMs). In: Zhang W, ed. Systems Biology and its Application in TCM Formulas Research. London: Academic Press 2018, pp. 113-119.
[65] Voskoglou-Nomikos T, Pater JL, Seymour L. Clinical predictive value of the in vitro cell line, human xenograft, and mouse allograft preclinical cancer models. Clin Cancer Res 2003;9:4227-39.
[66] Shoemaker RH. The NCI60 human tumour cell line anticancer drug screen. Nat Rev Cancer 2006;6:813-23.
[67] Barretina J, Caponigro G, Stransky N, Venkatesan K, Margolin AA, Kim S, Wilson CJ, Lehár J, Kryukov GV, Sonkin D, Reddy A, Liu M, Murray L, Berger MF, Monahan JE, Morais P, Meltzer J, Korejwa A, Jané-Valbuena J, Mapa FA, Thibault J, Bric-Furlong E, Raman P, Shipway A, Engels IH, Cheng J, Yu GK, Yu J, Aspesi P Jr, de Silva M, Jagtap K, Jones MD, Wang L, Hatton C, Palescandolo E, Gupta S, Mahan S, Sougnez C, Onofrio RC, Liefeld T, MacConaill L, Winckler W, Reich M, Li N, Mesirov JP, Gabriel SB, Getz G, Ardlie K, Chan V, Myer VE, Weber BL, Porter J, Warmuth M, Finan P, Harris JL, Meyerson M, Golub TR, Morrissey MP, Sellers WR, Schlegel R, Garraway LA. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 2012;483:603-7.
[68] Garnett MJ, Edelman EJ, Heidorn SJ, Greenman CD, Dastur A, Lau KW, Greninger P, Thompson IR, Luo X, Soares J, Liu Q, Iorio F, Surdez D, Chen L, Milano RJ, Bignell GR, Tam AT, Davies H, Stevenson JA, Barthorpe S, Lutz SR, Kogera F, Lawrence K, McLaren-Douglas A, Mitropoulos X, Mironenko T, Thi H, Richardson L, Zhou W, Jewitt F, Zhang T, O’Brien P, Boisvert JL, Price S, Hur W, Yang W, Deng X, Butler A, Choi HG, Chang JW, Baselga J, Stamenkovic I, Engelman JA, Sharma SV, Delattre O, Saez-Rodriguez J, Gray NS, Settleman J, Futreal PA, Haber DA, Stratton MR, Ramaswamy S, McDermott U, Benes CH. Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature 2012;483:570-5.
[69] Ziliak D, O’Donnell PH, Im HK, Gamazon ER, Chen P, Delaney S, Shukla S, Das S, Cox NJ, Vokes EE, Cohen EE, Dolan ME, Huang RS. Germline polymorphisms discovered via a cell-based, genome-wide approach predict platinum response in head and neck cancers. Transl Res 2011;157:265-72.
[70] Michelini S, Chiurazzi P, Marino V, Dell’Orco D, Manara E, Baglivo M, Fiorentino A, Maltese PE, Pinelli M, Herbst KL, Dautaj A, Bertelli M. Aldo-Keto Reductase 1C1 (AKR1C1) as the First Mutated Gene in a Family with Nonsyndromic Primary Lipedema. Int J Mol Sci 2020;21:6264.
[71] Paolacci S, Ergoren MC, De Forni D, Manara E, Poddesu B, Cugia G, Dhuli K, Camilleri G, Tuncel G, Kaya Suer H, Sultanoglu N, Sayan M, Dundar M, Beccari T, Ceccarini MR, Gunsel IS, Dautaj A, Sanlidag T, Connelly ST, Tartaglia GM, Bertelli M. In vitro and clinical studies on the efficacy of α-cyclodextrin and hydroxytyrosol against SARS-CoV-2 infection. Eur Rev Med Pharmacol Sci 2021;25:81-9.