 
Ayesha A1, Abdul M1*, Khushi M2, Sajjad2, Daniel P3, Hamid A2 and Irfan U2
1Department of Botany, Hazara University Mansehra, Pakistan
2Department of Biotechnology and Genetic Engineering Hazara University Mansehra, Pakistan
3Department of plant sciences, University of California UC Davis, USA
*Corresponding author:Sajjad, Department of Biotechnology and Genetic Engineering Hazara University Mansehra, Mansehra 21300, Pakistan
Submission: April 16, 2025; Published: July 08, 2025
 
	
	ISSN 2637-7082Volume5 Issue 3
The Quercus L. (Fagaceae) represents a complex taxonomic genus with approximately 600 extant species of significant ecological and economic importance. Accurate species identification remains challenging due to frequent hybridization and morphological similarities. This study influences DNA barcoding as a powerful molecular approach to distinguish Q. floribunda L., a species native to the Himalayan regions of Pakistan. In the current study, three chloroplast DNA barcoding loci, the rpoB, rbcL and tRNA, were investigated to assess their effectiveness in Q. floribunda L. identification and Phylogenetics. The research employed comprehensive molecular techniques to evaluate genetic diversity and evolutionary relationships of Q. floribunda L. The rpoB gene revealed the highest discriminatory power, exhibiting superior haplotype diversity, and provided insights into relationships between Q. floribunda L. and other Quercus species, with rpoB showing 14 haplotypes, 5 rbcL, and 8 tRNA. Phylogenetic analysis placed Q. floribunda L. in a shared ancestral node with Q. variables for rpoB, while rbcL showed high similarity with Q. aucheri. The tRNA spacer revealed evolutionary divergence across Quercus species, classifying sequences into four major clades. In conclusion, all the barcode regions demonstrated considerable diversity among Quercus species, supporting their utility as barcodes for identification. This barcoding study represents the first comprehensive analysis on Quercus from the Himalayan region of Pakistan. The findings contribute to accurate species identification, phylogenetic understanding, and conservation efforts for this ecologically important oak species. This research also highlights the potential of DNA barcoding in resolving taxonomic uncertainties and its application in biodiversity conservation and management of Quercus and other valuable plant resources.
Keywords:Genetic distance estimation; Quercus; Universal markers; rpoB; rbcL; tRNA; Evolutionary relationship
Quercus L. (Fagaceae) is an evolutionarily diverse genus and comprises about 600 species worldwide [1]. This genus is considered the most widely circulated woody genus in the northern hemisphere [2,3]. Q. floribunda L. is commonly known as holly oak and locally known as bunj or barungi [4,5]. This tree grows up to 20-22 meters tall. The leaves are typically elliptic-ovate to broadly lanceolate and are leathery (coriaceous) with entire or spiny toothed edges [6]. Both sides of the leaves are smooth and green (glabrous), with an oblique base. The petiole is 0.3 to 1cm long [7]. This species is differentiated from other members of the genus by its bicolored green leaves [8]. April to May is noted for the flowering period of Q. floribunda L. Sometimes, Q. floribunda L. and Q. baloot are observed as hybrids resembling those featuring leaves [9]. These hybrids may result from cross-pollination between Q. floribunda L. and Q. baloot [10]. Quercus species are primarily distributed in the temperate and sub temperate areas of the western and northern parts of Pakistan [11]. These areas are the Himalayan foothills, Karakoram and Hindu Kush ranges, such as Azad Jammu and Kashmir, Khyber Pakhtunkhwa (particularly Abbottabad, Mansehra, Chitral, Dir and Swat), Gilgit-Baltistan, and the high- altitude zones of Baluchistan, including the Suleiman Range [12]. Q. floribunda L. contains a variety of phytochemical constituents such as flavonoids (kaempferol and quercetin), phenolic acids (ellagic and gallic acids), lignans, sterols (β-sitosterol), tannins (both condensed and hydrolysable), triterpenes (oleanolic and ursolic acids), and volatile oils [13]. These compounds play a crucial role by exhibiting the activities such as anti-inflammatory, antidiabetic, antioxidant, anticancer and antimicrobial effects [14,15].
Due to hybridization and molecular evolution, oak trees exhibit sympatric parallel diversification accompanied by leaf morphological variations and harbor a high potential for introgression and reticulate evolution [16]. Such evolutionary divergence and similar morphology also pose a challenge in identifying members of the Fagaceae [17]. Conversely, molecular and DNA diagnostic tools such as plastid DNA barcodes and multilocus DNA markers can identify [18]. DNA barcoding is an advanced technological approach used to identify unknown biological taxa at the genus, species, or even subspecies level with high accuracy [19,20]. DNA barcoding aims to accurately determine plant species by analyzing a specific DNA fragment within a short timeframe. This method is recognized for its efficiency, standardization, speed, and cost-effectiveness in identifying diverse plant species [21]. This technique permits the investigation of both intra-specific and interspecific variations, facilitating the identification of unknown species or those with complex morphometric traits [22]. DNA barcoding markers, comprising rpoB, rbcL, ITS, and tRNA, have been used to identify plant species [23]. These markers play a critical role in examining the similarities and differences between various species, as well as in addressing taxonomic and evolutionary challenges [24]. Numerous plastid genomic loci, comprising rpoB, rpoC1, rbcL, and tRNA, have been utilized as DNA barcoding for plant identification [25-27]. The present study aims to assess the Quercus floribunda population from Northern Pakistan using molecular markers to estimate genetic distances and establish phylogenetic relationships with reference sequences from databases.
Sample collection and identification
Q. floribunda L. samples were collected and studied from natural habitats of Northern, Pakistan. Detailed information about the species were presented in the (Table 1). The collected samples were identified by the expert taxonomists and preserved at the Herbarium of Hazara University Mansehra, Pakistan.
Table 1:Detailed description of selected species along with primary data in the form of GPS coordinates and collection site.

DNA extraction and PCR amplification
For DNA extraction and PCR amplification, whole genomic DNA was extracted using the CTAB method, described by [28], with some optimizations. The extracted DNA samples were quantified using 1% agarose gel electrophoresis. Three genetic markers were amplified: the rpoB gene, rbcL gene, and tRNA barcodes, which were recommended by the Consortium for the Barcode of Life (CBOL) [29]. The primer pairs used for PCR amplification are listed in Table 2. The extracted DNA was stored at -20 °C and utilized as a template for amplifying barcode genes and spacers. The PCR reaction mixture (25μL) comprised 2.5μL of 10×PCR buffer (without MgCl2), 2μL of each dNTP, 1.25μL of each primer, and 0.125μL of Taq DNA polymerase. PCR reactions were carried out at 95 °C for 5 minutes, followed by 35 cycles of 94 °C for 30 seconds, 52 °C for 30 seconds, and 72 °C for 45 seconds. The final extension step was at 72 °C for 10 minutes. Macrogn Inc. performed sequencing in Korea, and all resultant sequences were submitted to the DNA Data Bank of Japan (DDBJ) for accession numbers.
Table 2:Detailed description of selected DNA barcode regions it’s with forward and reverse primers along with recommended authorities.

Nucleotide sequences alignments and analysis
After successful sequencing of the rpoB, rbcL, and tRNA, the good-quality sequences were used to align in online NCBI GenBank tools to retrieve the most similar sequences clustered with Q. floribunda L. with its closest relative’s reference sequences and diverse species sequences. Generous software was used to check the quality of sequences, and missing sequences, whether at the start or the end, were carefully reviewed and deleted, ensuring our data’s highest quality and reliability. The query sequences along with reference sequences were used for multiple sequence alignment in Clustal W [30]. Phylogenetic analysis was performed to identify the relationship between individuals, species, and genus. Neighbour-joining trees were constructed through Cantor- Jukes model with 10,000 bootstraps. The other analysis, such as nucleotide diversity (π), segregation (S), haplotype diversity (Hd), haplotype diversity distribution, and haplotype diversity network, were conducted in Python version 3.10.10.
Molecular analysis of Quercus floribunda
Table 3:A detailed description of DNA barcodes sequences submitted into NCBI (DDJB) database along with obtained Accession IDS.

In this study, three barcode regions, such as rbcL, rpoB, and tRNA, were selected and successfully amplified using forward and reverse primers. The initial identification of Quercus spp. was performed through morphological techniques and tagged as Q. floribunda L. The sequencing results of these barcodes were submitted to the NCBI GenBank, and obtained the accessions LC58840, LC583720, and LC518899 for rbcL, rpoB, and tRNA (Table 3). In the current study nucleotide diversity, evolutionary relationship, phylogeny, and haplotype diversity analysis Q. floribunda L were performed.
Phylogeny and evolutionary analysis of Quercus floribunda through rpoB marker
Current study assessed the genomic similarity, alignment, and phylogenetic trees based on rpoB, rbcL, and tRNA. Besides the reference rpoB (LC583720), 24 sequences were detected with a high similarity threshold via Mega Blast. Apart from L. craibianus and L. litseifolius, the dataset contains 22 Quercus spp sequences. The genetic similarity level of Q. floribunda L. was 1-2, which was higher as compared to other Quercus spp (Figure 1a). It has been mentioned that oak species are prone to molecular evolution despite having higher morphological similarities. Based on the phylogenetic tree, Q. floribunda shared an ancestral node with Q. variables. However, Q. chenii, Q. acutissima, Q. serrata, and Q. baronii shared the same clade, and presented in (Figure 1b).
Figure 1:Analysis of the rpoB gene-aligned sequences.
a) Indicates a 2D matrix of the pairwise-sequence distances. Pairwise distances are represented with color schemes, from dark
blue (lower) to yellow (higher).
b) Phylogenetic tree classification of rpoB genes generated with 1000 bootstraps using Neighbor-Joining (NJ). Light blue circles
represent bootstrap confidence; the larger circle size corresponds to higher bootstrap confidence.

Phylogeny and evolutionary analysis of Quercus floribunda through rbcL
Based on rbcL sequence analysis, the query sequence LC58840 along with similar sequences were used to align. The pairwise sequence similarity indicated that the higher similarity of Q. baloot and Q. aucheri and divergence with Q. acutissima and Q. chenii, respectively. Phylogenetic analysis classified Q. floribunda closest to Q. baloot, Q. aucheri, and Q. coccifera among the Quercus spp presented in (Figure 2a & 2b). Phylogeny and evolutionary Analysis of Quercus floribunda through tRNA: Based on query tRNA sequence along with total of 14 highly similar Quercus spp sequences retrieved from GenBank were used for analysis. In the phylogenetic tree Q. floribunda shared the same ancestral node with Q. senescens, Q. semecarpifolia, tungmaiensis, Q. aquifolioides, Q. rehderiana, and Q. brandisiana, presented in (Figure 3a & 3b).
Figure 2:

Figure 3:Analysis of the tRNA spacer-aligned sequences.
a) indicates a 2D matrix of the pairwise-sequence distances. Pairwise distances are represented with color
schemes, from dark blue (lower) to yellow (higher)
b) Phylogenetic tree classification of tRNA genes generated with 10000 bootstraps using Neighbor Joining (NJ).
Light blue circles represent bootstrap confidence; the larger circle size corresponds to higher bootstrap confidence.

Assessment of diversity analysis
The assessment of nucleotide diversity (π), segregation sites (S), haplotype identification (Hd), distribution, and networking for the genomic aligned sequences of rpoB, rbcL, and tRNA spacer were performed. Guo and co-workers have suggested similar analyses for Theaceae plants. A designated Python script was used for calculating π using the sequence pairwise distances. Current study results indicated 0.092, 0.025, and 0.011 for rpoB, rbcL, and tRNA, respectively. Furthermore, to access the segregation site (S) in each dataset 338, 5, and 21 each for rpoB, rbcL, and tRNA were detected presented in (Figure 4). The segregation data S, and haplotype diversity (Hd), indicating 0.913 for (rpoB), 0.722 (rbcL), and 0.836 (tRNA) were recorded presented in (Figure 5). All these barcode regions exhibited higher haplotype diversity, and recommended as good barcode for species identification.
Figure 4:Calculation of the Segregation Sites: Segregating sites are calculated in Python version 3.10.10. Panel A, B, and C represent rpoB, rbcL, and tRNA, respectively.

Figure 5:Haplotype Distribution Analysis: The calculated haplotypes are displayed in a pie chart. Below, the memberships of each dataset are mentioned, with their respective representations in the analyses.

Analysis of haplotype diversity
Based on these barcodes sequences haplotypes and distribute their diversity (Hd) were calculated. The results indicated that 14, 5, and 8 haplotypes in the rpoB, rbcL, and tRNA, respectively. As expected, the genes retaining higher (Hd) scores generated more haplotypes and showed higher diversity. The results found that all the barcodes’ regions achieved considerable diversity among the Quercus spp. Furthermore, the relationship was calculated between these haplotypes. Thus, the networks were employed in Python and used hamming distances between the nodes (haplotypes) and applied them to edge weights, i.e., higher weight corresponds to more thickness to describe the distance between the haplotypes. Based on rpoB, the reference node (node 12) exhibited a connection with other haplotypes retaining different distances. Among them, haplotype 8 had the close relationship with Q. brandisiana. Similarly, the reference rbcL shares the same clade with Q. baloot and Q. aucheri. Based on tRNA haplotype network, the reference node shares a similar distance with other haplotypes 7 (Q. oxycodone and Q. kiukiangensis), and lowest distance of 13 haplotypes presented in (Figure 6).
Figure 6:Haplotype Distribution Analysis: The calculated haplotypes are displayed in a pie chart. Below, the memberships of each dataset are mentioned, with their respective representations in the analyses.

The emergence of advanced sequencing technologies has significantly enhanced the utility of DNA barcoding in taxonomy, particularly for species identification and assessing biological diversity across populations and communities [20]. The current study highlighted the effectiveness of chloroplast regions such as rbcL, rpoB and tRNA recommended the Consortium of Life (CBOL) [29]. These markers have shown a reliable species identification and genetic diversity within various plant taxa, including oaks (Quercus spp.) [31,32]. The current study result is consistent with the previous research on Quercus species, which exhibit the high level of genetic variation due to their extensive distribution and ecological adaptability [33]. This genetic variability is essential for the Quercus species adaptation in diverse environmental conditions [34]. The high nucleotide diversity observed in the rpoB gene suggests significant evolutionary potential, which aligns with the previous studies that highlight the role of genetic diversity in facilitating the adaptive responses to environmental changes [35]. In the current study, phylogenetic analysis placed the Quercus floribunda L. in close relation to other members of the genus, such as Q. variabilis and Q. acutissima, which reflect the complex evolutionary history of oaks characterized by hybridization and gene flow [36,37]. These findings are strongly supported by interspecific hybridization events within the Quercus genus and contributes to their phylogenetic complexity [38]. Similarly, a previous study demonstrated that hybridization and introgression are common in the genus Quercus, blurring the species boundaries and complicating phylogenetic reconstructions [39]. On the other hand, similar findings were reported on Q. variabilis and Q. acutissima in East Asia, where overlapping habitats facilitated genetic exchange [40]. These patterns resonate with the placement of Q. floribunda alongside these species, suggesting that shared gene flow may strengthen their close genetic relationship. Similar results were found that Asian oaks often exhibit higher genetic diversity and reticulate evolutionary patterns [41]. Wang [11] also used ITS and chloroplast markers to resolve phylogenetic relationships among Asian oaks and found that Q. variabilis and Q. acutissima form a closely knit cluster. Studies on Quercus robur and Quercus petraea in Europe also highlight similar patterns gene flow and hybridization, showing that these processes are not confined to Asian species but represent a global phenomenon in the Quercus genus [42]. Understanding the genetic diversity and population structure of Q. floribunda L. is essential for conservation strategies because conservation efforts concentrate on preserving the genetic connectivity across landscapes [43]. This is a major step in ensuring the oak species long-term variability and adoptability [44]. Molecular markers such as rbcL, rpoB and tRNA genes provide a robust framework for assessing the genetic diversity and phylogenetic relationship [45]. Plant systematics have widely used these markers to resolve the taxonomic uncertainties and understand the evolutionary processes [46]. Integrating the chloroplast and nuclear DNA data enhances the resolution of phylogenetic analysis, offering a comprehensive view of species relationships within complex genera like Quercus [47]. The applications of these DNA barcodes have been further validated through comprehensive analysis, such as pairwise-sequence similarity tests, phylogenetic establishment and haplotype networking, which collectively establish their potential to resolve the taxonomic uncertainties and also understand the evolutionary relationship among species [48]. The current study focusing on the Quercus floribunda from the northern region of Pakistan and the finding of the results rpoB gene exhibited the highest haplotype diversity among the three tested barcode regions, indicating the superior capacity of the rpoB gene in distinguishing closely related species within the Quercus genus. Markers for terrestrial plants and underscored the importance of selecting appropriate markers for effective specieslevel identification [49]. The rapid evolution rate in the rpoB gene may contribute to its effectiveness in capturing recent speciation events, which is particularly relevant given the challenges posed by hybridization and introgression in oak taxonomy [50]. However, it is essential to recognize that relying only on a single barcode region may only sometimes yield sufficient resolution for all species within a diverse genus like Quercus [51]. Current study finding supports multi-barcode approaches to enhance species identification and their phylogenetic analysis, as supported by the literature and study of Heise [52] and Gonzalez [43]. Furthermore, current study haplotype network analysis revealed significant insight into the relationship among different Quercus species, suggesting recent divergence or ongoing gene flow, particularly between Q. floribunda and closely related species such as Q. baloot and Q. aucheri. This information is crucial for understanding the evolutionary history and biogeography of oaks in the region [53,54]. The observed genetic variability within Q. floribunda and its relative is vital for the adaptive potential of these species, especially in the context of climate changes and environmental pressures [55]. Conservation strategies should, therefore, consider this genetic diversity to preserve the evolutionary potential of these important forest trees [56-61].
In the current study, three barcode regions were successfully amplified, sequenced, and analyzed through various computational online and offline tools, both. The obtained sequences evaluation and identification were carried out using reference sequences of the Quercus genus retrieved from GenBank. The primers used in the current study showed over 90% effectiveness. The chloroplast markers, such rpoB, rbcL, and tRNA, performed exceptionally well in species identification and are recommended for use in evaluating other Quercus species across northern Pakistan and globally. Phylogenetic analysis showed that Quercus floribunda from the study area is most closely related to Q. baloot and Q. aucheri, and somehow with other species such as Q. variabilis, Q. tungmaiensis, Q. senescens, Q. semecarpifolia, Q. aquifolioides, Q. rehderiana, and Q. brandisiana. These findings lay the foundation for further research on the genetic diversity, population structure, and species identification within the Quercus genus.
All authors extend their sincere appreciation to the Researchers Supporting Project No: 5711 and the Department of Botany and Department of Biotechnology and Genetic Engineering, Hazara University Mansehra, KP, Pakistan, for assistance and technical support in the experiment.
AA and AM conceived and designed the study. AA S and IU performed the experiments. AA, S, HA and IU contributed to data collection and analysis. AA, AM, S, DP and HA assisted in manuscript drafting and reviewing. AM KM and DP provided supervision and critical revisions. All authors have read and agreed to the published version of the manuscript.
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
© 2025 Abdul M. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and build upon your work non-commercially.
 a Creative Commons Attribution 4.0 International License. Based on a work at www.crimsonpublishers.com.
							
							
							Best viewed in
   a Creative Commons Attribution 4.0 International License. Based on a work at www.crimsonpublishers.com.
							
							
							Best viewed in  
							 | Above IE 9.0 version
| Above IE 9.0 version