Inam PHD Thesis Final Binding

DENTAL MORPHOLOGY AND HAPLOTYPIC DIVERSITY
IN THE MAJOR ETHNIC GROUPS OF SWAT AND DIR

DISTRICTS
INAMULLAH
DEPARTMENT OF GENETICS
HAZARA UNIVERSITY MANSEHRA
2018
DISTRICTS
By
Inamullah
This research study has been conducted and reported as partial fulfillment of
the requirements of PhD degree in Genetics awarded by Hazara University
Mansehra, Pakistan
The Friday 03, March 2017
HAZARA UNIVERSITY MANSEHRA
2018
DISTRICTS
Submitted by INAMULLAH
PhD Scholar
Research supervisor PROF. DR. HABIB AHMAD

Vice Chancellor
Islamia College Peshawar
Co supervisor DR. BRIAN E. HEMPHILL

Associate Professor
Department of Anthropology
University of Alaska, Fairbanks
Fairbanks, AK 99775
United States
HAZARAUNIVERSITY MANSEHRA
2018
AL QURAN
"O Mankind, we created you from a single pair of a male and a female, and made
you in to tribes and nations so that you may know each other (not that you
despise each other). Verily, the most honored of you in the sight of Allah is he
who is most righteous of you, Surely Allah is All-Knowing, All-Aware."
(Al-Hujurat, 49: 13)

AUTHOR’S DECLARATION
I Inamullah hereby state that my PhD thesis titled “Dental morphology and
haplotypic diversity in the major ethnic groups of Swat and Dir Districts” is
my own work and has not been submitted previously by me for taking any
degree from this University (Hazara University Mansehra Pakistan) Or
anywhere else in the country/world.
At any time if my statement is found to be incorrect even after my Graduate the
university has the right to withdraw my PhD degree.
Inamullah
Date: 16-02-2018
Plagiarism Undertaking
I solemnly declare that research work presented in the thesis titled ‘’ Dental
morphology and haplotypic diversity in the major ethnic groups of Swat and Dir
Districts ’’ is solely my research work with no significant contribution from any
other person.
Small contribution/help wherever taken has been duly acknowledged and that
complete thesis has been written by me.
In understand the zero tolerance policy of the HEC and University (Hazara
University Mansehra) towards plagiarism. Therefore I as an Author of the above
titled thesis declare that no portion of my thesis has been plagiarized and any
material used as reference is properly referred/cited.
I undertake that if I am found guilty of any formal plagiarism in the above titled
thesis even after award of PhD degree, the university reserves the rights to
withdraw/revoke my PhD degree and that HEC and the University has the right
to publish my name on the HEC/University Website on which names of students
are placed who submitted plagiarized thesis.
Student/Author Signature: ____________
Name: Inamullah
ACKNOWLEDGMENTS
Although feelings are deep, unfortunately the words are too shallow. The names
may be mentioned but the extent and the level of my gratitude is impossible to
capture. All praises and thanks to the greatest Almighty Allah, Omnipresent, Lord of
the lords who blessed me to complete this task within specified time. One and only
who made my dreams come true. "I can do everything through Him who gives me
strength. I also offer the humble words of respect and profound gratitude to the
Holy Prophet Muhammad (Peace Be upon Him) the most perfect and glorious
among all the creatures born on surface of the earth and has been sent for
enlightening our conscience and who is forever the city of knowledge for the whole
humanity.
Completion of this doctoral dissertation was possible with the support, help and
inspiration of many people. It is a pleasure to convey my gratitude to them all in my
humble acknowledgment. In the first place, I would like to express my sincere
appreciation and gratitude to my research supervisor Prof. Dr. Habib Ahmad, Vice
Chancellor, Islamia College Peshawar, for the continuous support in my PhD
dissertation, for his patience, motivation, enthusiasm, and immense knowledge. His
guidance helped me in all the time of research and writing of this thesis. I could not
have imagined having a better advisor and mentor for my PhD study. A person with
an amicable and positive disposition, he has always made himself available to clarify
my doubts despite his busy schedule and I consider it a great opportunity to do my
doctoral programme under his guidance and to learn from his research expertise.
I also extend sincere thanks to my research co-supervisor Dr. Brian E. Hemphill

Associate Professor, Department of Anthropology, University of Alaska, Fairbanks,
United Sates for his splendid guidance and assistance in the completion of this work
and I am indebted to him for his advice, supervision, and crucial contribution, which
made him a backbone of this research and to this dissertation.
I am grateful to Prof. Eske Willeslev, Director of Center for Geogenetics, University

of Copenhagen, Denmark for giving me opportunities in their group and leading me
to work on diverse exciting projects.
I am also thankful to my lab colleagues and group members i.e. Morten E. Allentoft,
Martin Sikora Ashot Margaryan and Constanza de la Fuente Castro at Centre for
GeoGenetics Denmark for their excellent guidance.
I am also thankful to Dr. Jill K. Olofsson Department of Animal and Plant Sciences,
University of Sheffield, for helping me in statistical data analysis.
i
I express my heart-felt gratitude especially to Dr. Muhammad Shahid Nadeem,
Assistant Professor in the Department of Biochemistry, Faculty of Science, King
Abdulaziz University, Saudi Arabia, who guided me with his intelligent ideas,
thought provoking discussions and comprehensive understanding to help me sail
through the initial fumbling. His attitude of living every moment as it comes,
making unexpected observations and converting them to new possibilities,
correlating ideas and understanding the obvious has helped me come a long way
and will always guide me in future.
I would also like to express my appreciation to Dr. M. Ilyas, Director of the Centre
for Human Genetics Hazara University, Mansehra for his help in statistical data
analysis, sharing of scientific ideas, and his help in approaching scientific
communities.
I am grateful for the funding source that allowed me to pursue my study: The
Higher Education Commission (HEC) of Pakistan. The technical and generous
financial support from the HEC sponsored project (NRPU- 20-1409), entitled
“Ethnogenetic elaboration of KP through dental morphology and DNA analysis” of
the Department of Genetics Hazara University Mansehra, is highly acknowledged.
I acknowledge Secretary Education KP, Directorate of schools and colleges of Swat

and Dir districts and all the volunteers’ for their help and support in providing
samples for this research. I feel most pride in expressing my deepest sense of
gratitude to the Department of Genetics and everybody related to it was important
in my comprehension of the work, which boosted my self-confidence during
achievement of my goal. Some faculty members of the department have been very
kind enough to extend their help at various phases of this research, whenever I
approached them, and I do hereby acknowledge all of them.
Members of Human Genetics Lab deserve my sincerest thanks, their friendship and
assistance has meant more to me than I could ever express. I could not complete my
work without influential outgoing support of the participants of the Ethnogenetic
project in the lab. I should also mention Ethnogenetic project for allowing me to be
part of a great professional community. I am indebted to my many student
colleagues for providing an encouraging and pleasurable environment. My thanks
go in particular to Dr. Muhammad Tariq, Assistant Professor at Islamia College
University, Peshawar, Mr. Numan Fazal, Mr. Murad Ali, Mr. Muhammad Ismail
Khan (Research Associates), Mr. Faridullah, Mr. Sheraz Khan and Mr. Zakaria Khan
for their help and support during lab work. I would like to express my deep thanks
to Miss Mehreen Amin and Miss Shakeela Umar, who made possible the collection
ii
of female samples for my PhD research. Special thanks to Dr. Nazia Akbar for her
guidance, encouragement and support. I am also thankful to Dr. Khushi
Muhammad for his valuable guidance and moral support during my stay at the
human genetics lab. I would like to record words of honour for all my fellows,
colleagues and teachers who shaped me with their vast knowledge.
I am very thankful to Functional Genomics lab in the department of Genetics and all
its members especially Dr. Inamullah, Associate Professor, Dr. Ikram Muhammad,
Dr. Israr Ahmad and Miss Nazish Gul (MPhil Scholar) for their instrumental support
and help throughout my research work.
I would like to express my sincere thanks to FANTA`s members, (Mr. Ikram

Muhammad, Muhammad Ali, Mr. Israr Ahmad and the special one among them all,
Mr. M. Jawad) for their thoughtful guidance, insightful decision and sincere
encouragement during the critical situations of my PhD.
Finally, I would also like to thank my family for the support they provided me
through my entire life and in particular, I must acknowledge my mother, father
(late) and brothers, without whose love, encouragement and prayers, I would not
have finished this thesis.
Inamullah
iii
Dedication
To My
Beloved Parents
iv
CONTENTS
ACKNOWLEDGMENTS I
CONTENTS V
LIST OF TABLES X
LIST OF FIGURES XII
ABBREVIATIONS XV
ABSTRACT XVII
INTRODUCTION 1
1.1 Modern human history 1
1.1.1. Dispersal of anatomically modern humans 2
1.1.2. Time line and routes of Modern Human dispersal throughout the world 9
1.2. Pakistan 11
1.3. Study Area 15
1.3.1. District Swat 16
1.3.2. District Dir 18
1.4. The Pashtuns\Pakhtuns 20
1.4.1. The Yousafzai 22
1.4.2. The Utmankheils 23
1.4.3. The Tarklanis 23
1.5. The Kohistani 24
1.6. The Gujars 25
1.7. The genetic characterization of human 27
1.8. Dental Morphology/ Dental Anthropology 30
v
1.8.1 The Birth of Dental Anthropology 32
1.8.2. Dental anthropology investigations in South Asia 34
1.8.3. Non-metric dental morphological traits 36
1.8.4. Basic terminology use in dental morphology 36
1.8.5. Analysis of dental morphology traits 38
1.9. Mitochondrial DNA (mtDNA) 46
1.9.1. MtDNA in human lineages 48
1.9.2. mtDNA Variation 49
1.10. The Y-chromosome 53
1.10.1. Phylogenetic Tree based on human Y-chromosome 55
1.10.2. Y-chromosomal haplogroup distribution across the globe 58
CHAPTER 2 MATERIALS AND METHODS 61
2.1. Samples collection for dental morphology study 62
2.1.1. Collection of dental Casts 62
2.1.2. Selection of volunteers 63
2.1.3. Biosafety Measures 63
2.1.4. Dental casting and labeling 64
2.1.5. Grading and scoring of dental morphology traits 65
2.2. Analyzing the DNA 66
2.2.1. Collection of saliva samples 66
2.2.2. Genomic DNA extraction 66
2.2.3. Screening of the purified gDNA 67
2.2.4. Agarose gel electrophoresis 68
vi
2.3. Mitochondrial DNA characterization 68
2.3.1. PCR Amplification of target DNA 68
2.3.2. Thermocycling conditions for PCR 69
2.3.3. Visualization of the PCR Products 71
2.3.4. Elution of PCR Product 71
3.3. Y-chromosome analysis 72
3.3.1. Y-STR and Y-SNP datasets 72
3.1.2. Multiplex PCR profile 73
3.4. Statistical Analysis 74
3.4.1. Dental morphology Analysis 74
3.4.2. MtDNA Analysis 74
3.4.3. Y-STRs and Y-SNPs analysis 75
CHAPTER 3 RESULTS 83
3.1. Dental Morphology 83
3.1.1.1. Shovelling 85
3.1.1.2. Median Lingual ridge 88
3.1.1.3. Y-Groove Pattern 89
3.1.1.4. Hypocone 90
3.1.1.5. Metaconule 92
3.1.1.6. Major Cusp number 94
3.1.1.7. Entoconuild 96
3.1.1.8. Metaconulid 97
3.1.2. Mean Measure of Divergence 100
vii
3.1.3. Living Northern Pakistanis Only 100
3.1.3.1. Neighbor-joining Cluster Analysis 102
3.1.3.2. Multidimensional Scaling —Kruskal’s Method 103
3.1.3.3. Multidimensional Scaling —Guttman’s Method 106
3.1.3.4. Principal Coordinate Analysis 107
3.1.4. Living Pakistanis Considered in Light of Living Peninsular Indians and
Prehistoric Inhabitants of the Indus Valley and South-Central Asia 110
3.1.4.1. Neighbor-joining Cluster Analysis 110
3.1.4.2. Multidimensional Scaling—Kruskal’s Method 113
3.1.4.3.Multidimensional Scaling—Guttman’s Method 115
3.1.4.4. Principal Coordinate Analysis 117
3.2. Mitochondrial DNA analysis 119
3.2.1. Genomic DNA isolation 119
3.2.2. PCR amplification 119
3.2.3. MtDNA Haplogroups determination 121
3.2.3.1 MtDNA Haplogroups determination in the individuals of Gujars 121
3.2.3.2. MtDNA Haplogroups of the sampled Tarklani population of District Dir 130
3.2.3.3. MtDNA Haplogroup variation among the Utmankheil of District Dir 140
3.2.3.4. Haplogroups of the sampled Yousafzai of District Swat 149
3.2.3.5. Haplogroup distribution among the sampled Kohistanis of District Swat 158
3.2.4. Overall mtDNA haplogroup distribution among the five sampledethnic
groups of Swat and Dir districts 165
viii
3.2.4.1. Diversity comparison among the five sampled ethnic groups of Swat and
Dir Districts 173
3.2.4.2. Mitochondrial Genetic Differntiation 174
3.2.4.3. Multi Dimensional Scaling 175
3.2.4.4. Network Analysis based on mtDNA sequences 176
3.3. Y-chromosome STRs and Y-SNPs analysis 177
3.3.1. Multiplex performance 177
3.3.2. Genetic diversity 178
3.3.3. Genetic differentiation 181
3.3.4. Genetics, ethnicity and geography 184
3.3.5. Detailed analysis of two Y-chromosomal haplogroups 190
DISCUSSION 193
CONCLUSIONS 217
RECOMMENDATIONS 220
REFERENCES 221
APPENDIX I 257
APPENDIX II 258
APPENDEX III 259
APPENDEX IV 261
ix
LIST OF TABLES
Table 1 Details of samples collected from Swat and Dir districts ............................ 62
Table 2 Details of the primer sequences used in the present study for the
amplification of the target fragment of the mtDNA control region. .......... 69
Table 3 Components and concentration of PCR reaction mixture/sample ............ 69
Table 4 Components and concentrations of the multiplex PCR reaction. ............... 72
Table 5 Cycling profile for multiplex PCR reaction.................................................... 73
Table 6 Population samples included in the larger comparative analyses.
Sample sizes and references to the original studies are shown. ................. 78
Table 7 Details of the living\modern and prehistoric samples used in this
study for comparative analysis ........................................................................ 84
Table 8 Frequencies of dental traits among the five ethnic groups (%). .................. 85
Table 9 Mean measure of divergence (MMD) distance matrix obtained from the
pairwise group comparisons of the five populations and the other
population used in this study. ....................................................................... 101
Table 10 Statistical analysis of the Gujar sample from Swat ..................................... 121
Table 11 Haplogroup frequencies and their respective variants found in the
Gujar sample from of Swat ............................................................................. 124
Table 12 Diversity comparison of the sampled Gujar population from Swat with
the other reported ethnic groups of Pakistan. ............................................. 129
Table 13 Statistical analysis of Tarklanis of District Dir ............................................. 130
Table 14 Haplogroup frequencies and their respective variants found among the
sampled Tarklanis of Dir District .................................................................. 131
Table 15 Diversity comparison of among the sampled Tarklanis of District Dir
with the other reported ethnic groups of Pakistan. .................................... 139
Table 16 Statistical analysis of sampled Utmankheil population of District Dir.... 140
Table 17 Haplogroup frequencies and their respective variants in the
Utmankheil sample from District Dir ........................................................... 141
Table 18 Diversity comparison among the sampled Utmankheil individuals of
Dir District with the other reported ethnic groups of Pakistan. ............... 148
Table 19 Statistical analysis of Yousafzai of District Swat ......................................... 149
Table 20 Haplogroup frequencies and their respective variants among the
sampled Yousafzai individuals of Swat District ......................................... 150
Table 21 Genetic diversity of the Yousafzai sample from District Swat in
comparison to the other reported ethnic groups of Pakistan. ................... 157
Table 21 Statistical analysis of the Kohistani sample from District Swat ................ 158
Table 23 Haplogroup frequencies and the respective variants of the Kohistanis
sampled from Swat District ............................................................................ 159
x
Table 24 The genetic diversity of the sampled Kohistani population from
District Swat in comparison with the other reported ethnic groups of
Pakistan. ............................................................................................................ 164
Table 25 MtDNA haplogroup frequencies distribution in the five sampled
populations of Dir and Swat Districts. .......................................................... 165
Table 26 Haplogroups distribution among the individuals of Swat and Dir
district by associated geographic region of origin. ..................................... 170
Table 27 Genetic diversity in the mtDNA data within the five ethnic groups ....... 173
Table 28 Pairwise Fst genetic distances (below the diagonal) and corresponding
p-values (above the diagonal) between five ethnic groups from Swat
and Dir districts based on mtDNA sequence data. ..................................... 175
Table 29 Genetic diversity in the Y-STR (27 loci) and frequencies of Y-SNP
haplogroups within five ethnic groups from Dir and Swat Districts.
The values for the Y-SNP haplogroups in brackets represent 90%
confidence interval. .......................................................................................... 180
Table 30 The genetic distances among the five ethnic groups, calculated as
pairwise FST values based on 23 of the 27 STR loci. FST values below
the diagonal and the corresponding P-values above the diagonal. ......... 181
Table 31A AMOVA results when population samples are grouped based on
country of origin............................................................................................... 185
Table 31B AMOVA results when population samples are grouped based on
ethnicity. ............................................................................................................ 187
xi
LIST OF FIGURES
Figure 1. The Multi-Regional hypothesis of modern human migration 4
Figure 2. The representation of the Replacement hypothesis about human migration. 5
Figure 3. Diagram representing of the Assimilation model. 8
Figure 5. Geographic location of Khyber Pakhtunkhwa Province of Pakistan 13
Figure 6. Geographic distribution of primary languages spoken in KP, Pakistan 15
Figure 7. Graphical representation of population samples from Swat and Dir districts. 19
Figure 8. The generally accepted genealogy of the Pakhtuns originof KP, Pakistan. 21
Figure 9. Diagram represents the positional terms for the teeth and jaws 37
Figure 10. Morphological traits of canines and incisors with respect to ASUDAS. 39
Figure 11. Morphological traits of premolars. 40
Figure 12. Morphological traits with respect to ASDUAS reference plaques (a) reference
plaque representing the metacone in upper molars (b) reference plaque for
scoring hypocone (c) reference plaque for metaconule (d) reference plaque for
Carabelli’s trait (e) reference plaque for scoring parastyle. 42
Figure 13.1. Morphological traits (a) reference plaque for the anterior fovea in lower molars
with an example to its right (b) reference plaque for the deflecting wrinkle with
an example to its right side (c) reference plaque for the protostylid with an
example to its right side. 44
Figure 13.2.Dental morphological traits and reference plaques (a) the Y- and X-pattern (b)
reference plaque for the hypoconulid (c) reference plaque for the entoconulid
reference plaque for the metaconulid. 45
Figure14. Diagrammatic view of human mtDNA. 47
Figure 15. Human migration and haplogroup distribution across the world 49
Figure 16. mtDNA PhyloTree and partitioning scheme representing subtrees. 50
Figure 17. Modified structure of human Y chromosome. 53
Figure 18. Structure of the most recent and updated human Y-chromosome tree. 56
Figure 19.Geographic location of the study area. The colored circles represent location of
villages where samples were collected. 61
Figure 20. Filling, signing of consent form, and cleaning of teeth by volunteer individuals. 63
Figure 21. Placement and removal of the alginate-filled tray from the subject’s mouth. 64
Figure 22. Pouring of diestone mixture into the alginate impression mold and labeling. 65
Figure 23. Scoring of dental morphology traits using the ASUDAS reference plaques 65
Figure 24.Representation of thermocycling profile for PCR. Figure (A) represents PCR
conditions for HVSI, while figure (B) represents PCR conditions for HVSII. 70
Figure 25.Frequencies of shovelling among living Pakistani ethnic groups, living ethnic
groups from peninsular India and samples of the prehistoric inhabitants of the
Indus Valley, South Central Asia (A) SHOVUI1 (B) SHOVUI2. 87
Figure 26. Frequencies of Median Lingual ridge among living Pakistani ethnic groups, living
ethnic groups from peninsular India and samples of the prehistoric inhabitants of
the Indus Valley, South Central Asia. 88
xii
Figure 27. Frequencies of Y-Groove Pattern among living Pakistani ethnic groups, living
ethnic groups from peninsular India and samples of the prehistoric inhabitants of
the Indus Valley, South Central Asia. 89
Figure 28. Frequency distribution of hypocon (A) HYPOCONM1 (B) HYPOCONM2. 91
Figure 29. Frequencies of metaconule at upper molars (A) HYPOCONM1 (B)
HYPOCONM2). 93
Figure 30. Frequency Distribution of major cusps numbers at lower molars (CSPNLM)
among all samples (A) Frequency of CSPNLM1 (B) Frequency of CSPNLM2 95
Figure 31. Frequencies distributions of entoconuild at lower molars (C6LM) among all
samples included in this study (A) C6LM1 (B) C6LM2 97
Figure 32. Frequencies distributions of Metaconulid at lower molars (C7LM) among all
samples included in this study (A) C7LM1 (B) C7LM2 98
Figure 33. Neighbor-joining cluster analysis of modern populations of northern Pakistan,
peninsular Indian populations and their comparison to the major ethnic groups
of Swat and Dir districts, Pakistan. 102
Figure 34. Multidimensional scaling (Kruskal's method) of the major ethnic groups residing
in Swat and Dir districts in comparison with other living Pakistani and
peninsular Indian ethnic groups. 104
Figure 35. Multidimensional scaling (Guttman’s method) of the major ethnic groups
residing in Swat and Dir districts in comparison with other living Pakistani and
peninsular Indian ethnic group samples. 106
Figure 36. Principal coordinate analysis of the major ethnic groups residing in Swat and Dir
districts in comparison with other living Pakistani and peninsular Indians. 109
Figure 37. Neighbor-joining cluster analysis of the living Pakistani, other living and
prehistoric inhabitants of the Indus Valley, South-Central Asia with the major
ethnic groups from Swat and Dir districts. 110
Figure 38. Multidimensional scaling with Kruskal's method of Smith’s MMD pairwise
distances among living Pakistani ethnic groups, living ethnic groups of
peninsular India, prehistoric inhabitants of the Indus Valley, South Central Asia
and the major living ethnic groups of Swat and Dir districts. 114
Figure 40. Principal coordinate analysis of living Pakistani, peninsular Indians, samples of
prehistoric inhabitants of the Indus Valley and South Central Asia. 118
Figure 41. Photographs representing quality and concentretion of gDNA (A) Agarose gel
electrophoresis (B) electropherogram 119
Figure 42. Agarose gel electrophoresis photograph of mtDNA control region (a) amplfied
PCR fragment of HVSI (b) amplfied PCR fragment of HVSII 120
Figure 43. Agarose gel electrophoresis photographs (a) eluted PCR products of mtDNA
HVSI (b) eluted PCR products of mtDNA HVSII. 120
Figure 44. Graphical representation of mtDNA haplogroups frequencies present in Gujar
sample from Distict Swat. 123
Figure 45. Mega-haplogroup frequencies observed in the sample of Gujars from District
Swat through mtDNA control region. 128
Figure 46. Distribution of Tarklanis haplogroup by Origins. 135
xiii
Figure 47. Graphical representation of haplogroup frequencies among the sampled Tarklani
individuals from District Dir. 136
Figure 48. Haplogroup frequencies observed among the sampled Tarklani individuals from
District Dir through mtDNA control region. 137
Figure 49. Graph representing haplogroup frequencies among the sampled Utmankheil
individuals from District Dir. 145
Figure 50. The frequency of Mega-haplogroups observed among Utmankheils. 147
Figure 51. The distribution of Yousafzai haplogroups among the individuals sampled from
District Swat by associated geographic origin. 153
Figure 52. The frequencies of mtDNA haplotypes of Yousafzai individuals sampled from
District Swat with respect to their associated geographic origins. 154
Figure 53. The frequency of Mega-haplogroups observed in Yousafzai individuals from
District Swat through mtDNA control regions. 155
Figure 54. Haplogroup distribution among the sampled Kohistanis from District Swat by
associated geographic region of origin. 161
Figure 55. The frequencies of mtDNA haplotypes of Kohistanis sampled from District Swat
with respect to their associated geographic regions of origin. 162
Figure 56. Mega-haplogroup distribution among the sampled Kohistani individuals from
District Swat. 163
Figur 57. Mega-haplogroup distribution among members of the five sampled ethnic
groups of Swat and Dir districts. 169
Figure 58. Haplogroup distribution among the individuals of the five sampled populations
of Swat and Dir district by associated geographic region of origin. 171
Figure 59. Distribution of mtDNA lineages (A) West Eurasian (B) South Asian (C) East
Eurasian 172
Figure 60: MDS plot of the five major ethnic groups of Swat and Dir districts derived from
Fst genetic distances. 175
Figure 61. Network analysis of five population samples from Swat and Dir districts based
on mtDNA sequence data. 176
Figure 62. An example of typical electropherogram for Y-STRs multiplex reaction. 177
Figure 63. Multi Dimensional Scaling derived for the five major ethnic groups of Swat and
Dir districts. 182
Figure 64. Median joining network based on the Y-STR haplotypes (23 loci) of the five
population samples. 183
Figure 65. Multi-dimensional scaling analysis for 38 selected populations from the Indo-
Pakistani sub-continent and neighboring countries. 188
Figure 66. Worldwide multi-dimensional scaling analysis of pairwise genetic distances,
estimated as FST 190
Figure 67. Y-chromosome haplogroup-specific networks based on Y-STR haplotypes (10
loci) with individuals assigned to (A) Y-SNP haplogroups G-Page94 and H1-M69
(B) Y-SNP haplogroup L1-M22(xM274). 192
xiv
ABBREVIATIONS
AMHs Anatomically Modern Humans
AWAm1 Awans collection from Mansehra District by BE Hemphill
AWAm2 Awans collection from Mansehra District by Nazia Sidiq
ChlMRG Early Chalcolithic Period collection from the archeological site of

Mehrgarh (c. 4500 BC)
CHU Living tribal Chenchus from central Andhra Pradesh, India
DJR Djarkutan Period collection from the archeological site of Djarkutan

(2000-1800 BC), Uzbekistan
GPD Low-status Dravidian-speaking Gompadhompti Madigas from

southern Andhra Pradesh, India
GUJm2 Gujars collection from Mansehra District by Nazia Sadiq
GUJsw Gujars from Swat District (present study)
HAR Mature Period collection from the archeological site of Harappa (c.
2300-1800 BC), Punjab Province, Pakistan
INM Late Jorwe Period collection from the archeological site of Inamgaon (c.
1400 BC), Maharashtra, India
KARa Karlaars collection from Abbottabad District by Nazia Sadiq
KHO Khowars from Chitral City, Chitral District
KOHsw Kohistanis from Swat District (present study)
KUZ Kuzali Period collection from the site of Djarkutan (1800-1650 BC),
Uzbekistan
MDK Living inhabitants of the village of Madak Lasht, Chitral District
MDA Living Madia Gond tribals from Eastern Maharashtra, India
MDS Multidimensional Scaling
MHR Living Indo-Aryan-speaking low-status Mahars from Western

Maharashtra, India
MRT Living Indo-Aryan-speaking high-status Marathas from Western

Maharashtra, India
xv
MOL Molali Period collection from the site of Djarkutan (1650-1500 BC),
Uzbekistan
NeoMRG Aceramic Neolithic Period collection from the site of Mehrgarh(c. 6000
BC), Baluchistan Province, Pakistan
PNT High-status Dravidian-speaking Pakanati Reddis from southern

Andhra Pradesh, India
PCO Principal Coordinates Analysis
SAP Sapalli Period collection from the site of Sapalli tepe (c. 2200-2000 BC),
Uzbekistan
SKH Iron Age collection from the site of Sarai Khola (c. 200 BC), Punjab
Province, Pakistan
SWT Living Swatis collection from Dhodial and Baffa, Mansehra District by
BE Hemphill
SYDm2 Syeds collection from Mansehra District by Nazia Sadiq
TANm2 Tanolis collection from Mansehra District by Nazia Sidiq
TMG Late Bronze/Early Iron Age collection from the site of Timargarha
(1400-800 BC), Dir District, Pakistan
TRKd Tarklani from Dir District (present study)
WAKg Living Wakhis from Gulmit, Gilgit-Baltistan
WAKs Living Wakhis from Sost, Gilgit-Baltistan
MtDNA Mitochondrial DNA
CRS Cambridge Reference Sequence
HV Hypervariable
HVS Hypervariable Sequence
HVSI Hypervariable Segment I
HVSII Hypervariable Segment II
TANm2 Tanolis from Mansehra District collected by Nazia Sadiq
UTHd Utmankheil from Dir District (present study)
YSFsw Yousafzai from Swat District (present study)
GOP Government of Pakistan
xvi
ABSTRACT
The ethnic groups inhabiting Dir and Swat Districts of Khyber Pakhtunkhwa
Province, Pakistan are known to exhibit cultural and physical diversity. Genetic
diversity however among the people of this region remains largely unknown. A
research endeavor based on dental anthropology and molecular phylogenetic was
conducted in for elaborating phenetic and molecular affinities among the major
ethnic groups of the area. The morphological variants of permanent tooth crown
were recorded for phenotypic analyses, whereas mitochondrial DNA (mtDNA) and
Y-Chromosoal STRs/SNPs were considered for maternal and paternal variation,
respectively, among the individuals and in between Gujar, Kohistani, Tarklani,
Utmankheil and Yousafzai tribes. Dental casts and oral swabs were collected from
volunteers of all the tribes/ethnic groups. Morphological variants of the permanent
tooth crown were scored from maxillary and mandibular dental castes in accordance
with the Arizona State University Dental Anthropology System (ASUDAS). Two
mitochondrial DNA control segments viz Hypervariable segment I (HVSI),
Hypervariable segment II (HVSII), 27 Y-STRs and 331 Y-SNPs were used to explore
molecular phylogenetic relationships. Dental casts were obtained from 823 healthy
unrelated individuals of the five ethnic groups of the two districts. The casts were
analyzed for 14 tooth-trait combinations. The data was then compared with 27
samples encompassing 3,185 prehistoric and living individuals representing ethnic
groups of the Hindu Kush-Karakoram highlands and Indus Valley of Pakistan,
peninsular India, and Central Asia. Inter-sample affinities were computed with
C.A.B. Smith’s pairwise Mean Measures of Divergence (MMD) statistic. Patterning of
phenetic affinities were assessed with neighbor-joining cluster analysis (NJ),
multidimensional scaling (MDS), and principal coordinate analysis (PCA). The
results obtained vary with respect to data reduction technique. Neighbor-joining
cluster analysis assort Gujars, Kohistanis and Utmankhels with possessing affinities
to the ancient Harappans peop;e of the Indus Valley whereas Yousafzais assorted for
having affinities with ethnic groups of the Hindu Kush-Karakoram highlands. The
Tarklanis exhibit no close affinities to Gujars, Kohistanis, Utmankheils or Yousafzais.
xvii
The results of mtDNA generated 126 haplotypes among which, 75 were unique and
51 were shared. The results further revealed that 45% of the individuals possess
matrilineages of West Eurasian derivation, 36% of South Asian derivation, 6% of
individuals possessed lineages of East Eurasian derivation, while frequencies of
lineages of other derivations are of extremely low frequency. The West Eurasian
haplogroup R was found 62% of individuals was the most frequent haplogroup,
followed by South Asian haplogroup M (32%), East Eurasian haplogroup N (5%),
while one individual was found to possess the African haplogroup L. The results of
Y-STRs analysis revealed 82 haplotypes in which 75% were unique and 25% were
shared, yielding a haplotypic diversity of 0.99. High and statistically significant
levels of genetic differentiation were obtained in nine of the 10 pairwise comparisons
(FST= 0.148-0.596), the exception being the contrast between Tarklanis and Yousafzais
(FST = 0.008). Members of the Utmankheil, also considered Pashtuns tribe, were
found to be not closely related to any of the other population samples (FST= 0.445-
0.596). The high genetic differentiation was also visible in Y-chromosomal SNPs,
showing very little overlap between the five population samples, except for
Tarklanis and Yousafzais. When analyzed in a larger continental-scale, it is clear that
the paternal lineages of these five ethnic groups fall mostly outside the previously
characterized Y-chromosomal gene pools of Indo-Pakistani sub-continent. Our
findings presented here contribute towards the understanding of the genetic
complexity exhibited by the apparently related ethnic groups residing in the
northern parts of Pakistan. It provides a sound baseline for elaborating the historical
profile and anthropological standings of Pakistani people for the fastly approaching
era of personal genomics and personalized medicine.
xviii
Chapter 1
INTRODUCTION
1.1 History of modern human
Ever since from the development of human civilization the questions like; where did
the human race come from? Where are we going? Who were the closest relatives
and what are the circumstances that led to the evolution of Homo sapiens (H.sapiens);
are some of the questions for which the scientists seeks answers using the principles
of evolution and molecular genetics (Whale, 2012; Stoneking, 2008). The human are
unique due to: 1) an evolved intelligence, 2) hyperprosociality, and 3) a psychology
for social learning (Marean et al., 2015). Ultimate explanations for this evolutionary
information are better explained through synthetic studies of biology, genetics,
anthropology and archaeology etc. The evidence of fossil records shows that the
lineage that leads to extant modern humans appeared approximately ∼300 and 100
thousand years ago (Poznik et al., 2013; Scally and Durbin, 2012; Endicott et al., 2010;
Underhill and Kivisild, 2007). The fossil records also suggest that the world had a
diverse set of hominin lineages between ∼800 and 40 thousand years ago (kya).
There was a modern human lineage in Africa i.e., Omo-Kibish No. 2, Ngaloba, Jebel
Irhoud, Herto, at least one archaic African lineage H. heidelbergensis represented by a
number of fossil specimens (i.e., Bodo, Kabwe, Elandsfontein, Saldana, two archaic
Eurasian lineages (Neanderthals and Denisovans) and a widespread archaic Eurasian
lineage commonly referred to Homo erectus which shows considerable temporal and
1
geographic variation (Meyer et al., 2014; Prufer et al., 2014; Mendez et al; 2013;
Lachance et al. 2012; Hammer et al. 2011; Harvati et al., 2011).
Around 700 kya, and perhaps earlier, H. erectus in Africa gave rise to H.
heidelbergensis, a species more similar to modern humans in terms of body
symmetries, dental adaptations and cognitive factors (Rightmire, 2009).
Archeological and DNA evidence suggests that H. sapiens evolved in Africa about
200 kya, probably from H. heidelbergensis (Rightmire, 2009; Relethford, 2008). H.
heidelbergensis, often referred to as an "archaic" H. sapiens, was a dynamic big-game
hunter, produced sophisticated tools, and by at least 400 kya had the ability to
control fire (Roebroeks and Villa, 2011). Due to special characteristics and the advent
of good quality hunting techniques, H. sapiens was able to flourish in sub-Saharan
Africa, from which they dispersed to Eurasia, Australia, the Americas and Oceania
(DeGiorgio et al., 2009).
1.1.1. Dispersal of modern man
Despite the broad consensus that Africa represents the main place of origin for
Anatomically Modern Humans (AMHs), the routes of dispersal of man from the
continent remains a subject of considerable debate.
One of the most highly debated issues that focused on the origins of modern humans
is that, roughly 100,000 years ago, the Old World was occupied by a morphologically
diverse group of hominins. In Africa, as well as in the Middle East, there was H.
sapiens; in Asia, Homo erectus; and in Europe, Homo neanderthalensis (Klein, 2008).
2
However with the passage of time about 30,000 years ago this taxonomic diversity is
disappeared and all that remained were anatomically as well as behaviorally
modern humans (Johanson, 2001; Klein, 1999; Tattersall and Schwartz, 1999; Clark
and Willermet, 1997; Stringer and McKie, 1996; Wolpoff and Caspari, 1996; Nitecki
and Nitecki, 1994; Smith and Spencer, 1984). The evolution of Modern Man from
previous hominin species is disputed, nor which archaic human species from which
modern humans derived, but where, geographically. Three hypotheses are currently
quoted for popularity among paleoanthropologists. These include:
(i) The multi-regional hypothesis
(ii) The replacement hypothesis
(iii) The assimilation hypothesis.
Each of these hypotheses are based on fossil, archaeological, anthropological and
genetic evidences (Stringer, 2002; Mellars, 2006; Schick and Tooth, 1994). An
introduction to all these hypotheses is given bellow:
The multi-regional hypothesis
Proponents of the multi-regional hypothesis (Fig. 1), or the Regional Continuity
Model, suggest that Homo erectus migrated out of Africa to the various regions of the
world nearly 2.0 million years ago (MYA), which gradually evolved into AMHs,
providing our current worldwide distribution (Wolpoff et al., 1984; Nei, 1995). For
example, Asian H. erectus evolved into Asian modern humans, African H. erectus
evolved into African modern humans etc. It has also been reported that the multi-
regional model does not suggest parallel evolution, independent multiple origins or
3
the simultaneous appearance of characteristics within different regions (Wolpoff et
al., 2000). This hypothesis also states that the regional characteristics of modern
humans can be traced back to H. erectus remains that date nearly 1 mya (Nei, 1995).
The Genomic study reveals that AHMs had no evidence of Homo neanderthalensis
mtDNA contribution (Hodgsonand Disotell, 2008). This might be due to the high rate
of polymorphisms found between Neanderthal and modern human mtDNA with
respect to any two modern human mtDNA. However, some sequences of Homo
erectus X-chromosome were identified in the genome of modern humans (Cox et al.,
2008). This phenomenon provides some genetic support for the multi-regional
hypothesis.
Africa Asia Austo Asia Europe
Africa
Figure 1. The Multi-Regional hypothesis of modern human migration

(Stoneking, 2008)
4
It should be noted that the model doesn’t support the possibility of different H.
erectus populations breeding with one another; however it says that the main form of
breeding took place within isolated H. erectus members. Hence, proponents of this
hypothesis conclude that, each inhabited region showed a continuous anatomic
sequence leading to the development of modern humans, and those non-African
populations exhibited no special African influence. (Stringer, 2002).
The Replacement hypothesis
The Replacement hypothesis, or Out-of-Africa theory, is the primary alternative to
the multi-regional hypothesis (Fig. 2).
Africa Asia Austral-asia Europe
Africa
Figure 2. The representation of the Replacement hypothesis about modern

human migration (Stoneking, 2008).
5
This hypothesis also describes an African origin, but proponents of the multi-
regional hypothesis focus mainly on H. erectus and not all of the AMHs. By contrast,
proponents of the replacement hypothesis suggest that modern humans originated
from an African H. erectus population about 100,000-200,000 years ago (Nei, 1995) or
maybe ~150000 years ago (Forster and Matsumura, 2005).
This indicates that the modern humans first expanded inside Africa, then migrated
to the Middle East and then onwards to other regions. Advanced genetic techniques
were used to test this hypothesis (Whale, 2012). DNA obtained from Africans,
Asians, Australians, New Guineans and Europeans were analyzed for restriction
fragment length polymorphisms and it was concluded that the common ancestor of
all modern humans lived in East Africa between 140-280 kya (Cann et al., 1987). The
mtDNA sequences obtained from chimpanzees and humans were used to determine
the rate of mtDNA evolution and the results demonstrated that the common ancestor
to all modern humans dates to some 166-249 kya (Vigilant et al., 1991). Another
study also supported the Out of Africa hypothesis and calculated the ancestor of
modern humans to be about 230-298 kya (Hasegawa and Horai, 1991; Ruvolo et al.,
1993). Later the dispersal of AMHs out of Africa proceeded along a northern route
(the Levant) or a southern route through the Horn of Africa. Recently, the initial
single migration that took place through the southern route was supported by many
researchers based on an array of different data sets (Chandrasekar et al., 2009; Kumar
et al., 2009; Hudjashov et al., 2007; Mellars, 2006; Forster and Matsumura, 2005;
Macaulay et al., 2005; Kivisild et al., 1999). The Levantine migration shows lesser
6
impact and appears to have occurred recently about 20-10kya (Forster and
Matsumura, 2005; Winters, 2011). A third migration has also been proposed. In this
case the route occurred through the narrow Strait of Gibraltar from North Africa
approximately 40-35 kya, when Neanderthals were still present in western Eurasia
(Winters, 2011). The archeological and molecular genetic evidence supports a single
AMH origin in East Africa (Liu et al., 2006).
Research based on mtDNA and Y-chromosome variations also supports the out of Africa
hypothesis. The results of Y- chromosome and mtDNA haplogroups show that
Australian Aboriginals and Melanesians are from founder haplogroups (haplogroup
N and M for mtDNA, and haplogroups F and C for Y- chromosome) that are related
to the initial movement from Africa about 50-70 kya. Australian Aboriginals and the
indigenous populations of Papua New Guinea and Melanesia are related to each other and
once settled together; however, they were separated by the Timor Strait (Hudjashov et al.,
2007).
The Assimilation model
The Assimilation Model (Figure 3) is the combination of two former theories in that
AMH “arose through the integration of an important African role with multiregional
aspects” (Stringer, 2002). According to proponents of the assimilation model, Africa
is the origin for AHMs; however, it also suggested that the migrations and
replacement of the archaic populations played a pivotal role in the local evolution of
various H. erectus populations into AHMs.
7
It has also been reported that the genome of AHMs of Eurasians shares 1-4% of the
Neanderthal genomes and the Neanderthal genome is marked by a greater affinity
to European AHMs relative to the genomes of African AMHs (Green et al., 2010).
We also know from the previous studies that, the genome of Neanderthal is equally
similar to French individuals as to an East Asian (Han Chinese) and Papuan
genomes. This pattern suggests that admixture between AMHs and Neanderthals
occurred soon after modern humans dispersed out of Africa, but prior to the
subsequent divergence of Europeans, East Asians and Papuans (Green et al., 2010).
Africa Asia Austral-asia Europe
Africa
Figure 3. Diagram representing of the Assimilation model about the dispersal

of antatomically modern humans(Stoneking, 2008).
8
1.1.2. Time line and routes of Modern Human dispersal
Archaeological and genetic data suggest Africa as the home of AMHs (H. sapiens).
Phylogeographic studies utilizing the uniparental non-recombining DNA, mtDNA
and the male-specific region of the Y Chromosome(MSY), has largely clarified the
initial migration routes of anatomically modern humans (Underhill and Kivisild,
2007). The number of migration events of AMHs out of Africa is still debated, but
studies based on uniparental markers suggest a single migration event
(Oppenheimer, 2012; Underhill and Kivisild, 2007). A simplified sketch of the initial
dispersal of AMH is presented in Figure 4.
35-25 kYA 20-15 kYA

40 kYA
12? kYA
150-100 kYA
150-100 kYA 60-50 kYA
11? kYA
Figure 4. The main migration routes and timing of the migrations of human
out-of-Africa, adapted from Oppenheimer (2012)
9
Haak (2015), investigated the massive migration of human from steppe towards
Europe through ancient DNA and support for a steppe origin (steppe hypothesis) of
at least some of the Indo-European languages of Europe (
The timeline for the migration(s) of anatomically modern humans out of Africa is
controversial. However, a number of evidences show that modern humans migrated
out of Africa some 100-72 kya and moved eastwards towards the Indian sub-
continent via the Arabian Peninsula (Oppenheimer, 2012; Relethford, 2008). After the
initial dispersal out of Africa, anatomically modern humans traveled southeast and
reached Australia about 60-50 kya (Oppenheimer, 2012; Rasmussen et al., 2010;
Relethford, 2008). The migration of anatomically modern humans to Europe from
the Arabian Peninsula occurred approximately 40-50 kya (Soares et al., 2010;
Relethford, 2008; Novelletto, 2007) and at about 40 kya, Central Asia was inhabited
by humans from Pakistan through East Asian sea coast (Oppenheimer, 2012). Later
on, approximately 30-20 kya, the population from Central Asia migrated westward
toward Europe and eastward into Beringia while the last geographic region
colonized by human (Oppenheimer, 2012).
Most of the studies indicates that humans from Beringia reached Alaska
approximately 20-15 kya (Oppenheimer, 2012; Raff et al., 2011; O'Rourke and Raff,
2010). However there is an alternative hypothesis about human migration to the
Americas which states that a Pacific coastal route was used for migration from
Siberia to South America followed by a second migration towards the Bering
landbridge into North America (O'Rourke and Raff, 2010; Schurr and Sherry, 2004).
10
Most of the study indicated that the last geographic region colonized by
anatomically modern humans was Oceania, specifically Polynesia (Kayser et al.,
2010). Some evidence has also been offered to suggest that anatomically modern
humans migrated to Southeast Asia and Australia about 60-70 kya along the coasts
of Indian Ocean (Lahr and Foley, 1994).
It has also been reported that Pakistan was the first geographic region though which
anatomically modern humans migrated through this postulated southern coastal
route (Wolpert, 2000; Qamar et al., 1999).
1.2. The Islamic Republic of Pakistan
Pakistan is home of more than 2100 million people and at least 18 ethnic groups that
speak more than 60 local languages that have been assigned to a wide array of
linguistic stocks including, but not limited to Indo-Iranian, Indo-Aryan, Dardic,
Tibeto-Burman and Dravidian (Grimes and Grimes, 2000; Newcomb, 1986). Pakistan
occupies eastern Hindu Kush, western Himalaya and southern Karakurum. All these
famous mountan ranges meet in Pakistan at Jaglot, near Gilgit. Pakistan lies on the
crossroads of West Asia, Central Asia and South Asia (Ali et al., 2005). This region is
marked by a high degree of ethnic diversity, which historically has been attributed,
at least partially, to a long and dynamic history of repeated invasions by Aryans,
Macedonians, Arabs and Mongols etc (Lapidus, 2002; Bernhard, 1983; Birdwood,
1959). It is also believed that the Coast of Makran-Pakistan and the present day
Afghanistan likely served as passage for human dispersal in prehistoric times,
11
making the population dynamics of this region even more interesting (Derenko et al.,
2013). Additionally, the Hindu Kush highlands served as a physical barrier that
channeled trade along the “Silk Route” that linked the Mediterranean Basin and
West Asia to China for more than 16 centuries (Petraglia et al., 2012; Kuzmina 2008;
Elisseeff 2001; Quintana-Murci et al., 1999). Furthermore, Pakistan is one of the South
Asian countries that has two well-known civilizations; the Indus valley or Harappa
civilization, which flourished between 2600 BC and 2000 BC (Kenoyer, 1998) and the
Gandharan civilization that, peaked between 1500 and 1000 BC (Miller, 1985;
Basham, 1963). It is therefore possible that the extant populations of the Hindu Kush
highlands show traces of historic and even prehistoric gene flow from far distant
human populations. Currently, Pakistan is divided into five provinces: Punjab,
Sindh, Baluchistan, Gilgit-Baltistan, Khyber-Pakhtunkhwa (KP) and the Federally
Administered Tribal Areas (FATA) (Fig. 5).
12
Figure 5. Geographic location of Khyber Pakhtunkhwa Province of Pakistan
Khyber Pakhtunkhwa, where the Pashtuns are in majority is situated in the
northwestern part of Pakistan, is recognized as the heart of Gandhara civilization
(Zwalf, 1996). About 300 historic sites in different areas of the province have been
identified (Arif, 2014; Docherty, 2007). The presence of the remains of animals,
humans and coins of bronze have disclosed the hidden truth related to Gandhara
culture during excavations in Kashmir (Dani, 1980). The Bronze Age coins, about
3000 BC, old were found to be associated with the Alxon Hunnic Kings of Gandhara,
Bactria and other dynasties. The pre-Harappan civilization (4000 BC) sites were also
discovered at Rehman Dheri, located on the trade route in KP that connects South
Asia, Eastern Iran, southern Afghanistan and Central Asia (Khan, 2013a; Durrani et
al., 1991). Furthermore, about 50,000 petroglyps and inscriptions available on the
13
Karakurum Hiway near Shatial and Thag Nala near Astor dates back 5th to 9th
century BC (Khan, 2013b), shows the movement of people of different regions of the
world in Pakistan.. Subsequently, the historical view has been that KP was a region
inhabited by Indo-Aryans in 2000 BC (Renfrew, 1996). The mtDNA haplotypic
diversity shows that the populations of India, Afghanistan, Iran, Turkey and Central
Asia reflect the fact that the genetic influx from the Fertile Crescent to the Indian
subcontinent was more frequent than from East to West (Kaifu et al., 2015; Quintana-
Murci et al., 2004). The historical record documents that KP was ruled by Persians in
550 BC, Macedonian dynasty in 330 BC, the Mauryans Empire in 322 BC, the
Kushana monarchy in 250 BC, the realm of Kabul Shahi in AD 1000, a Ghaznavid
invasion in AD 997, was incorporated into Turk-Mongol Gurkani domain in AD
1200, the Yan Dynasty in AD 1271, experienced influxes of Pashtuns beginning in the
16th century and British Empire in the 18th century (Marbaniang, 2015; Tamimi,
2009; Rome, 2008; Aslamkhan, 1996; Barth, 1956). The merging of forien elements
along with the indigenous inhabitants brought a unique social, cultural and high
level of diversity in the population of KP (Crews, 2015). Hindko, Saraiki, Khowar,
Gujri, Kohistani and Pashto are the primary languages spoken in different regions of
the province (Cunliffe, 2015; Bouckaert et al., 2012) and their respective areas are
illustrated in Figure 6.
The province comprises of 27 districts including Bannu, Buner, Peshawar,
Abbottabad, Mansehra, Shangla, Swabi, Upper Dir, Lower Dir, Tank, Shangla, Swat,
Noshehra, Mardan, Karak, Tor Ghar, Kohistan, Hangu, Haripur, Kohistan,
14
Batagram, Lakki Marwath, Kohat, Malaknd, Chitral and D.I. Khan with a population
of approximately 31 million according to the most recent census held in 2017 (GOP,
2002).
Figure 6. Geographic distribution of primary languages spoken in Khyber

Pakhtunkhwa, Pakistan.
1.3. The Study Area
Swat and Dir districts were selected and explored for dental morphology and
molecular anthropology of the major ethnic groups. Brief regarding both the districts
is provided bellow:
15
1.3.1. District Swat
Swat is a district located in the Khyber Pakhtunkhwa (KP) province of Pakistan with
a population of around 1.26 million (according to the 1998 census, GOP, 2002). It is
the largest among all the valleys of Hindu Kush and encompasses an area of some
6226 km2 between 34o 30’ to 35o 55’ N longitude and 71o 45’ to 72o 50’ E latitude. The
altitude of Swat ranges from 600m in the South to more than 6000m in the north,
with the highest peak of Falaksair, attaining an elevation of 6261m AMSl (Ali et al.,
2012; Ahmad and Ahmad, 2003). The valley borders on Indus Kohistan and Shangla
to the East, with Chitral and Ghizer to the North, Bunir, Malakand Agency to the
South, and Dir to the West (GPO, 1998). The geographic position of the District Swat
is presented in Figure 5.
The Swat valley, bounded by the mountain of Hindu Raj, occupies an important
position among the Hindu Kush and Himalyan mountains of Pakistan and is famous
for its natural resources and biodiversity (Ahmad et al., 2015). Historical perspectives
confirmed by archeologists indicate that the valley was occupied in prehistoric
period between 2400-2100 BC (Ali and Khan, 1991; Stacul, 1969). Several civilizations
have passed through Swat in different waves. The term Suvastu was referred to Swat
for the first time in the sacrid book Rigveda, the religious account of the Aryans,
which means in Sanskrit as ‘good dwelling’ while the Latin and Greek
historiographers of Alexander’s army referred to the Swat Valley as Soastos named
with Swastu of the Vedic origin. In the literature of Buddhist, Swat is still named as
‘Urgyan’ or ‘Orgayan’ (Tucci, 1958). Both these terms ‘Urgyan’ or ‘Orgayan’ are the
16
phonetic versions of Sanskrit word ‘Uddyana’. According to the Chinese travelers
Fa Hien, Wicking, Hiuen Tsang and Song Yun, Swat remained under Gandharan
control in the 5th-8thcentury AD (Hussain 1962; Shah, 1940; McMahoon and Ramsy,
1901). Fa Hien who explored Swat in 403 AD, called it Won Chang in Chinese or park
in English. He also mentioned that the people of Swat spoke the Indo-Aryan
language here (McMahoon and Ramsy, 1901). Swat remained for more than 1000
years and flourished under Buddhist and Brahminic rules, whose carvings
inscriptions are still available on rocks embroidery and wood carvings all over the
area. Ahmed and Sirajuddin (1996), maintain that the major feature of the
vegetational land scap of the area is Sino-Japanese in nature. The Aryans, alleged to
be emigrants from Central Asia, who likely one or several proto-Indo-Iranian
languages, take over the region from Iran to northwest Pakistan in the Second
millennium BC, while the mention of Suvastu (modern Swat) in Rigveda attests to
the Aryan colonization in Swat valley (Allchin and Allchin, 1982). In 327 BC,
Alexander crossed the Hindu Kush and travelled towards Afghanistan and occupied
Swat (Rome, 2008). At the decline of Greek power, Chandragupta Maurya attacked
the Macedonians and occupied the whole Punjab (Smith, 1914). In swat valley
Chandragupta Maurya established his strong hol at Mura Hill, in Malakand Agency,
which remaind as the last stronghold of Gujars Dynasty in the Hindu Kush region
(Anonymos, 1998; Ahmad et. al, 2011). He expanded his empire and, during the
reign of his grandson known as Asoka the great, Buddhism was predominated in
Swat in 3rd century BC (Khattak, 1997).
17
After the falldown of Mauryan Dynasty, the Bactrian Greeks took over the whole
regions of Gandhara, Khyber Pass, Hunza and Swat. Swat was retained until Turk
Shahi invasion, who expanded its reign of Kabul from the borders of Seistan to the
North of Punjab during 7th century AD and in 745 AD Swat was completely occupied
(Rehman, 1979). After the downfall of Turks, the Hindu Shahi dynasty established
their rule in AD 822 AD and that lasted until the 11th century AD (Rehman, 1993).
Sultan Mahmud Ghaznavi (Mahmud of Ghazna) occupied the valley of Swat in the
11th century AD defeating Raja Gira, wherein the Pushto Language and Islamic laws
were introduced. Later on the valley was occupied by Dilazak and Swati
Pathans/Pashtuns (Swati, 1997). The Yousafzai Afghan Pashtuns/ Pathans placed
their mark on the valley in the 16th century defeating the Swatis (Rome, 2008; Qasmi,
1939). Today the Swat valley is identified by three ethnic groups Pashtuns/Pashtuns,
Gujars and Kohistanis (Barth, 1956).
1.3.2. District Dir
Dir District comprises hilly and mountainous trrain coprising of the main Dir Valley,
several side valleys, narrow mouhtain gorges and part of plains of Ranizai Area. It
high peaks ranging from 4876m to northeast and 3048m to East with Swat and to
West with Afghanistan (Rahatullah et al., 2011). The total area of District Dir is
5284km2 when it was considered as one District now divided into to two newly
separate districts (i.e. lower Dir 15,85km2 and upper Dir 36,99km2) that lies in Hindu
Kush range between 71°50 to 71° 83E longitude and 35°10 to 35°16N latitude (Ali et
al., 2008). The census report of 1998 revealed that the total population of the area is
18
approximately 1.38 million (GOP, 2002). To the western border, from North to South,
stretches the mountain range known as the Koh-i-Hindu Raj (Fig. 1). To the East
from North to South, there is the mountain range of Swat and Dir, which serves as a
boundary between the two districts and in the North which separates Swat Kohistan
from Dir Kohistan (Hazrat et al., 2007). The District is also bounded by Bajaur
Agency to the west, Malakand District to the south and Chitral is situated in the
north (Figure 5). Dir was invaded by Alexander, Buddhists, Mughals, but the most
important event was the settlement of the Yousafzai in the 16th century (Shah, 2013).
As in Swat, Pashtuns/Pakhtuns are also the major ethnic group of District Dir,
followed by Gujars and Kohistanis, while majority of the people speaks Pashto
language, followed by Gojari and Kohistani (Bellow, 1994). A brief historical review
of the selected ethnic groups of the study areas are described below in Figure 7.
Selected Population for present

study
Pashtu Gujar Kohistan

ns s is
Utmankhei Tarklani Yousafza

l i
Figure 7. Graphical representation of the present study population samples
from Swat and Dir districts
19
1.4. The Pashtuns\Pakhtuns
Pashtuns are an Eastern-Iranian-speaking Afghan ethnic group with a widespread
geographic distribution in southern and eastern parts of Afghanistan and in the
northwestern portion of Khyber Pakhtunkhwa as well as in Baluchistan provinces of
Pakistan (Haber et al., 2012; Caroe, 1976).The terms Afghan, Pukhtun, Pathan and
Pashtun are synonyms used in different literatures (Glatzer, 1998). The origins of
Pashtuns are rather poorly understood, not only in terms of population genetics, but
also in terms of history (Sabitov, 2011).
There are many hypotheses about the origins and inter-relationships among the
various ethnic groups subsumed under the more general term “Pashtun.” Some
historians are of the opinion that Pashtuns are the descendants of Jews (Qamar et al.,
2002; Caroe, 1958). Some of the European authors maintain that Pashtuns are a
Caucasian ethnic group descended from Armenians, while perhaps the strongest
argument is that Pashtun Afghans are basically belongs to Aryans (Elphinstone,
2011; Mirabal et al., 2010; Robson and Lipson, 2002). Some genetic evidence also
suggests that there is a very close relationship between Ashkenazi Jews and
Pashtuns (Bhatti et al., 2016a). It has also been reported that Pashtuns originated
from Greeks (Firasat et al., 2007). Furthermore, the Pashtuns cannot be defined by
their ethnicity only; instead, they are also defined by speaking Pukhto/Pashto and
by practicing a set of traditional cultural values known as
Pakhtunwali/Pashtunwali, also called Pukhto (Barfield, 2010; Coningham and
Young 2015; Bohner and Lucarini, 2015; Khan, 2008; Nusser and Dickore, 2002;
20
Caroe 1958). Among these ethnic subgroups of Pashtuns (Fig. 8), Yousafzai,
Tarklanis and Utmankheils were selected for this dissertation, because these ethnic
groups are representative of the study area.
The genealogy of the Pashtuns is summarized in Figure 8.
Afghanan
Qais Khalid Bin Waleed
Sarbani Bitan Ghurghakhti Karlanri
Sharkbun Hussainkhel Krozai Sanzarkheil Bangash

Karshbun Dotani Mattizai Essakheil Mehsuds
Ghoryakhel Khattak Sahak Yasinzai Orakzai
Kasi Stanikzai Kakar Wazir Wardak
Tareen Lodhi Jadoon Utmankheil
Shinwari Niazi Musakheil Afridi
Tarklani Ghilzai Safi Khatak
Yousafzai Zadran
Mohmand
Daudzai
Abdali
Alikozai
Barakzai
Achakazai
Figure 8. The generally accepted genealogy of the Pakhtuns origin of Khyber

Pakhtunkhwa, Pakistan (Modefied from Caroe, 1958).
21
1.4.1. The Yousafzai
The Yousafzai (literally meaning “Sons of Joseph”) are a sub-tribe of Pashtuns that is
found in the northern areas of KP, Pakistan (Tokayer, 2007). The Yousafzai have
spread over a large area that stretches from the Bajaur Agency contiguous with the
Durand line, to the Easternmost reaches of Mansehra (Caroe, 1958). The Pakhtuns
residing in Swat, Dir, Buner, Shangla, Mardan, Swabi and Malakand mainly belong
to the Yousafzai sub-tribe (Barth, 1959).
It is clear from history that the Yousafzai inhabited Kabul, along with other Pakhtun
tribes like Muhammad Zai and Khalil Mumand, but due to clashes with the Mughal
ruler, Mirza Alagh Beg, they migrated from Kabul to Peshawar at the end of 15th
century under the guidance of their leaders Malik Ahmad and Sheikh Malli
(Sirajuddin, 1970). After expelling the Delazak from Peshawar, the Yousafzai
occupied Mardan, Swabi, Swat, Buner, Dir and Bajaur valleys pushing the native
population of the area into other areas like Hazara or the inaccessible mountain
gorges (Yasin, 2008; Barth 1959; Caroe 1958). Due to the historic position of
Yousafzai among the other Pashtuns, they are the most widely studied and
recognized population in terms of tribal and clan structures, genetic profile, politics,
history, language and marriage practices (Lindholm, 1982; Ahmed, 1976; Barth, 1959;
Caroe, 1958). According to (Ilyas et al., 2015) all popula-tions share a similar
demographic history between 1 mil-lion to 200kyr ago. From 200kyr ago to 20kyr
ago, the Pathan follow a similar trajectory to other Asian and European populations,
with an inferred effective population size smaller than African populations,
22
reflecting the out of Africa bottleneck, over the last 20 k years, the Pathan shows an
explosion in effective population size, contemporaneous to other Eurasian
populations but much greater in magnitude. The very large effective population size
likely reflects admixture between European and Asian lineages giving rise to
modern Pathans rather than an actual increase in census sizes.
1.4.2. The Utmankheils
Utmankheils are a Pathan subtribe who inhabits a large portion of the country that
spread across the hills surrounding the valley of Peshawar and includes the country
west and southwest of the junction of Swat, Dir (Panjkora rivers), Bajour, Malakand
Agency and some parts of Mardan (Murray, 1899; FATA, 2010; International Crisis
Group, 2006). The Utmankheil appear to have acted in concert with the Tarklani and
Yousafzai in the campaigns just referred to, and at about the same time as the
conquest of Swat and Dir by the Yousafzai were settled in the country they currently
occupy. The Utmankheil belongs to the Karlanri subtribe of Pashtuns and within the
Karlanri, the origin of the Uthmankhel clan is debated as their current sub-
populations are descended from an adopted child of unknown origin by the
Pashtuns (Barfield, 2010; Caroe, 1958). The Utmankheil are further divided into
Ismailzai, Bimmarai, Mandal, Muttakai, Alizai, Sanizai, Aseel, Gorai, Boot Khel and
Shamozai clans (Yaad, 1986).
1.4.3. The Tarklanis
Tarklanis (Tarkanis) are a clan within the Sarbani subtribe of Pashtuns and they are
mainly found in the Federally Administered Tribal Areas (FATA) of Pakistan, while
23
a large number also reside within Kunar Province of Afghanistan and District Dir
lower of Khyber Pakhtunkhwa, Province of Pakistan (Rehman et al., 2016; Caroe,
1958; FATA, 2010; Tareekh-e- kakzai, 1993; International Crisis Group, 2006).
Tarklanis are further divided into four clans. These include: Mammund, Salarzai,
Isazai and Ismailzai (Yaad, 1986). They came along withYousafzai from central
Afghanistan replacing Dilazak of the Peshawar valley and moved towards Swat, Dir
and Malakand Agency in the 15th A.D., where they got a separate ownership
(Political and Secret Department, 1933). Among the Pashtun ethnic groups of the
present study, Utmankheils and Tarklanis have not been as widely as the Yousafzai,
therefore a brief account is provided for two of the three sampled populations.
1.5. The Kohistani
The word “Kohistan” literally means “the place of mountains.” As a physical
features, Kohistan is divided into three areas: Dir Kohistan, Indus Kohistan and
Swat Kohistan, while the people living in all three of these regions are referred to as
“Kohistanis” (Hamayun, 2005). Kohistanis speak an array of Dardic languages and
practice a wide range of agricultural and transhumant herding subsistence strategies
(Bangash, 2012; Barth, 1956). The Kohistanis are commonly thought to be the
descendants of the ancient nomadic herders of the area who were forced into the
mountainous highlands from the low-lying fertile plains of Dir and Swat by Pashtun
invaders from the west during the 16th century A.D. (Shah, 2013; Rome, 2008; Barth,
1956). Prior to the 15th or 16th centuries, the Kohistanis were non-Muslim, but due to
the influence of the Yousafzai immigrants, they converted to Islam (Baart and Sagar,
24
2002).The population of Kohistanis in districts Swat and Dir is estimated to be
between 60,000 and 70,000 individuals (Hamayun, 2005).
1.6. The Gujars
Gujars, who speak Gujari/Gojri (a lowland Indo-Aryan language), are an ethnic
group found in northern India and the mountainous regions of northern Pakistan,
northern Afghanistan and Kashmir (Grimes and Grimes, 2000; Lalata et al., 1971;
Barth, 1956). The spellings of Gujar are not homogeneous and they may be referred
to by any of the following: Gurjara, Gojar, Gujjar, Goojar, Gujar, Gurjjara and
Gurjara. Gujars are the ancient pastoralists/farming communities who herd
livestock or dairy buffalo, and mostly settled agriculturalists or semi-settled
agriculturalists who practice seasonal transhumance (Gooch, 1992; Barth, 1956). The
pastural Gujars who speaks Gojari, along with other local lnguages are said to be the
descendants of the ancient Gurjaras. There are many hypotheses regarding the origin
of the Gujars and their inter-relationship. Some anthropologists recognize the Gujars
as Kushan, which are of the Indo-Scythian tribe (Cunninghum, 1865). They are
considerd as Central Asian in origin from where they reached to India along with
Huns Population in 5th century AD and setled in Rajasthan. In 16th century AD they
moved from Rajastan towards Himachal Pradesh following Kashmir and Punjab It
has also been reported that the Gujars migrated from Georgia also called Gurjistan
(in Persian, Turkish and Arabic) through Afghanistan and reached to India (Tyagi,
2009). Previous genetic work found the Gujars to be genetically closer to the pastoral,
cattle farming Gola ethnic group in India than to other Pakistani ethnic groups (Raza
25
et al., 2013). Gujars all over the sub-continent claim to be indigenous natives since
time immemorial. Indeed, many Gujars also claim with confidence that they are
Kashtriyas by origin; descendents of the Suryavanshi Kshatriyas (Sun Dynasty) and
connect themselves with the Hindu deity Rama without having any traces of of so-
called foreign origin (Lalata et al., 1971).
According to the 1941 census report of India, the tribe called "Gurjaras" were
established in the area near Mount Abu in Rajasthan, around 6th century A.D. The
“Gurjaras" were Hindus at the time they were first appeared in India and established
their own kingdom in A.D. 640. It seems that the Gujars successfully resisted the
Arab invasion from the north early in the eighth century A.D. It is alleged that about
A.D. 750 A.D. the Chapa dynasty of the Gurjaras, which was in power for about 200
years, were displaced by the Pratiharas in A.D. 1000. They embraced Islam after
being defeated by Mahmud of Ghazni and their kingdom fully flourished during the
reign of Akbar. In India the Hindu Gujars are assimilated into several other groups
of Hinduism, while in Pakistan the Gujars are considered a tribe (Parishad and
Bharatiya, 1996).
Today the Gujars are famous in agriculture, urban professions and have great
contribution in civil cervices, occupying large scales of land especially in northern
parts of Pakistan and India. The population of Gujars in India is approximately 30
million while, in Pakistan their population is about 33 million. Due to the lack of
food and disasters caused by wars the Gujars migrated to northwards toward
26
Kashmir and occupied many areas of the region including Rajasthan, Gujarat and
Kathiawar (Wreford, 1943). The portion of some migrating Gujars also moved to the
northern areas of Pakistan including Swat and Dir some 400 years ago (Chauhan,
2001; Rome, 2008; Barth, 1956). Despite being a country inhabited by a population of
tremendous ethnic diversity, however the diversity among the people of this region
remains largely unknown genetically.
1.7. The genetic characterization of human
Genetic characterization of modern human populations is very important for
investigating or confirming archeological, anthropological and other information
related to human history, genetic polymorphisms, racial biases and medical
relevance (Bodmer, 2015; Macaulay et al., 2005; Renfrew, 2000; Ingman et al., 2000;
Excoffier and Langaney, 1989; Cann et al., 1987). The evaluation of molecular
techniques used to study the genetic structure of human populations and the results
obtained can yield much insight into human health and history (Bodmer, 2015).
Previous studies have interpreted the presence of genetic sub-structures in human
populations as the consequence of migration patern of subgroups and genetic drift.
Consequently, individuals of the same group are very similar to each other
genetically as compare to the individuals of another group (Henn et al., 2016;
Novembre, 2011; Tishkoff et al., 2009; Jakobsson et al., 2008; Rosenberg et al., 2002;
Cavalli-Sforza et al., 1994).
Genetic divergence in a population may occur due to non-random mating among
isolated populations as well as the genomic diversity within and among populations,
27
which is determined primarily by mutation and certain demographic factors like
effective population size and the extent of migration (i.e., gene flow) among
populations (Slatkin, 1987; Wright, 1951). Population subdivisions, extension
dynamics and migration patterns can be analyzed through the use of different
molecular techniques (Risch et al., 2002). Several other fields have been and remain
actively engaged in elucidating human history and evolution in addition to
molecular evolution and genetic approaches to the origins and distribution of the
human species across the globe.
The human story in the form of recorded text goes back only as far as 4,000 years.
Historical linguistics and the languages spoken today hold the evidence of their
origin for more than 10,000 years (Jobling et al., 2004). Archaeological evidence
provides the ability to study human history, sometimes at great time depth, through
the analysis of such physical remains as bones, teeth, stone tools, pottery, waste
deposits, coins, inscriptions and dwellings left by members of past populations.
Paleontology however, provides a very deep ancestral record of human beings while
molecular anthropology is the most recent approach to estimate human history
(Jobling et al., 2004; Cavalli-Sforza et al., 1994).
Genetic variation at the individual level only yields insight into the past, but can also
be used to shape the future with respect to possible ramifications in the field of
medicine, prevention methods, disease susceptibility and response to drug
treatment. Several studies have demonstrated individual differences in terms of
disease risk and response to medicines (Bamshad et al., 2004; Jorde et al., 2001).
28
Consequently, the variation among members of different races at the genetic level is
obligatory for the effective planning of prevention and treatment strategies.
At the beginning of the 20th century, genetic differentiation within and across the
various major geographic groups of humanity was explored through ABO blood
group patterning (Landsteiner, 1901). Furthermore, the importance of such genetic
variation was only observed apparent when individual differences in proteins were
systematically studied in the 1950s and 60s (Cavalli-Sforza et al., 1994). Genetic
variation is widely studied with the expansion of evolutionary genetics, the
availability of analytical tools and more effective and economical means for DNA
amplification (Jobling et al., 2004; Cavalli-Sforza et al., 1994). Recently, variation in
uniparental markers found on the Y-chromosome and mtDNA are being studied to
investigate the dispersal and origin of modern humans (Torroni et al., 2006; Forster,
2004; Jobling and Tyler-Smith; 2003). However, these studies were usually on
particular genes and were investigated under the influences on a specific phenotypic
property or disease risk; therefore, the variation investigated would have been
subject to selection pressures. The completion of the Human Genome Project and
with the advent of sequencing technologies, such as Sanger sequencing and Next
Generation Sequencing (NGS), have permitted molecular geneeticists the ability to
access large amounts of information within the genome as a database for
investigating human evolution and diversification (Garrigan and Hammer, 2006;
Margulies et al., 2005; Przeworski et al., 2000; Sanger et al., 1977). Exploring
information contained in mtDNA, Y-STRs and dental morphology/dental
29
anthropology are also very important tools for phylogenetic studies as well as for the
investigation of human origins (Larmuseau et al., 2015; Nesheva, 2014; Bailey, 2002).
1.8. Dental Morphology/ Dental Anthropology
Dental anthropology is the study of humans present and past from the evidence
provided by teeth (Hillson, 1996). Teeth provide valuable evidence about prehistoric,
historic and modern populations—not just interms of morphological features of the
crown and root, but teeth also have the potential to preserve a high-quality DNA for
molecular anthropological analyses (Damgaard et al., 2015; Higgins and Austin,
2013; Brook and Scheers, 2006).
Dental morphology is a field of study that arose initially in the 19th century that is
used to register, analyze, interpret and understand all aspects of dental crown and
root morphology that yield insight into human groups, their cultural activities,
biological conditions and quality of life (Irish and Scott, 2016; Moreno et al., 2004;
Carabelli, 1842).The traits present on human teeth are used for population-based
studies, they can serve as identification markers, and they provide the bases for
comparisons of genetic origin, thereby allowing the classification of human groups
in taxonomic, phylogenetic and evolutionary categories by means of their frequency,
expression of sexual dimorphism, bilateral symmetry and morphological
characteristics (Rodriguez, 1999; Rodriguez, 2003). The fact that the morphological
traits present on teeth are often preserved in good condition among post-industrial
30
modern humans is due to the presence of enamel, which makes it resistant to
unfavorable conditions for a long time (Moreno and Moreno, 2005).
These biological traits are expressed in humans and are transfered to subsequent
generations in a manner much like other genetically controlled traits, such as blood
groups, fingerprint patterns, skin colour, height, which are of varying utility the
reconstruction of phylogenetic relationships among various species, evolutionary
changes in dentition, the impact of diet upon the dentition and for estimating the
degree of biological distance observed among various communities (Scott and
Turner, 1997; Walimbe and Kulkarni, 1993).
Teeth have long been used by anthropologists for the reconstruction of life through
the examination of pathological afflictions suffered by members of ancient
populations that shed light on the general health conditions, diet and even the social
status of individuals (Hemphill, 2012; Eshed et al., 2006; Cucina and Tiesler, 2003;
Hillson, 1979). Similarly the status of dental eruption can be used for the
determination of age at death for infants and juveniles, while both micro- and
macroscopic tooth wear, when calibrated for local conditions, can be used for
recording adult death age and information regarding the foods consumed (Teaford
and Lytle, 1996; Smith, 1991). Teeth may also be used by the forensic anthropologist
for the identification of individuals, human evolution and, most recently, certain
dental traits are used for the estimation of human ancestry (Edgar, 2013; Pretty and
Sweet, 2001).
31
1.8.1 The Birth of Dental Anthropology
The history and origin of dental anthropology goes back to late 19th century when
researchers first focused on the teeth of mammals and reptiles and compared them
to the human dentition (Osborn, 1888). The early researchers used teeth to categorize
fossils, record pathological status, describe natural variations in human teeth and
comparing their presence and frequency in various populations distributed
throughout the world (DeSantis, 2016; Drennan, 1929; Hellman, 1928; Gregory, 1926;
Bolk, 1922; Gregory, 1922; Sullivan, 1920; Osborn, 1907; Owen, 1845). Georg von
Carabelli (1842) was the first researcher who reported and described the presence of
a small accessory cusp on the mesiolingual surface of the protocone of the maxillary
molars of Europeans (Scott and Turner, 1997). This was given the name Carabelli’s
trait and is found and recorded in most of dental anthropological evaluations
(Marado and Campanacho, 2013; Hsu et al., 1999; Reid et al., 1991; Hassanali, 1982;
Townsend and Brown 1981; Scott, 1980). Variations in enamel and root anatomy
were also observed among various races (Hellman, 1928; Tomes, 1889; Flower, 1885;
Owen, 1845). Ales Hrdlicka (1920) identified the shovel-shaped incisor, which plays
a pivotal role in the classification system and researchers consider it a basic dental
morphological trait in the field of dental anthropology (Scott and Turner, 1997;
Hrdlicka, 1920). Hrdlicka also observed similarities, variations and the level of
shovelling expression in American Indian and Asian populations and its clear
departure from that observed in African and European dentitions (Hrdlicka, 1924;
Hrdlicka, 1920). The identification of stable morphological traits in canines, incisors,
32
premolars and molars improved the analytical ability of dental morphology-based
investigation of human biological differences (Dahlberg, 1945).
Recently, studies on dental variations of both hominins and modern humans are
significantly improved. Dental anthropological studies have illuminated Plio-and
Pleistocene hominin dental morphology (Gomez-Robles et al., 2008; Gomez-Robles et
al., 2007; Bailey, 2004; Wood et al., 1988; Wood and Engleman, 1988; Wood and
Uytterschaut, 1987; Wood et al., 1983; Wood and Abbott, 1983), new information in
the study of Neanderthals (Bailey et al.,2011; Bailey, 2002), analysis of microwear-
based investigations of dietary variability among hominins (Lucas et al., 2008; Scott et
al.,2005; Teaford and Ungar, 2000), identification of behavioral patterns and wear-
related remodeling (Margvelashvili et al., 2013), and phylogenetic relationships of
the newly discovered hominin, Australopithecus sediba and other hominine species
(Irish et al.,2013).
Research based on variations in dental development between modern humans and
ancestral hominins have revealed new insights into dental relationships between
these taxa and new techniques to visualize internal and external dental structure
from two-dimensional surfaces using low magnifying microscope (DeSantis, 2016;
Smith and Tafforeau, 2008).
Single and multiple dental morphological traits are commonly used to investigate
different groups of human populations for phylogenetic relationships (Mihailidis et
al., 2013; Matsumura et al., 2009; Townsend et al., 1990; Kieser, 1984; Mayhall et al.,
1982; Scott and Dahlberg, 1982; Kaul and Prakash, 1981; Kieser and Preston, 1981;
33
Townsend and Brown, 1981; Scott, 1980; Suzuki and Sakai, 1973). About 100
morphological dental traits combinations have been reported till now while new
traits are added frequently soon (Cunha et al., 2012). In 1990, a standardize
methodology was introduced for dental morphological scoring and observation
following the techniques introduced by Hrdlicka (1920) and Dahlberg (1945) (Scott
and Turner, 2008). A series of rank-scaled reference plaques for 36 dental non-metric
traits were developed, called the Arizona State University Dental Anthropology
System, or ASUDAS. These plaques were accompanied by a set of rules and
guidelines for observers (Turner et al., 1991), which need to be followed carefully to
minimize inter- and intraobserver error and ultimately maximize comparability.
1.8.2. Dental anthropology investigations in South Asia
To find out the biological affinities between prehistoric and living South Asian
populations it is important to understand the dispersal route of early humans in
South Asia also called Indo-Pak subcontinent. Therefore dental morphological
features should be used because, once they are expressed within a given tooth, they
remain unaffected until pathological or physical damages.
Moderate to highly heritable dental features means that these traits provide a
reliable picture of the genetic relationship between the past populations and may be
used to test hypotheses about past human migration patterns within and across the
continents. Dental anthropology is a recently emerging field to explore variations
within and among the various populations of Indo-Pak subcontinent (Hemphill,
34
2013; Hemphill, 2012; Hemphill, 2009a; Blaylock, 2008; Sharma, 1983; Kaul and
Prakash, 1981; Sharma and Kaul, 1977). Variation in the frequency of non-metric
dental traits of the permanent teeth has been used to determine biological distances
among South Asian prehistoric skeletal series. Relevant studies have focused on
early agriculturist chalcolithic groups of the Deccan Plateau (Lukacs, 1987),
Chalcolithic and Neolithic samples from Mehrgarh, a site located in Baluchistan
Province of Pakistan (Lukacs and Hemphill, 1991; Lukacs, 1986), and Iron Age series
from Sarai Khola, Timargarha (Lukacs, 1983), Parwak (Ali et al., 2005), located in
northern Pakistan, respectively. The first descriptions of dental morphology of early
Holocene hunters focused on the site of Sarai Nahar Rai in the mid-Ganga Plain of
North India, but due to small sample size assessment of biological relationships was
prevented (Kennedy et al., 1986). Non-metric dental trait frequencies and inter-group
bio-distances were reported from a site nearby known as Mahadaha (Lukacs and
Hemphill, 1992). Researchers at the Anthropological survey of India, the University
of Chandigarh, and the University of Sri Venkateswara, Tirupati have also been
reported dental morphological trait frequencies from skeletal samples from South
Asian and living ethnic groups. The researchers from Chandigarh University
worked on Jats (Kaul and Prakash, 1981), Tibetans (Sharma, 1983), Punjabis (Sharma
and Kaul, 1977) and Andhra Pradesh (Rami- Reddy, 1985). Hindu caste Vaghelia
Rajputs and Garasias, as well as tribal Bhils were reported from Gujrat while caste
Marathas and Mahars, along with tribal Madia Gonds and urban mixed caste
samples from the city of Pune were reported from the State of Maharashtra in west-
35
central India (Hemphill et al., 2000; Lukacs and Hemphill, 1992). Recently 2,455
living individuals were also reported from samples of seven populations living in
the northern areas of Pakistan including the residents of Madak Lasht and Swatis
(Hemphill et al., 2010; Hemphill, 2009b). Additional dental morphology studies have
been conducted among the Khows of Chitral District (Hemphill et al., 2008) and
Awans of Mansehra District (Hemphill, 2012).
1.8.3. Non-metric dental morphological traits
Non-metric dental traits are morphological variants of the root and crown that vary
among populations and because of these variations researchers can get access
towards human ancestry (Maula, 1993).
The non-metric traits are usually scored in two ways: (i) the traits such as groove
patterns, accessory ridges, supernumerary cusps and roots are represent as
“Presence- absence,” or (ii) as the differences in form such as curvature and angles
(Scott and Turner, 1997; Hillson, 1996). When present, many of these traits vary in
the degree to which a particular morphological structure is expressed (e.g. cusp or
ridge size) (Scott and Turner, 1997).
1.8.4. Basic terminology use in dental morphology
Dental anthropologists use basic terms when describing specific regions or
expressions of the dentition that helps the researchers orient themselves within the
dentition, and makes it easy to describe morphological traits --onin a specific tooth.
These specific terms are mesial: toward the anatomical midline or the sagittal plane
than runs between the two central incisors, distal: away from the midline, buccal:
36
towards the cheek, labial: toward the lip, lingual: towards the tongue and occlusal:
the chewing surface of a tooth (Scott, 1997).
Figure 9. Diagram represents the positional terms for the teeth and jaws
37
1.8.5. Analysis of dental morphology traits
Dental morphological traits are analyzed with the help of an internationally
recognized system called the Arizona State University Dental Anthropology System
(ASUDAS), which features 36 rank-scale reference plaques that illustrate minimum,
maximum and intermediate expressions of specific traits. The ASUDAS procedures
also help to standardize the observations and scoring of about more than 40 specific
crown, root and intraoral osseous morphological traits of the human permanent
dentition (Turner et al., 1991). The most frequently occuring dental morphological
traits are: winging, which is present in central incisors of the maxilla and can be
identified when the lateral margins of the antimeres are rotated labially (Enoki and
Dahlberg, 1958) (Fig. 10a). The peg-shaped (reduced and cone shaped) character
found in the upper lateral incisors and is very rare as compare to the other tooth
traits (Scott and Turner, 1997) (figure 10b).
Labial Convexity of the upper incisors, mostly found in upper incisor 1 (UI1), is
defined as the roundness of labial surface of UI1 (Nichol et al., 1984; Scott and
Turner, 1997). Shoveling (Figs. 10c & d) is found in canines, upper and lower incisors
with well differentiated distal and mesial lingual ridges (Hrdlicka, 1920; Dahlberg,
1956; Scott and Turner, 1997). Double-Shoveling occurs in canines, first premolars,
upper and lower incisors while UI1 is said to be the key tooth for this trait (Dahlberg,
1956) (Fig. 10e). The Interruption Groove (IG), also called the corono-radicular
groove that appears in upper incisors and is sometime common in UI2 (Scott and
Turner, 1997) (Fig. 10f).Tuberculum dentale, also known as the median lingual ridge,
38
is present in upper canines and incisors (Nichol and Turner, 1986) as shown in figure
10g.
(a) (b)
(c) (d)
(e) (f)
@4
(g) (h)
Figure 10. Morphological traits of canines and incisors with respect to

ASUDAS. (a) Winging of central incisors (b) Peg-shaped upper lateral incisors
(c) Reference plaque for shoveling UI1 (d) shovel-shaped UI2 (e) Reference
plaque for double-shoveling (f) Interruption groove in upper lateral incisors (g)
Tuberculum dentale (median lingual ridge) for UI1 (h) reference plaque for
distal accessory ridge upper canines (DAR UC).
The Bushman canine (canine mesial ridge) is common in canines, especially in upper
canines, and is said to be the combination of mesial marginal ridge of the canine
39
with a projection of the cingulum on the primary tubercle (Morris, 1975; Scott and
Turner, 1997). Canine distal accessory ridge appears in upper and lower canines
(Morris, 1975; Scott and Turner, 1997) (Fig. 10h).
The “Uto-Aztecan premolar” also known as the disto-sagittal ridge (Fig. 11a) found
in the first maxillary premolars (Morris et al., 1978). Premolar mesial and distal
accessory cusps (Fig. 11b) occur in the upper premolars (Turner, 1967). Assessments
of the number, conformation, and position of the lingual cusp is assessed among the
lower premolars (figure 11c) (Scott and Turner, 1997; Kraus and Furr, 1953;
Pedersen, 1949).
(a)
(b) (c)
Figure 11. Morphological traits of premolars (a) Reference plaque for Uto-Aztecan
premolars (b) Premolar accessory cusps (c) Premolar lingual cusp.
40
The metacone, hypocone, metaconule, Carabelli and parastyle traits are the most
common traits studied in the upper molars (Dahlberg, 1951; Turner, 1979; Scott and
Turner, 1997; Harris, 1977; Bolk, 1916).
The metacone or cusp 3 is a primary cusp of the upper molars (i.e. M1, M2 andM3)
found in the distobuccal quadrant of the tooth (Fig. 12a). The metacone is almost
unformly fully developed on M1, shows some reduction on M2, and is often reduced
or even absent on M3 (Hillson, 1996). The disto-lingual cusp found on the upper
molars is the hypocone or cusp 4 (Fig. 12b). This trait is most common on M1, while
on M2 and M3 it can be found in reduced form or sometimes absent (Hillson, 1996).
The trait found in the distal fovea between metacone and hypocone on the distal
marginal ridge of upper molars is known as the metaconule or cusp 5 (Fig. 12c).
Carabelli’s trait is an extra tubercular structure found at the base of meso-lingual
surface of cusp 2 (protocone) in upper M1, M2 and M3 (Fig. 12d). The parastyle
(extra cusp) is a trait found on on the buccal surface of the upper molars (Fig. 12e). It
may be very small or sometimes is expressed as a well-defined extra cusp mainly
found on M3 while rare on M1 (Hillson, 1996). The parastyle is regarded by some
dental anthropologists as one of the most important features in the field of dental
anthropology found on buccal surface of molars (Scott and Turner, 1997).
41
Figure 12. Morphological traits with respect to ASDUAS reference plaques (a)
reference plaque representing the metacone (cusp 3) in upper molars
(b) reference plaque for scoring hypocone (cusp 4) in upper molars (c)
reference plaque for metaconule (cusp 5) in upper molars (d)
reference plaque for Carabelli’s trait (e) reference plaque for scoring
parastyle in upper molars with a well-pronounced example to the
right side of the scale.
42
The anterior fovea (precuspidal fossa) is a dental morphological trait found on the
anterior and occlusal surface of all three mandibulars molars, but it is only scored on
lower M1 (Turner et al., 1991; Hrdlicka, 1924). Its identification and scoring need
well-experienced researchers (Scott and Turner, 1997). In some cases it forms a deep
triangular fossa distal to the mesial marginal ridge (Fig. 13.1a). The deflecting
wrinkle is demonstrated as the median occlusal ridge of the metaconid that goes
down from the tip of the cusp toward the central fossa in lower molars M1, M2 and
M3 (Scott et al., 1997) (Fug. 13.1b). The protostylid is an extra cusp or outgrowth
present on the buccal surface of cusp 1 (protoconid) of the lower molars (Turner et
al., 1991). This trait can be observed on the buccal surface of the lower molars (Fig.
13.1c).
43
(a)
(b)
(c)
Figure 13.1. Morphological traits (a) reference plaque for the anterior fovea in
lower molars with an example to its right (b) reference plaque for
the deflecting wrinkle with an example to its right side (c)
reference plaque for the protostylid with an example to its right
side.
The groove pattern is identified as the configuration of contacts among different
cusps (Turner et al., 1991), which may be in the form of letters X, Y or the (plus) +
mark (Fig. 13.2a). The Y-pattern is recognized as the connection between cusps 2 and
3, the X-pattern is recognize as the connection between cusps 1 and 4, while a +
pattern is recognize as the connection of all four major cusps (Turner et al., 1991). The
44
major cusp numbers are commonly reported in the lower molars (Gregory, 1916;
Scott and Turner, 1997). The most common major cusps of lower M1 are five in
number and are reported as mesio-buccal (metaconoid), mesio-lingual (entaconoid),
centro-buccal (hypoconulid), disto-buccal (protoconoid), and disto-lingual
(hypoconoid), while four or three cusps may also be found in lower molars
respectively. The hypoconulid, or cusp 5, is the distal cusp found on the occlusal
surface of the lower molars (figure 13.2b). Its size can be calculated in the absence of
entoconulid (Turner et al., 1991).

(a) (b)
(c) (d)
Figure 13.2.Dental morphological traits and reference plaques (a) the Y- and X-
pattern (b) reference plaque for scoring the hypoconulid (cusp 5) (c)
reference plaque for scoring the entoconulid (cusp 6) (d) reference
plaque for scoring the metaconulid (cusp 7).
45
Cusp 6, also called the tuberculum sixtum or entoconulid, is found lingual to cusp 5 in
the distal fovea of the lower molars (Fig. 13.2c). Both cusps 5 and 6 in terms of size
are similar to each other in grading (Turner et al., 1991). It is located between cusps 4
and 5 in lower M1, M2 and M3. The metaconulid, also known as tuberculum
intermedium or shortly cusp 7, is situated between cusps 2 and 4 in the lingual groove
of the lower molars (Fig. 13.2d). The key tooth for scoring cusp 7 is M1 of the lower
jaw (Turner et al,. 1991).
1.9. Mitochondrial DNA (mtDNA)
A typical somatic cell performs many complex metabolic processes that are specific
for that cell; for example, the synthesis of a specific protein required for a specific
function and cellular energy in the form of Adenosine Tri Phosphate (ATP) is
required for life activities (Guimaraes-Ferreira, 2014; Davey et al., 2002). The cell
contains many organelles required for essential cellular functions. Among these
organelles the nucleus and mitochondria are the most important. The DNA within
the nucleus is called nuclear DNA, or the nuclear genome, while the DNA within the
mitochondrion is called mitochondrial DNA (mtDNA), which synthesises its own
proteins and is therefore known as the power house of the cell (Butler, 2005; Jobling
et al., 2004; Holland and Parsons, 1999). The endosymbiotic theory about the origin
of the mitochondrion is widely accepted that is based on the mutual symbiotic
relationship between the cell and a bacterium, which eventually led to the
integration of the bacterium to form the mitochondrion (van der Giezen, 2011; Joblin
et al., 2004; Anderson et al., 1981).
46
Human mtDNA (Fig. 14) is a double-stranded circular molecule with length of
approximately 16,569 base pairs (bp), having 37 genes of which 13 code for proteins,
22 for transfer RNAs (tRNAs), 2 for ribosomal RNAs (rRNAs)(Ebner et al., 2011), a
non-coding region, a displacement loop (D-loop), also called control region, as well
as the regulatory sequences for the mtDNA origin of replication, the promoters for
transcription (Chang et al., 2010; Taanman, 1999; Anderson et al., 1981), cytochrome
b, cytochrome c, ATPase and NADH dehydrogenase (Liu et al., 2011; Mckenzie et al.,
2010; Ketmaier and Bernardini, 2005).
16024 16383 57 372 438 574
(D-Loop)
tRNA
Cytochrome c
Figure14. Diagrammatic view of human mtDNA.
47
The major portion of the non-coding region of mtDNA is the D-loop of 1122bp,
which is composed of the Hyper Variable Sequences (HV); HV-I (nucleotide position
[np] 16024-16383), HV-II (np 57-372) and HV-III (np 438-574) (Butler, 2012; Butler,
2005). These regions have mutation rates that are approximately ten times that
observed in the coding sequence (1.64273 x 10-7 for HVS-I, 2.29640x10-7 for HVS-II)
(Soares et al., 2009).
The mitochondrion is present several hundred times in the cell and its inheritance is
unilineal via the maternal line (Butler, 2005; Jobling et al., 2004; Lightowlers et al.,
1997; Robin and Wong, 1988). Paternal inheritance of mtDNA in humans has also
been reported up to the blastocyst stage in embryos (St John et al., 2000), but this
phenomenon is very rare and its contribution is considered negligible (Kraytsberg et
al., 2004).
1.9.1. MtDNA in human lineages
The Cambridge Reference Sequence (CRS), also called the original sequence of
mtDNA, was first obtained from the placenta of a European individual that describes
the characteristics of European mtDNA lineage (Achilli et al., 2004; Anderson et al.,
1981).
The mtDNA, along with autosomal DNA and Y chromosomal DNA, have long been
used in evolutionary biology, historical perspectives and population genetics
(Kivisild, 2015; Cann et al., 1987; Wallace et al., 1985). The high copy number per cell
(Piko and Matsumoto, 1976; Michaels et al., 1982), lack of recombination, maternal
inheritance (Kivisild, 2015; Hutchison, 1974), and high mutation rate (Brown et al.,
48
1979), have made mtDNA a unique tool for human evolutionary studies and
population genetics.
Phylogenetic study of mtDNA has a central role in the identification of the human
maternal ancestors, known as the “mitochondrial Eve,” who inhabited Africa around
124,000-157,000 years ago (Fu et al., 2013; Poznik et al., 2013) and then subsequently
dispersed to the rest of the Old World and eventuially into the New World as well
(Stewart, 2015; Behar et al., 2008; Torroni et al., 2006). mtDNA migration pattern is
illustrated in Figure 15.
Figure 15. Human migration and haplogroup distribution across the world
(Stewart, 2015)
1.9.2. MtDNA Variation
Human mtDNA differs broadly across the globe, with populations of similar descent
or geographical origin sharing many of the same characteristics. In some cases, these
characteristics may indicate various historical events of the population including
admixtures with other populations or migrations (Whale, 2012).
49
A mtDNA haplotype is the combinations of polymorphisms that differ from the CRS
and transmitted together from mother to offspring and which cannot be affected by
recombination. Thus, similar mitochondrial haplotypes share a set of common
mutations and can be traced to a common maternal ancestor. Individuals from the
similar or same populations may share the same mtDNA sequences (haplotypes) and
can be clustered together to form haplogroups (Wallace et al., 1999). A haplogroup is
a set of slowly mutating markers shared by peoples of the same geographic region
(Jobling et al., 2004), and are mostly continent-specific, leading to indicate modern
human history and migration paths. MtDNA haplogroups are indicated by letters of
the Latin alphabet and all of these letters, except “O,” have been utilized (van Oven
and Kayser, 2009). Among these letters (Fig. 16) L1, L2, L3, L4, L5 and L6 represent
African-specific mtDNA haplogroups and belong to the “L” clade (Behar et al., 2008).
Figure 16. mtDNA PhyloTree and partitioning scheme representing subtrees

(van Oven et al., 2015).
50
All of the non-African lineages are said to have originated from L3 about 60,000 to
70,000 years ago (Soares et al., 2012; Behar et al., 2008). The African haplogroups L1,
L2 and L3, are also found in Makrani population of Baluchistan province of Pakistan
among whom frequencies range from 28% to 39.4% (Siddiqi et al., 2015; Quintana-
Murci et al., 2004). Haplogroups M and N separated from haplogroup L3 about
77,000 years ago (Forster and Matsumura, 2005).
The M clade (including haplogroups C, D, E, G, Q and Z) is distributed in Asia,
Indonesia, Australia and the Americas. Indeed, more than 70% of mtDNA lineages
identified among the inhabitants of India belong to haplogroup M (Chandrasekar et
al., 2009; Metspalu et al., 2004).
Similar lineage was also found among South Indian tribes and caste populations that
accounts for all but three lineages among the Chenchus (Kivisild et al., 2003).
Haplogroup M is also common in the populations living in the southern region of
the Makran coast of Pakistan and northwest India, with the frequencies of 30-35%,
respectively (Quintana-Murci et al., 2004). On the other hand the frequency of
haplogroup M is low or absent among populations residing the west of the Indus
Valley, while it is found at frequencies of less than 12% among the populations of
Central Asia (e.g, Uzbeks, Turkmen and Shugnan) (Quintana- Murci et al., 2004).
Haplogroup N is widely distributed in Europeans and Oceanic populations in
addition to Indians, Native Americans and Asians. The parahaplogroup N* of
haplogroup N includes haplogroups A, I, S, W, X, and Y. Clade R is also included
within the clade N, which is also very common in Europeans and is further divided
51
into parahaplogroups R* and RO. R* is divided into haplogroups B, F, J, P, and T
while RO is further divided into HV, H, and V and U, the latter of which includes
haplogroup K (Fig. 16). Equal distributions of haplogroup U and M are found in
Asia, especially in India and Pakistan (Quintana-Murci et al., 2004). Haplogroup H is
the most common and most recent haplogroup of Europe with a total frequency of
40-45%, 20% in the Caucasus and 10% in Arabian Gulf populations (Heinz, 2015).
Haplogroup H has been reported in different populations from Pakistan, with
frequencies of 28% among Sindhis, 26.3% among Brahuis, 20.5% among Baluchis,
13% among Hazaras and 12.3% among Burushos, respectively (Bhatti et al. 2016a;
Szecsenyi-Nagy et al., 2014; Brandt et al., 2013; Mikkelsen et al., 2008; Quintana-Murci
et al. 2004; Richards et al., 2000). Recently, South Asian haplogroups M (28%), R (8%),
and West Asians haplogroups U (17%), HV (15%), H (9%), K (8%), J (8%), W (4%), T
(3%) and N (3%), were also found among members of the various Pashtun ethnic
groups sampled in northern Pakistan (Bhatti et al., 2016b).
MtDNA variation can be studied by the direct sequencing of the control region or
via restriction fragment length polymorphisms (RFLP). Haplogroups are separated
by the combination of HVS-I, HVS-II polymorphisms, and RFLPs (Schurr, 2004a).
MtDNA is a very useful tool to characterize variation, estimation of elapsed time of
divergence on a branched tree using either coalescence or distance methods of
estimation (Schurr, 2004b). It also plays a pivotal role in answering different
questions related to local populations in addition to those related to human origins
and evolution in general (Torres, 2016). Therefore, in the present study, mtDNA was
52
selected to find out more information about the local population samples from Swat
and Dir districts of Pakistan.
1.10. The Y-chromosome
The human Y-chromosome (Fig. 17) is about 60 megabases (Mb) in length, which
is inherited unilaterally through the paternal line (Li et al., 2008; Jobling et al.,
2004). The Y-chromosome carries the sex-determining region Y (SRY) gene that is
responsible for the genetic and sex determination mechanism by the activation of
the SOX9 gene, which in turn activates sex differentiating glands in males (Jiang
et al., 2013; Foster and Graves, 1994; Berta et al., 1990; Sinclair et al., 1990).
It also carries the male-specific region Y (MSY) and the pseudoautosomal regions
(PARs) (Fig. 17). The MSY encompasses about 95% of the entire Y-chromosome is
composed of the euchromatic and some of the repeat-rich heterochromatic parts
(Li et al., 2008; Skaletsky et al., 2003). The total size of euchromatin on the Y-
chromosome is approximately 23 Mb of which 8 Mb is found on the Yp arm and
14.5 Mb on the Yq arm (Skaletsky et al., 2003).
Figure 17. Modified structure of human Y chromosome (Skaletsky et al., 2003;

Olofsson, 2015).
53
The Y-chromosome also has three classes of euchromatin sequences known as X-
transposed, X-degenerate and ampliconic (Skaletsky et al., 2003).
The size of the ampliconic sequences is about 10.2 Mb which is highly repetitive,
polymorphic and composed of 1 to 8 palindromes (Li et al., 2008; Skaletsky et al.,
2003).
The euchromatin exhibits Long Interspersed Nuclear Elements 1 (LINE 1), which
accounts for about 36% of the X-transposed sequences (Skaletsky et al., 2003).
About 400kb of the heterochromatic region is present in the euchromatin, while
the major part about 35Mb of the heterochromatin can be identified at the lateral
long arm of the Y-chromosome (Hughes et al., 2012; Skaletsky et al., 2003). It has
also been reported that tandem repeats present in the heterochromatic region
have no transcription factors (Alechine et al., 2016; Skaletsky et al., 2003). The
length polymorphism in the heterochromatin is responsible for human Y-
chromosomal variation (Repping et al., 2006).
A total of 78 transcriptional units have been recognized in the modern MSY and
among these units 17 are present in single copies (Navarro-Costa, 2012; Navarro-
Costa et al., 2010; Skaletsky et al., 2003). The majority of these genes are
responsible for sex determination and sperm formation in testis (Olofsson et al
2015; Li et al., 2008).
The X-transposed regions carry TGIF2LY and PCDH11Y (also called
Protocadherin 11 Y). The TGIF2LY is testis specific, while PCDH11Y assists in
brain development in the fetus (Navarro-Costa, 2012; Skaletsky et al., 2003). The
54
X-degenerate region contains 27 genes of single copy also called pseudogenes
whose function is to code for approximately 15 proteins. Among these 27 genes,
13 show similarities with exons and introns of the functional X- homologue while
the remaining 14 are transcribed to functional genes and show similar features
with the X- and Y- linked genes of non-identical features. Furthermore, all the
twelve ubiquitously expressed MSY proteins are found within the X-degenerate
region and are involved in sex-determination and spermatogenesis (Skaletsky et
al., 2003; Lahn and Page, 1997).
The ampliconic region has nine coding genes that range from 2-35 copies that
belong to the protein coding family and are expressed only in testes (Navarro-
Costa, 2012; Navarro-Costa et al., 2010; Skaletsky et al., 2003). Approximately 75
non-coding genes are also identified in the ampliconic region, of which 65 are
specific to MSY families and 10 are found as single copies (Skaletsky et al., 2003).
1.10.1. Phylogenetic Tree based on human Y-chromosome
The information found on the haploid Y-chromosome is mostly used as a molecular
marker in anthropology, genotyping, demography, genealogy, forensics, medicine
and in evolutionary studies (Oven et al., 2014). Single nucleotide polymorphisms
(SNPs) and short tandem repeats (STRs) are two widely used markers present in the
non-recombining region of the Y-chromosome (Wang et al., 2015). Y-SNPs are slowly
evolving markers with mutational rates of about 3 × 10–8/ nucleotide/ generation
(Xue et al., 2009). These markers are widely used to study the paternal relationship
between individuals and among members of different populations (Van Oven et al.,
55
2014; Underhill et al., 2000). The biallelic properties of Y-SNPs make it an important
tool for constructing phylogenetic trees that link all the human reference populations
(Karafet et al., 2008; Consortium, 2002).
The first phylogenetic tree based on Y-SNPs was published in 2002 and was further
updated in 2003 (Consortium, 2002; Jobling and Tyler-Smith, 2003). The last updated
tree was published in 2008 (Karafet et al., 2008). Since then, the Y-chromosome tree
has been continuously updated and the most recent tree is now publicly available at
http://www.isogg.org/tree/. The main structure of the most recent and updated
tree is shown in Figure 18.
Figure 18. Structure of the most recent and updated version of human Y-
chromosome phylogenetic tree (Van Oven et al., 2014).
56
The y- haplogroups (Y-HGs) are named “A” to “T” where Y-HG “A” shows the
deepest root of the Y-chromosomal tree (Karafet et al., 2008).
Today the short revised version of the nomenclatural system, in which the first letter
represents the haplogroup or sub-haplogroup followed by the marker (e.g., R-U106
or R1b-U106) is used, rather than the previous nomenclatural system (e.g.,
R1b1a2a1a1) (Olofsson, 2015).
The second class of mutations found in the NRY consists of microsatellites, or short
tandem repeats (STRs), which have 2–6 base pair (bp) repeat units (Willems et al.,
2016; Roewer et al., 2001; Goldstein et al., 1996), with the mutation rate of 3.83 × 10–4
mutation per generation (mpg) (Willems et al., 2016). Y-STRs have a wide range of
forensic applications (crimes, rapes and paternity), human history and migration
pattern, and phylogenetic tree construction that links human populations with each
other to determine their genetic relatedness and possible origins (Kareem et al., 2015;
Butler, 2011; Underhill and Kivisild, 2007).
Thus Y-STRs are the ideal molecular markers as they are transfer from father to son
without recombination, they have a high level of diversity, they are simple to
genotype, they are sensitivite to genetic drift and permit the prediction of
informative haplotypes (Kareem et al., 2015; Marjanovic and Primorac, 2013; Butler,
2012). Haplotype is referred to as the genetic information received from lineage
markers such as Y-STR (Butler, 2012).
The convergence of Y-STR haplotypes among different haplogroups has
compromised the accuracy of haplogroup prediction. Therefore, samples with
57
ambiguous Y-STRs haplotypes, its typing with Y-SNPs is a very promising method
for finding haplogroup finer resolution and confirmation (Wang et al., 2013).
1.10.2. Y-chromosomal haplogroup distribution across the globe
Human migration may be predicted through Y-chromosomal haplogroup
distribution. The variation may be due to bottlenecks, founder effects and genetic
drift occurring along the migration routes across different regions (Olofsson, 2015).
The Y-chromosome haplogroups (Y-HGs) A and B-M60 are said to be very frequent
in African population (Gomes et al., 2010; Hammer et al., 2001; Underhill et al., 2001).
Populations residing in the Horn of Africa and in Noth Africa have a very high
frequencies of Y-HGs E-M96, E-M35, J-M304 and E-M81 (Trombetta et al., 2015;
Bekada et al., 2013; Gomes et al., 2010; Sanchez et al., 2005; Hammer et al., 2001;
Underhill et al., 2001). Y-HG E-M2 has also been reported in sub-Saharan African
populations, with the frequencies as high as 80% in West and 60% in Central Africa,
respectively (Trombetta et al., 2011).
Y-HG C-M130 is very common among the populations of Oceania and Asia
(Stoneking and Delfin, 2010; Karafet et al., 2008; Hammer et al., 2001; Underhill et al.,
2001). The occurrence of that sub-haplogroup among Native Americans confirms
their Asian origins (Geppert et al., 2011; Karafet et al., 2008; Zegura et al., 2004).
Haplogroup D-M174 is more common in Japan and Central Asia, while O-M175, D-
M174 and N-M231 are frequently distributed in East Asians (Zhong et al., 2011;
Karafet et al., 2008). Individuals from Oceania and Indonesia are limited to Y-HG M-
P256 and S-M230, respectively (Karafet et al., 2008; Hudjashov et al., 2007).
58
Haplogroups R-M207 and I-M170 are frequently distributed across Europe (Karafet
et al., 2008; Rootsi et al., 2004; Rosser et al., 2000; Semino et al., 2000). Haplogroup R-
M269 is found in Central and Western Europeans (Busby et al., 2012). Y-HG Q-M242
is very common in northern Eurasia and in some Siberian populations, while its sub-
haplogroups are distributed with low frequencies across European, Middle Eastern
and East Asian populations (Karafet et al., 2008).
The Eurasian Y-chromosomal lineages are common in Indo-Pakistani sub-continent
(Karafet et al., 2008; Sengupta et al., 2006). Y-HG R1a-M417 occurs widely throughout
the Eurasian continent, especially among the populations found in South and
Central Asia (Karafet et al. 2008; Novelletto 2007; Rosser et al. 2000; Semino et al.,
2000; Sengupta et al. 2006; Underhill et al., 2015). Haplogroup H-M69 and R1a1-M17
is widely distributed in India and Pakistan, while haplogroup R1a1a-M17 is very
common among the populations residing in the tribal areas of Khyber Pakhtunkhwa
province of Pakistan (Lee et al., 2014; Trivedi et al., 2008).
Exploring information contained in mtDNA, Y-STRs and tooth morphology is very
important for phylogenetic studies; for no such study has ever been conducted to
investigate any relationship among different ethnic groups of Hindu Raj region.
Therefore, the current project was designed to characterize five populations
(Yousafzai, Gujars, Tarkalani, Kohistani, Utmankheil) residing in Swat and Dir
district through dental morphology, mtDNA and YSTRs with the following
objectives.
59
Objectives
• To elaborate dental morphological variations among the major ethnic groups
in Swat and Dir districts.
• Genealogical study of the ethnic groups in the area using mitochondrial hyper
variable segments 1 and 2.
• Genetic characterization of Y- chromosomal STRs haplotypes in individuals
from Swat and Dir.
• Studying genetic diversity using human dental morphology.
• Statistical and Bioinformatics analysis of the data produce.
60
Chapter 2
MATERIALS AND METHODS

Samples from five ethnically distinct populations’ viz. Tarklani, Yousafzai,
Kohistani, Gujar and Utmankheil were collected from volunteers residing in
different areas of Swat and Dir districts of Khyber Pakhtunkhwa, Pakistan. Sampling
sites wherefrom the sample collection was done is presented in Fig. 19.
Figure 19. Geographic location of the study area. The colored circles represent
location of villages where samples were collected.
Members of three of these population samples (Tarklanis, Yousafzai, and
Utmankheils) are commonly recognized as sub-groups within the Pashtun ethnic
group. Ethnicity was self-declared and first degree relatives were identified and
61
excluded from the study. All participants gave their informed written consent after
the aims and procedures of the study were explained to them. The present research
was approved by the Institutional Bioethical Committee of Hazara University,
Mansehra, Pakistan (appendix-I).
2.1. Samples collection for dental morphology study
2.1.1. Collection of dental Casts
A total of 823 dental casts from males and females with signed consent forms were
collected from of Tarklani, Yousafzai, Kohistani, Gujar and Utmankheil volunteers of
Swat and Dir Districts (Table 1).
Table 1: Details of samples collected from Swat and Dir districts
S.No Ethnic Group Sampling site Sex Total No. of

M F Casts
1 Gujars Gabral and Miandam, Swat 85 80 165
2 Kohistani Bahrain and Kalam, Swat 89 85 174
3 Tarklani Miadan, Dir 75 75 150
4 Utmankheil Maidan, Dir 75 75 150
5 Yousafzai Mingora, Swat 94 90 184
The consent form was designed according to the guidelines of the Institutional
Bioethical Committee of Hazara University (Appendix-II). All the information about
the volunteers regarding geography and ethnic affiliation were saved properly for
further analysis. Before sample collection all participants were guided to wash and
62
clean their teeth in such a way that the cavities of the teeth should be free from foods
with the help of toothpaste and brushes provided by the research team (Fig. 20). The
optimized procedure was used to reduce the chances of gagging during sample
collection.
Figure 20. Filling, signing of consent form, and cleaning of teeth by volunteer
individuals.
2.1.2. Selection of volunteers
Volunteers for the present research were selected on the basis of ethnicity,
relatedness and condition of teeth. The individuals between 12 to 22 years of age
with teeth in good condition were considered for dental casting. The ethnicity of the
volunteer was self-declared or was provided by the individual’s parents. Those who
did not fulfill the selection criteria were not included in the study sample.
2.1.3. Biosafety Measures
Sterilized and autoclaved dental trays and alginate (Cavex CA37), which is widely
used in dental anthropology research, were selected for dental casting. The alginate
used for template preparation was easy to separate from the sample dental casts. A
mixture of alginate and water in a rubber bowl was prepared in semi-fluid form and
63
poured into the appropriate size impression tray. The alginated-filled impression
tray was placed in the mouth for two minutes and then removed gently. After being
removed from the mouth, the tray was rinsed with water to remove saliva to avoid
bubbles and erosion of the impression (Fig. 21).
Figure 21. Placement and removal of the alginate-filled impression tray from
the subject’s mouth.
2.1.4. Dental casting and labeling
The fine powder of diestone (DentAmerica, CA 91744, U.S.A) was mixed with water
for pouring into the alginate impressions. Two mixtures of plaster (thin and thick)
were prepared for preparation of a good quality dental cast. The thin mixture was
poured first into the alginate impression before thick mixture was added to avoid
bubbles and for better visualization of the traits. The thick plaster was used to make
the cast stronger and more resilient to damage. The trays filled with diestone were
kept in a sunny area to dry and were labeled carefully before removing from the
trays. The labeled casts were removed from the trays and were dried properly.
Tissue paper was wrapped around the dried casts to prevent them from breakage
and the casts were stored for further analysis (Fig. 22).
64
Figure 22 . Pouring of diestone mixture into the alginate impression mold and
labeling of dental casts.
2.1.5. Grading and scoring of dental morphology traits
The Arizona State University Dental Anthropology System (ASUDAS) (Scott and
Turner, 1997; Turner et al., 1991) was followed for the scoring and grading of dental
morphological traits of the samples with the help of 23 reference plaster plaques
(Turner et al., 1991) (Fig. 23). The dental morphology data derived from the five
samples were converted into dichotomized (presence or absence) format for further
analysis.
Figure 23. Scoring of dental morphology traits using the ASUDAS reference
plaques
65
2.2. Analyzing the DNA
The DNA was analyzed for both paternal and maternal lineages using saliva as a
source. The methods used in the present study for the collection of saliva and DNA
isolation are described below in detail.
2.2.1. Collection of saliva samples
Volunteer individuals of the five selected populations (Tarklanis, Yousafzai,
Utmankheil, Gujars and Kohistanis) were properly instructed before collection of
saliva samples. Two to five minutes were given to each individual for proper
cleaning of their mouth to minimize the chance of contamination using tooth
brushes provided to each volunteer by the research team (Fig. 19). After cleaning
their mouth the individuals were instructed to wait for two minutes until new
epithelial cells were produced and after that time a 5% sucrose solution was given
and the subject was instructed to keep it in their mouth for two minutes. The
individuals were then advised to spit the solution into a sterile specimen collection
cup and their saliva was stored in styrofoam coolers in the field until they were
delivered to the research laboratory. These samples were then directly processed for
further DNA extraction upon delivery to the lab.
2.2.2. Genomic DNA extraction
A good quality of Genomic DNA (gDNA) was extracted from the saliva containing
human epithelial cells using the optimized protocol established in our research lab
by Akbar et al. (2015). The materials, chemicals and the preparation of stock solutions
used in the present study are detailed in Appendix- III. A total of 2ml of saliva was
66
taken in a 2ml centrifuge tube and centrifuged at 3578 ×g for two minutes to obtain a
pellet of epithelial cells. 100µl of cell lysis solution from stock (2ml lysis buffer + 20µl
β-mercapto ethanol and 2µl proteinase K) was added to the pellet and vortexed until
the pellet was dissolved in the lysis solution. The sample tube was then kept in an
incubator for one hour at 56ºC. A 600µl phenol and chloroform solution with 1:1 was
added and again incubated for 5 to 10 minutes at room temperature after gentle
shaking, followed by centrifugation at 5590 ×g for 12 minutes. After centrifugation,
500µl supernatant was transferred to a sterile 1.5 tube with proper handling. 500µl of
isopropanol was then added to a tube containing the same volume of supernatant
and incubated for 20 minutes at -20ºC. After incubation, the sample was centrifuged
at the speed of 5590 ×g for 10 minutes and the supernatant was discarded while the
pellet was washed with ethanol (70%) at 3578 ×g for five minutes. The ethanol was
removed from the tube carefully and the pellet was air dried. 50µl of distilled water
was added to the dried pellet of gDNA and incubation was carried out for five
minutes at 56 ºC.
2.2.3. Screening of the purified gDNA
The concentration and quality of purified gDNA was determined with a Qubit
flourometer (Invitrogen, life technology, cat. Number Q32857) using the Qubit
dsDNA HR assay kit (Invitrogen, cat. Number Q32854) and a Agilent 2200
TapeStation instrument using a genomic DNA screen tape assay according to the
instructions provided by the manufacturers. Traditional agarose gel electrophoresis
(1% agarose gel) was also performed to determine the quality of the purified gDNA.
67
2.2.4. Agarose gel electrophoresis
A 1% agarose gel was prepared by adding one gram of agarose powder in 100mL
TAE-buffer and was heated for one minute in microwave oven for gDNA
quantification. When the temperature of the solution reached 40-45oC, 15µL of
ethidium bromide was added to it and mixed well by shaking. The solution was
poured into a gel casting tray with combs and kept smoothly to avoid bubbles at
room temperature till solidified. The combs were removed from the solid gel and
were set in electrophoresis equipment containing TAE-buffer of required volume. A
total of 8µL of DNA (5µL DNA + 3µL loading dye) was loaded into the wells in
agarose gel. About 80-100 volts of electric current was supplied to electrophoresis
apparatus until the dye moved from the wells. The presence and position of the
DNA bands were visualized and photographed using gel documentation.
2.3. Mitochondrial DNA characterization
2.3.1. PCR Amplification of target DNA
The isolated gDNA was used as a template for the PCR amplification of mtDNA
control region. A fragment about 450bp long at nucleotide position (np) 15974-16424
of the HVS-I and 550bp long fragment at np07-557 of the HVS-II region was
amplified using Taq DNA polymerase. Primers (Table 2) for the present study were
designed from Cambridge reference genome accession No. NC_012920 (Andrews et
al., 1999). The components of the PCR mixture used for amplification are given in
Table 3 below.
68
Table 2: Details of the primer sequences used in the present study for the
amplification of the target fragment of the mtDNA control region.
S.NO. OLIGO NAME SEQUENCE (5‘-3‘) %GC TM
1 HVS-1 (F) CTCCACCATTAGCACCCAAAGCTAAG 50 59.5
2 HVS-1 (R) GATATTGATTTCACGGAGGATGGTGGTC 46 59.9
3 HVS-2 (F) AGGTCTATCACCCTATTAACCACTCACG 46 60.0
4 HVS-2 (R) GGTGTCTTTGGGGTTTGGTTGGTTC 52 59.3
Table 3: Components and concentration of PCR reaction mixture/sample
S. Reagent Volume Final Concentration

No.
1 10X Taq Buffer 2.5 µL 1X
2 2 mM dNTPs 2.0µL 0.16 mM
3 25mM MgCl2 2.0µL 2.0 mM
4 10pM /µL F-Primer 2.0µL 20 pM
5 10 pM/µL R-Primer 2.0µL 20pM
6 Taq. Polymerase (5U/µL) 0.5µL 2.5 U
7 DNA template 2.0 µL 30 ng
8 ddH2O 12 µL 12 µL
Final Volume 25.0 µL
2.3.2. Thermocycling conditions for PCR
The thermocycling conditions for both HVS I and II were adjusted as: the initial
denaturation temperature was set at 95ºC for four minutes, the second denaturation
69
temperature was set at 94ºC for 40 seconds, the annealing temperature was set at
56ºC for one minute, the initial extension temperature was set for one minute at
72ºC, while the final extension was set for five minutes at 72 ºC followed by 35
cycles, respectively. All the of thermocycling conditions for HVS I and II were the
same, except the annealing temperature, which was 55ºC for HVS II (Fig. 24).
Figure 24. Representation of thermocycling profile for PCR. Figure (A)

represents PCR conditions for HVSI, while figure (B) represents PCR
conditions for HVSII.
70
2.3.3. Visualization of the PCR Products
The amplified PCR products of the control region were run on 2% agarose gel and
the corresponding bands were detected under UV using a gel documentation
system. The sharp and good quality DNA fragments were selected for further
cleaning.
2.3.4. Elution of PCR Product
The slices of gel containing the desired PCR products were purified using the
manual provided by the GeneAll Gel Extraction Spin/vacuum (SV) Kit Cat. No. 102-
101. The gene clean kit contains GB solution for melting the gel, washing the buffer
and elution buffer. GB buffer of about 500µl was added to the tube containing the
fine slices of the amplified PCR fragment and was incubated for 10 minutes at 56°C
after vortexing. When the gel was fully dissolved it was transferred to SV column
and centrifuged for one minute at 8050 ×g and the liquid was discarded from the
collection tube. 500µl of wash buffer (WB) was added to SV column and centrifuged
for 1:30 minutes at 8050 ×g. The liquids from the collection tube were discarded and
the empty SV column was centrifuged again. The SV column was shifted to a new
1.5 eppendorf tube; 50µl of elution buffer was added and incubated for two minutes
at 56°C. After incubation the tube with SV column was centrifuged for 1:30 minutes
at 9447 ×g while at this time the SV column was discarded and the purified PCR
products were then checked on 1.5% agarose gel. The confirmed PCR products after
agarose gel electrophoresis were sent to Macrogen, Inc. (Seoul, South Korea) for
sequence analysis. Sequencing was performed using the ABI PRISM® BigDye TM
71
Terminator Cycle Sequencing Kit and sequences were analyzed on a 3730XL Genetic
Analyzer (Applied Biosystems).
3.3. Y-chromosome analysis
The samples from Swat and Dir districts were analyzed for Y-chromosome
characterization to assess the genetic diversity within and among these populations.
3.3.1. Y-STR and Y-SNP datasets
A total of 27 Y-STR loci (DYS19, DYS385, DYS389I, DYS389II, DYS390, DYS391,
DYS392, DYS393, DYS437, DYS438, DYS439, DYS448, DYS449, DYS456, DYS458,
DYS460, DYS481, DYS518, DYS533, DYS570, DYS576, DYS627, DYS635, Y GATA H4,
DYF387S1) were amplified with the Yfiler®Plus PCR Amplification kit
(ThermoFisher Scientific, Cat. No. 4484678). The PCR products were separated and
evaluated according to manufacturer’s protocols with modification of Olofsson et al.
(2015). A portion of the DNA sample was diluted with tris ethylin dia amin tetra
acetic acid (TE) buffer in such a way that 1.0ng of total DNA was adjusted in a final
volume of 10µL TE buffer and was then added to the multiplex reaction mixture
(Table 4).
Table 4. Components and concentrations of the multiplex PCR reaction.
S.No Reaction components Volume per reaction
1 Master mix 10.0µL
2 Primer Set 5.0µL
3 TE buffer (with 1.0ng DNA) 10µL
Total volume 25µL
72
3.1.2. Multiplex PCR profile
The annealing temperature for multiplex PCR was kept 61.5°C and the number of
cycles were adjusted to (25-29) to amplify the 27 Y-STR loci (Table 5).
Table 5. Cycling profile for multiplex PCR reaction
Serial Operation Temperature Time Cycles

number
1 Initial denaturation 95°C 1 min
2 Denaturation 94°C 4s
3 Annealing 61.5°C 1 min (26-29)
4 Extension 60.0°C 1 min
5 Final extension 60.0°C 22 min
6 Hold 4°C 
Electrophoresis was performed with 1µL of the amplified products, 0.5µL of
GeneScanTM 600 LIZ® Size Standard v. 2.0 and 9.5µL of deionized Hi-DiTM
formamide and denatured at 950C for three minutes. The fragments were read on
Applied Biosystems® 3500×L Genetic Analyzer (ThermoFisher Scientific) according
to the manufacturer’s recommendations, while the injection timing was reduced
from 24 sec to 12 sec. The electropherograms were analyzed using GeneMapper®
IDX v. 1.4 (Thermo Fisher Scientific, Waltham, MA, USA) and the allelic data was
checked manually two times for accuracy.
Initial assignment of Y-chromosome haplogroups was also carried out using
genotypes of Y-chromosome SNPs included on the Infinium®OmniExpressExome-8
73
v.1.3 BeadChip array, performed commercially by AROS Applied Biotechnology
A/S Denmark. A total of 1,641 SNPs are included on the array, of which 1,226
passed genotyping filters (call rate ≥ 90%) in the sampled individuals.
3.4. Statistical Analysis
3.4.1. Dental morphology Analysis
The nonmetric data of the samples from Swat and Dir districts were examined with
neighbor joining cluster analysis (Saitou and Nei, 1987), multidimensional scaling
(MDS) with Kruskal’s (1964) and with Guttman’s (1968) coefficient of alienation, and
principal coordinates analysis (PCA) (Gower, 1966). Pairwise distances between the
samples were calculated with C.A.B. Smith’s Mean Measure of Divergence statistic
(MMD) for intergroup comparison. Samples of prehistoric and living individuals
from South Asia, Central Asia and the northern areas of Pakistan were included for
comparative study along with the present study population.
3.4.2. MtDNA Analysis
All the raw sequences of mtDNA control region obtained from Macrogen (Seoul
Korea) were cleaned using Sequencher® version 5.4.6 (Gene Codes Corporation,
http://www.genecodes.com). The cleaned sequences were aligned and compared
with rCRS using MAAFT software (version 7) (Katoh and Standley, 2013; Andrews
et al., 1999; Anderson et al., 1981). The aligned sequences were further investigated
for haplotype detection with MitoTool (Fan and Yao, 2011), HaploGrep (Kloss-
Brandstatter et al., 2011) and Mitomaster (Brandon et al., 2009) using PhyloTree Build
16 (http://www.phylotree.org) as the classification tree to assess the quality of
74
mtDNA data (Van Oven and Kyser, 2009). The haplotypes were assigned to
haplogroup according to phylotree (Van Oven, 2015) and published data (Van Oven
et al., 2011; Behar et al., 2008; Metspalu, 2004). The population statistics i.e. Genetic
Diversity (GD), Power of Discrimination (PD) and Random Match Probability (RMP)
were also calculated (Prieto et al., 2011; Tajima et al., 1989). Genetic distances between
population samples were evaluated as pairwise FST calculated based on haplotype
frequencies in Arlequin v. 3.5.1.2 [10,000 permutations; Excoffier and Lischer, 2010]
with the other Pakistani population data (Bhatti et al., 2016a; Bhatti et al., 2016b;
Siddiqi et al., 2015), and visualized through classical multidimensional scaling (MDS)
in the statistical software R (v. 3.2.1.), median joining networks of haplotypes were
constructed in the program Network v. 5.0.0.0 (http://www.fluxus-
engineering.com).
3.4.3. Y-STRs and Y-SNPs analysis
Y- STR data was analyzed with population genetic parameter estimation for the five
samples (Tarklani, Yousafzai, Utmankheil, Gujars and Kohistani) and for the
combined set of all individuals as previously described (Olofsson et al., 2015).
Genetic distances between samples were evaluated as pairwise FST calculated on the
basis of haplotype frequencies in Arlequin v. 3.5.1.2 with 10,000 iterations per
mutation (Excoffier and Lischer, 2010), and visualized through classical
multidimensional scaling (MDS) with R (v. 3.2.1). Median joining networks of
haplotypes were constructed in the program Network v. 5.0.0.0
75
(http://www.fluxus-engineering.com) and weights (1-5) were given to the included
loci as previously reported (Olofsson et al., 2015).
We constructed two datasets of previously published Y-STR data to explore the
patrilineal gene pool of Swat and Dir districts in a broader geographic and
ethnographic context. One dataset encompassed 38 population samples specifically
from the Indo-Pakistani sub-continent and Southwest Asia. The other dataset
encompassed 54 worldwide population samples (including the five from the current
study) from the human genome diversity project (HGDP) panel (Haber et al., 2012;
Perveen et al., 2014; Lee et al., 2014; Roewer et al., 2009; Vermeulen et al., 2009;
Rosenberg, 2006; Qamar et al., 2002; Cann et al., 2002). Details of all comparative
samples included in this study are provided in Table 6. To facilitate the inclusion of
the previously published data from a large number of ethnic groups the data set was
limited to 10 Y-STR loci for which all samples had been characterized. The same
package (Arlequin v. 3.5.1.2: Excoffier and Lischer, (2010)) was used for the analyses
of molecular variance (AMOVA) between all groups and further groupings based on
country of origin and ethnicity.
Following standard practice, the multi-copy loci in this kit, DYS385 and DYF387S1,
and haplotypes with duplication events were excluded for estimations of genetic
distances (FST) and construction of median joining networks. Furthermore,
individuals with haplotypes displaying null or intermediate alleles were also
excluded. As is standard for Y-STR analyses, the alleles of the DYS389II locus were
76
converted to the DYS389B nomenclature by subtracting the repeat number of
DYS389I from that of DYS389II.
All the corresponding haplotypes observed in the present study were reported to the
Y-chromosomal haplotype reference database (YHRD) (Willuweit and Roewer,
2015), with their respectitve accession numbers from YA004265 to YA004269.
77
Table 6. Population samples included in the larger comparative analyses. Sample sizes and references to the original
studies are shown.
Population No of individuals References

Gujara 20 This study
Kohistania 20 This study
Tarklania 20 This study
Utmankheila 20 This study
Yousafzaia 20 This study
Iran-Ahvazb 46 Roewer et al., 2009
Iran-Izehb 50 Roewer et al., 2009
Iran-Rashtb 46 Roewer et al., 2009
Iran-Sarib 46 Roewer et al., 2009
Iran-Masalb 18 Roewer et al., 2009
Azerbaijan-Lenkoranb 47 Roewer et al., 2009
Afghanistan-Baluchb 13 Haber et al., 2012
Afghanistan-Hazarab 60 Haber et al., 2012
Afghanistan-Pashtunb 48 Haber et al., 2012
Afghanistan-Tajikb 56 Haber et al., 2012
Afghanistan-Uzbekb 17 Haber et al., 2012
Pakistan-Punjabib 300 Perveen et al., 2014
Pakistan-Pathanb 270 Lee et al., 2014
Pakistan-Baluch (BAL)b 59 Qamar et al., 2002
Pakistan-Balti (BLT)b 13 Qamar et al., 2002
Pakistan-Brahui (BRU)b 109 Qamar et al., 2002
Pakistan-Burusho (BSK)b 94 Qamar et al., 2002
Pakistan-Hazara (HZR)b 23 Qamar et al., 2002
Pakistan-Kalash (KAL)b 44 Qamar et al., 2002
78
Pakistan-Kashmiri (KSR)b 12 Qamar et al., 2002
Pakistan-MakraniBaluch (MAKB)b 25 Qamar et al., 2002
Pakistan-Negroid Makrani (MAKN)b 33 Qamar et al., 2002
Pakistan-Parsi (PRS)b 89 Qamar et al., 2002
Pakistan-Pathan (PKH)b 94 Qamar et al., 2002
Pakistan-Sindh (SDH)b 120 Qamar et al., 2002
Adygeic 7 Cann et al., 2002;Rosenberg, 2006; Vermeulen et al., 2009
Balochia 25 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Bantuc 19 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Basquec 14 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Bedouinc 26 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
BiakaPygmyc 30 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Brahuia 25 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Burushoa 17 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Cambodianc 6 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Colombianc 5 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Daic 7 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Daurc 7 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Druzec 13 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Frenchc 10 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Hanc 22 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Hazaraa 24 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Italianc 6 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Japanesec 19 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Kalasha 20 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Karitianac 10 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Lahuc 7 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Makrania 19 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
79
Mandenkac 15 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
MbutiPygmyc 11 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Melanesianc 6 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Miaoc 7 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Mongolac 6 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Mozabitec 20 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Naxic 8 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Orcadianc 7 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Oroqenc 5 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Palestinianc 16 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Papuanc 10 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Pathana 17 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Pimac 14 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Russianc 15 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Sanc 7 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Sardinianc 15 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Shec 7 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Sindhia 19 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Suruic 9 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Tuc 7 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Tujiac 9 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Tuscanc 5 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Uygurc 20 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Xiboc 20 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Yakutc 16 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Yic 7 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Yorubac 13 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009
Total 2481
80
Groups marked witha were used both for the regional (MDS, AMOVA) and the worldwide analyses (MDS). Groups marked with b
were used only for the regional analyses (MDS, AMOVA). Groups marked with c were used only for the worldwide MDS analysis.
81
Initial assignment of Y-chromosomal haplogroups was carried out using genotypes
of Y-SNPs included on the Infinium®OmniExpressExome-8 v.1.3 BeadChip array. A
total of 1,641 Y-SNPs are included in the array, of which 1,226 passed genotyping
filters (call rate ≥ 90%) among the individuals included in the study. These were
intersected with the ISOGG Y-DNA SNP index (http://isogg.org/tree/index.html,
version 10.103), resulting in a final set of 331 haplogroup-defining Y-SNPs.
Individual haplogroups were assigned as the most derived haplogroup where the
individual’s genotype matched the derived allele. Markers in parenthesis followed
by an “x” indicate downstream markers for which the samples were typed but were
found to be in an ancestral state.
82
Chapter 3
RESULTS
The results obtaind were proper analyzed and are logically arranged in in this
chapter under three sub headings as dental morphology, mitochondrial and Y
chromosomal DNA analyses
3.1. Dental Morphology
Fourteen maxillary and mandibular dental traits combinations were scored
according to ASUDAS. Seven maxillary and seven mandibular traits were selected
for comparative analysis. The maxillary tooth-trait variables include, shovelling of
UI1 (SHOVUI1) and UI2 (SHOVUI2), hypocone on UM1 (HYPOUM1) and UM2
(HYPOUM2), median lingual ridge (MLRUI1) development or tuberculum dentale,
presence of metaconule on UM1 (MTCLUM1) and UM2 (MTCLUM2). The
mandibular variables include, Y-groove pattern on LM1 (YGRVLM2), entoconuild
on lower molar 1 (C6LM1) and lower molar 2 (C6LM2), metaconulid on lower molar
1 (C7LM1) and lower molar 2 (C7LM2).
3.1.1. Dichotomized Individual Trait Frequencies
Dental trait frequencies and the corresponding sample size of the present five ethnic
groups residing in Swat and Dir districts of Khyber Pakhtunkhwa Province of
Pakistan and the additional samples used for comparative analysis are given in
appendix- IV. These comparative samples include both living and prehistoric
individuals (Table 7). The expression of dental traits shows marked variation when
they were individually dichotomized into absence and presence only. The dental
trait frequencies obtained from the five population samples of the present study are
83
described in table 8. Majority of these traits shows moderate frequencies while few
traits i.e (Median Lingual ridge) MLRUI1, (Hypocone) HYPOUM1, (Major Cusp
number) CSPNLM1 were observed with highest frequencies.
Table 7. Details of the living\modern and prehistoric samples used in this study
for comparative analysis
Sample Abb. N Sample Abb. N

Northern Pakistan/ Karakoram Prehistoric Central Asia
Khows KHO 144 Djarkutan DJR 39
Madaklasht MDK 185 Kuzali KUZ 24
Wakhis (Gulmit) WAKg 162 Prehistoric Indus valley
Wakhis (Sost) WAKs 146 Neo. Mehrgarh NeoMRG 49
Abbottabad and Mansehra Chl. Mehrgarh ChlMRG 25
Awans AWAm2 93 Harappa HAR 33
Syeds SYD 65 Timargarha TMG 25
Gujars GUJ 90 Sarai Khola SKH 15
Tanolis TAN 69 South-Eastern Indians (Andhra Pradesh)
Karlaars KAR 76 Pakanati Red. PNT 182
Awans AWA1 167
Swatis SWT 178
Western Indians (Maharashtra) Gompad. Mad. GPD 178
Inamgaon INM 41 Chenchus CHU 194
Marathas MRT 198 Swat and Dir (Present study)
Mahars MHR 195 Gujars GUJsw 165
Madia Gonds MDA 169 Kohistan KOHsw 174
Tarklani TRKd 150
Utmankheil UTHd 150
Yousafzai YSFsw 150
N=Number of sample size, Abb= abbrivations
84
Table 8. Frequencies of dental traits among the five ethnic groups (%).
Traits GUJsw KOHsw TRKd UTHd YSFsw

SHOVUI1 31.25 33.33 44.72 40.88 29.28
SHOVUI2 20.63 17.83 31.58 22.88 15.00
MLRUI1 73.13 62.35 65.84 72.96 76.80
HYPOUM1 96.88 98.76 93.79 94.34 99.45
HYPOUM2 19.50 20.63 13.75 15.09 18.64
MTCLUM1 26.25 25.47 31.68 24.53 16.57
MTCLUM2 10.06 6.96 16.88 13.21 10.17
YGRVLM2 11.88 15.00 12.42 4.49 37.02
CSPNLM1 79.87 85.09 80.75 94.97 86.11
CSPNLM2 12.03 18.75 19.23 21.94 6.08
C6LM1 11.88 6.21 11.18 13.21 5.00
C6LM2 0.00 1.25 2.48 0.65 0.56
C7LM1 14.38 8.70 21.74 12.58 6.11
C7LM2 2.52 1.88 11.80 3.90 0.00
3.1.1.1. Shovelling
Shovelling was more frequent in the individuals of all five ethnic groups in upper
incisor one (SHOVUI1) ranging from lowest 29.28% to highest 44.72% than the
upper incisor two (SHOVUI2) ranging from lowest 15.00% to the highest value of
31.58% (Table 8). The frequency of SHOVUI1 was observed highest among the
Tarklani (TRKd) sample accounting for 44.72% of individuals followed by 40.88%
Utmankheil (UTHd), 33.33% Kohistani (KOHsw), 31.25% Gujar (GUJsw), and
29.28% Yousafzai (YSFsw). Shovelling of the upper incisor one was highly prevalent
in the Neolithic inhabitants of Mehrgarh (NeoMRG) observed in 64.29% of
individuals among all the samples included in this analysis, while no single
individual was found to express this trait among the inhabitants of Sarai Khola
(SKH) sample from Indus valley (Fig. 25A). The highest frequency of this trait was
also found among the foothills samples of Karlaars from Abbottabad (KARa) 57.72%
85
and Gujars from Mansehra (Gujm2) 57.42%, while its frequencies in the remaining
samples of this aggregate was observed from 17.73% to 37.76%. The inhabitant of
central Asian population (DJR, KUZ, MOL, SAP) collectively shows lowest
frequencies of SHOVUI1 as compared to the other samples ranging from 7.69% to
18.75%. The highest prevalence was found in the individuals of Madia Gonds
(MDA) accounting for 49.08% and the lowest 37.50% was found in the individuals of
Inamgaon (INM) among the ethnic groups of West-Central India. The Madaklasht
(MDK) sample among the Hindu Kush highlander shows the highest frequency
occurs in 40.78% individuals, while the lowest frequency (19.62%) of this trait was
recorded in the individuals of Wakhis (WAKs). The three living Dravidian ethnic
groups (PNT, CHU, GPD) of southeast Indian samples revealed 29.55% to 36% of
SHOVUI1 trait. YSFsw from Swat District, AWAm1 of Foothill sample from
Mansehra District and Pakanati Reddis (PNT) from Andhra Pradesh of southeast
India shows similar frequency of SHOVUI1 ranging upto 29% (Fig. 25A)
The frequency of Shovelling trait in the upper incisor two (SHOVUI2) was observed
31.58% among the individuals of Tarklani (TRKd) followed by 22.88% Utmankheil
(UTHd), 20.63% Gujar (GUJsw), 17.83% Kohistani (KOHsw) and 15.0% Yousafzai
(YSFsw) (Table 8). When the frequency of SHOVUI2 was compared with other
population included in this study for comparative analysis, it was found that Gujar
sample from Swat District (GUJsw) was the most similar to Tanoli from Mansehra
District (TANm2 ) and KHO of Hindu Kush highlands samples occur within the
range of 20% (Fig 24B). Highest frequency of SHOVUI2 was observed among the
Chalcolithic inhabitants of Mehrgarh (CHMRG) occur in 58.33%of individuals
86
followed by KARa (57.72%) and GUJm2 (57.42%) of Foothill samples from Mansehra
District, while this trait was not expressed in the inhabitants of Sarai Khola (SKH)
samples of Indus valley as compared to the other samples include in this study (Fig.
25B).
SHOVUI1
A
Present Study Foot hill West-Central Southeast Central Hindu Indus
(Dir and Swat) (Mansehra and Indian Indian Asian Kush valley
Abbottabad)
SHOVUI2
B
Present Study Foot hill West-Central Southeast Central Hindu Indus
(Dir and Swat) (Mansehra and Indian Indian Asian Kush valley
Abbottabad)
Figure 25. Frequencies of shovelling (SHOVUI) among living Pakistani ethnic

groups, living ethnic groups from peninsular India and samples of
the prehistoric inhabitants of the Indus Valley, South Central Asia
and the major living ethnic groups of Swat and Dir districts (A)
SHOVUI1 (B) SHOVUI2.
87
3.1.1.2. Median Lingual ridge
Median lingual ridge development also called tuberculum dentale (TD) is a
maxillary dental trait expression found lingually on incisors (MLRU) and canine
(MLRC). MLR on upper incisor one (MLRUI1) was highly expressed among the five
population samples from Swat and Dir districts ranging from 62.35% to 76.80%
(Table 8). High frequency of MLRUI1/TDUI1 was observed among the individuals
of Yousafzai (YSFsw) sample accounting for 76.88%, followed by Gujar (GUJsw)
73.13%, Utmankheil (UTHd) 72.96%, Tarklani (TRKd) 65.84% and Kohistani
(KOHsw) 62.35% (Fig. 26).
MLRUI1
Present Study
Foot hill West Central southeast Central Hindu
(Dir and Swat) Indus
(Mansehra and Indian Indian Asian Kush valley
Abbottabad)
Figure 26. Frequencies of Median Lingual ridge (MLR) among living Pakistani
ethnic groups, living ethnic groups from peninsular India and
samples of the prehistoric inhabitants of the Indus Valley, South
Central Asia and the major living ethnic groups of Swat and Dir
districts.
It was also found that the individuals of Yousafzai belongs to Swat District (YSFsw)
has highest frequency of MLRUI1 in comparison to the rest of all samples including
historic, prehistoric and living samples from Pakistan and peninsular India as
shown in figure 25. Lowest frequency of this trait was occur in Kuzali (KUZ)
88
(15.38%) and Djarkutan (DJR) (17.65%) of Central Asian followed by Awan
(AWAm2) (17.73%) of Foothill samples from Mansehra District.
3.1.1.3. Y-Groove Pattern
Y-groove pattern or Y Occlusal groove pattern in lower molar 2 (YGRVLM2) was
found with low frequency such as 4.49% in Utmankheil (UTHd) to high frequency
37.02% in Yousafzai among the five ethnic groups of Swat and Dir districts while, its
prevalence in Kohistani (KOHsw) was 15%, followed by Tarklani (TRKd) 12.42%
and Gujar (GUJsw) 11.88% (Fig. 27).
YGRVLM2
Present Study Foot hill Southeast Central Hindu

West Indus
(Dir and Swat) (Mansehra and Indian Asian Kush
Central Indian valley
Abbottabad)
Figure 27. Frequencies of Y-Groove Pattern (YGRVLM2) among living

Pakistani ethnic groups, living ethnic groups from peninsular India
and samples of the prehistoric inhabitants of the Indus Valley,
South Central Asia and the major living ethnic groups of Swat and
Dir districts.
The Pakanati Reddis (PNT) and Gompadhompti Madigas (GPD) ethnic groups of
Southeast Indian, SKH sample of Indus Valley and Yousafzai of Swat District
(YSFsw) reveals highest YGRVLM2 trait frequency ranging from 35.73% to 40.61%,
89
whereas the lowest expression was found in Utmankheil sample from District Dir
(UTHd) occur in 4.49%, when compared with the rest of the samples (Fig. 27).
Similarity was found among Kohistani sample of Swat District (KOHsw), Gujar
(GUJm2) and Tanoli (TANm2) of Foothill samples from Mansehra District and
Molali (MOL) sample of Central Asia where 15% of individuals express YGRVLM2
trait. The individuals of Gujar sample from District Swat (GUJsw), Awan sample
from District Mansehra (AWAm2) and the individuals of Wakhis from Gulmit
(WAKg) bear the same frequency ranging from 11.88% to 12.00% respectively (Fig.
27).
3.1.1.4. Hypocone
The prevalence of hypocone (Cusp 4) trait of upper molar 1 (HYPOUM1) was found
highly frequent ranging from 93.79% to 99.45% as comparison to upper molar 2
(HYPOUM2) ranging from 13.75% to 20.63% among the individuals of the five
ethnic groups of the present study (Table 8). High frequency of HYPOUM1 was
found in the individuals of YSFsw accounting for 99.45%, followed by KOHsw
98.76%, GUJsw 96.88%, UTHd 94.34% and TRKd 93.79% (Fig. 28A). HYPOUM1 trait
was highly expressed in all populations i.e Indus valley, Hindu Kush, Central Asian,
Peninsular Indian (Southeast Indian and West Central Indian) and the samples of
present study (Swat and Dir districts) ranging from 65.85% to 100% except the
Foothill samples where its frequency was found from lowest 17.73% to highest
57.72% (Fig. 28A).
90
HYPOUM1
A
Present Study Foothill (Mansehra West Southeast Central Hindu Indus
Central Indian Indian Asian Kush valley
(Swat & Dir & Abbottabad)
HYPOU2M
B
(Swat & Dir & Abbottabad) Central Indian Indian Asian Kush valley
Figure 28. Frequency distribution of hypocon (A) Frequency distribution of

hypocon at upper molar 1 (HYPOCONM1) (B) Frequency
distribution of hypocon at upper molar 2 (HYPOCONM2).
The frequency of hypocone trait at upper molar 2 (HYPOUM2) was observed
highest in Kohistani (KOHsw) occur in 20.63% individuals among the other samples
of Swat and Dir districts followed by Gujar (GUJsw) 19.50%, Yousafzai (YSFsw)
18.64%, Utmankheil (UTHd) 15.09% and Tarklani (TRKd) 13.75% respectively (Table
8). Comparatively all the samples of Central Asian reveals highest expression of
HYPOUM2 ranging from 50% to 71.88% among the other samples included in this
study, whereas only one sample (CHMRG) among the other Indus valley samples
91
and one sample (CHU) among the samples of southeast Indian has highest
prevalence with the frequency of 55.56% and 42.78% respectively. No expression of
HYPOUM2 was recorded among the Indus Valley sample from Timergara (TMG)
and Inamgaon (INM) sample from West Central Indian. The overall results show
that HYPOUM1 was more prevalent among the individuals of all samples included
in this study than HYPOUM2 (Fig. 28B).
3.1.1.5. Metaconule
Metaconule (MTCL) (cusp 5) was scored on both upper molar 1 (MTCLUM1) and
upper molar 2 (MTCLUM2). The result shows that the frequency of MTCLUM1 was
frequently high ranging from 16.57% to 31.68% as compared to MTCLUM2 ranging
from 6.96% to 16.88% (Table 8). The frequency of MTCLUM1 was observed highest
among the individuals of Tarklani (TRKd) accounting for 31.68%, followed by Gujar
(GUJsw) 26.25%, Kohistani (KOHsw) 25.47%, Utmankheil (UTHd) 24.53% and
Yousafzai (YSFsw) 16.57% samples from Swat and Dir districts (Fig. 29A). Highest
expression was observed in the samples of Indus valley, Peninsular Indian and the
samples of present study (Swat and Dir districts) ranging from the lowest value
14.63% to highest 46.15%, whereas one sample (SKH) among the Hindu Kush has
also found with the highest frequency accounting for 33.3.% (Fig. 29A). Similarity in
the expression of MTCLUM1 was found in the three ethnic groups (GUJsw, UTHd
and KOHsw) from Swat and Dir districts, one sample Chenchus (CHU) from
southeast India and two samples (NeoMRG and CHIMRG) from Indus Valley with
frequency ranging from 24% to 26%. The rest of the samples included in this analysis
92
i.e Foothill, Central Asian and Hindu Kush (Except SKH) samples reveal less than
10% of MTCLUM1 trait expression as shown in figure 29A.
MTCLUM1
A
(Swat & Dir & Abbottabad) Central Indian Indian Asian Kush Valley
MTCLUM2
B
Present Study Foothill (Mansehra West Southeast Hindu
Central Indus valley
(Swat & Dir & Abbottabad) Central Indian Indian Asian Kush
Figure 29. Frequencies of metaconule at upper molars (A) Frequency

distribution of metaconule at upper molar 1 (HYPOCONM1) (B)
Frequency distribution of metaconule at upper molar 2
(HYPOCONM2).
The frequency of metaconule in upper molar 2 (MTCLUM2) was observed 16.88%
among the individuals of Tarklani (TRKd) followed by Utmankheil (UTHd) 13.21%,
Yousafzai (YSFsw) 10.17%, Gujar (GUJsw) 10.06% and Kohistani (KOHsw) 6.9%
(Table 8). Highest expression of this trait was found among the individuals of
NeoMRG and CHIMRG of the Indus Valley with the frequencies of 40% and 33.33%
93
respectively (Fig. 29B). Lowest frequency was found among the Kuzali (KUZ)
occupants of Central Asia, where this trait was express just in 4.1% of individuals
and completely absent in the prehistoric Indus Valley sample from Timergara
(TMG) and in Central Asian sample from Djarkutan (DJR) figure. 29B.
3.1.1.6. Major Cusp number
Major Cusp number (CSPN) was found highest in the lower molar 1 (CSPNLM1)
ranging from 78.87% to 94.97% than lower molar 2 (CSPNLM2) ranging from 6.08%
to 21.94% among all the five populations sampled from Swat and Dir districts (Table
8). The frequency of CSPNLM1 was accounted 94.97% in the individuals of
Utmankheil (UTHd), 86.11%in Yousafzai (YSFsw), 85.09%in Kohistani (KOHsw),
80.75 in Tarklani (TRKd) and 79.87% in Gujar (GUJsw) respectively (Fig. 30A).
Comparatively, No clear differences was found in the frequency of this trait among
all the samples used in this analysis (fig. 30A)
CSPNLM2 at lower molar 2 was frequently observed in the sample of Utmankheil
(UTHd) which occurs in 21.94% of individuals, followed by Tarklani (TRKd) 19.23%,
Kohistani (KOHsw) 18.75%, Gujar (GUJsw) 12.03% and Yousafzai (YSFsw) 6.08%.
(Fig. 30B).
This trait was markedly expressed among all the three ethnic groups from southeast
India which occur 37.21% in the individuals of Gompadhompti Madigas (GPD),
23.89% in Pakanati Reddis (PNT) and 27.75% in Chenchus (CHU), while completely
absent in the prehistoric sample from Harappan (HAR). The most similar
frequencies were found among the individuals of Gujar sample from District Swat
(GUJsw), the sample of Awan (AWAm1), Karlaars (KARa), Syed (SYDm2) and
94
Tanoli (TANm2) from Mansehra District and Khowars sample of Hindu Kush
highlands from Chitral District (KHO) ranging from 10.29% to 13.77% respectively
(Fig. 30B). The Yousafzai sample of District Swat (YSFsw) was the most similar in
CSPNLM2 trait expression with Neolithic occupant of of Mehrgarh (NeoMRG) and
Sari Khola (SKH) of Indus Valley samples recorded in 6.08 to 6.67% of individuals
(Fig. 30B)
CSPNLM1
A
Present Study Foothill West Southeast Central Hindu
Central Indian Indus
(Dir and Swat) (Mansehra and Indian Asian Kush
Abbottabad)
valley
CSPNLM2
B
Present Study West Southeast Central Hindu
Foothill Indus
(Dir and Swat) (Mansehra and Central Indian Indian Asian Kush valley
Abbottabad)
Figure 30. Frequency Distribution of major cusps numbers at lower molars

(CSPNLM) among all samples (A) Frequency of major cusps
numbers at lower molar 1 (CSPNLM1) (B) Frequency of major cusps
numbers at lower molar 2 (CSPNLM2)
95
3.1.1.7. Entoconuild
The entoconuild (Cusp 6) was found more prevalent on lower molar 1 (C6LM1) than
lower molar 2 (C6LM2) among the individuals of five ethnic groups of Swat and Dir
districts (Table 8). The frequency of entoconuild at lower molar 1 (C6LM1) was
found 13.21% among the individuals of Utmankheil (UTHd), followed by Gujar
(GUJsw) 11.88%, Tarklani (TRKd) 11.18%, Kohistani (KOHsw) 6.21% and Yousafzai
(YSFsw) 5% (Fig. 31A). The Chalcolithic period sample from Mehrgarh (CHIMRG)
shows highest frequency (21.74%) as compared to the rest of the samples included in
this analysis and the lowest frequency was observed among the Gujar from
Mansehra District (GUJm2), where the expression of this trait was recorded in only
1.8% of individuals, while completely absent in the prehistoric Indus Valley sample
from Timergara (TMG) and Kuzali (KUZ) sample from Central Asian (Fig. 31A).
The entoconuild at lower molar 2 (C6LM2) was completely absent in Gujar (GUJsw),
and its frequency in the individuals of Utmankheil (UTHd), Tarklani (TRKd),
Kohistani (KOHsw) and Yousafzai (YSFsw) samples from Swat and Dir district was
recorded from 0.56% to 2.48% respectively and was considered as the lowest when
compared to the other traits frequencies observed in this analysis ( Table 8). Half of
the samples included in this analysis express this trait with a very low frequency
ranging from 0.54% to 11.11%, while it was found completely absent in the remaing
half samples i.e Gujar sample from District Swat (GUJsw), three samples (AWAm1,
AWAm2, SWT) from Mansehra District, prehistoric sample from Mahashtra (INM),
prehistoric south Central Asian samples (DJR, KUZ, MOL, SAP), two samples
(KHO, WAKs) from Hindu kush and three samples (SKH, HAR, NeoMRG) from
Indus Valley (Fig. 31B).
96
C6LM1
A
Present Study Foot hill West Southeast Centra Hindu Indus
(Dir and (Mansehra and Central Indian l Asian Kush valley
Swat) Abbottabad) Indian
C6LM2
B
Present Study Foot hill West Southeast Centra Hindu Indus
(Dir and (Mansehra and Central Indian l Asian Kush valley
Swat) Abbottabad) Indian
Figure 31. Frequencies distributions of entoconuild at lower molars (C6LM)

among all samples included in this study (A) Frequency of
entoconuild at lower molar 1 (C6LM1). (B) Frequency of entoconuild
at lower molar 2 (C6LM2)
3.1.1.8. Metaconulid
Metaconulid frequencies at lower molar 1 (C7LM1) was recorded 21.74% among the
individuals of Tarklani (TRKd), followed by Gujar (GUJsw) 14.38%, Utmankheil
(UTHd) 12.58%, Kohistani (KOHsw) 8.70 % and Yousafzai (YSFsw) 6.11% collected
from Swat and Dir districts (Table 8). When C7LM1 was comparatively studied in all
of the samples included in this analysis, it was found that highest frequency
(24.62%) was observed among the individuals of Chenchus (CHU) of southeast
Indian samples than all of the remaining samples, while Tarklani from District Dir
97
(TRKd) was the second most prevalent sample with the frequency of 21.74% (Fig.
32A). Lowest frequency was observed among the inhabitants of Sapalli tepe (SAP),
where the expression of this trait was recorded in only 2.63% of individuals and was
found completely absent in the individuals of Kuzali (KUZ) samples of Central
Asian (Fig. 32A).
A
Present Study Foot hill West Southeast Central Hindu Indus
(Swat and Dir) (Mansehra and Central Indian Indian Asian Kush valley
Abbottabad
C7LM2
B Present Study Foot hill West Southeast Central Hindu Indus

(Swat and Dir) (Mansehra and Central Indian Indian Asian Kush valley
Abbottabad
Figure 32. Frequencies distributions of Metaconulid at lower molars (C7LM)

among all samples included in this study (A) Frequency of
Metaconulid at lower molar 1 (C7LM1). (B) Frequency of
Metaconulid at lower molar 2 (C7LM2)
98
The Metaconulid at lower molar 2 (C7LM2) was observed 11.80% in the individuals
of Tarklani (TRKd), Utmankheil (UTHd) 3.90%, Gujar (GUJsw) 2.52%, Kohistani
(KOHsw) 1.80%, while it was completely absent in the individuals of Yousafzai
(YSFsw) samples from Swat and Dir districts (Table 8). It was found that the sample
of Tarklani from District Dir was the most prevalent among all of the samples
included in this analysis, where 11.80% individuals express this trait followed by
Gompadhompti Madigas (GPD) from southeast Indian samples with the expression
rate of 10.98%, prehistoric sample from Timergara (TMG) 10%, CHU 9.28% and PNT
6.4% belongs to Southeast Indian samples, while in the remaining samples it was
observed in less than 6% of individuals (Fig. 32B).
Lowest frequency was observed among the Marathas (MRT) of West Central Indian
sample, where the expression of this trait was recorded in only 0.51% of individuals,
and was found completely absent in the individuals of Yousafzai sample from
District Swat (YSFsw), AWAm2 and TANm2 from District Mansehra, KUZ and SAP
from Central Asia, WAKs from Hindu Kush, SKH, CHIMRG, HAR and NeoMRG
samples of Indus Valley (Fig. 32B).
Furthermore the dental morphology data obtained from the five population samples
of Swat and Dir districts were compared with the other Pakistani, Central Asian and
Indian (living/modern and prehistoric) samples (Table 7) and their results were
used for further analysis.
99
3.1.2. Mean Measure of Divergence
A mean measure of divergence (MMD) analysis was carried out to determine the
patterns of affinities among the five population samples from Swat and Dir districts,
prehistoric inhabitants of the Indus Valley and South-Central Asia, as well as living
peninsular Indian ethnic groups and individuals of other ethnic groups of Pakistan
(Table 8). The distance matrix values for each set of the pairwise group comparison
are described in Table 9. The values obtained were used for further analysis. The
high MMD values represent phenetic divergence between the paired groups while
low MMD values indicate phenetic similarities between the paired samples.
3.1.3. Living Northern Pakistanis Only
Inter-sample affinities based upon pairwise MMD values were examined with
neighbor-joining cluster analysis (NJ), multidimensional scaling (MDS), and
principal coordinate analysis (PCA).
100
Table 9: Mean measure of divergence (MMD) distance matrix obtained from the pairwise group comparisons of the five
populations and the other population used in this study.
AWAm1 AWAm2ChlMRG CHU DJR GPD GUJm2 GUJsw HAR INM KARa KHO KOHsw KUZ MDA MDK MHR MOL MRT NeoMRG PNT SAP SKH SWT SYDm2 TANm2 TMG TRKd UTHd WAKg WAKs YSFsw
AWAm1 --- 0.005 0.016 0.004 0.014 0.004 0.005 0.004 0.019 0.014 0.005 0.006 0.004 0.021 0.004 0.004 0.004 0.012 0.004 0.011 0.004 0.013 0.027 0.004 0.005 0.005 0.026 0.004 0.004 0.005 0.005 0.004
AWAm2 0.008 --- 0.017 0.004 0.015 0.004 0.005 0.004 0.019 0.014 0.005 0.006 0.004 0.021 0.004 0.004 0.004 0.012 0.004 0.011 0.004 0.014 0.027 0.004 0.005 0.005 0.026 0.044 0.004 0.005 0.005 0.004
ChlMRG 0.087 0.148 --- 0.016 0.026 0.016 0.017 0.016 0.031 0.026 0.017 0.017 0.016 0.033 0.016 0.016 0.016 0.023 0.016 0.023 0.016 0.025 0.039 0.016 0.017 0.017 0.037 0.016 0.016 0.017 0.016 0.016
CHU 0.051 0.069 0.049 --- 0.014 0.003 0.004 0.004 0.018 0.014 0.005 0.005 0.004 0.020 0.004 0.004 0.003 0.011 0.003 0.010 0.003 0.013 0.027 0.004 0.004 0.004 0.025 0.004 0.004 0.004 0.004 0.003
DJR 0.080 0.084 0.058 0.051 --- 0.014 0.015 0.014 0.029 0.024 0.015 0.015 0.014 0.031 0.014 0.014 0.014 0.022 0.014 0.021 0.014 0.023 0.037 0.014 0.015 0.015 0.036 0.014 0.014 0.015 0.014 0.014
GPD 0.052 0.081 0.067 0.004 0.087 --- 0.005 0.004 0.019 0.014 0.005 0.005 0.004 0.021 0.004 0.004 0.004 0.011 0.003 0.011 0.004 0.013 0.027 0.004 0.005 0.005 0.025 0.004 0.004 0.004 0.004 0.004
GUJm2 0.030 0.048 0.083 0.086 0.092 0.097 --- 0.005 0.019 0.015 0.006 0.006 0.005 0.021 0.005 0.005 0.004 0.012 0.004 0.011 0.004 0.014 0.027 0.005 0.005 0.005 0.026 0.005 0.005 0.005 0.005 0.004
GUJsw 0.019 0.032 0.069 0.051 0.127 0.061 0.066 --- 0.019 0.014 0.005 0.005 0.004 0.021 0.004 0.004 0.004 0.011 0.004 0.011 0.004 0.013 0.027 0.004 0.005 0.005 0.026 0.004 0.004 0.004 0.004 0.004
HAR 0.019 0.027 0.054 0.049 0.122 0.059 0.074 -0.016 --- 0.028 0.020 0.020 0.019 0.035 0.019 0.019 0.019 0.026 0.018 0.026 0.019 0.028 0.042 0.019 0.019 0.019 0.040 0.019 0.019 0.019 0.019 0.019
INM -0.004 0.031 0.088 0.065 0.129 0.049 0.028 0.022 0.016 --- 0.015 0.015 0.014 0.030 0.014 0.014 0.014 0.021 0.014 0.021 0.014 0.023 0.036 0.014 0.015 0.015 0.035 0.014 0.014 0.014 0.014 0.014
KARa 0.035 0.083 0.053 0.092 0.103 0.092 0.007 0.063 0.070 0.018 --- 0.006 0.005 0.022 0.005 0.005 0.005 0.012 0.005 0.012 0.005 0.014 0.028 0.005 0.006 0.006 0.026 0.005 0.005 0.005 0.005 0.005
KHO -0.007 0.001 0.072 0.033 0.058 0.046 0.037 0.007 0.011 0.018 0.049 --- 0.005 0.022 0.005 0.005 0.005 0.013 0.005 0.012 0.005 0.014 0.028 0.005 0.006 0.006 0.027 0.005 0.005 0.006 0.006 0.005
KOHsw 0.010 0.020 0.072 0.031 0.095 0.037 0.047 -0.003 -0.007 0.017 0.050 -0.001 0.021 0.004 0.004 0.004 0.011 0.004 0.011 0.004 0.013 0.027 0.004 0.005 0.005 0.026 0.004 0.004 0.004 0.004 0.004
KUZ 0.052 0.044 0.058 0.050 -0.049 0.077 0.065 0.090 0.067 0.080 0.066 0.035 0.064 --- 0.021 0.021 0.021 0.028 0.020 0.027 0.021 0.030 0.043 0.021 0.021 0.021 0.042 0.021 0.021 0.021 0.021 0.021
MDA 0.031 0.049 0.087 0.039 0.127 0.032 0.027 0.047 0.034 0.004 0.046 0.039 0.029 0.098 --- 0.004 0.004 0.011 0.004 0.011 0.004 0.013 0.027 0.004 0.005 0.005 0.026 0.004 0.004 0.005 0.004 0.004
MDK 0.000 0.031 0.095 0.064 0.127 0.054 0.041 0.024 0.043 0.009 0.040 0.005 0.015 0.104 0.037 --- 0.004 0.011 0.004 0.011 0.004 0.013 0.027 0.004 0.005 0.005 0.026 0.004 0.004 0.004 0.004 0.004
MHR 0.016 0.035 0.097 0.053 0.152 0.047 0.045 0.017 0.006 -0.014 0.055 0.021 0.012 0.116 0.006 0.019 --- 0.011 0.003 0.011 0.004 0.013 0.027 0.004 0.004 0.005 0.025 0.004 0.004 0.004 0.004 0.004
MOL 0.055 0.059 0.021 0.040 -0.035 0.079 0.081 0.068 0.058 0.101 0.090 0.026 0.055 -0.035 0.112 0.092 0.113 --- 0.011 0.018 0.011 0.021 0.034 0.011 0.012 0.012 0.033 0.011 0.011 0.012 0.012 0.011
MRT 0.022 0.040 0.103 0.060 0.143 0.046 0.042 0.032 0.009 -0.014 0.045 0.034 0.020 0.096 0.003 0.032 -0.001 0.120 --- 0.010 0.003 0.013 0.027 0.004 0.004 0.004 0.025 0.004 0.004 0.004 0.004 0.003
NeoMRG 0.054 0.123 0.026 0.095 0.175 0.085 0.039 0.067 0.039 0.007 0.018 0.076 0.068 0.148 0.026 0.053 0.030 0.139 0.030 --- 0.011 0.020 0.034 0.011 0.012 0.012 0.032 0.011 0.011 0.011 0.011 0.011
PNT 0.031 0.064 0.063 0.015 0.108 0.005 0.095 0.030 0.015 0.027 0.080 0.029 0.020 0.086 0.031 0.037 0.026 0.084 0.027 0.061 --- 0.013 0.027 0.004 0.004 0.005 0.025 0.004 0.004 0.004 0.004 0.004
SAP 0.096 0.080 0.061 0.064 -0.041 0.106 0.118 0.112 0.101 0.142 0.135 0.061 0.092 -0.051 0.147 0.145 0.157 -0.039 0.154 0.204 0.121 --- 0.036 0.013 0.014 0.014 0.035 0.013 0.013 0.014 0.013 0.013
SKH 0.040 0.018 0.130 0.055 0.067 0.055 0.086 0.052 -0.001 -0.001 0.090 0.041 0.039 -0.026 0.039 0.094 0.037 0.064 0.009 0.113 0.038 0.046 --- 0.027 0.028 0.028 0..0481 0.027 0.027 0.027 0.027 0.027
SWT 0.000 0.029 0.070 0.038 0.098 0.038 0.057 0.014 0.024 0.019 0.056 -0.004 0.008 0.084 0.041 -0.002 0.020 0.062 0.037 0.059 0.018 0.111 0.079 --- 0.005 0.005 0.026 0.004 0.004 0.005 0.004 0.004
SYDm2 0.014 0.019 0.083 0.055 0.056 0.070 -0.003 0.045 0.042 0.023 0.020 0.014 0.025 0.028 0.024 0.033 0.037 0.044 0.035 0.062 0.068 0.075 0.047 0.040 --- 0.005 0.026 0.005 0.005 0.005 0.005 0.004
TANm2 0.013 -0.004 0.126 0.049 0.043 0.065 0.033 0.051 0.047 0.032 0.070 0.008 0.029 0.015 0.037 0.042 0.043 0.038 0.042 0.115 0.063 0.052 0.011 0.039 0.004 --- 0.026 0.005 0.005 0.005 0.005 0.005
TMG -0.014 -0.012 0.086 0.037 0.062 0.041 0.003 0.003 -0.022 -0.051 0.003 -0.007 -0.009 -0.005 -0.001 0.016 -0.009 0.040 -0.016 0.041 0.024 0.069 -0.057 0.021 -0.024 -0.018 --- 0.026 0.026 0.026 0.026 0.025
TRKd 0.037 0.066 0.049 0.048 0.150 0.055 0.051 0.007 0.001 0.016 0.042 0.032 0.012 0.113 0.029 0.036 0.016 0.093 0.029 0.030 0.034 0.148 0.073 0.032 0.042 0.073 -0.001 --- 0.004 0.004 0.004 0.004
UTHd 0.029 0.050 0.075 0.052 0.152 0.056 0.063 0.004 0.005 0.024 0.068 0.020 0.005 0.138 0.037 0.021 0.013 0.092 0.035 0.056 0.036 0.151 0.103 0.017 0.046 0.063 0.018 0.006 --- 0.004 0.004 0.004
WAKg 0.000 0.007 0.109 0.062 0.120 0.066 0.051 0.004 0.005 0.005 0.062 -0.005 0.002 0.084 0.042 0.004 0.012 0.071 0.029 0.077 0.041 0.116 0.052 0.003 0.028 0.024 -0.013 0.028 0.010 --- 0.005 0.004
WAKs 0.010 0.004 0.158 0.089 0.154 0.095 0.068 0.015 0.016 0.025 0.086 0.006 0.015 0.104 0.060 0.016 0.026 0.102 0.043 0.111 0.061 0.145 0.057 0.018 0.041 0.030 -0.003 0.045 0.028 -0.006 --- 0.004
YSFsw 0.008 0.036 0.092 0.063 0.118 0.058 0.081 0.015 0.012 0.025 0.067 0.008 0.015 0.088 0.060 0.014 0.030 0.083 0.036 0.072 0.023 0.125 0.057 0.003 0.063 0.057 0.028 0.047 0.038 0.015 0.024 ---
MMD= Below Diagonal
MMDsd= Above Diagonal
101
3.1.3.1. Neighbor-joining Cluster Analysis
Neighbor-joining cluster analysis revealed that, the six samples located at lower
right are ethnic groups from peninsular India and shows close affinitites to each
others (Fig. 33).
Figure 33. Neighbor-joining cluster analysis of modern populations of northern

Pakistan, peninsular Indian populations and their comparison to the
major ethnic groups of Swat and Dir districts, Pakistan.
As expected, the three Dravidian-speaking samples from Andhra Pradesh (CHU,
GPD, PNT) exhibit closest affinities to one another, as do the three Indo-Aryan-
speaking ethnic group samples from Maharashtra (MDA, MHR, MRT). The
remaining samples fall into four aggregates. The first aggregate includes four of the
five samples (GUJsw, KOHsw, TRKd, UTHd) collected from Swat and Dir districts
102
all of whom show a very close affinities to one another, while the sample of
Yousafzais from Swat, fall into the second aggregate, in which all of the remaining
members, except one (SWTm: Swatis from Mansehra District), are highland samples
from either Chitral District (KHO, MDK) or Gilgit-Baltistan (WAKg, WAKs). Apart
from YSFsw, the sample with closest affinities to these highland samples is the
sample of Kohistanis from Swat (KOHsw). Reassuringly, the two samples of Wakhis
(WAKg, WAKs) exhibit closest affinities to one another. The third aggregate has only
two “core” members and one peripheral member. The “core” members are the
Awans (AWAm2) and Tanolis (TANm2) from Mansehra District, while the
peripheral member is the sample of Awans (AWAm1) also collected from Mansehra
District. The fourth aggregate includes three members: Gujars (GUJm2) and Syeds (
SYDm2) from Mansehra District, as well as the sample of Karlaars (KARa) collected
from Abbottabad.
3.1.3.2. Multidimensional Scaling —Kruskal’s Method
Multidimensional scaling into three dimensions with Kruskal’s method was
accomplished in seven iterations with a stress value of 0.0702 (a very good fit),
accounting for 84.63% of the variance between samples. The three samples of
Dravidian-speaking ethnic groups from southeast India are separated in the upper
right of the array from all other samples (Fig. 34). The sample of Yousafzais (YSFsw)
collected from Swat District are identified as possessing unexpectedly close affinities
to the three Dravidian-speaking samples (PNT, CHU, GPD) from Andhra Pradesh in
southeast peninsular India. The three ethnic groups from Maharashtra (MDA, MRT,
MHR) are found in the lower left and the Tarklanis from District Dir are interposed
103
between the tribal Madia Gonds (MDA) of eastern Maharashtra and high-status
Marathas (MRT) from western Maharashtra (Pune).
Figure 34. Multidimensional scaling (Kruskal's method) of the major ethnic

groups residing in Swat and Dir districts in comparison with other
living Pakistani and peninsular Indian ethnic groups.
The sample of Utmankheils (UTHd) from Dir possess closer affinities with the low-
status Mahars (MHR) from western Maharashtra than to high-status Marathas
(MRT) found right next door to them in western Maharashtra. In the foreground is
an array of ethnic group samples from the northern foothills of the Indus Valley.
These samples are the members of aggregates three and four described above for the
104
neighbor-joining analysis. The most divergent are the Karlaars from Abbottabad, as
well as the Gujars and Sayeds, followed by Tanolis and Awans from Mansehra
District, respectively. Their connection to the remaining samples is a very distant
affinity between the sample of Awans (AWAm2 and AWAm1) from Mansehra
District.
The remaining samples may be identified as falling into two aggregates. The first is
found in the middle-left of the array and is composed of the Yousafzais and
Kohistanis from Swat, as well as the sample of Swatis from Mansehra. The second
aggregate is found in the lower left and includes the two Wakhi samples from
Gulmit (WAKg) and Sost (WAKs), the inhabitants of Madak Lasht (MDK) and the
Khows (KHO) from Chitral District, as well as the sample of Awans (AWAm1) from
Mansehra District.
The overall results obtained from the Kruskal’s method may be summarized as
follows: 1) The sample of Yousafzais from Swat (YSFsw) have close affinities with
the Dravidian-speaking ethnic groups (CHU, GPD and PNT) of Andhra Pradesh
India; 2) Gujar (GUJsw) and Kohistani (KOHsw) samples from Swat show are
marked by rather close affinities with an array of highland samples from Gilgit-
Baltistan (WAKs, WAKg) and Chitral District (KHO, MDK), as well as to the sample
of Swatis from Mansehra District; 3) The samples collected from Dir District (UTHd,
TRKd) show affinity to west-central peninsular Indians (MRT, MDA, MHR) and
apparently possess no affinities to the samples collected from Swat (GUJsw, KOHsw,
YSFsw) and the other samples from Pakistan included in this analysis.
105
3.1.3.3. Multidimensional Scaling —Guttman’s Method
Multidimensional scaling into three dimensions with Guttman’s method was
accomplished in 11 iterations, with a stress value of 0.0906 (an extremely good fit)
accounting for 92.05% of the variance between samples. In general, the patterning
found in this array (Fig. 35) is similar to that described for MDS with Kruskal’s
method.
Figure 35. Multidimensional scaling (Guttman’s method) of the major ethnic

groups residing in Swat and Dir districts in comparison with other
living Pakistani and peninsular Indian ethnic group samples.
Once again, the three Dravidian-speaking samples from southeast peninsular India
(CHU, GPD, PNT) are clearly distinguished from all other samples, and on the left
side of the array there is an aggregation of highland samples, which include Gujars
106
(GUJsw), Kohistanis (KOHsw) and Yousafzais (YSFsw) from Swat, the inhabitants of
Madak Lasht (MDK), the two Wakhi samples (WAKg, WAKs) and the Khows
(KHO) of Chitral District. The array also includes three of the foothill samples—
Swatis (SWT), Awans (AWAm1) and (possibly) the sample of Awans (AWAm2)
from Mansehra District. The Tarklani (TRKd) and Utmankheil (UTHd) samples from
Dir once again are closely associated with the three ethnic groups of west-central
India (MHR, MRT, MDA), while the samples from Mansehra District (TANm2,
SYDm2, and KARa) are surprisingly isolated from all of the other Pakistani
population samples, except one of the Awan samples from Mansehra District
(AWAm2).
3.1.3.4. Principal Coordinate Analysis
The first three principal axes generated by principal coordinate analysis (PCA)
capture 83.92%of the total variance among samples (Fig. 36). The first axis accounts
for 43.487% of the variance, the second 25.962%, and the third 14.468%. Axis 1 was
elongated to reflect the fact that the greatest proportion of variance is accounted for
by this axis. This plot shows many similarities but also some differences from the
results obtained by cluster analysis and multidimensional scaling. Occupying an
isolated position in the lower right of the array, the three Dravidian-speaking ethnic
groups from southeast peninsular India (CHU, GPD, PNT) are all identified as
possessing closest affinities to one another and are segregated away from all other
samples. Even better, the three Indo-Aryan-speaking samples from west-central
peninsular India (MDA, MRT, MHR), located at the top of the array, are identified as
107
possessing closer affinities to one another than to any of the other samples. Most of
the highland samples aggregate together in the upper left side of the array and this
aggregate includes the inhabitants of Madak Lasht (MDK), the Wakhi (WAKg)
sample from Gulmit, the Khow of Chitral District (KHO) and the Kohistanis
(KOHsw) of Swat, along with the foothill samples of Awans (AWAm1) and Swatis
(SWT) from Mansehra District. The Wakhis from Sost (WAKs) show no affinity with
the other sample of Wakhis from Gulmit (WAKg). However, once again, all of the
remaining samples from Mansehra District show closest, albeit distant, affinities to
one another and stand apart from all other samples included in this analysis.
Intriguingly, the two samples from Dir, the Tarklanis (TRKd) and Utmankheils
(UTHd), show little affinity to one another and act as phenetic “bridge” linking the
highly divergent aggregates to the remaining samples. In the case of the former, it is
the Indo-Aryan-speaking samples from west-central India (MRT, MHR, MDA),
while for the latter it is the Dravidian-speaking samples from southeast India (CHU,
GPD, PNT). This pattern suggests that the Tarklanis and Utmankheils do not have
any particular affinities to any of the other samples and may be considered to
represent phenetic isolates relative to the array of living South Asian ethnic groups
encompassed by the current study. Hence, the overall results obtained from the PCA
show that, within the present studied population samples from Dir and Swat
Districts, the Tarklanis and Utmankheils show some close affinities to one another,
Gujars and Kohistanis are marked by little affinity to one another while the
Yousafzai are highly isolated from the rest of the samples. This pattern also shows
that Kohistanis Gujars, and Utmankheils all exhibit moderate affinities to one
108
another, with the Yousafzais more divergent. The Tarklanis appear to share no
affinities to the other four sampled ethnic groups from Dir and Swat disticts.
Figure 36. Principal coordinate analysis (PCA) of the major ethnic groups
residing in Swat and Dir districts in comparison with other living
Pakistani and peninsular Indian ethnic groups.
109
3.1.4. Living Pakistanis Considered in Light of Living Peninsular Indians and
Prehistoric Inhabitants of the Indus Valley and South-Central Asia
3.1.4.1. Neighbor-joining Cluster Analysis
Neighbor-joining cluster analysis identifies six sample aggregates. Beginning at the
extreme left is an aggregate of eight samples. These eight samples may be further
divided into two sub-aggregates and a “bridge” sample (Fig. 37).
Figure 37. Neighbor-joining cluster analysis of the living Pakistani, other living
and prehistoric inhabitants of the Indus Valley, South-Central Asia
with the major ethnic groups from Swat and Dir districts.
110
The first sub-aggregate is composed of four samples. These samples include the two
prehistoric samples from Mehrgarh (NeoMRG, ChlMRG) and two living samples,
the Karlaars (KARa) from Abbottabad and the Gujars from Mansehra District
(GUJm2). The second sub-aggregate is also composed of four samples. These include
the three living Indo-Aryan-speaking ethnic groups from Maharashtra (MDA, MRT,
MHR), which exhibit closest affinities to one another, followed by the prehistoric
sample also from Maharashtra (INM). Occupying a position in between these two
sub-aggregates is the sample of Tarklanis from Dir, which are identified as
possessing somewhat closer affinities to the west-central peninsular Indian samples
than to the living samples of Karlaars (KARa) and Gujars (GUJm2) from the foohills
riming the northern margin of the Indus Valley of Pakistan.
The second aggregate, found in the upper center of the array, is composed of the
three living Dravidian-speaking ethnic groups from southeast peninsular India
(CHU, GPD, PNT), which show closest affinities to one another and are distantly
separated from all of the other samples included in this analysis.
The third aggregate, found in the upper right, includes the prehistoric sample from
Harappa (HAR), two of the samples from Swat (KOHsw, GUJsw), and the sample of
Utmankheils (UTHd) from Dir District. Intriguingly, it is the Kohistani sample from
Swat (KHOsw) that links the members of this aggregate to the rest of the samples
included in this analysis, the Utmankheils (UTHd) from Dir are identified as the
most divergent, while the prehistoric sample (HAR) is interposed between the
Kohistanis (KOHsw) and Gujars (GUJsw) from Swat.
111
The fourth aggregate, also found in the center right of the array, encompasses six
samples, all but one of which may be considered highland samples. The Khows
(KHO) from Chitral District serve as the sample that links the members of this
aggregate to the rest of the samples included in this analysis. The remaining samples
are divided into two sub-aggregates. The first is composed of the two Wakhi
samples (WAKg, WAKs), which show closest affinities to one another, while the
second is composed of the inhabitants of Madak Lasht (MDK), the Yousafzai from
Swat (YSFsw), and one of the samples of Awans (AWAm1) from Mansehra District.
The fifth aggregate, found in the lower center, encompasses five samples. These
include the prehistoric sample from Timargara (TMG), three of the samples (SYDm2,
AWAm2, TANm2) from Mansehra and the late prehistoric sample from Sarai Khola
(SKH). Interestingly, it is the prehistoric sample from Dir, Timargara (TMG) that
serves to link these samples to the rest of the samples included in this analysis (apart
from the members of aggregate six), while it is the prehistoric sample from Sarai
Khola (SKH), that serves as the link between the members of this aggregate to the
members of aggregate six.
Aggregate six, which includes four samples found in the lower right of the array, is
composed entirely of the prehistoric samples from south Central Asia (KUZ, MOL,
DJR, SAP). These samples are strongly separated from all of the other samples
included in this analysis and also the members of aggregate five. Of note, Sarai
Khola (SKH) shows much closer affinities to Sayeds (SYDm2), Awans (AWAm2) and
Tanolis (TANm2) from Mansehra District than to either Kuzali (KUZ), the
112
phenetically most proximate of the south Central Asian samples, or to the prehistoric
Gandharan Grave Culture sample from Timargara (TMG).
Dimension 1 also serves to separate the members of sub-aggregate one (ChlMRG,
NeoMRG, KARa, GUJm2) from all other samples in the lower right. Highland
samples are found in the upper right and stand apart from the other samples by
possessing high scores for Dimension 3. These not only include the two samples
from Dir (TRKd, UTHd), who possess closest affinities to one another and to the
sample of Yousafzais (YSFsw) from Swat District, but also the two Wakhi samples
(WAKg, WAKs), who likewise express closest affinities to one another, the
prehistoric sample from Timargara (TMG), the Khows (KHO) from Chitral District,
Kohistanis (KOHsw) from Swat, and the inhabitants of Madak Lasht also from
Chitral District (MDK). This aggregate also include the sample of Swatis from
Mansehra District and the prehistoric sample from Harappa (HAR). The Indo-Aryan
speaking ethnic groups (MRT, MHR, MDA) and the prehistoric inhabitants (INM)
from west-central India are found in the lower right of the array.
3.1.4.2. Multidimensional Scaling—Kruskal’s Method
Multidimensional scaling into three dimensions with Kruskal’s method was
accomplished in 10 iterations, with a stress value of 0.0686 (very good fit) accounting
for 85.14% of the variance between samples. Dimension 1 provides a clear separation
of the south Central Asian samples located in the upper left of the array from all
other samples, with the late prehistoric sample from Sarai Khola (SKH) occupying
the most proximate position to them (Fig. 38).
113
Figure 38. Multidimensional scaling with Kruskal's method of Smith’s MMD
pairwise distances among living Pakistani ethnic groups, living
ethnic groups of peninsular India and samples of the prehistoric
inhabitants of the Indus Valley, South Central Asia and the major
living ethnic groups of Swat and Dir districts.
However, the three living Dravidian-speaking ethnic groups (CHU, GPD, PNT)
from southeast India do not exhibit any affinities to one another. Instead, the
Gompadhompti Madigas (GPD) occupy a position in the lower right of the array
with closest affinities to the high-status Marathas (MRT) of west-central India. The
middle-status Pakanati Reddis (PNT) are found in the middle-right, with close
affinities to Gujars (GUJsw) from Swat and to the living inhabitants of Madak Lasht
(MDK), while the tribal Chenchus (CHU) possess only distant affinities to the Khows
114
(KHO) of Chitral District and to one of the samples of Awans (AWAm1) from
Mansehra District.
3.1.4.3.Multidimensional Scaling—Guttman’s Method
Multidimensional scaling into three dimensions with Guttman’s method (Fig. 39)
was accomplished in 11 iterations, with a stress value of 0.0522 (very good fit)
accounting for 90.97% of the variance between samples.
Figure 39. Multidimensional scaling (Guttman’s method) of living Pakistani

and peninsular Indian ethnic groups, prehistoric inhabitants of the
Indus Valley and South Central Asia, as well as samples of the major
ethnic groups from Swat and Dir districts.
115
The pattern is similar, but not identical to that described above for MDS with
Kruskal’s method. Once again, Dimension 1 serves to separate the four prehistoric
samples from south Central Asia in the upper left of the array from all other samples
(Fig. 39). However, in this case, the links between these samples and all others is
with the Tanoli (TANm2) sample from Mansehra District, rather than with Sarai
Khola (SKH), which stands out as an isolate from all other samples.
The members of sub-aggregate one of aggregate one identified in the neighbor-
joining cluster tree are found in the lower left, but in this case, it is the Chalcolithic
period sample (ChlMRG) from Mehrgarh that is identified as a distant outlier, while
the earlier Neolithic (NeoMRG) occupants of this site are identified as possessing
close phenetic affinities to the Karlaars (KARa) of Abbottabad District and Gujars of
Mansehra District (GUJm2). Once again, the three living Indo-Aryan speaking ethnic
groups from Maharashtra (MHR, MDA, MRT), as well as the prehistoric inhabitants
of this same region of peninsular India (INM), occupy the lower right of the array
and possess closest affinities to one another.
The highland samples are more widely dispersed and are divided into two groups.
The first, possessing high values for Dimension 3, occupy the upper right of the
array and include the two Wakhi samples (WAKg, WAKs, which exhibit closest
affinities to one another), the Yousafzai from Swat (YSFsw), the Utmankheils
(UTHd) and Tarklanis from Dir (TRKd), the prehistoric inhabitants of Timargara and
the sample of Swatis (SWT) from Mansehra District. The second aggregate of
highland samples, possessing lower values for Dimension 3, include the Khows
(KHO) of Chitral District, one of the samples of Awans (AWAm2) from Mansehra
116
District, Gujars (GUJsw) from Swat District, the prehistoric sample from Harappa
(HAR), and the inhabitants of Madak Lasht (MDK). However, and rather troublingly
so, this aggregate also includes the tribal Chenchus (CHU) from Andhra Pradesh as
well as the two other Dravidian-speaking samples (GPD, PNT) from southwest
India.
3.1.4.4. Principal Coordinate Analysis
The first three principal axes generated by principal coordinate analysis (PCA)
combine to capture 84.75% of the total variance among samples. This plot shows
many similarities with the results obtained by neighbor-joining cluster analysis by
the two versions of multidimensional scaling. Found in the lower right of the array,
the four prehistoric south Central Asian samples (SAP, DJR, MOL, KUZ) are once
again clearly distinguished from all other samples (Fig. 40). The Neolithic
inhabitants of Mehrgarh (NeoMRG), the Karlaars (KARa) from Abbottabad and the
Gujars (GUJm2) of Mansehra District occupy an isolated position in the upper left,
as well as the Chalcolithic (ChlMRG) inhabitants of Mehrgarh, which stand out as an
isolate in the upper foreground of the array. Reassuringly, the three living
Dravidian-speaking ethnic groups (GPD, CHU, PNT) of southeast peninsular India
exhibit closest affinities to one another in the lower left of the array, while the three
living Indo-Aryan speaking ethnic groups (MDA, MRT, MHR) of west-central India,
as well as the prehistoric inhabitants (INM) of this region are tightly grouped
together in the center-left and possess secondary affinities to the two samples (TRKd,
UTHd) from Dir District.
117
Once again, the remaining samples can be divided into two aggregates. Members of
the first aggregate, separated by higher scores for Axis 2, include two Wakhi samples
(WAKg, WAKs), one of the samples of Awans (AWAm1) from Mansehra District
and the prehistoric Gandharan Grave Culture sample from Timargara (TMG). The
second aggregate, with lower scores for Axis 2, includes the Khows (KHO) of Chitral
District, Kohistanis (KOHsw) from Swat, the prehistoric sample (HAR) from
Harappa, Swatis from Mansehra District (SWT), as well as the samples of Gujars
(GUJsw) and Yousafzais (YSFsw) from Swat.
Figure 40. Principal coordinate analysis (PCA) of living Pakistani and

peninsular Indian ethnic groups, samples of prehistoric inhabitants
of the Indus Valley and South Central Asia, as well as samples of
the major living ethnic groups of Swat and Dir districts.
118
3.2. Mitochondrial DNA analysis
3.2.1. Genomic DNA isolation
The gDNA obtained from saliva was collected from individuals belonging to five
major ethnic groups (Tarklanis, Yousafzais, Kohistanis, Gujars, Utmankheils) of Swat
and Dir districts using a protocol established in our lab was of a good quality and
quantity (Fig. 41).
(A)
(B)
Figure 41. Photographs representing quality and concentretion of gDNA (A)

Agarose gel electrophoresis (B) electropherogram
3.2.2. PCR amplification
The gDNA was used to amplify the control region (HVSI, HVSII) of mtDNA by PCR
using a set of primers and the amplified products were separated by electrophoresis
on 1.5% agarose gel. The amplified fragments and their corresponding band sizes are
given in Figure 42.
119
(a)
(b)
Figure 42. Agarose gel electrophoresis photograph of mtDNA control region

(a) amplfied PCR fragment of HVSI (b) amplfied PCR fragment of
HVSII
The amplified PCR products (HVSI, HVSII) were cleaned from the agarose gel for
sequencing and very good results were obtained (Fig. 43).
Figure 43. Agarose gel electrophoresis photographs (a) eluted PCR products of
mtDNA HVSI (b) eluted PCR products of mtDNA HVSII.
Furthermore, the sequencing results of the PCR products obtained from Macrogen,
Inc. were converted to FASTA format and BLAST using national center for
120
biotechnology information (NCBI) data base. The mismatched sequences were
removed while the most matched and accurate sequences were used for analysis.
3.2.3. MtDNA Haplogroups determination
The mtDNA sequences of the five sampled populations (Gujars=73, Tarklanis=62,
Yousafzais=56, Utmankheils=70, and Kohistanis=37) were used and determined
their respective mtDNA haplogroups which are described below.
3.2.3.1 MtDNA Haplogroups determination in the individuals of Gujar population
A total of 73 samples of Gujars were analyzed for the mtDNA control region (HVSI,
HVSII). About 46 different haplotypes were observed, among which 29 were unique
and 17 were shared by more than one individual. Occurring in (7%) of individuals,
haplogroup M6 was found to be the most frequent. The corresponding mtDNA
genetic diversity among members of the Gujar sample was (0.9223), power of
discrimination (0.9097) and random match probability was determined (0.0903)
respectively (Table 10).
Table 10. Statistical analysis of the Gujar sample from Swat
Population statistics
Total number of samples 73
No of haplotypes 46
No of unique haplotypes 29
Random match probability 0.0903
Power of discrimination 0.9097
Genetic diversity 0.9223
The regional identification of haplogroups observed among Gujars is as follows: 42%
South Asian, 37% West Eurasian, 11% East Eurasian, 4% Southeast Asian, 2.7% East
121
Asian, 1.4% Eastern European, and 1.4% North Asian. The South Asian haplogroups
include: M6 (7%), M30 (4%), M37 (4%), M5c (4%), M3 (2.7%), M3a (2.7%), M5 (2.7%),
M52a (2.7%), R5a (2.7%), M30d (1.4%), M3c (1.4%), M53 (1.4%), M54 (1.4%), M7c
(1.4%) and R22 (1.4%). West Eurasian haplogroups include: H2a (4%), T2b (4%),
H14a (2.7%), H5 (2.7%), K1a (2.7%), U7a (2.7%), H1 (1.4%), H1a (1.4%), H1e (1.4%),
H3p (1.4%), N (1.4%), T (1.4%), T1a (1.4%), U2a (1.4%), U4a (1.4%), U5b (1.4%), U7
(1.4%), V9a (1.4%) and W3a (1.4%). East Eurasian haplogroups include: B4a (5%),
D4b (1.4%), D4e (1.4%), D4g (1.4%) and D4p (1.4%). Southeast Asian haplogroups
include: F1 (1.4%), G2b (1.4%) and S (1.4%). East Asian haplogroups include: A
(2.7%), Eastern European H7i (1.4%) and North Asian includeS haplogroup J (1.4%).
The observed haplogroup frequencies, their respective variants, and geographic
position are provided in Figure 44 and Table 11.
122
(11%)
Figure 44. Graphical representation of mtDNA haplogroups frequencies

present in Gujar sample from Distict Swat.
123
Table 11. Haplogroup frequencies and their respective variants found in the Gujar sample from of Swat
S.NO Frequency Variants Hg HGO

1 2 A73G, T152C, A234G, A235G, A263G, C309CCT, T310C, AC523d, C560A, T16105C, C16115A, A EA
C16223T, C16290T, T16311C, G16319A, T16362C, T16413A
2 4 A73G, T195C, A263G, C309CCT, T310C, AC523d, C560A, A16100T, T16189C B4a EEA
3 1 C16115A, C16223T, G16274A, A16307T, C16332T, C16355T, T16362C, A16367G, G16384T, A16387G D4b EEA
4 1 T152C, T155A, A165T, A178T, C16083A, T16090A, A16100T, C16223T, G16274A, T16362C D4e EEA
5 1 C151T, T152C, A263G, A290T, C298T, C308T, C315CC, G323T, C324T, C332T, T334TT, C340T, D4g EEA
C349T, C356T
6 1 A73G, T195C, C198T, A263G, C309CCT, T310C, T482C, T489C, AC523d, T16097C, G16110A, D4p EEA
G16414A
7 1 A16183C, T16189C, A16194C, T16195A, C16197G, C16201A, A16203AA, C16205A, T16209A, F1 SEA
C16211A, C16214A, A16230T, C16234A, C16236A, T16243G, C16245G, A16258C, A16265C, A16269C,
T16271A, T16276A, C16282A, A16293C, C16301A, T16304C, C16306A, T16308A, T16311A, C16313A,
A16322T, C16332A, C16339A, A16340T, T16347C, C16358T, T16359C, A16367T, T16368G, T16372A
8 1 G62GG, A73G, G184A, A200G, A263G, T310C, T310TTC, G380T, G389T, A396T, G410T, A425T, G2b SEA
T430C, C445T, C465T, T16094C, T16117A, T16189C, C16192CT, A16194G, T16195G, A16212T,
A16220C, C16223T, C16239G, C16245G, A16258C, A16265T, A16269G, T16276A, A16277C, A16285C,
A16293T, C16294T, C16296T, A16305T, A16316G, A16326C, T16330G, A16333T, T16334A, G16346A,
T16347C, A16351T, T16362C, A16367G, T16368G
9 1 G53GC, A263G, T310C, T310TTC, T16154C, G16156C, C16159T, A16166C, T16189C, A16402C, H1 WE
T16413C
10 2 G71GG, T72G, A263G, C315CC, A16180C, C16256T, T16352C, G16414A H14a WE
11 1 G92A, A111d, G124T, A126T, T131A, G184A, G185T, G187A, A200G, G203T, C231T, A241T, A248T H1a WE
12 1 T89TT, C150T, A263G, C264T, A300G, T310A, G316C, C317CC, G329T, C330T, C332T, A339T, H1e WE
A351T, A357T, A360T
13 3 A263G, C315CC, T16075A, C16223T, C16234T, G16274A H2a WE
124
14 1 G53GC, A263G, C315CC, T16154C, G16156C, A16166C, C16168A, C16169A, C16174A, C16222T, H3p WE
C16242T, G16273A, T16356C
15 2 A263G, C315CC, G366A, G389A, T408TT, A419C, A426T, A428T, C436A, C438T, A439T, A443T, H5 WE
C445A, A446T, C456T, C462T
16 1 G124T, T125A, T133A, A178T, A215d, G228A H7i EEU
17 1 G184A, A191T, A200G, C222T, A240T, A263G, C295T J NA
18 2 T63A, A73G, C150T, T199C, A263G, C315CC, G366T, C371T, G380A, T391A, A395T, A415T, A419T, K1a WE
T424TT, C438T, C441T, A443T, A451T, T452A, T453G, C459T, C462T, C467A, C476T, A478T,
G16129A, T16224C, C16301T, A16312C, C16321T, C16328T
19 2 A73G, T195C, A263G, T310C, T310TTC, G366A, T414G, T482C, T489C, AC523d, A561C M3 SA
20 3 A73G, T125C, T127C, T195A, A263G, C309CCT, T310C, T489C, AC523d, C560A, T16075A, A16078T, M30 SA
C16223T, C16234T, G16274A, G16414A
21 1 A73G, T195A, A263G, C315CC, T489C, AC523d, C16179d, C16223T, A16302G M30d SA
22 1 T199C, A263G, A278T, A281T, A291T, C311T, G16096C, T16097C, C16223T, T16304C, T16362C M35b SA
23 3 A73G, C151T, T152C, A263G, C309CCT, T310C, T489C, T16075A, C16085A, C16221T, C16223T M37 SA
24 2 C194T, T195C, T204C, G260T, A263G, C271T, A272T, C273A, C315CC, A331T, C332A, C349T, M3a SA
A16074G, T16126C, C16192T, C16223T, A16312G
25 1 A73G, T195C, A263G, C309CCT, T310C, T482C, T489C, AC523d, T16126C, T16154C, C16223T, M3c SA
T16224C
26 2 G53GT, A73G, T195C, A263G, C309CCCT, T310C, T489C, C560A, G16129A, C16223T M5 SA
27 2 A73G, C78CA, G79C, T195C, A237T, A263T, C268T, C269T, A281T, A287T, C16223T, C16266T, M52a SA
A16275G, C16327A, G16390A
28 1 T16154C, A16164C, A16165C, T16189C, C16192T, C16223T, C16294T, A16316G, T16362C, G16384A, M53 SA
T16386A
29 1 A73G, A263G, C315CC, T489C, C560A, A16070C, G16129A, C16223T, T16304C, T16325C, G16414A M54 SA
30 3 A73G, C150T, A263G, C315CC, T489C, C560A, G16110A, C16111A, C16115A, G16118A, T16126C, M5c SA
G16129A, T16209C, C16223T, T16311C
125
31 5 G54GG, A73G, T152C, A214G, A263G, C315CC, C461T, T489C, AC523d, T16140A, T16152C, M6 SA
T16154C, A16155C, A16164C, A16165C, C16174G, C16223T, G16274A, T16323A, A16351T, T16362C,
C16376T, G16384T, A16387G, C16404T
32 1 T16068C, A16070C, A16074G, A16078T, G16110A, T16117A, T16140d, T16144A, C16147T, A16158d, M7c SA
T16161A, A16171T, T16172C, A16182T
33 1 C16223T N WE
34 1 A73G, A188G, C194T, T204C, G207A, A263G, G316C, C317CC, G329A, A339T, T344C, C353G, R22 SA
A358T
35 2 G85T, G94A, G107T, A111T, G124A, T146C, T152C, A210T, G229T, A263G, G275T, A278T, A281T, R5a SA
C299T, A301T, C307T, C312CT, C312T, T16094C, G16096C, T16097C, C16099T, C16266T, T16304C,
T16311C, T16356C, C16393T, C16404T
36 1 T152C, A263G, C315CC, T455TT, A492T, A515T, CAC516d, C558T, A16066T, C16069T, A16070T, S SEA
A16074G, C16176T, C16185T, C16223T, A16246T, A16309T, G16346T, C16348T, A16402T
37 1 A87G, A263G, C315CC, T16126C, T16143G, C16151G, C16188T, T16189C, A16194G, A16207G, T WE
A16216T, C16234T, T16263C, A16277T, C16279T, A16284T, A16289T, C16294T, G16303T, C16321T,
C16327T, C16337T, T16342A, A16343T, C16353T, A16367G, C16382T, T16386G, A16387G, C16393T,
C16395G
38 1 A73G, T152C, T195C, A263G, C309CCT, T310C, C16174A, C16186T, T16189C, C16294T T1a WE
39 3 A73G, A263G, C285T, T310C, G316C, C317CCCC, T321C, C324G, C332A, C343T, C362A, G366A, T2b WE
T372C, A379C, G380A, T383C, T391A, C394T, C404T, C411G, G429C, T430C, A432C, A448T, T460C,
A464C, T471C, C473A, T474A, T482C, T489A, A492C, A523C, C527G
40 1 A73G, A183G, A188G, C194T, G207A, A263G, C315CC, G545A, G16110A, C16115A, G16129A, U2a WE
A16206C, T16362C
41 1 A73G, T99TT, G124T, T199C, A263G, A270T, C296T, A300T U4a WE
42 1 A73G, C150T, A263G, C315CC, C560A, T16093C, T16094C, T16097C, G16110A, C16111A, C16115A, U5b WE
T16131G, C16270T, G16412C
43 1 C16114A, C16115A, A16309G, A16318T, A16416T U7 WE
126
44 2 A73G, G94T, G97T, G103T, G121T, C151T, T152C, A183T, G187A, A189T, T208A, T233A, A243T, U7a WE
A249T, G260T, A263G, G275A, T16121C, T16126C, T16131G, T16263C, A16269G, T16288C, T16304A,
A16309G, T16311A, A16318T, T16359C, T16362C, T16372C, T16396C
45 1 T119C, A189G, T195C, T204C, G207A, A263G, C315CC, C516T, C530T, T16093C, T16094C, T16097C, V9a WE
T16105C, G16213A, G16274A, G16319A, T16362C, G16390A
46 1 A73G, G143A, A189G, C194T, T195C, T199C, T204C, G207A, A263G, C315CC, C16223T, C16292T, W3a WE
G16414A
haplogroup Hg; Haplogroup origin Hgo; East Asian EA; Southeast Asian, SEA; West Eurasian, WE; Eastern European, EEU; North Asian, NA;
South Asian, SA; Eastern Eurasian, EEA.
127
The haplotypes of the Gujar sample were assigned to mega-haplogroups. The most
common mega-haplogroup was haplogroup R, which was found in 35 (48%)
individuals, followed by haplogroup M 33(45%) and N 5(7%), respectively (Fig. 45).
Figure 45. Mega-haplogroup frequencies observed in the sample of Gujars from

District Swat through mtDNA control region (HVSI and HVSII).
By comparing the genetic parameters of reported populations living in Pakistan with
the current sampled Gujar population, we found that the Gujars of Swat have a
moderate number of unique haplotypes (29) relative to other sampled Pakistani
ethnic groups (Table 12). The moderate frequency of unique haplotypes is reflected
by the low genetic diversity (0.922) in the Gujar samples of the present study as
compared to the other reported ethnic groups from Pakistan, except the Kalash
whose genetic diversity (0.851) is very low (Table 12).
128
Table 12. Diversity comparison of the sampled Gujar population from Swat with the other reported ethnic groups of Pakistan.
Parameters Gujars Mak Sk Pt Bl Br Hz Hb Ks Ps Sd Pk

present study
No of samples 73 100 85 230 39 38 23 44 44 44 23 100
No of haplotypes 46 70 63 157 26 22 21 32 12 22 21 77
No of unique 29 54 58 128 18 15 19 25 5 12 19 63
haplotypes
Genetic diversity 0.922 0.968 0.957 0.993 0.974 0.952 0.992 0.98 0.851 0.95 0.992 0.992
Mak, Makrani; Sk, Saraiki; Pt, Pakhtuns; Bl, Baluch; Br, Brahui; Hz, Hazara; Hb, Hunza burusho; Ks, Kalash; Ps, Pasrsi; Sd,
Sindhi; Pk, Pakistan Karachi.
129
3.2.3.2. MtDNA Haplogroups of the sampled Tarklani population of District Dir
A total of 62 unrelated Tarklanis from District Dir, Pakistan were sampled and
analyzed for HVSI and HVSII sequencing. Some 42 different haplotypes were
observed, among which 31 were unique and 11 were shared by more than one
individual. The corresponding mtDNA genetic diversity of the Tarklanis was
(0.9449), power of discrimination (0.9297) and random match probability was
(0.0703) (Table 13).
Table 13. Statistical analysis of Tarklanis of District Dir
Population statistics
No of haplotypes 42
No of unique haplotypes 31
Random match probability 0.0703
Power of discrimination 0.9297
Genetic diversity 0.9449
Out of the 42 haplotypes, 31 (74%) were observed once, 6 (14%) twice, 2 (5%) three
times, 2 (5%) four times and 1 (2%) five times. The haplotype frequencies, their
respective variants, and geographic association are provided in Table 14.
130
Table 14. Haplogroup frequencies and their respective variants found among the sampled Tarklanis of Dir District
S.No Frequency Variants Hg HGO

1 1 T146C, T152C, T195C, T16126C, G16129C, C16223T, T16298C, C16327T C4b(C4b1) NA
2 1 G103d, T152C, T195C, A263G, C315CC, G380C H11a WE
3 1 A73G, A263G, C315CC, G389C H1a WE
4 1 A73G, A263G, A281T, A16070C, G16129A, C16259T H1a(H1a3a3) WE
5 1 G62A, A73d, C86T, T152C, A263G, C309CCT, T310C, A16038C, T16046TCG, G16047C H1c WE
6 1 T152C, T195C, A263G, C315CC, G329T, G366T, A402T, C404T, G413T, C427T, T430C, H36 WE
A432T, C433T, C434T, C435T, C436T, A439T, C445T, C456T, T474A, C16278T
7 1 C64T, T195C, A263G, T310C, C315CCCC, T321G, C340A, C362A, G366A, A376C, A379C, H57 WE
G380C, G389A, A16070C, C16072G, A16074G, T16093C, T16126C, T16362C
8 1 A263G, T310C, T310TTC, C16176T, C16184T, T16357C H7a(H7a2) WE
9 1 T16131G, A16166T, C16185A, C16192G, C16205A, G16208A, T16209G, G16213C, H82 WE
A16219G, A16220G, C16223d, A16237G, A16253C, A16258C, C16261G, A16281d,
C16286T, C16287T, T16288G, A16300T, C16301d, G16310C, C16313A
10 1 T152C, T195C, A263G, C315CC, T16090A HV1b(HV1b3) WE
11 1 CAA242d, A263G, C295T, T310C, T310TTC, C462T, T489C, A515T, A16038AC, G16039A, J1(J1) NA
G16042T, G16047GC
12 1 A16210C, C16221G, C16222T, T16224C, C16261T, C16264G, A16265d, C16268T, C16279G, J1b NA
A16280G, C16282G, A16285G
13 1 T88TA, A263G, C271T, C295T, C315CC, C462T, T489C, A512G, T16126C, G16145A, J1b(J1b1b) NA
C16222T, C16261T
14 1 A263G, C315CC, G366T, C469T, C476T, T477A, A490T, A512T, C525T, G526T, C534T, JT(JT) NA
A538T, A16074G, T16126C, T16172TT, A16181G, C16190T, G16196GG, T16209C,
A16216AA, C16223CC, A16237AA, A16241AA, C16262CC, C16270CC, A16285AA,
A16293C, A16293AAC, A16305AA, G16329GG, C16355CC
131
15 1 A73G, C150T, T199C, A263G, C315CC, G366T, A374C, C375T, T16093C, G16129A, K1a WE
T16224C, C16278T, T16311C
16 1 G143A, T204C, T217C, A263G, C309CCT, T310C, T482C, T489C, T16126C, C16223T, M3a(M3a1) SA
C16260T, T16311C
17 1 A73G, A263G, C315CC, T489C, C511T, A16137G, C16223T, A16289G, C16360T M65a SA
18 1 C16223T, T16231C, T16356C, T16362C M6a(M6a1a) SA
19 1 A73G, C80CA, T125C, T127C, C128T, T146C, T195C, A263G, T310C, T310TTC, A339C P3a Australian
20 1 C16083A, T16090A, T16094C, G16319A P4a Oceanian
21 1 T195C, A263G, T310C, T310TTC, C411G, A432C, T460C, T471A, T528G, C16053G, R0a(R0a2) SA
T16093C, T16126C, A16318AA, T16362C, G16391GG, C16404CC
22 1 G109C, T152C, A263G, T310TTC, T310C, A341T, C362A, G366A, G389A, T391A, A402T, R5a(R5a2a) SA
C411G, A415d, A419C, C420T, C427T, C445T, A446d, A448T, T460A, A464C, A470T,
A472T, C473d, C476T, A479T, C491T, C16083A, T16097C, G16145A, T16304C, T16311C,
T16356C
23 1 C122T, T152C, T195C, G228A, A263G, C271G, A272d R6a SA
24 1 A73G, C91CAG, G92A, T146C, A263G, C309CCT, T310C, G380C, G389A, C411G, T414A, T2 WE
G429A, C431T, C445T, T452A, A472T, C476A, A16051T, A16066T, A16070C, G16096C,
C16099d, A16100T, A16103T, T16126C, T16131G, C16176T, A16177C, C16193T, C16205T,
A16206G, C16214T, A16240G, A16241G, C16251A, C16256A, A16258G, C16261G,
T16263C, A16265C, C16266T, A16293G, C16294T, C16296T, G16303A, C16313T, A16318G,
C16327A, C16328G, A16333C, G16336T, A16340G, C16358A, G16384T, G16391A
25 1 G94C, G109C, A263G, T310C, T310TTC, T321G, A341T, C362A, G366A, G389A, C404T, U2a(U2a1a) SA
A415T, C441T, C447T, A451T, T460C, T471A, T474A, A475T, C476A, C481T, T16126C,
G16129C, T16154C, A16206C, A16230G, T16311C
132
26 1 A73G, G81GG, T217C, A263G, T310C, C311CTCC, C327A, G329A, C340T, C353A, U2e(U2e1) SA
C362A, G366T, C369T, C378G, C382A, A16087T, T16093C, G16129C, T16136A, A16137T,
T16143A, C16147d, A16149T, C16150G, A16155AC, G16156A, AAA16180d, T16189C,
AT16194d, A16200G, C16201G, A16202T, C16223T, C16225A, A16226C, C16228T,
A16247T, A16254C, A16265C, G16273A, C16278G, C16279G, A16289G, A16333G,
C16339A, G16384C, A16402T
27 1 G79C, T152C, T217C, A263G, C308CT, C315CC, C340T, G16129C, G16145A, T16189C, U2e(U2e1e) SA
A16194C, T16195C, C16197G, C16197CGT, C16201A, C16205A, T16209A, C16211A,
C16214A, C16218A, C16228T, C16236A, C16239A, C16242G, T16243G, C16245A,
A16258C, A16265C, C16270A, T16276A, C16282A, A16293C, C16296T, C16301A,
T16311A, T16315A, A16322T, T16330A, C16332A, C16339A, A16340T, C16344A,
C16348A, C16355T, G16361C, T16362C, A16367G, G16384A, T16386A, A16387G,
G16391A, C16394T, C16395T
28 1 G124T, T127C, T152C, T195C, A263G, G275A, C295T, C296T, A301T U4a(U4a1b1) WE
29 1 G124T, C151T, T152C, A263G, T282A, A291T, C315CC, C317T, T321G, G329T, C332T, U7a WE
A339T, A341T, C356A, C362T, A365T, G366T, C371A, A374T, C386T, G389T, A16309G,
A16318T
30 1 G143A, A189G, T192C, C194T, T195C, T196C, T204C, A263G, C315CC, A432T, C458T, W4a WE
T489TT, C494T, C496T, C509T, A521T, C527G, G529T, C530T, C535T, C16111T, C16185T,
C16223T, C16262CC, C16268CC, C16286A, C16286CT, C16292T, G16319A, G16319GGC,
T16325TT, T16347TT
31 1 C16192T, C16223T, C16292T W6 WE
32 2 G109C, A263G, C315CC, C445A, T16126C, G16145A, C16222T, C16261T H3p WE
33 2 T16304C H5 WE
34 2 T72A, A73G, A77d, G81A, G103d, G109d, C110A, A123T, G124C, C132A, T135G, A176C, J1b(J1b1a1) NA
T177C, G187A, A201T, A202C, G203C, T16126C, G16145A, T16172C, C16261T
35 2 A73G, T152C, T195A, A263G, C315CC, A326C, T405A, T16189C M30b SA
133
36 2 G16042T, T16046TCG, G16047C, T16126C, C16294T T WE
37 2 G62GG, A73G, T152C, T195C, A263G, C315CC, T16126C, A16163G, C16186T, T16189C, T1a(T1a1'3) WE
C16294T
38 3 A73G, G109C, G121d, T195A, A218C, G247T, G260T, A263G, C16221T, C16223T, M30 SA
C16234T, T16362C
39 3 G109T, A193C, T195C, T224A, T236C, A263G, T310C, C311CTCC, G316A, A328C, C330A, U4a(U4a2a) WE
T334A, C338T, C16355T
40 4 A263G, T310C, T310TTC, G329A, C330A H2a WE
41 4 A16051G, C16085T, A16100T, C16112G, A16113G, A16127d, A16132G, C16147d, U2a WE
C16148T, G16204A, A16206C, G16208A, T16209A, A16216G, A16227C, C16228T
42 5 G97C, T195C, A263G, C315CC, C338A, G366A, C375A, A376C, G380A, T16126C, M3 SA
C16185T, C16223T
Hg, haplogroup; HGO, Haplogroup origin; EA, East asian; Southeastt Asian, SEA; West Eurasian, WE; Nortn Asian, NA; South Asian, SA;
East Eurasian, EEA.
134
When haplogroups are considered by associated region, those of West Eurasia were
the most common (54%), followed by South Asian (30%), North Asian (11%),
Oceanian (2%) and Australian (2%), respectively (Fig. 46).
Figure 46. Distribution of Tarklanis haplogroup by Origins.
Among the West Eurasian haplogroups, haplotype frequencies are as follows: U2a
(8%), H2a (6.5%), U4a (U4a2a) 5.0%, H5 (3.0%), H3p (3.0%), H11a (1.6%), H1a (1.6%),
H1a (H1a3a3) 1.6%, H1c (1.6%), H36 (1.6%), H57 (1.6%), H7a (H7a2) (1.6%), H82
(1.6%), HV1b (HV1b3) 1.6%, K1a (1.6%), U4a (U4a1b1) 1.6%, W4a (1.6%), and W6
(1.6%).
South Asian haplogroups include: M3 (8.0%), M30 (5.0%), M30b (3.0%), M3a (M3a1)
1.6%, M65a (1.6%), M6a (M6a1a) 1.6%, R0a (R0a2+195) 1.6%, R5a (R5a2a) 1.6%, R6a
(1.6%), U2a (U2a1a) 1.6%, U2e (U2e1) 1.6% and U2e (U2e1e) 1.6% (Fig. 47).
135
Figure 47. Graphical representation of haplogroup frequencies among the
sampled Tarklani individuals from District Dir.
136
North Asian haplogroups include: J1b (J1b1a1) 3.0%, C4b (C4b1) 1.6%, J1(1.6%), J1b
(1.6T 3% %), J1b(J1b1b) 1.6% and JT(1.6%). West East Asian haplogroups include:
T1a (T1a1'3) 3%, T2 (1.6%) and U7a (1.6%). The sole Australian and Oceanic
haplogroups include P3a (1.6%) and P4a (1.6%), respectively, as shown in Figure 47.
The haplotypes found among the sampled Tarklani individuals from District Dir
were assigned to mega-haplogroups. The most frequent is haplogroup R, which was
found in 46 (74%) of individuals, followed by haplogroup M, which was found in 14
individuals (23%), followed by haplogroup N, which was only found in two of the
sampled individuals (Fig. 48).
Figure 48. Haplogroup frequencies observed among the sampled Tarklani

individuals from District Dir through mtDNA control region (HVSI,
HVSII).
137
The genetic parameters of the Tarklanis included in the present study were
compared with previously reported ethnic groups of Pakistan. The comparative
analysis revealed that Tarklani sample has a moderate number of unique haplotypes
(31), which is similar to most of other ethnic groups of Pakistan, while the highest
number (128) was observed among Pathans, a result that was likely a consequence of
the large number of samples considered (Table 15). Furthermore, the greatest genetic
diversity (0.993) was observed in Pathans among the other ethnic groups previously
reported from Pakistan followed by Sindhis (0.992), Hazaras (0.992), a mixed
ethnicity sample from Karachi (0.992) and the Burusho of Hunza (0.980), while the
sample of Tarklanis in the present study had a diversity value of 0.945 as
summarized in Table 15.
138
Table 15. Diversity comparison of among the sampled Tarklanis of District Dir with the other reported ethnic groups of
Parameters Tarklani Mak Sk Pt Bl Br Hz Hb Ks Ps Sd Pk

present study
No of samples 62 100 85 230 39 38 23 44 44 44 23 100
No of haplotypes 42 70 63 157 26 22 21 32 12 22 21 77
Unique haplotypes 31 54 58 128 18 15 19 25 5 12 19 63
Pakistan.
Mak, Makrani; Sk, Saraiki; Pt, Pakhtuns; Bl, Baluch; Br, Brahui; Hz, Hazara; Hb, Hunza burusho; Ks, Kalash; Ps, Pasrsi; Sd, Sindhi;
Pk, Pakistan Karachi.
139
3.2.3.3. MtDNA Haplogroup variation among the Utmankheil of District Dir
Samples were obtained from 70 unrelated Utmankheil individuals District Dir,
Pakistan and analyzed for HVSI and HVSII sequencing. A total of 44 different
haplotypes were observed among which 33 were unique and 11 were shared by
more than one individual. The corresponding mtDNA genetic diversity of the
Utmankheil was (0.9118), power of discrimination was (0.8982) and the random
match probability observed was (0.1018) (Table 16).
Table 16. Statistical analysis of sampled Utmankheil population of District Dir
S. No Population statistics
1 Total number of samples 70
2 No of haplotypes 44
3 No of unique haplotypes 33
4 Random match probability 0.1018
5 Power of discrimination 0.8982
6 Genetic diversity 0.9118
Of of 44 haplotypes, 33 (75%) were observed once, 4 (9%) were observed twice, 4
(9%) were observed three times, 2 (5%) were observed four times and one haplotype
(2%) was observed in nine individuals. The haplotype frequencies, their respective
variants, and geographic association are provided in Table 17.
140
Table 17. Haplogroup frequencies and their respective variants in the Utmankheil sample from District Dir
S.No Frequency Variants Hg HGO

1 3 A73G, A263G, C271T, C295T, C315CC, C462T, T489C, A512G, C16069T, C16072G, J1b (J1b1b) NA
A16074G, T16126C, G16145A, C16222T, C16261T
2 1 A215T, T233A, A249T, C253T, C258G, A263G, T265C, C268T, C269T, A270T, A272C, U5a WE
G275A, C296T, A297T, C298T, T310C, G316C, C317CCC, G329A, A337T, C343T,
C362T, G366A, A16074G, C16192T, C16256T, C16270T, T16298C, T16368C
3 1 C64CG, A73G, C76d, G79C, G109C, C151T, C170T, G171d, A183G, A189T, A191d, R30b (R30b2) SA
A202G, A211T, A214T, T216A, T16093C, C16292T
4 3 A73G, C150T, T152C, A263G, C295T, C309CCT, T310C, T489C, G564A, C16069T, J2b (J2b1a) SE
T16126C, C16193T, G16274A, C16278T
5 1 A73G, T89TT, A263G, T310C, T310TTC, G16039A, G16042T, G16047GC, A16417d H1e (H1e2c) WE
6 2 G92A, G94A, G107A, G109C, T115A, T119C, C122A, G124T, A126T, G136A, T161C, M49 SEA
G163C, G171T, G187A, T204G, T206G, T220G, T226A, G229A, T250G, A263G,
A302ACC, T310C, A376C, T407G, C420T, G429A, T460C, C16223T, C16234T
7 1 A73G, C150T, T152C, A263G, C315CC, A16247G, A16254G U5b (U5b2a1a2) WE
8 1 A73G, A263G, C315CC, T16126C, A16181G, T16209C, A16417d JT NA
9 3 G68C, A73G, A232d, T254G, A263G, A278T, A291d, T310C, T310TTC, T321G, C332T, U4a (U4a2a) WE
G366A, G380C, G389A, T401A, A415T, C420T, T460C
10 2 A73G, T195C, G207A, A263G, C309CCT, T310C, T482C, T489C, G16023GA, M3 SA
G16047GC, T16126C, C16185T, C16223T, A16417d
11 1 A200G, A263G, C315CC, C16192T, T16243C, T16311C, T16368C H3x WE
12 1 G124A, T152C, A153T, T157A, C170d, G171T, G187A, T16086G, T16093C, A16103G, C4 SEA
T16126G, C16150G, C16223T, T16298C, C16327T
13 2 A200G, A263G, C309CCCT, T310C, G366A, C411G H1a (H1ah1) WE
14 1 A73G, A189G, T195C, G207A, A210G, A263G, C315CC, A365AA, A376C, A397AA, N1a (N1a3a3) WE
A402T, T16172C, C16201T, C16223T, A16265G, T16271C
141
15 1 A200G, A263G, T310C, G316C, C317CCGG, C320T, C324G, G329A, C330A, C332A, H7c (H7c4) WE
C338A, A341C, C345T, C362A, G366A, A16074G, T16092C, A16183C, T16189C,
C16193CC, T16195G, C16214A, T16243G, A16258C, A16265C, T16276A, C16282A,
A16293C, C16301A, C16306A, C16313A, A16322T, C16332A, C16339A, T16368G,
T16372G, G16384A, T16386A, A16402C, T16413G
16 1 A263G, T310C, C16176T, C16184T, T16357C H7a (H7a2) WE
17 1 C64T, C150T, A263G, C315CC, G366A, T16189C H57 WE
18 4 T16094C, A16100T, T16124G, T16126C, T16152G, A16163G, C16186T, T16189C, T1a WEA
T16195G, A16293C, C16294T, G16310A, T16315A, C16365G, A16367G, T16386G,
G16391GG, A16402C, A16405C, T16413G
19 1 A73G, T119C, A189G, T195C, T204C, G207A, A263G, C315CC, C16223T W1 WE
20 1 T152C, T195C, A263G, C299A, A302C, C315CC, A373G, T480C, A16183C, T16189C, HV0 WE
C16193CC, T16195G, C16201A, C16205A, C16214A, T16224C, C16228T, T16243G,
A16258C, G16273A, G16274A, T16276A, C16282A, A16293C, T16297C, T16298C,
T16311A, C16313A, A16322T, T16330A, T16334G, A16335G, A16340T, C16344A,
T16362C, T16368G, T16372G, G16384A, T16386A, A16402C
21 1 A73G, T146C, T152C, T195A, A263G, C309CCT, T310C, T489C, AC523d, G16042T, M30c SA
G16047GC, A16166C
22 1 T16126C, G16153A, C16294T, C16296T T2e WE
23 1 G143A, T152C, G228GA, G228A, A234G, A249d, A263G, C315CC, C16083A, T16090A, F1c(F1c1a) SEA
C16111T, G16129A, T16304C, T16362C, A16420T
24 1 A73G, A263G, C315CC, A16066T, A16070C, A16120T, A16137G, C16174T, C16176T, D4h (D4h1) CA
T16209A, G16213T
25 1 T16126C, C16184T, C16223T, A16237G, G16273A, A16283C M66 SA
26 1 C151T, T152C, A263G, C296A, A16265T, G16273T, A16280T, A16318C, C16332T, H1c WE
C16344T, T16372G, C16375T, C16376T, C16377T, C16382T, T16386A, C16395T,
C16401T, A16405T, C16408A
142
27 1 G121T, T125d, G136T, A137T, T142C, T146C, T152C, A153C, T172C, A181T, G203A, H1t (H1t1a1) WE
A211T, C16266T, C16267T, A16305T, T16311C, G16329T, C16332T, C16339T, A16340T,
A16350T
28 9 A73G, C150T, T195A, A263G, T310C, C315CCC, A339C, C356CC, A368AA, C375A, M30 SA
A385AA, A390AA, C16223T, C16234T, A16338T, A16405T
29 1 A73G, A82AA, G109A, T115C, C145CC, A156d, A175T, A193T, T195C, C16223T, M3d SA
C16234T, G16303T, T16304A, T16308A, A16318T, A16331T, C16344T, A16351T
30 4 A73G, A263G, C315CC, T471C, T16154C, A16206C, A16230G, T16311C, A16383T, U2a (U2a1a) WE
G16384T, A16387T, C16419A
31 3 A93G, A95C, T152C, A263G, C16256T, A16265T, C16270T, A16299T, A16300AT, H14a WE
C16313T, G16336T, C16337T, A16340T, A16351T, T16352C, C16365T, C16379T,
G16384T, G16388A
32 2 A263G, C315CC, C456T, A16074G, T16304C H5 WE
33 1 C91CA, C151T, T152C, A263G, C296A, T310C, T310TTC, C320T, C324G, A328C, H10e (H10e1a) WE
C338A, C362A, G366A, C369A, C375d, A379C, T391A, C394T, A395C, T16126C,
T16131G, A16166C, C16193T, A16194T, C16201T, G16204T, A16220C, C16221T,
A16226T, C16228T, T16231C, C16239T, A16240T, G16255C, A16258T, C16260T,
C16266T, A16272T
34 1 A263G, T310C, G316C, C317CCCT, T321G, A326d, G329A, C338A, C343T, G347GG, R22 SEA
C362A, G366A, T16090A
35 1 A73G, T152C, T195A, A263G, C309CCT, T310C, T489C, AC523d M30b SA
36 1 A73G, T199C, A263G, C309CCT, T310C, T489C, T16090A, T16094C, G16129A, M33a SA
C16223T
37 1 A73G, T125C, T127C, C128T, T146C, T195C, A263G, C309CCT, T310C, T489C, M12a SA
T16172C, A16180T, A16183C, T16189C
38 1 C96A, G109d, C110A, G143C, T146C, T152C, A153C, A165C, G171T, C182d, C190A, HV1a WE
C16115A, T16209C, C16239T, A16318C, G16319A, A16322C, C16327A, C16332A,
G16336T, A16349C, A16350C, T16352G, T16362C, A16367T, G16370T
143
39 1 A73G, T152C, T195C, A263G, C309CCT, T310C B4a (B4a1c3a) EEA
40 1 A73G, A263G, C315CC, T489C, A16100T, G16129A, T16189C M1 SA
41 1 A73G, T152C, T199C, C315CC, T471C, C481T U2d SA
42 1 A73G, T217C, A263G, T310C, C315CCCC, C327A, A331C, T16090A, T16093C, U2e SA
T16094C, G16129C
43 1 G62T, A73G, G81A, G92A, G109C, A116T, G124T, T146C, T152C, C170T, A193T, U2b SA
A234G, T239A, A249T
44 1 A200G, T204C, A249G, A263G, C309CCT, T310C, T489C M40a SA
haplogroup Hg; Haplogroup origin HGO; Southeast Asian, SEA; West Eurasian, WE; East Eurasian, EEA; North Asian, NA; South
Asian, SA; Southern European, SE;.
144
In the present study, West Eurasian haplogroups were identified in 47% of the
sampled Utmankheil individuals, followed by South Asian (33%), Southeast Asian
(7%), North Asian (6%), Southern European (4%), East Eurasian (1.4%), and Central
Asian (1.4%), respectively (Fig. 49).
Figure 49. Graph representing haplogroup frequencies among the sampled

Utmankheil individuals from District Dir.
145
The West Eurasian haplogroups include the following haplotypes: T1a (6%), U2a
(U2a1a) 6%, J1b (J1b1b) 4%, U4a (U4a2a) 4%, H14a (4%), H1a (H1ah1) 3%, H5 (
3%), JT (1.4%), U5a (1.4%), H1e (H1e2c) 1.4%, U5b (U5b2a1a2) 1.4%, H3x (1.4%),
N1a (N1a3a3) 1.4%, H7c (H7c4) 1.4%, H7a (H7a2) 1.4%, H57 (1.4%), W1 (1.4%), HV0
(1.4%), T2e 1.4%, H1c (1.4%), H1t (H1t1a1) 1.4%, H10e (H10e1a) 1.4% and HV1a
(1.4%).
The South Asian haplogroup includes haplotypes: M30 (13%), M3 (3%), R30b
(R30b2) (1.4%), M30c (1.4%), M66 (1.4%), M3d (1.4%), M30b (1.4%), M33a (1.4%),
M12a (1.4%), M1 (1.4%), U2d (1.4%), U2e (1.4%), U2b (1.4%) and M40a (1.4%). The
South East Asian haplogroup includes haplotypes, M49 (3%), C4 (1.4%), F1c (F1c1a)
1.4% and R22 (1.4%). The southern European haplogroups include haplotype J2b
(J2b1a, 6%). The Central Asian haplogroup includes haplotype D4h (D4h1) 1.4%,
while the East Eurasian haplogroup includes haplotype B4a (B4a1c3a) 1.4%. The
haplotype frequencies within each haplogroup are given in Figure 49.
The haplotypes of the sampled Utmankheil individuals of District Dir were also
assigned to mega-haplogroups (Fig. 42). Among the mega-haplogroup, haplogroup
R was the most frequent being found in 45 (64%) of the sampled individuals,
followed by haplogroups M 23 individuals (33%) and N two individuals (3%) (Fig.
50).
146
Figure 50. The frequency of Mega-haplogroups observed among Utmankheil
individuals sampled from District Dir through mtDNA control
region (HVSI, HVSII).
The genetic parameters of the Utmankheil sample in the current study were
compared to that of previously reported ethnic groups of Pakistan. This comparative
analysis revealed that the Utmankheil have a moderate number of unique
haplotypes (33), which is consistent with most other ethnic groups of Pakistan,
except Pathans in which the high number of unique haplogroups (128) was like due
to the large number of samples used (Table 18). Furthermore, low genetic diversity
(0.9118) was observed among the Utmankheil individuals, relative to that that
observed among previously reported ethnic groups from Pakistan, except for the
Kalash, who were found to have the least genetic diversity (0.851) of all the sampled
considered (Table 18).
147
Table 18. Diversity comparison among the sampled Utmankheil individuals of Dir District with the other reported ethnic
Parameters Utmankheil Mak Sk Pt Bl Br Hz Hb Ks Ps Sd Pk

(present study)
No of samples 70 100 85 230 39 38 23 44 44 44 23 100
No of haplotypes 44 70 63 157 26 22 21 32 12 22 21 77
groups of Pakistan.
Mak, Makrani; Sk, Saraiki; Pt, Pakhtuns; Bl, Baluch; Br, Brahui; Hz, Hazara; Hb, Hunza burusho; Ks, Kalash; Ps, Pasrsi; Sd,Sindhi;
148
3.2.3.4. Haplogroups of the sampled Yousafzai of District Swat
A total of 56 unrelated Yousafzai samples from District Swat, Pakistan were
analyzed for HVSI and HVSII sequencing. The results revealed 39 different
haplotypes among which 30 were unique and nine were shared by more than one
individual (Table 17). The corresponding mtDNA genetic diversity of the Yousafzai
is 0.9392, the power of discrimination is 0.9237, and the random match probability
observed is 0.0763 (Table 19).
Table 19. Statistical analysis of Yousafzai of District Swat
1 Total number of samples 56
Out of the 39 haplotypes, 30 (77%) were observed once, six were observed (15%)
twice, one was observed (3%) three times, one was observed (3%) four times and
one was observed (3%) six times. The haplotype frequencies, their respective
variants, snd their associated geographic origins are given in Table 20.
149
Table 20. Haplogroup frequencies and their respective variants among the sampled Yousafzai individuals of Swat District
S.NO Frequency Variants Hg HGO

1 5 A263G, T310C, C315CCCCC, C320T, C324G, A326C, C330A, C332A, C338A, C340A, C345T, B4a EEA
A350C, C362A, G366A, A379C, G380C, T383C, C387CC, G389A, C394T, A402T, A16181T,
A16183C, T16189C
2 6 T152C, A219G, A235T, A238G, C273T, A274T, G275T, T282A, A284G, A297T, C299G, C16225G, H2a WE
A16237G, T16249G, A16252G, A16254G, G16255C, A16269G, C16270A, G16273C
3 1 A87G, T146C, T152C, A189G, T204C, A263G, C309CCT, T310C, C324T, A16165C, T16224C, K2a WE
T16311C, T16413G
4 1 C151T, T152C, T206G, A230AA, A263G, A276AA, T292A, A302AA, C315CC, C320T, G329A, R22 SEA
C330A, A331AA, C349CC
5 2 A87G, T152C, A263G, C309CCT, T310C, G366A, A464C, T482C, A488d, C494A, C505T, C518A, R5a SA
AC523d, C541A, G16096C, T16097C, A16203AA, C16266T, T16304C, T16311C
6 1 A87G, A263G, T310C, C315CC, G329A, C332A, C338A, C362A, T372C, A376C, A379C, G380A, H27d WE
T383C, G389A, T391A, C394T, A395G, T398A, T403A, C411G, A419C, G16129A
7 1 C151T, T152C, A263G, C303A, C315CC, T346C, A16070C H1a WE
8 1 T16126C, C16193T, G16274A J1d WE
9 1 G103d, C150T, T195A, A16070C, A16309G, A16318T, A16343G, T16362C U3b WE
10 2 G16129A, G16213A, T16362C H7h WE
11 1 A16120T, T16126C, T16131G, A16164T, A16170C, A16171G, T16178C, C16179A, A16194C, H4a WE
A16203C, G16204C, A16215C, A16216G, T16217G, C16218G, A16219G, A16220d, C16225G,
A16226T, A16237G, A16247d, A16252G, C16268d, C16270G
12 1 A111d, T152C, A263G, A290C, A301G, C303T, C315CC, T16105C, C16108A, C16115A, U7 WE
A16309G, A16318T
13 1 C96A, A263G, C315CC, C324G, G366A, C375A, A379C, A388G, C394T, A402T, C411G, T414G, U2e SA
T416G, A419C
14 1 T204C, A234G, A263G, T310C, T310TTC, C371T, G16208A, C16223T M14 SA
150
15 1 G103A, T195C, A263G, A286d, C315CC, T489C, C16223T, T16325C D4k CA
16 3 T204C, T217C, A263G, T310C, C311CTCC, G366A, C375A, G389A, C411G, C16083A, C16099T, M3a SA
T16126C, C16223T, T16311C
17 1 C96A, G103A, A111d, A263G, C315CC B5b EEA
18 1 T117A, G121C, G124T, T129C, A219C, T236A, A259C, G260C, A263G, A270G, G275T U7a WE
19 1 C194T, G207A, A263G, C315CC, C514A H3s WE
20 1 T154C, A263G, C315CC, A376C, G413A, C420T R8b WE
21 2 T195A, A263G, C315CC, T489C, G16035C, A16037C, A16041T, G16042T, G16047GC, A16051G, M30 SA
C16223T
22 2 A263G, C315CC, A432C, A16070C, C16076A, T16086A, C16205T, A16206C, A16212C, U2a SA
A16216G, T16217G
23 2 A87G, T152C, T199C, T204C, A263G, C309CCCT, T310C, C16147G, T16172C, C16223T, N1a WE
C16248T, C16344T, C16355T
24 1 C96A, G109d, A111C, T146C, C150T, T152C, T195A, A263G, C315CC, A374T, T403A, T408A, M30c SA
G410C, C418G
25 1 G109C, C151T, T152C, A263G, C315CC, G16096C, T16097C, T16126C T2b WE
26 1 G107GC, T195C, C198T, A263G, C315CC HV0b WE
27 2 G109C, T152C, A200G, A263G, C315CC, C362A, C371T, C16234T, A16247G, T16304C, H5a WE
T16325G, G16391T
28 1 A200G, A263G, C16223T, C16234T, T16362C M49 SA
29 1 T192d, T195C, A211T, A215G, C231A, T233A U4b WE
30 1 T127C, T146C, T152C, T168C, A202d, T226A, G229A L0d AF
31 1 G94C, A111d, T152C D1j EA
32 1 C16104T, C16223T, T16231C, C16291T, G16319A M6a SA
33 1 T16217C HV2 WE
34 1 T88TA, C242T, A263G, C295T, C315CC, C462T, T489C, T16094C, T16126C J1b NA
35 1 G92A, G94C, C105A, C112d, G121T, G124T, A176d, A218C, A219C, T220C, A234C, A249T, H3c WE
G255C, G260A, A263G
151
36 1 G16110A, C16223T, A16289G M65a SA
37 1 C16099T, A16100T, C16179T, G16204A, C16221A U2c SA
38 1 T16093C, T16094C, C16099T, A16100T, C16301A, G16303A, T16311C, T16330G, C16344A, H3v WE
T16347G, C16353T, C16358T, A16367C, C16375A, T16386d
39 1 A73G, A189G, C194T, T195C, T204C, G207A, A263G, T310C, C315CCC, C320T, C324G, A326C, W+194 WE
G329A, A335T, C338A, G16129A, C16223T, T16249C, C16292T, C16419A
haplogroup Hg; Haplogroup origin HGO; West Eurasian, WE; East Eurasian, EEA; North Asian, NA; South Asian, SA; Southeast
Asian, SEA; African, AF; East Asian, EA.
152
The Yousafzai samples were assigned to haplogroups by geographic origin and
haplogroups associated with West Eurasian populations were most common at
52%, followed by South Asian haplogroups at 29%, East Eurasian at 11%,
Southeast Asian at 1.8%, North Asian at 1.8%, African at1.8%, East Asian at 1.8%
and Central Asian also at 1.8% (Fig. 51).
Figure 51. The distribution of Yousafzai haplogroups among the individuals

sampled from District Swat by associated geographic origin.
The West Eurasian haplogroups found among the sampled Yousafzai
individuals includes haplotypes: H2a (11%), H7h (3.6%), N1a (3.6)%, H5a (3.6%),
while the rest of haplotypes i.e. K2a, HV2, W+194, H3V, H3c, U4b, R8b, HV0b,
T2b, H3s, U7a, U7, H4a, U3b, J1d, H1a and H27d occurred in only one individual
(1.8%), respectively. The East Eurasian haplogroup includes haplotypes B4a (9%)
and B5b (1.8%). The South Asian haplogroup include haplotypes M3a (5.4%), R5a
(3.6), U2a (3.6%), and M30 (3.6%), while the remaining haplotypes (U2c, M6a,
153
M49, M65a, M30c, M14, U2e) were each limited to one individual (1.8%). North
Asian, East Asian, African, Central Asian and Southeast Asian haplogroups were
observed in only one individual (1.8%), respectively in the Yousafzai sample
(Fig. 52).
West Eurasian
East Eurasian
South Asian
North Asian
East Asian
African
Central Asian
Southeast Asian
Figure 52. The frequencies of mtDNA haplotypes of Yousafzai individuals

sampled from District Swat with respect to their associated
geographic origins.
154
The haplotypes of the Yousafzai sample from District Swat were assigned to
mega-haplogroups (Fig. 45). Observed in 40 (71%) of the sampled individuals,
mega-haplogroup R was the most frequent haplogroup observed among the
sampled Yousafzai individuals, followed by M 12 (21%), N 3 (6%) and L 1 (2%),
respectively (Fig. 53).
Figure 53. The frequency of Mega-haplogroups observed in Yousafzai

individuals from District Swat through mtDNA control regions
(HVSI, HVSII).
The genetic parameters of the sampled Yousafzai population were compared
with the other reported Pakistani ethnic groups. This comparison revealed that
the Yousafzai sample from Swat has moderate frequency (30) of unique
haplotypes that is consistent with previously reported ethnic groups of Pakistan,
155
except Pathans which has the highest (128) numbers of unique haplotypes
among the other ethnic groups including the Yousafzai of the present study
(Table 21). The great number of unique haplotypes in the reported Pathan
sample is likely a consequence of the large number of samples (230) considered.
The high number of unique haplotypes resulted high genetic diversity among
Pathans (0.993), followed closely by Sindhis (0.992), Hazaras (0.992), a mixed
ethnicity sample from Karachi (0.992) and the Burushos of Hunza (0.980) in
comparison to Yousafzai (0.9392) observed in the present study (Table 21).
156
Table 21. Genetic diversity of the Yousafzai sample from District Swat in comparison to the other reported ethnic groups of
Pakistan.
Parameters Yousafzai Mak Sk Pt Bl Br Hz Hb Ks Ps Sd Pk

present study
No of samples 56 100 85 230 39 38 23 44 44 44 23 100
No of haplotypes 39 70 63 157 26 22 21 32 12 22 21 77
157
3.2.3.5. Haplogroup distribution among the sampled Kohistanis of District Swat
A total of 37 unrelated Kohistani samples from District Swat were analyzed for
mitochondrial HVSI and HVSII sequencing and 25 different haplotypes were
identified. Out of the 37 samples, 15 haplotypes were unique and 10 were shared by
more than one individual, while the corresponding mtDNA genetic diversity of this
population was 0.9176, the power of discrimination was 0.8928 and the random
match probability was 0.1072 (Table 22).
Table 21. Statistical analysis of the Kohistani sample from District Swat
1
Of the 25 haplotypes, 15 (60%) were observed once, nine (36%) were observed twice
and one was observed (3%) four times. The haplotype frequencies and their
respective variants are provided in Table 23.
158
Table 23. Haplogroup frequencies and the respective variants of the Kohistanis sampled from Swat District
S. Frequency Variants Hg HGO

No
1 2 G16129A, T16131G, C16242T, T16356C, A16417G HV12b WE
2 1 T16126C, T16189C J1c WE
3 1 G16096C, T16097C, A16098T, C16099A, G16129A, C16174T, A16177T, C16201A, T16209d, G16213A, H5c WE
C16218T
4 2 T65TT, T72C, A193T, A211T, A215T, A243T, G260T, A263T, C271T, G275T, A278T, A281T, C285T, A286T, H2a WE
C299T, C304T, C16107d, A16109G, T16126A, T16199G, A16215G, C16228T, C16232G, A16233C, A16235C,
C16239G, A16241T, C16256A, C16262T, G16273T, A16277T, A16280T, C16282T
5 1 A16051G, C16056T, C16079T, A16081T, C16083d, T16092C, G16096T, A16098T, C16099d, A16113d, U6a WE
A16116T, G16118T, C16151T
6 1 G79C, T83A, G92A, A95T, C120G, G121T, T152A, T161C, A175T, A176T, T180G, A183C, C186G, G187A, H5 WE
G203T, G205A, G207T, G16156T, C16205A, G16208T, A16227T, C16282T, A16300T, C16301A, G16303T,
T16304C, C16313d, C16364T, T16372A, G16384T, G16388A, T16397TT, C16410T, G16412T, G16414T
7 2 C16223T, C16292T, T16362G, A16383T, G16384d W WE
8 1 A16058T, A16074T, A16078T, G16084T, C16085A, G16089d, A16149T, C16179T, C16193T, C16197T, U4c WE
G16204A, A16206T, A16207T, A16215T, A16219T, A16220T, A16227T, C16228T, A16230d, A16240T,
A16241T, C16245A, G16255A, C16256A, C16259T
9 1 A16051G, T16086C, G16129A, A16206C, C16291T, T16362C U2a SA
10 1 G16129A, C16239T, C16339T, A16343T, C16380T, A16387T, A16399T, A16415T, A16420T H17c WE
11 1 G16084A, G16096C, T16097C, C16174d, A16200T, A16202T, G16204A, C16211T, C16214T, A16219T, F2e SEA
C16236T, A16254T, C16260T
12 1 A16070C, T16075A, T16097d, A16100T, G16129A, G16156T, T16161A, A16175T, A16181T, C16193T, HV4a WE
A16194T, C16201T, G16204T, A16210T, C16211T, A16216T, C16218T, C16221T, A16226C, A16227T,
C16228d, C16239T, A16241T, A16247T, G16255T
159
13 1 G16129A, C16242T, G16273T, A16281T, A16312T, C16313A, A16333T, A16335T, G16336T, C16337T, H1e WE
A16338T, C16339T, G16361T, T16362TT
14 2 A16070C, G16145A, C16176T, C16223T, C16261T, C16291T, T16311C, A16383T, G16384T M4 SA
15 1 T16102A, T16131d, C16225T, C16239T, A16252C, A16258T, C16264T, C16270T H1b WE
16 1 C16069T, T16126C, C16218T J1 NA
17 1 T16092C, C16188T, T16189C, A16207G, C16234T, A16314T, C16410T H13a WE
18 A16037C, C16040T, T16050d, C16052A, C16053T, C16056A, C16057A, C16076A, C16083A, C16085G, WE
1 T16086A, G16129A, T16131A, C16133G, T16161A H1
19 G54GG, A73G, T152C, A214G, A263G, C315CC, C461T, T489C, AC523d, T16140A, T16152C, T16154C, SA
4 A16155C, A16164C, A16165C, C16174G, C16223T, G16274A, T16323A, A16351T, T16362C, C16376T, M6
G16384T, A16387G, C16404T
20 2 A73G, T195C, A263G, T310C, T310TTC, G366A, T414G, T482C, T489C, AC523d, A561C M3 SA
21 2 A87G, A263G, C315CC, T16126C, T16143G, C16151G, C16188T, T16189C, A16194G, A16207G, A16216T, T WE
C16234T, T16263C, A16277T, C16279T, A16284T, A16289T, C16294T, G16303T, C16321T, C16327T,
C16337T, T16342A, A16343T, C16353T, A16367G, C16382T, T16386G, A16387G, C16393T, C16395G
22 2 G62GG, A73G, G184A, A200G, A263G, T310C, T310TTC, G380T, G389T, A396T, G410T, A425T, T430C, G2a SEA
C445T, C465T, T16094C, T16117A, T16189C, C16192CT, A16194G, T16195G, A16212T, A16220C,
C16223T, C16239G, C16245G, A16258C, A16265T, A16269G, T16276A, A16277C, A16285C, A16293T,
C16294T, C16296T, A16305T, A16316G, A16326C, T16330G, A16333T, T16334A, G16346A, T16347C,
A16351T, T16362C, A16367G, T16368G
23 2 T152C, T155A, A165T, A178T, C16083A, T16090A, A16100T, C16223T, G16274A, T16362C D4e EEA
24 2 A73G, T195C, C198T, A263G, C309CCT, T310C, T482C, T489C, AC523d, T16097C, G16110A, G16414A D4p EEA
25 1 A73G, G143A, A189G, C194T, T195C, T199C, T204C, G207A, A263G, C315CC, C16223T, C16292T, W3a WE
G16414A
haplogroup Hg; Haplogroup origin HGO; West Eurasian, WE; East Eurasian, EEA; Northern Asian, NA; South Asian, SA; South East
Asian, SEA.
160
The Kohistani samples were assigned to haplogroup by associated origin. The most
common haplotypes were found to be of West Eurasian derivation at 54%, followed
by South Asian at 24%, East Eurasian at 11%, Southeast Asian at 8% and a North
Asian haplogroup was observed 3% of the sampled individuals (Fig. 54).
Figure 54. Haplogroup distribution among the sampled Kohistanis from District
Swat by associated geographic region of origin
The most frequent haplotype in the Kohistani samples was M6, which is observed in
11% of individuals. The West Eurasian haplogroups haplotypes occurring with
lesser frequency include: T at 5.4%, HV12b at 5.4%, H2a 5.4%, and W also at 5.4%.
The rest of haplotypes occurred with a frequency of 2.7% and include W3a, U6a,
U4c, J1c, HV4a, H5c, H5, H1e, H1b, H17c, H13a and H1. South Asian haplogroups
include haplotypes M6 at 11%, M4 at 5.4%, M3 at 5.4% and U2a at 2.7%. The East
Asian haplogroup include haplotypes D4p and D4e, both of which occur with a
frequency of 5.4%. The Southeast Asian haplogroup includes haplotypes G2a at 5.4%
161
and F2e at 2.7%. The North Asian haplogroup was represented by a single
haplotype, J1 which occurs with a frequency of 2.7% as shown in Figure 55.
Figure 55. The frequencies of mtDNA haplotypes of Kohistanis

sampled from District Swat with respect to their associated
geographic regions of origin.
162
These haplotypes were further assigned into mega-haplogroups and the results
show that the most frequent among them was, haplogroup R, which occurred in 20
(54%) individuals, followed by M 14 (38%) and N 3 (8%) (Fig. 56).
Figure 56. Mega-haplogroup distribution among the sampled Kohistani

individuals from District Swat.
The genetic parameters of the Kohistani sample of the present study were compared
with the previously reported ethnic groups of Pakistan. The comparative analysis
revealed that the Kohistanis have a moderate number of unique haplotypes (15),
which is consistent with the reported ethnic groups of Pakistan (Table 24).
Furthermore, low genetic diversity (0.9176) was observed in this Kohistani sample
relative to previously reported ethnic groups from Pakistan, with the exception of
Kalash, which have the least genetic diversity (0.851) as shown in Table 24.
163
Table 24. The genetic diversity of the sampled Kohistani population from District Swat in comparison with the other reported
ethnic groups of Pakistan.
Parameters Kohistanis Mak Sk Pt Bl Br Hz Hb Ks Ps Sd Pk

present study
No of samples 37 100 85 230 39 38 23 44 44 44 23 100
No of haplotypes 25 70 63 157 26 22 21 32 12 22 21 77
164
3.2.4. Overall mtDNA haplogroup distribution among the five sampledethnic
groups of Swat and Dir districts
The mtDNA control region sequences of 298 individuals from Swat and Dir districts
were collectively analyzed as one geographical population for haplogroup
identification. The overall results revealed 126 haplotypes in which 75 haplotypes
were unique and 51 were shared. The most common haplogroups among the all five
population samples were H2a and M30 observed with frequency of 5.37%, followed
by U2a (4.36%), M3 (3.69%), B4a (3.02%), U4a (2.68%), T1a (2.35% ), H1a (2.01%), R5a
(1.68%), T(1.68%), H17c (1.34%), T2b (1.34%), U2e (1.34%), U7a (1.34%), W (1.34%)
while, haplogroups H3p , J2b , K1a, M30b, M37, M49, M5c, N1a, R22 were found
with similar frequency of 1.01% respectively. The rest of haplotypes accounted for
less than 1.01% as described in Table 24. The most frequently observed haplotype
among the Yousafzai and Tarklani individuals is H2a, while M30 is prevalent among
Utmankheils and M6 was the most common haplogroup among Gujar and Kohistani
individuals from Swat and Dir districts. The frequencies of each of the haplogroups
identified in the five sampled populations are summarized in Table 25.
Table 25. MtDNA haplogroup frequencies distribution in the five sampled

populations of Dir and Swat Districts.
S.No Hg Gujars Yousafzai Tarklani Kohistani Utmankheil Total

N=73 N=56 N=62 N=37 N=70 (%)
1 A 2 0 0 0 0 0.67
2 B4a 3 5 0 0 1 3.02
3 B5b 0 1 0 0 0 0.34
4 C4 0 0 0 0 1 0.34
5 C4b 0 0 1 0 0 0.34
6 D1j 0 1 0 0 0 0.34
7 D4b 1 0 0 0 0 0.34
165
8 D4e 1 0 0 2 0 1.01
9 D4g 1 0 0 0 0 0.34
10 D4h 0 0 0 0 1 0.34
11 D4k 0 1 0 0 0 0.34
12 D4p 1 0 0 2 0 1.01
13 F1 1 0 0 0 0 0.34
14 F1c 0 0 0 0 1 0.34
15 F2e 0 0 0 1 0 0.34
16 G2a 0 0 0 2 0 0.67
17 G2b 1 0 0 0 0 0.34
18 H1 1 0 0 1 0 0.67
19 H10e 0 0 0 0 1 0.34
20 H11a 0 0 1 0 0 0.34
21 H13a 0 0 0 1 0 0.34
22 H14a 2 0 0 0 0 0.67
23 H17c 0 0 0 1 3 1.34
24 H1a 1 1 2 0 2 2.01
25 H1b 0 0 0 1 0 0.34
26 H1c 0 0 1 0 1 0.67
27 H1e 1 0 0 1 1 1.01
28 H1t 0 0 0 0 1 0.34
29 H27d 0 1 0 0 0 0.34
30 H2a 4 6 4 2 0 5.37
31 H36 0 0 1 0 0 0.34
32 H3c 0 1 0 0 0 0.34
33 H3p 1 0 2 0 0 1.01
34 H3s 0 1 0 0 0 0.34
35 H3v 0 1 0 0 0 0.34
36 H3x 0 0 0 0 1 0.34
37 H4a 0 1 0 0 0 0.34
38 H5 2 0 2 1 2 2.35
39 H57 0 0 1 0 1 0.67
40 H5a 0 2 0 0 0 0.67
41 H5c 0 0 0 1 0 0.34
42 H7a 0 0 1 0 1 0.67
43 H7c 0 0 0 0 1 0.34
44 H7h 0 2 0 0 0 0.67
45 H7i 1 0 0 0 0 0.34
46 H82 0 0 1 0 0 0.34
166
47 HV0 0 0 0 0 1 0.34
48 HV0b 0 1 0 0 0 0.34
49 HV12b 0 0 0 2 0 0.67
50 HV1a 0 0 0 0 1 0.34
51 HV1b 0 0 1 0 0 0.34
52 HV2 0 1 0 0 0 0.34
53 HV4a 0 0 0 1 0 0.34
54 J 1 0 0 0 0 0.34
55 J1 0 0 1 1 0 0.67
56 J1b 0 1 4 0 3 2.68
57 J1c 0 0 0 1 0 0.34
58 J1d 0 1 0 0 0 0.34
59 J2b 0 0 0 0 3 1.01
60 JT 0 0 1 0 1 0.67
61 K1a 2 0 1 0 0 1.01
62 K2a 0 1 0 0 0 0.34
63 L0d 0 1 0 0 0 0.34
64 M1 0 0 0 0 1 0.34
65 M12a 0 0 0 0 1 0.34
66 M14 0 1 0 0 0 0.34
67 M3 2 0 5 2 2 3.69
68 M30 3 2 3 0 8 5.37
69 M30b 0 0 2 0 1 1.01
70 M30c 0 1 0 0 1 0.67
71 M30d 1 0 0 0 0 0.34
72 M33a 0 0 0 0 1 0.34
73 M35b 1 0 0 0 0 0.34
74 M37 3 0 0 0 0 1.01
75 M3a 2 3 1 0 0 2.01
76 M3c 1 0 0 0 0 0.34
77 M3d 0 0 0 0 1 0.34
78 M4 0 0 0 2 0 0.67
79 M40a 0 0 0 0 1 0.34
80 M49 0 1 0 0 2 1.01
81 M5 2 0 0 0 0 0.67
82 M52a 2 0 0 0 0 0.67
83 M53 1 0 0 0 0 0.34
84 M54 1 0 0 0 0 0.34
85 M5c 3 0 0 0 0 1.01
167
86 M6 5 0 0 4 0 3.02
87 M65a 0 1 1 0 0 0.67
88 M66 0 0 0 0 1 0.34
89 M6a 0 1 1 0 0 0.67
90 M7c 1 0 0 0 0 0.34
91 N 1 0 0 0 0 0.34
92 N1a 0 2 0 0 1 1.01
93 P3a 0 0 1 0 0 0.34
94 P4a 0 0 1 0 0 0.34
95 R0a 0 0 1 0 0 0.34
96 R22 1 1 0 0 1 1.01
97 R30b 0 0 0 0 1 0.34
98 R5a 2 2 1 0 0 1.68
99 R6a 0 0 1 0 0 0.34
100 R8b 0 1 0 0 0 0.34
101 S 1 0 0 0 0 0.34
102 T 1 0 2 2 0 1.68
103 T1a 1 0 2 0 4 2.35
104 T2 0 0 1 0 0 0.34
105 T2b 3 1 0 0 0 1.34
106 T2e 0 0 0 0 1 0.34
107 U2a 1 2 5 1 4 4.36
108 U2b 0 0 0 0 1 0.34
109 U2c 0 1 0 0 0 0.34
110 U2d 0 0 0 0 1 0.34
111 U2e 0 1 2 0 1 1.34
112 U3b 0 1 0 0 0 0.34
113 U4a 1 0 4 0 3 2.68
114 U4b 0 1 0 0 0 0.34
115 U4c 0 0 0 1 0 0.34
116 U5a 0 0 0 0 1 0.34
117 U5b 1 0 0 0 1 0.67
118 U6a 0 0 0 1 0 0.34
119 U7 1 1 0 0 0 0.67
120 U7a 2 1 1 0 0 1.34
121 V9a 1 0 0 0 0 0.34
122 W 0 1 0 2 1 1.34
123 W1 0 0 0 0 1 0.34
124 W3a 1 0 0 1 0 0.67
168
125 W4a 0 0 1 0 0 0.34
126 W6 0 0 1 0 0 0.34
These haplotypes were further analyzed for mega-haplogroup prediction and mega-
haplogroup R, which occurred among 186 (62%) of the sampled individuals, was
found to be the most frequent. Other identified mega-haplogroups include: “M”
which was identified in 96 (32%) individuals, “N”which occurred in 15 (5%)
individuals, while mega-haplogroup “L” was only found in a single individual
(0.34%) (Fig. 57).
Figur 57. Mega-haplogroup distribution among members of the five sampled

ethnic groups of Swat and Dir districts.
169
The high frequency of mtDNA lineages observed in the data from Dir and Swat
Districts was West Eurasian observed in 133 (45%) individuals, followed by South
Asian, which was identified in 108 (36%) individuals, followed by: East Eurasian 19
(6%), Southeast Asian 12 (4%), North Asian 13 (4%), Southern European 3 (1%), East
Asian 3 (1%), Central Asian 2 (0.8%), Eastern European 2 (0.8%), African 1 (0.34%),
Australian 1 (0.34%) and Oceanian 1 (0.34%), respectively (Table 26 and Fig. 58).
Table 26. Haplogroups distribution among the individuals of Swat and Dir
district by associated geographic region of origin.
S.No Haplogroup Origin Count %age
1 West Eurasian 133 45
2 South Asian 108 30
3 East Eurasian 19 6
4 South East Asian 12 4
5 Northern Asian 13 4
6 Southern Europe 3 1
7 East Asian 3 1
8 Central Asian 2 0.8
9 Eastern Europe 2 0.8
10 African 1 0.34
11 Australian 1 0.34
12 Oceanian 1 0.34
170
Figure 58. Haplogroup distribution among the individuals of the
five sampled populations of Swat and Dir district by
associated geographic region of origin.
In the present study, West Eurasian lineages occurred with high frequency,
accounting for 54% of Tarklanis, 54% of Kohistanis followed by Yousafzais 52%,
Utmankheils 47% and Gujars 37%, respectively (Fig. 59A).
The frequency of South Asian lineages observed among Gujar individuals was 42%,
followed by Utmankheils at 33%, Tarklanis at 30%, Yousafzais at 29% and
Kohistanis at 24%, respectively (Fig. 59B). Consequently, all of the sampled ethnic
groups, except Gujars, are highly associated with West Eurasian haplogroups. In
contrast, South Asian haplogroups were the most prevalent among Gujar
individuals. The East Eurasian lineages vary in frequency from a high of 11%
among Gujar, Yousafzai and Kohistani individuals to a low of 1% among
Utmankheil individuals, while it was found to be completely absent among the
Tarklanis (Fig. 59C).
171
A
Figure 59. Distribution of mtDNA lineages among the five ethnic

groups sampled from Districts Swat and Dir (A) West
Eurasian (B) South Asian (C) East Eurasian
172
3.2.4.1. Diversity comparison among the five sampled ethnic groups of Swat and
Dir districts
The results obtained from the mtDNA control region (HVSI and HVSII) of the five
ethnic groups (Gujars, Tarklanis, Utmankheils, Yousafzai and Kohistanis) were
compared to each other for genetic variation (Table 27).
Table 27. Genetic diversity in the mtDNA data within the five ethnic groups
mtDNA haplotype (shared) G T U Y K Combined

1 (unique) 29 31 33 30 15 138
2 10 6 4 6 9 35
3 5 2 4 1 0 12
4 1 2 2 1 1 7
5 1 1 0 0 0 2
More than 5 0 0 1 1 0 2
Number of haplotypes 46 42 44 39 25 196
Sample size 73 62 70 56 37 298
Unique haplotypes (%age) 0.40 0.50 0.47 0.54 0.40 0.46
Haplotype diversity 0.92 0.94 0.91 0.94 0.91 0.93
Power of discrimination 0.91 0.93 0.90 0.92 0.90 0.91
Random Match Probability 0.0903 0.0703 0.1018 0.0763 0.1072 0.0892
G, Gujars; T, Tarklanis; U, Utmankheil; Y, Yousafzai; K, Kohistanis. Power of

discrimination, power to differentiate between any two people chosen at random
from the population; calculated as the ratio between the number of different
haplotypes and the total number of haplotypes.
173
The frequency of haplotypes identified among Gujar and Utmankheil individuals
were 63%, Tarklanis and Kohistanis 67%, while the highest frequency (70%) of
haplotypes were scored in the individuals of Yousafzai population (Table 27). The
percentage of unique haplotypes varied between the five groups, ranging from a low
of 40% among Gujars and Kohistanis to a high of 54% among Yousafzais. The
percentage of unique haplotypes among Tarklanis and Utmankheils was 50% and
47%, respectively (Table 27).
As a result of the differences in unique haplotype frequency, haplotype diversity
also varied between population samples, ranging from 0.91 among Kohistanis and
Utmankheils to 0.94 among Tarklanis and Yousafzais while Gujar individuals were
marked by a moderate level of diversity (0.92) relative to the other samples included
in the present study (Table 27). The overall power of discrimination was moderately
high among Tarklani individuals (0.93), followed by Yousafzais (0.92) and Gujars
(0.91) while diversity was found to be similar among Utmankheils and Kohistanis
(0.90) (Table 27). The lowest random match probability was observed among
Utmankheils (0.0703) and Yousafzais (0.0763) in comparison to Gujars (0.0903),
Tarklanis (0.1018) and Kohistanis (0.1072) (Table 27).
3.2.4.2. Mitochondrial Genetic Differntiation
After calculating the FST, the highest genetic differentiation was observed between
Kohistanis and Gujars (0.1102), and the lowest among Yousafzai and Kohistanis
(0.0029) table 28.
174
Table 28: Pairwise Fst genetic distances (below the diagonal) and
corresponding p-values (above the diagonal) between five ethnic
groups from Swat and Dir districts based on mtDNA sequence data.
Yousafzai Gujar Tarkalani Kohistani Utmankheil

Yousafzai * 0.0090 0.0541 0.4505 0.2793
Gujar 0.0398 * 0.0000 0.0541 0.0000
Tarkalani 0.0164 0.0621 * 0.1532 0.6306
Kohistani 0.0029 0.1102 0.0415 * 0.1712
Utmankheil 0.0040 0.0495 -0.0055 0.0451 *
* p < 0.05
3.2.4.3. Multi-Dimensional Scaling
The MDS plot based on FST statistics calculated from the sequences of mtDNA
control region for the five sampled ethnic groups from from Districts Swat and Dir
and 17 published studies from Pakistan (Bhatti et al.,2016; Bhatti and Aslamkhan,
2015; Siddiqi etal., 2015). Each dot in the MDS plot represents the group centroid for
a specific sample. (Fig. 60).
Figure 60: MDS plot of the five major ethnic groups of Swat and Dir districts
derived from Fst genetic distances.
175
The results show that the Yousafzai, Utmankheil, Tarklani and Gujars cluster with
other neighboring populations of Pakistan, while the Kohistanis are isolated from
the rest of the samples.
3.2.4.4. Network Analysis based on mtDNA sequences
Haplotype networks were constructed using the mtDNA control region sequences
obtained from Gujar, Utmankheil, Tarklani, Yousafzai and Kohistani individuals
sampled from Swat and Dir districts, Pakistan. The orange colored nodes represent
the Gujars, which forms a star cluster in the network (Fig. 61).
Figure 61. Network analysis of five population samples from Swat and Dir
districts based on mtDNA sequence data.
176
3.3. Y-chromosome STRs and Y-SNPs analysis
3.3.1. Multiplex performance
The 100 unrelated individuals in this study self-identify as members of one of three
major ethnic groups: Pathans (Pashtuns), Kohistanis, or Gujars. The Pashtuns are
further represented by individuals from three widely recognized paternally-based
divisions, Tarklanis, Utmankheils, and Yousafzais. All of the samples were
successfully genotyped for 27 Y-STR loci. The amplified products were
electrophoresed and electropherograms were generated that could be interpreted
easily (Fig. 62).
Figure 62. An example of typical electropherogram for Y-STRs multiplex

reaction used during the present studied populations of Swat and
Dir districts.
177
3.3.2. Genetic diversity
Analyses of the 27 Y-STR loci resulted in the identification of 82 haplotypes of which
75 were unique (Table 29). The frequency of unique haplotypes varied between the
five groups, from 100% (20 out of 20) among Kohistanis to 45% (9 out of 20) among
Utmankheils. Seven haplotypes were shared between two to six individuals and all
but two haplotypes were population-specific (Table 29). The non-population-specific
haplotypes were shared between four and five individuals, respectively. These
include a haplotype shared by three Yousafzai individuals and one Tarklani
individual and a haplotype shared by four Gujars and one Kohistani individual. As a
result of the differences in unique haplotype frequencies, haplotype diversity also
varied between population samples, ranging from 1.00 among Kohistanis to 0.93
among Utmankheils (Table 29). The overall power of discrimination was relatively
high (0.85) for the combined set of individuals but varied widely between the five
ethnic groups from relatively low (0.60) among Utmankheils to high (1.00) among
Kohistanis.
Information on Y-SNPs was used to assign a Y-chromosomal haplogroup
(Larmuseau et al., 2015; Karafet et al., 2008) to each individual. A relatively large
number of haplogroups was observed (Table 29) and the spectrum of these
haplogroups was consistent with previous studies (Lee et al., 2014;
Chennakrishnaiah et al., 2013; Zhao et al., 2009; Karafet et al., 2008; Sengupta et al.,
2006; Kivisild et al., 2003; Qamar et al., 2002). However, 85% of the studied
individuals carry one of four haplogroups (H1-M69, G2b-M283, L1-M22(xM274), and
178
R1a-M417,Page7) and there are large differences in the frequencies of these four
haplogroups between the five samples (Table 29).
179
Table 29: Genetic diversity in the Y-STR (27 loci) and frequencies of Y-SNP haplogroups within five ethnic groups from Dir and
Swat Districts. The values for the Y-SNP haplogroups in brackets represent 90% confidence interval.
Y-STR haplotype Kohistanis Gujars Yousafzais Tarklanis s Utmankheils Combined

1 (unique) 20 16 15 17 9 75
2 1 1 2
3 1 1 1 2
4 1 1a
5 1b
6 1 1
Number of haplotypes 20 17 17 18 12 82
Sample size 20 20 20 20 20 100
Unique haplotypes 1.00 0.80 0.75 0.85 0.45 0.75
Haplotype diversity 1.00 0.98 0.99 0.99 0.93 0.99
Power of discrimination 1.00 0.85 0.85 0.90 0.60 0.82
Y-SNP haplogroup Kohistanis Gujars Yousafzais Tarklanis Utmankheils Combined
G2a-L30(xL14, L13,M278) 1 (0.03-0.07) 1 (0.03-0.07) 2 (0.01-0.03)
G2b-M283 2 (0.07-0.13) 16 (0.77-0.83) 18 (0-17-0.19)
H1-M69 10 (0.46-0.54) 1 (0.03-0.07) 11 (0.10-0.12)
J2a-L25 2 (0.07-0.13) 2 (0.01-0.03)
J2b-M241 1 (0.03-0.07) 1 (0.03-0.07) 2 (0.01-0.03)
L1-M22(xM274) 1 (0.03-0.07) 11 (0.51-0.59) 1 (0.03-0.07) 13 (0.12-0.14)
O2-IMS-JST0213554(xP164) 1 (0.03-0.07) 1 (0.006-0.014)
Q-M242(xL56, L57, L214) 2 (0.07-0.13) 2 (0.01-0.03)
Q-L56,L57(xL54) 2 (0.07-0.13) 2 (0.01-0.03)
R-M207,M734,P224,P280(xM173) 1 (0.03-0.07) 2 (0.07-0.13) 1 (0.03-0.07) 4 (0.03-0.05)
R-M734,P224,P280(xM173) 1 (0.03-0.07) 1 (0.006-0.014)
R1a-M417,Page7 5 (0.21-0.29) 3 (0.12-0.18) 16 (0.77-0.83) 16 (0.77-0.83) 2 (0.07-0.13) 42 (0.40-0.44)
a
Shared between three Yousafzai and one Tarklani individuals.
b
Shared between four Gujar and one Kohistani individuals.
180
For example, haplogroup G2b-M283 occurs at very high frequency (0.80) among
Utmankheils, but is completely absent among members of three of the other four
population samples. In contrast, haplogroup R1a-M417,Page7 occurs among
members of all five population samples but frequencies range from high (0.80)
among Yousafzais and Tarklanis to a low (0.10) among Utmankheils. Due to small
sample sizes 90% confidence intervals are relatively large and overlap for some
haplogroups (Table 29).
3.3.3. Genetic differentiation
The genetic distances between the five groups, as estimated from the Y-STR markers
using pairwise FST, are mostly very large, ranging from 0.148 to 0.595, except for the
pairwise comparison between Tarklanis and Yousafzais, and highly significant
(Table 30 and Fig. 63).
Table 30. The genetic distances among the five ethnic groups, calculated as
pairwise FST values based on 23 of the 27 STR loci. FST values below
the diagonal and the corresponding P-values above the diagonal.
Gujars Kohistani Tarklanis Utmankheils Yousafzais
Gujars - 0.000±0.0005* 0.000±0.0005* 0.000±0.0005* 0.001±0.0005*
Kohistani 0.148 - 0.000±0.0005* 0.000±0.000* 0.000±0.0005*
Tarklanis 0.393 0.264 - 0.000±0.0005* 0.265±0.005
Utmankheils 0.508 0.445 0.596 - 0.000±0.0005*
Yousafzais 0.352 0.231 0.008 0.550 -
* Significant at 0.05 significant level with correction for multiple testing (0.05/10 =
0.005)
181
Despite being considered different ethnic subgroups of Pashtuns, members of these
two groups are not significantly different from each other genetically (FST = 0.008, p =
0.265).
Multi-dimensional scaling (MDS) analysis of pairwise genetic distances was
estimated based on FST statistics (27 Y-STR loci), for the five samples in this study
with a stress value = 1.857914e-16. The Tarklanis and Yousafzais are clustered
together within the MDS plot, while the Utmankheils are isolated from the rest of
samples and occupy a position in the top right of the plot (Fig. 63).
Figure 63. Multi Dimensional Scaling (MDS) derived for the five
major ethnic groups of Swat and Dir districts.
182
The genetic structure is also evident in the median joining network of Y-STR
haplotypes in which four distinct groups may be discerned, mainly explained by the
haplogroup assignment of the individuals (Fig. 64 and Table 29).
Figure 64. Median joining network based on the Y-STR haplotypes (23 loci) of
the five population samples. The circle sizes indicate the number of
individuals with shared Y-STR haplotypes (smallest circles = one
individual). The lengths of the connecting branches indicate the
number of mutational steps separating the haplotypes (shortest
branch lengths = one mutational step).
183
Members of the Tarklani and Yousafzai subgroups of Pashtuns are mostly found
together, being separated by only a few mutational steps. This is in contrast to the
Utmankheils and Gujars who, despite some outliers, form distinct groups separated
by a large number of mutational steps. There are no shared haplotypes within the
Kohistani group; hence they appear more scattered in the network. Nevertheless, the
majority of haplotypes among Kohistanis are still found close together in relative
proximity to the Tarklani/Yousafzai aggregate (Fig. 64).
3.3.4. Genetics, ethnicity and geography
We included population samples from a wider geographic range to examine the
genetic variation in a broader context, but limited the data set to 22 Y-STR loci for a
worldwide data set and 10 Y-STR loci for the closer look at the Indo-Pakistani sub-
continent and Southwest Asia (Table 6). In the AMOVA analysis, c. 90% of the
genetic variation occurs within the 38 population samples from the Indo-Pakistani
sub-continent and Southwest Asia (Table 31A).
184
Table 31A. AMOVA results when population samples are grouped based on
country of origin
Source of variation d.f. Sum of Variance Percentage of

squares component variance
Among groups 3 86.835 0.04903 (Va) 1.53
Among populations 34 531.079 0.25235 (Vb) 7.71

within groups
Within populations 1959 5747.079 2.93368 (Vc) 90.75
Total 1996 6364.993 3.235206
Fixation Indices P-value
FSC (Va) 0.07921 0.00000

±0.000005
FST (Vb) 0.09316 0.00000

±0.000005
FCT (Vc) 0.01515 0.02713

±0.00152
Country Populations
Pakistan Gujar, Pakistan_Pathan, Pakistan-KAL (Kalashas), Pakistan-HZR

(Hazaras), Pakistan-BSK (Burusho) , Pakistan-BRU (Brahuis), Pakistan-
BLT (Baltis) , Pakistan-BAL (Baluchis), Pakistan-KSR (Kashmiris),
Pakistan-MAKB (Makrani-Baluch), Pakistan-MAKN (Makrani-
Negroid), Pakistan-PRS (Parsis), Pakistan-PKH (Pathan), Pakistan-
SDH (Sindhis), Kohistanis, Yousafzai, Tarklani , Utmankheil, Pakistan-
Punjabi, Sindhi (HGDP), Pathan (HGDP), Makrani (HGDP),Kalash
(HGDP),Hazara (HGDP),Burusho (HGDP),Brahui (HGDP), Balochi
(HGDP)
Iran Iran-Ahvaz, Iran-Izeh, Iran-Rasht, Iran-Sari,Iran-Masal
Azerbaian Azerbaijan-Lenkoran
Afghanistan Afghanistan-Baluch, Afghanistan-Hazara, Afghanistan-Pashtun,

Afghanistan-Tajik, Afghanistan-Uzbek
d.f. degree of freedom; CV. Variance component.
185
When grouping these population samples by country of origin, the genetic variation
among countries only accounts for 1.5% of the variation, whereas 7.7% of the total
variation is explained by difference between population samples within countries
(Table 31A). However, when the 38 samples are instead grouped by ethnic
relationships, differences between the ethnic groups account for 4.5% of the total
variation, and the variation between population samples within the ethnic group
accounts for 4.51% of the total variation (Table 31B).
The comparative AMOVA analysis based upon ethnicity (Table 31B) grouped the 30
relevant population samples into eight aggregates. The first may be designated as
Baluchis and associated ethnic groups. This aggregate includes five samples:
Afghan-Baluch, Pakistan-BAL (Baluchis), Pakistan-MAKB (Makrani-Baluch),
Pakistan-MAKN (Makrani-Negroid) and Pakistan-BRU (Brahui). The second
aggregate may be designated as Pathans. This aggregate also encompasses five
samples: Tarklanis, Yousafzais, Afghanistan-Pashtuns, Pakistan-PKH (Pathans), and
Pakistan-Pathans. The third group is the Utmankheils, whose separation from the
other Pathan groups is justified by the results of the current study. The fourth
aggregate may be designated as Iranians. This aggregate includes eight samples:
Iran-Ahvaz, Iran-Izeh, Iran-Rasht, Iran-Sari, Iran-Masal, Azerbaijan-Lenkoran, and
Pakistan-Parsi. The fifth aggregate may be designated as East Asian derived. This
group includes four samples: Afghanistan-Hazaras, Pakistan-Hazaras, Afghanistan-
Uzbeks, and Pakistan-BLT (Baltis). The sixth aggregate may be designated as
lowland western Indians. The group includes three samples: Pakistan-Punjabis,
Pakistan-SDH (Sindhs), and Gujars. The seventh aggregate may be designated as
186
Northern Pakistani Highlanders. This aggregate includes four samples: Kohistanis,
Pakistan-BSK (Burushos), Pakistan-KAL (Kalash), and Pakistan-KSR (Kashmiris).
Table 31B. AMOVA results when population samples are grouped based on
ethnicity.
Source of variation d.f. Sum of Variance Percentage of

squares component variance
Among groups 6 331.036 0.14497(Va) 4.50
Among populations 31 286.878 0.14543(Vb) 4.51
within groups
Within populations 1959 5747.079 2.93368(Vc) 90.99
Total 1996 6364.993 3.22408
Fixation Indices P-value
FSC (Va) 0.04723 0.00000
±0.000005
FST (Vb) 0.09007 0.00000
±0.000005
FCT (Vc) 0.04497 0.00000
±0.000005
Ethnic group Populations
Baluchi Afghanistan-Baluch, Pakistan-BRU (Brahuis), Pakistan-MAKB
(Makrani-Baluch), Pakistan-MAKN (Makrani-Negroid),
Pakistan-BAL (Baluchis), Baluchi (HGDP), Brahui (HGDP),
Makrani (HGDP)
Pathans Tarklani , Yousafzai, Afghanistan-Pashtun (Pathans), Pakistan-
PKH (Pathan/Pakhtuns), Pakistan_Pathan, Pathan (HGDP)
Utmankheils Utmankheils
Iranians Iran-Ahvaz, Iran-Izeh, Iran-Rasht, Iran-Sari, Iran-Masal,
Azerbaijan-Lenkoran, Pakistan-PRS (Parsis)
Mongol-derived Afghanistan-Hazara, Afghanistan-Tajik, Afghanistan-Uzbek,
Pakistan-HZR (Hazaras), Pakistan-BLT (Baltis), Hazara (HGDP
Lowland Pakistan-Punjabi, Pakistan-SDH (Sindhis), Gujars, Sindhi
Western Indians (HGDP)
Northern Pakistan-BSK (Burusho), Pakistan-KAL (Kalashas), Pakistan-KSR
Pakistanis (Kashmiris), Kohistanis, Burusho (HGDP), Kalash (HGDP)
187
Multi-dimensional scaling (MDS) analysis based on pairwise genetic distances was
estimated as FST (10 Y-STR loci) for 38 selected population samples from the Indo-Pakistani
sub-continent and neighboring countries with a stress value of 0.1000367 (Fig. 65).
Figure 65. Multi-dimensional scaling (MDS) analysis for 38 selected

populations from the Indo-Pakistani sub-continent and neighboring
countries.
Despite the inclusion of 38 population samples from the Indo-Pakistani sub-
continent and Southwest Asia, most of the genetic variation in the MDS is still
defined by the five population samples from Dir and Swat Districts. Although this
dataset was limited to 10 STR loci a very large genetic differences between the
samples from Swat and Dir districts may still be observed. The genetic difference
188
between the Gujars and Kohistanis becomes non-significant, when the resolution has
been reduced from 27 to 10 Y-STR markers.
Several specific observations may be noted. The Gujar sample and the Baluch ethnic
groups from Afghanistan (Haber et al., 2012) both represent outliers that occupy the
same area in the MDS plot (Fig. 65), whereas the Baluch sample from Pakistan
(Qamar et al., 2002) occupies a more central position. The Kohistanis occupy amore
central position within the MDS plot that is adjacent to a large number of other
sampled ethnic groups from the Indo-Pakistani sub-continent and Southwest Asia.
Noticeably, the Utmankheil sample is separated by very large and highly significant
genetic distances from all other groups and on the MDS plot this sample occupies an
isolated position distant from all other samples. The Pashtun groups Tarklanis and
Yousafzais are marked by very similar genetic distances to all other groups included
in this analysis (Figure 65).
These results are generally mirrored when the MDS is constructed from the
worldwide data set (Fig. 66).
189
Coordinate 2
Coordinate 1
Figure 66. Worldwide multi-dimensional scaling (MDS) analysis of pairwise

genetic distances, estimated as FST (10 Y-STR loci), for 54 population
samples (from HGDP), including the five population samples from
Dir and Swat. (stress value =0.1583562).
3.3.5. Detailed analysis of two Y-chromosomal haplogroups
To get a more detailed picture of the relationships between the five population
samples from Dir and Swat Districts I constructed haplotype (10 Y-STR loci)
networks for individuals assigned to Y-SNP haplogroups (i) G-Page94 [(G2a-
L30(xL14, L13,M278) and G2b-M283)], (ii) H1-M69, and (iii) L1-M22(xM274), and
included previously published datasets from Pakistan (Qamar et al., 2002),
Afghanistan (Haber et al., 2012), and the HGDP (Rosenberg, 2006) (Fig. 67).
Haplogroups G-Page94 and H1-M69 were combined in one network, as the Y-SNP
typing of the previously published Pakistani population samples did not allow for
190
the distinction between these two haplogroups (Qamar et al., 2002). Individuals
representing these two Y-haplogroups are clearly separated from each other in the
STR-network, thereby demonstrating concordance between the two datasets (Fig.
67A). Most of the Utmankheils possess haplogroup G-Page94 (more specifically,
G2b-M283) and they all cluster closely together (owing to highly similar Y-STR
profiles) and with a couple of individuals from both Afghanistan and Pakistan (Fig.
67A). Only one Kohistani and one Gujar individual have a Y-SNP profile assigned to
the G-Page94 haplogroup, and these two individuals share the same Y-STR
haplotype, which is clearly separated from the haplotypes observed among the
sampled Utmankheil individuals (Fig. 67A).
The Y-STR network with individuals assigned to SNP-haplogroup H1-M69 is more
diffuse and many individuals are separated by a larger number of mutational steps.
However, most of the Kohistanis are found within this network, and many of them
cluster together, sharing the same Y-STR haplotype (Fig. 67A).
The network of STR-haplotypes assigned to SNP-haplogroup L1-M22(xM274) shows
at least two defined groups of individuals (Fig. 67B). All but one of the Gujar
individuals in this network share the same Y-STR haplotype, which is also shared by
a single Kohistani individual (and also if extended to the full 27 Y-STR loci
haplotype (Table 29 and Fig. 64). Only a single Gujar individual is found in the other
sub-group within the network.
191
Figure 67. Y-chromosome haplogroup-specific networks based on Y-STR
haplotypes (10 loci) with individuals assigned to (A) Y-SNP
haplogroups G-Page94 and H1-M69, and (B) Y-SNP haplogroup L1-
M22(xM274). The circle sizes indicate the number of individuals that
share the same Y-STR profile for these 10 loci. The smallest circles
represent one individual. The lengths of the connecting branches
indicate the number of mutational steps.
192
Chapter 4
DISCUSSION
Modern world humans are the last key features occurred late in human
development. Despite broad opinion that Africa represents the main, if not nearly
exclusive, place of origin for anatomically modern humans (AMHs), their patterns of
dispersal out of Africa are still poorly understood and represent a challenge for
researchers that continues to be investigated. One of the most hotly debated issues
concerning the origins of anatomically modern humans is the role played some
100,000 years ago by a morphologically diverse array of archaic hominins. In Africa
and in the Middle East there were various transitional forms spanning late Homo
heidelbergensis and H. sapiens; in Asia, Homo erectus; and in Europe, Homo
neanderthalensis (Klein, 2008). However, by 30,000 years ago this taxonomic diversity
vanished and humans everywhere had evolved into the anatomically and
behaviorally modern form of humans (Klein, 1999; Tattersall and Schwartz, 1999;
Clark and Willermet, 1997; Stringer and McKie, 1996; Wolpoff and Caspari, 1996;
Nitecki and Nitecki, 1994; Smith and Spencer, 1984). Due to advancement of
techniques for survival, the H. sapiens was able to flourish in the African region, from
whene they dispersed to Eurasia, Australia, Americas and eventually Oceania
(DeGiorgio et al., 2009), but their routes and pattern of migrations are poorly
understood. However, the morphologic, genetic and archeological evidence suggests
that dispersal of AMHs occurred through Levant and the southern routes from the
Horn of Africa, through the Arabian Peninsula into the region of southern Asia
(Reyes-Centeno et al., 2014; Fu et al., 2013; Liu and Zhao, 2006; Lahr and Foley, 1994;).
193
The presence of stone tools found in the Indo-Pakistani subcontinent (also called
South Asia) specifically in the Soan Valley of Pakistan suggest that, the humans
appeared in the region at least by 200,000–400,000 years ago (Wolpert, 2000) and
thus are likely to have been associated with archaic Homo species. A report based on
fossile record suggests that modern humans inhabited Pakistan approximately
60,000–70,000 years ago (Hussain, 1997).
Geographically, Pakistan is borderd with the high mountains of Karakuram,
Himalyas, Hindukush ranges and Arabean Sea, situated at the crossroads of Asia, at
the junction of the West Asia, Central Asia and South Asia (Ali, 2005). This region
has high ethnic diversity, which historically has been, at least partially, attributed to
a long and dynamic history of repeated invasions by Aryans (Bernhard, 1983),
Macedonians (Birdwood, 1959), Arabs, and Mongols (Lapidus, 2002). In addition,
the Hindu Kush highlands served as a physical barrier of trade along the “Silk
Route” that channeled routes of communication between the populations of the
Mediterranean Basin and West Asia to those of China for more than 16 centuries
(Petraglia et al., 2012; Kuzmina 2007; Quintana-Murci et al., 1999). It is therefore
possible that the extant populations of the Hindu Kush highlands show traces of
historic, and even prehistoric, gene flow from far distant human populations.
Furthermore, Pakistan is a South Asian country that has two well-known
civilizations; the Indus or Harappa civilization, which flourished between 2600 BC -
1900 BC and Gandhara Civilization dated, 1500 to 1000 BC (Kenoyer, 2005; Miller,
1985; Basham, 1963). It is believed that the southern coast of the Persian Gulf, the
territory of present-day Afghanistan and the Makran Coast of Pakistan likely served
194
as passages for human dispersal out of Africa in prehistoric times, making the
population dynamics of this region even more interesting (Derenko et al., 2013;
Underhill et al., 2001). However, the migration and admixture of new populations
and exchange of cultural elements following these routes have made the Indo-
Pakistani people more heterogeneous and diverse (Lukacs and Hemphill, 1991). For
example, in the fourth century B.C. onwards about 2000 years, different populations
entered Pakistan and settled. These populations were the Greeks, Scythians,
Parthians, Pahlavas, Kushans and the Indo-Aryans (Maloney, 1974; Thapar, 1969).
The Huns came in somewhat in at the close of the Gupta period (Ingalls, 1976). The
Jews and Parsis came later via the western coast. Arabian Muslims, Persian Muslims,
Turks and Afghans each came to the region in different waves and at different times.
The Muslim immigration into India and Pakistan began even before the Arab
invasions quite early in the 8th century A.D. and ended with the establishment of the
Mughal Empire in the 16th century.
It is very difficult to assess how human groups and settlements were formed in the
pre-historic times, whether they were the indigenous inhabitants or were migrants
from some other place? And, if they migrated, what routes they followed? These are
some of the big questions scientists seek to answer, using the principal of evolution,
dental anthropology and molecular genetics on the basis of past, present and future
(Whale, 2012; Stoneking, 2008).
The multicultural (Rose, 1911) and highly diverse population has made Pakistan an
attractive country for the field of anthropology. The study area of Swat and Dir
districts of Khyber Pakhtunkhwa, where Gujars, Kohistanis and Pashtuns are the
195
major ethnic groups are genetically isolated, lacking of intermarriages and hence
being highly endogamous (Glatzer 2002; Qamar et al., 1998; Caroe, 1992).
The evidence obtained from skeletons of the region has not been studied for ancient
DNA (Kennedy, 2000). Although the genetic data available on Pakistani populations
is very limited, it has indicated differences between Pakistanis and other populations
of the world. Most of the earlier studies mentioned Pakistani populations as a single
entity, which is incorrect, because Pakistan is the home of 18 different ethnic groups
(Grimes, 1992; Newcomb, 1986). Therefore each ethnic group of Pakistan ought to be
studied separately. Recently, a few ethnic groups of the country have been studied,
and such studies have demonstrated clear divergences among them (Lee et al., 2014;
Mehdi et al., 1999; Qamar et al., 1999).
The demography and the historic perspectives of different ethnic groups residing
within the Indo-Pakistani subcontinent has been a subject of interest for years. As a
result, three models have been offered as a consequence of these investigations. The
first is known as the Long-Standing Continuity Model. According to proponents of
this model, the modern human population migrated to South Asia some 62,000 to
75,000 years ago and is commensurate with the initial dispersal of H. Sapiens out of
Africa. Proponents of this model claim that once H. Sapiens arrived and settled in
South Asia, the resident population of the subcontinent was not significantly
influenced by subsequent gene flow from surrounding populations or large-scale
migrations within the subcontinent (Krithika et al., 2009; Sahoo et al., 2006; Kennedy
et al., 1984). Therefore, the pattern of affinities among members of the living ethnic
196
groups and ancient inhabitants of South Asia are due to simple isolation-by-distance,
both in time and space.
The second model is the Aryan Invasion Model. This model is predicated on the
creation of war tools, domestication of horses, and the invention of horse-drawn
chariots that have been found in Central Asian during the Bronze Age (Bryant and
Bryant, 2001; Renfrew, 1987). The existence of Indo-Aryan languages in the
northern two-thirds of the subcontinent, and the of invaders with war horses
inhibiting the castles of the noseless Dasus in the RgVeda, suggests that the Central
Asians invaded the northwestern region of the subcontinent during the mid-2nd
millennium BC (Wheeler, 1968).
The third model is known as the Out of India Model. The creators of this model
believe that the appearance of early agriculture and the presence of complex cities of
the Indus Valley and Doab of north India (McAlpin, 1981) are the consequence of a
proto-Elamo-Dravidian migration from southwestern Iran (i.e., Susa) into the
subcontinent and Central Asia. However, proponents of this model are not in
agreement as to when this dispersal event took place. Consequently, two versions of
this model have been proposed. The first proposes that South Asia represents the
true homeland of the Indo-European languages and the dispersal of populations
bearing these languages occurred in 3rd millennium BC. The second version suggests
that the entry of Indo-Aryan languages into the subcontinent occurred later, and is
perhaps associated with the appearance of the Iron Age during the 1st millennium
BC .
197
Exploring information contained in mtDNA, Y-STRs, and tooth morphology is very
important for phylogenetic studies, therefore the current project was designed to
characterize the five ethnic groups (Yousafzai, Gujar, Tarklani, Kohistani,
Utmankheil) residing in Swat and Dir district through dental morphology, mtDNA
and Y-STRs analysis.
Dental morphology provides an assessment of variations in the cusps, ridges,
grooves and root structures that can be used for reconstruction of biological
relationships among different populations (Hillson, 1996; Dahlberg, 1945; Pedersen,
1949; Moorrees, 1957). These variations are controlled by different genes and are
only slightly affected by environmental factors (Scott and Turner, 1997). The dental
traits exhibit significant differences in frequency among major geographic areas
(Dahlberg, 1951; Dahlberg, 1945; Hrdlicka, 1920) and, in some cases, these
differences are so obvious that Caucasoid, Mongoloid and African dental complexes
can be easily differentiated (Buikstra et al., 1990; Haeussler, 1989; Mayhall et al.,
1982).
The current project was designed to assess the nature of the biological affinities
among members of the myriad ethnic groups of South Asia and the five samples of
Swat and Dir districts of the present study using dental morphology are but a small
component of that overall endeavor. The research also analyzed the gene flow from
the surrounding region into the South Asian gene pool. Furthermore, the research
based on dental morphological data also explored the biological affinities among the
living population from northern Pakistan, including the present studied population
samples, and the ancient inhabitants of Indo-Pak subcontinent.
198
Inter-sample affinities based upon pairwise MMD values were examined with
neighbor-joining cluster analysis (NJ), multidimensional scaling (MDS), and
principal coordinate analysis (PCA).
If the “Long-Standing Continuity Model” is correct, i.e. the inhabitants of South Asia
have experience no significant gene flow from neighboring populations or
population movements within the sub-continent and their migration and
establishment occurred about 75,000 years ago, the patterning of their biological
affinities ought to be the consequence of regional and geographic proximities. The
regional structural profile of peninsular Indians, inhabitants of Indus Valley, Central
Asians, Himalayan highlanders, Hindu Kush and the inhabitants of northern Indus
Valley boundaries; including the temporal provenience among the pre-historic
inhabitants of the Indus Valley, prehistoric Central Asians and all living populations
will interact in patterning of their biological affinities.
In contrast, if the Aryan invasion model is true, that South Asian was invaded by
Bronze Age Central Asians in the mid- 2nd millennium BC, This event should be
reflected by a biological discontinuity within the population of the Indus valley
population commensurate with the dissolution of the Harappan civilization.
Therefore, one ought to expect that all the post-Harappan populations are
descendents of these Aryans from Central Asia. Moreover, if it is correct that the
Indo-European languages dispersed in South Asia due to this Aryan invasion, which
afterward spread to the Upper Doab of North India, this ought to be reflected by
close biological affinities between members of North Indian ethnic groups and their
alleged Central Asian ancestors. However, Dravidian-speaking groups from
199
southeast India ought to show no genetic affinities with these Central Asians
invaders and little affinity to their North Indian descendants. Ultimately, the ethnic
groups residing in Himalaya and Hindu Kush highlands, including members of the
ethnic groups residing in the northern portion of Khyber Pakhtunkhwa (KP) and the
foothills rimming the northern boundary of the Indus Valley may have biological
affinities to these mid-2nd millennium invaders from central Asia. The current
studied population of Swat and Dir districts exhibit no affinities to the Central Asian
samples included in this analysis.
However, if the Out of India Model is correct, which states that the rise of complex
cities in Indus Valley indicates that Indo-European languages arose within the
Indian subcontinent and then dispersed to surrounding regions of Central and
Southwest Asia during the 3rd millennium BC, then the origin of South Asian
populations ought to be attributed to long-term geographical isolation, thereby
reducing the biological distances among the late Bronze Age Central Asians and
post-Chalcolithic populations of the Indus Valley, North Indiana and northern
Pakistan.
The second version suggests that the entry of Indo-Aryan languages into the
subcontinent occurred later, and is perhaps associated with the appearance of the
Iron Age during the 1st millennium BC
Furthermore, if the 2nd version of the Out of India Model is correct, which states
that the expansion of complex cities lead to the migration of people out of India,
populating the Upper Doab of North India and then migrating to the neighboring
200
areas of Central and Southwest Asia did not occur until the mid-1st millennium BC,
while the entry of Indo-Aryan languages into the subcontinent occurred later, and is
perhaps associated with the appearance of the Iron Age during the 1st millennium
BC. Therefore, the prehistoric Central Asian samples of the late Bronze Age, since
they antedate this proposed migratory event, ought to show no affinities to any of
the South Asian samples included in the present study and hence the dispersal did
not take place until the Iron Age.
Variations among the present studied ethnic groups of Swat and Dir district and the
other groups of northern Pakistan were carried out with an array of data reduction
techniques. The results were presented through neighbor-joining cluster analysis
(NJ), multidimensional scaling (MDS), and principal coordinate analysis (PCA). The
results visualized through neighbor-joining cluster analysis reveal a fundamental
split between peninsular Indians on the one hand versus the ethnic groups from
northern Pakistan on the other (Fig. 33). Intriguingly, the amount of diversity among
the former appears greater than the diversity among the latter. Whether this is a
reflection of reality, or is the consequence of the greater number of northern
Pakistani samples or their assessment by a greater number of researchers is unclear.
On the other hand, the samples collected from Swat and Dir districts revealed close
affinity with each other, except for the Yousafzai, who show affinity to the Swatis
sample from Mansehra as shown in Figure 25.
The MDS with Kruskal’s and Guttman’s methods revealed that the Yousafzais from
Swat possess affinities with the Dravidian-speaking ethnic groups from Andhra
Pradesh in southeastern peninsular India, while the Gujar (GUJsw) and Kohistani
201
(KOHsw) samples from Swat exhibit close affinities with the highland samples from
Chitral and the Swatis of Mansehra District. The Utmankheils and Tarklanis of Dir
District share close affinities with the ethnic groups from Maharashtra, located in
West-Central peninsular India and are distinctly separated from the other Pakistani
samples included in this analysis.
The results obtained from PCA indicate that within the present studied population
samples from Dir and Swat Districts, the Tarklanis and Utmankheils from Dir show
some affinities to one another, the Gujars and Kohistanis are marked by affinities to
one another, while the Yousafzai are highly isolated from the rest of the samples
phenetically. Furthermore , the Dravidian-speaking samples from southeast India
(CHU, GPD, PNT) and the Indo-Aryan-speaking samples from west-central India
(MDA, MRT, MHR) are segregated away from each other phonetically and are
linked to the remaining samples by very distant affinities to the Utmankheil and
Tarklani samples from Dir, respectively. Most of the highland samples (Madak
Lasht, Wakis from Gulmit, Khows, and Kohistanis) aggregate together along with
the foothill samples of Awans (AWAm1) and Swatis (SWT) from Mansehra District.
Perhaps all the members of this aggregate ought to be considered the highland
aggregate. If so, then the Wakhi sample from Sost (WAKs), the Yousafzais from Swat
(YSFsw) and even the second sample of Awans from Mansehra (AWAm2) would be
considered members of this aggregate as well (Fig. 28).
An examination of the biological affinities of northern Pakistani ethnic groups in the
context of living ethnic groups from peninsular India and prehistoric samples from
the Indus Valley and southern Central Asia yield several consistent patterns. First,
202
prehistoric south-Central Asians (DJR, SAP, KUZ, MOL) are clearly separated from
all South Asian samples, both living and prehistoric. Second, peninsular Indian
samples tend to be segregated from the Pakistani samples and tend to aggregate into
separate groups by both region (Andhra Pradesh vs. Maharashtra) and language
(Dravidian vs. Indo-Aryan). Intriguingly, the prehistoric sample from Maharashtra
(INM) consistently exhibits closest affinities to the living ethnic groups (MRT, MHR,
MDA) of this same region of India. This may reflect local population continuity since
the 2nd millennium BC. Northern Pakistanis tend to aggregate into two groups, One
appears to be largely composed of highland samples (KHO, MDK, WAKg, WAKs),
possible highland groups (UTHd, YSFsw, KOHsw), as well with groups from the
foothills (SWT, AWAm1). The other samples, such as (SYDm2, AWAm2, TANm2,
SYDm2, AWAm2, TANm2) tend to occupy highly anomalous positions. These
samples were scored by another researcher and that their anomalous positions may
be a reflection of inter-observer differences in the scoring of the dental traits.
Among the studied five population samples from Swat and Dir districts, the
neighbor-joining cluster analysis identifies Gujars, Kohistanis and Utmankheils as
possessing affinities to the ancient Harappans of the Indus Valley, Yousafzais as
having affinities to ethnic groups of the Hindu Kush-Karakoram highlands, while
Tarklanis are identified as exhibit no close affinities to any of the other samples from
Dir and Swat Districts. The MDS identifies the Pashtun groups (YSFsw,UTHd,
TRKd) as having closest affinities to one another, with Kohistanis somewhat
divergent and Gujars aligning with the ancient Harappans. PCA identifies
Kohistanis, Yousafzais, and Gujars as possessing affinities to one another. When
203
such results are viewed together, the results obtained from dental morphology
suggest the immigrant Pashtun groups were small in number and appear to have
intermarried extensively with members of the local ethnic groups they encountered,
especially those occupying the highlands. However, the Kohistanis are not closely
related to these immigrants, while the rather close affinities of Gujars to the sample
of Harappans from Cemetery R37 attest to their Indus Valley origins.
Despite being a country inhabited by a population of tremendous ethnic diversity,
the members of many ethnic groups in Pakistan have remained largely unstudied
genetically, therefore the mtDNA control region of the five studied population
samples from Dir and Swat Districts were analyzed for genetic characterization.
MtDNA has a distinctive geographic distribution throughout the world’s
population. The sequence of mtDNA haplogroups varies from each other is due to
the polymorphic sites or nucleotide variation found in the control regions. Studies
based upon HVSI and HVSII of mtDNA have contributed to explore the genetic
legacy of some Indian and Pakistani populations (Bhatti et al., 2016; Kivisild et al.,
2003; Roy et al., 2003; Macaulay et al., 1999).
In the present study, a total of 126 different haplotypes were identified among which
the frequency of unique haplotypes was found to be 63% among Gujars , 67%
among Tarklanis, 63% among Utmankheils, 70% among Yousafzais (70%) and 67%
among Kohistanis (67%). The proportion of unique haplotypes among the other
reported populations of Pakistan have been observed with the frequencies of 91% in
Orakzais of Hazara, 77% among Makranis, 74% among Saraikis , 72% among
Burushos, 68% among Pathans, 66% among Baluchis , 60% among Bangashis, 58%
204
among Brahuis, 56% among Sindhis, 52% among Khattaks, 50% among Parsis, 36%
among Mahsuds and 27% among Kalashas (Bhatti et al., 2016a; Hayat et al., 2015;
Saddiqi et al., 2014; Rakha et al., 2011; Quintana-Murci et al., 2004). The difference in
haplotype frequencies among the five ethnic groups from Swat and Dir districts the
other reported ethnic groups from Pakistan is due to the differences in sample size.
The number of unique haplotypes identified in the five studied population samples
from Swat and Dir districts was 75 in which, 63% were unique among Gujars, 74%
among Tarklanis, 75% among Utmankheils, 77% among Yousafzais and 60% among
Kohistanis. Such values are consistent with Burushos (78%), Hazaras (76%),
Makranis (76%), Baluchis (69%) and Brahuis (68%) among the other reported
populations of Pakistan, but unique haplotypes were found to be more common
among Saraikis (92%), Sindhis (90%) and Pathans (81%) (Hayat et al., 2015; Saddiqi et
al., 2014; Rakha et al., 2011; Quintana-Murci et al., 2004).
The results obtained in the current study sugges that members of the five ethnic
groups from Swat and Dir districts have experienced a strong amount of admixture
in which their mtDNA reflects: i) a high proportion of West Eurasian lineages; ii)
moderate to high proportion of South Asian lineages; iii) low proportion of East
Eurasian/East Asian, Southeast Asian and North Asian lineages; and iv) a small
fraction of Southern European, Central Asian, Eastern European, African, Australian
and Oceanic lineages.
The phylogenetic analysis revealed that the Indian and Pakistani populations share
high frequencies of West Eurasian mtDNA haplogroups (Bhatti et al., 2016a; Kivisild
et al., 1999), which is also very frequent accounting 45% in the individuals of the
205
present studied population samples from Swat and Dir districts. The frequency of
West Eurasian lineages is 54% among Tarklanis, 54% among Kohistanis, 52% among
Yousafzais, 47% among Utmankheils and 37% among Gujars. Similar frequencies,
ranging around 55%, were reported among Pathans, followed a much lower 26%
among Makranis of Pakistan (Siddiqi et al., 2015; Rakha et al., 2011). This low number
of West Eurasian haplogroupe in Makrani population is due to the fact that they are
of African ancestory (Siddiqi et al., 2015). Furthermore, the frequency of West
Eurasian haplogroups among ethnic groups of Indian Punjabis were reported to
range from 40% to 50%, around 30% among Kashmiris and Gujaratis, while the
lowest proportion of West Eurasian lineages were reported among ethnic groups
residing in West Bengal, Indian cast populations and in some Indian states like Uttar
Pradesh, Kerala, Maharashtra, Tamil Nadu and Uttar Pradesh (Ahmed, 2014;
Metspalu et al., 2004; Kivisild etal., 2003). A greater proportion of West Eurasian
lineages were reported among the major ethnic groups of Afghanistan, with
frequencies of 40% among Hazaras, 89% among Tajiks, 74% among Baluchis and
64% among Pashtuns (Whale, 2012). The presence of West Eurasian lineages at high
frequencies suggests that gene flow in the past into this region likely occured from
the west through Iran or possibly from the north through Central Asia (Quintana-
Murci et al., 2004), through the invasion by different invaders in the past
(McElreavey et al., 2005).
South Asian lineages are the second most prevalent, accounting for 30% of the
lineages found among the individuals of the five ethnic group samples from Swat
and Dir districts. Frequencies were highest among Gujars at 42%, followed by by
206
Utmankheils at 33%, Tarklanis at 30%, Yousafzais at 29%and Kohistanis at 24%. The
proportion of South Asian lineages among individuals of other reported Pakistani
ethnic groups ranges from a high of 48% among Sindhis, 39.1% among Pathans, 36%
among Pashtuns, 29.4% among Saraikis, and 24% among Makranis (Bhatti etal.,
2016a; Bhatti et al., 2016b; Hayat et al., 2015; Saddiqi et al., 2014; Rakha et al., 2011).
Low frequencies of South Asian lineages have been reported by other researchers
among the major ethnic groups of Afghanistan, ranging from 15% among Hazaras,
13.3% in Baluchis, 7.1% among Pashtuns, and completely absent among Tajiks
(Whale, 2012). looking at the frequencies of South Asian lineage in Afghan Pathans
(7.1%) vs. its frequencies among the Tarklanis (30%), Utmankels (33%), and
Yousafzais (29%) from the present study as well as the frequencies of Pathans 36%
(Bhatti et al., 2016b) and Pashtuns 29.4% (Rakha et al., 2011) from Pakistan, suggests
that there has been considerable gene flow between these immigrant groups and the
local, indigenous ethnic groups they encountered once they arrived in Pakistan.
The complete dataset revealed that only 7% of the lineages found among members of
the five ethnic groups of Swat and Dir districts are associated with populations of
East Eurasia/East Asia. Frequnecies range from a high of 12.8% among Yousafzais,
11% among both Gujars and Kohistanis, 1.4% among Utmankheils to complete
absence among Tarklanis. Frequencies of East Eurasian/East Asian lineages
previously reported by other researchers among Pakistani ethnic groups ranges from
a high of 35% among Hazaras of Baluchistan, to 9% among Saraikis, to 6.9% among
Burshos of Hunza, to 5.2% among Pathans, to a low of 2% among Makronis (Bhatti et
al., 2016a; Saddiqi et al., 2014; Rakha et al., 2011; Quintana-Murci et al., 2004).
207
The East Eurasian/East Asian haplogroup have also been reported in the major
ethnic groups of Afghanistan where frequencies range from a high of 37.5% among
Hazaras, 31% among Uzbeks, 14.3% among Pashtuns, 13.4% among Baluchi to 10.5%
among Tajiks. In addition, it has been reported that 37% of lineages observed among
Turkmens from Turkmenistan are of East Eurasian/East Asian derivation (Whale,
2012; Quintana-Murci et al., 2004).
Our examination of haplotypes among the five sampled ethnic groups reveals that
about 4% of these lineages are of Southeast or North Asian origin. These lineages
range from a high of 8% among Kohistanis and 7% among Utmankheils, while Easst
Euraian/East Asian lineages were scarce or uncommon among Gujars, Tarklanis and
Yousafzais. Negligible frequencies of Southern European, Central Asian, Eastern
European, African, Australian and Oceanic lineages were found among members of
the five sampled ethnic groups of Swat and Dir districts.
A majority of individuals comprising the current human populations outside Africa
possess mtDNA lineages that can be assigned mega-haplogroups M, R and N, all of
which are believed to be derived from African lineage L3. Among the 298
individuals from the five sampled ethnic groups of Swat and Dir districts the
frequency of mega-haplogroups R, M, N and L were 62%, 32%, 5% and 0.34%
respectively (Fig. 49). Highest frequencies of mega-haplogroup R occur among
Tarklanis at 74%, followed by Yousafzais at 71%, Utmankheils at 64%, Kohistanis at
54% and Gujars at 48%. High frequencies of mega-haplogroup R have been
previously reported by other researchers among other ethnic groups of Pakistan
with frequnecies ranging from a high of 63.4% among Pathans of Districts Mardan
208
and Charsada ( Tabassum, et al. 2016), followed by 61.3% among Pathans residing
within the Federally Administrated Tribal Areas (Rakha et al., 2011), 30.8% among
Baluchis (Whale, 2012), 17.3% among Hazaras, , 16.89% in the Hazarwal population
of Hazara Division ( Akbar et al., 2016), , , 9.1% among Makranis, 8.7% among
Hazaras of Baluchistan, 8% among Pashtuns, 7.7% among Baluchis, 7.9% among
Brahuis, 6.9% among Sindhis (Bhatti et al., 2016a, 2016b), and 2.3% among the
Burushos of Hunza (Quintana-Murci et al., 2004). This haplogroup has also been
previously reported by other researchers among the major ethnic groups of
Afghanistan where mtDNA lineages corresponding to mega-haplogroup R were
found to range from a high of 28.6% among Pashtuns 28.6% to 20% among Uzbeks,
15.8% among Tajiks, to a low of 7.5% among Hazaras. In India, frequencies of
mtDNA belonging to mega-haplogroup R range from a high of 31% among
Koyas,8.8% among Gujaratis, 8.77% among Tamils, and 1% smong Chenchus
(Ranaweera et al., 2014; Kivisild et al., 2003; Quintana-Murci et al., 2004).
Lineages belonging to mega-haplogroup M among members of the five sampled
ethnic groups of Dir and Swat district occurred with highest frequency among
Gujars (45%), followed by Kohistanis (38%), Utmankheils (33%), Tarklanis (23%) and
Yousafzai (21%). Frequencies of lineages withinmega-haplogroup M among ethnic
groups of Pakistan previously reported by other researchers ranges from a high of
33% among Baluchis, to 30.9% among Pathans resident within the Federally
Administerred Tribal Areas, to 30.4% among Sindhis, to 28% among Pastuns, to
26.8% among Pashtuns of Charsada and Mardan Districts, to 22.7% among
Burushos from Hunza, to 21.78% in the Hazarwal population of Hazara Division, to
209
13% among Hazaras, to a low of 9.1% among Makranis (Sadia et al., 2016; Rakha et
al., 2011; Whale, 2012; Nazia et al., 2016; Bhatti et al., 2016a; Bhatti et al., 2016b;
Quintana-Murci et al., 2004). The proportion of mtDNA lineages assignable to mega-
haplogroup M among members of Afghan ethnic groups reported by other
researchers ranges from a high of 15% among Hazaras, 13.3% among Baluchis, 7.1%
among Pashtuns, to complete absence among Tajiks (Whale, 2012). Lineages
assignable to mega-haplogroup M are predominant among ethnic groups of
peninsular India, occuring in 60-70% of population, 26-64% in Indian Sub-Continent
(Chandrasekar et al., 2009; Metspalu et al., 2004; Quintana-Murci et al., 2004; Kivisild
et al., 1999).
I observed that lineages attributable to mega-haplogroup N were found among 5%
of the 298 sampled individuals from Dir and Swat Districts. Frequencies were
highest for Kohistanis (8%), Gujars (7%), and Yousafzais (6%), while frequencies
were much lower among Tarklanis and Utmankheils at 3%, respectively. The
frequency of mtDNA lineages attributable to mega-haplogroup N among other
Pakistani ethnic groups reported by other researchers ranges from a high of 15.56%
in the Hazarwal population of Hazara Division, to 8.6% among Pashtuns from
Districts Charsada and Mardan, to 7.8% among Pathans residing within the
Federally Administered Tribal Areas, 6.9% among Sindhi, 5.2% among Baluchis, 3%
among Pashtuns from Khyber Pakhtunkhwa and Makranis of Sindh, , , 2.6% among
Brahuis , to a low of 2.3% among the burusho of Hunza and (Tabassum et al., 2016;
Akbar et al., 2016; Bhatti et al., 2016a; Bhatti et al., 2016b; Whale, 2012; Rakha et al.,
2011; Quintana-Murci et al., 2004). Its prevalence among the major ethnic groups of
210
Afghanistan has been reported by other researhcers as ranging from a high of 10.5%
among Tajiks, to 7.5% among Hazaras , with an overall frequency of 5.9% in the
Afghan population as a whole (Whale, 2012). The prevalence of M, N and R lineages
within the present study five population samples from Swat and Dir districts, other
Pakistani populations and the neighboring populations from Afghanistan and India
may revealed that these areas are the initial place where human settled after its
dispersal from Africa (Chandrasekar etal., 2009).
No common haplotypes were observed among members of the five sampled ethnic
groups of Dir and Swat Districts, but various specific haplotypes are identified in
which H2a was the most prevalent, being found in 26.6% of individuals, followed by
M30 (25.3%), U2a (21.4%), M3 (19%), M6 (17.6%), B4a (15.9%), J1b (12.5%), U4a
(12.2%), T1a (10.3%),T (10%), M3a (9.74%), W (8.6%), R5a (7.94%), H17c (7%), D4p
(6.8%), U2e (6.4%), U7a (6.14%), T2b (5.8%), HV12b, H1e (5.4%), G2a (5.4%), M4
(5.4%), D4e (5%) and N1a (5%). Among these haplotypes H2a was observed to be
the most common haplotype among Yousafzai and Tarklani individuals, M6 was the
most common among Gujar and Kohistani individuals, while M30 was the most
common haplotype observed Utmankheil individuals.
Haplotype H2a is predominant in European and west Eurasian populations
(Brotherton et al., 2013; Loogvali et al., 2004), M6 is frequently reported in the Indus
Valley (Metspalu et al., 2004), M30 is India-specific (Maji, 2009), while U2a is
restricted to South Asia (Quintana-Murci et al., 2004). The prevalence of specific
haplotype H2a among Yousafzais and Tarklanis suggests that the maternal gene
pools of these two populations are derived from West Eurasian populations. The
211
predominance of haplotype M6 among Gujars and Kohistanis may indicate maternal
gene flow from ethnic groups occupying the Indus Valley, while the high prevalence
of haplotype M30 among Utmankheils suggests some kind of general South Asian
influence on their maternal gene pool. The MDS graphs depict Kohistanis as clear
outlyers relative to the four other sampled ethnic groups of Dir and Swat Districts.
Such results may be a consequence of the fact that Kohistanis are highly
endogamous and genetically isolated relative to the other sampled ethnic groups
included in this study.
Our analysis of patrilineal genetic diversity among members of the five sampled
ethnic groups of Dir and Swat Districts yielded several interesting insights. First, the
level of Y-STR haplotype diversity within each ethnic group was found to be
generally high and comparable to average global values (Purps et al., 2014), except
for the Utmankheil sample, which displays less diversity and fewer unique
haplotypes (Table 29). Second, the five groups are makred by an extreme level of
genetic differentiation, both among themselves (Table 30, Fig. 63) and in relation to
other population groups in the region (Fig. 65). Based on all 27 loci, the average FST
between these five ethnic groups is very high 0.34 (Table 30), with an extreme FST of
0.60 observed between Tarklanis and Utmankheils (Table 30). The middle range FST
values (0.1-0.3) were found between some of the ethnic groups (i.e., Gujar–Kohistani,
Tarklani–Kohistani, Yousafzai – Kohistani) are comparable to genetic distances
reported previously between population groups from the Indo-Pakistani sub-
continent (Perveen, 2014; Seema et al., 2011; Alam et al., 2010) and the Middle East
(Triki‐Fendri, 2015). However, the extreme genetic distances were observed (FST >
212
0.35) in several of the pairwise comparisons are unusual and higher than observed
between most human populations - even when occupying different continents
(Purps et al., 2014). Small sample sizes can inflate the genetic distances and, with just
20 sampled individuals from each group, the FST values should be interpreted with
some caution. However, we note that such extreme genetic distances have been
observed previously between other ethnic groups living in relative proximity (Zeng
et al., 2014), when they have experienced prolonged and severe genetic isolation
coupled with long-standing endogamy (Zeng et al., 2014; Roewer, 2013; Gaikwad et
al., 2006). As such, it is perhaps not unexpected to observe such great genetic
distances between the ethnic groups of Swat and Dir districts given their isolated
residential localities, their cultural preferences for endogamous marriages, as well as
their differences in subsistence practices, lifestyles and languages specially among
Pashtuns, Gujars and Kohistanis (Barth, 1956). The high differentiation could be an
effect of male founder effects (see below) and are might not be mirrored in genome-
wide autosomal data, but further studies are needed to clarify this. Nevertheless, our
results indicate that isolated lifestyles and cultural preferences can have a very large
impact on genetic distances between geographically closely residing populations.
The genetic distinction between members of these ethnic groups is further
underscored by differences in haplogroup frequencies (Table 29). The only
haplogroup shared by members of all five population samples is R1a-M417,Page7,
which is not surprising as this haplogroup occurs widely throughout the Eurasian
continent, especially among populations found in Central Asia and the Indo-
213
Pakistani sub-continent (Pamjav et al., 2012; Karafet et al., 2008; Novelletto , 2007;
Sengupta et al., 2006).
It is widely recognized that cultural factors, such as language and group
associations, can sometimes play a role in shaping the genetic structure among
human populations, especially those found in remote areas where populations are
small and isolated physically (Ayub et al., 2009; Gaikwad et al., 2006). The AMOVA
results confirm that this is also the case for the Indo-Pakistani sub-continent, where
4.1% of the genetic variation is explained by ethnicity whereas only 1.6% is
explained by origin. Members of the studied ethnic groups were found to be more
similar genetically to population samples assigned to their respective ethnicity than
to population samples obtained in the same geographic location (Figure 65, Table
30).
Unlike Gujars, Kohistanis and especially Utmankheils, Tarklanis and Yousafzais
cannot be differentiated from each other genetically with the 23 analyzed Y-STR
markers (FST = 0.008, Table 30), and the SNP data show that the majority of these
individuals carry variants of haplogroup R1a-M417,Page7, that are intermingled in a
loosely defined group in the network (Fig. 64). This haplogroup is common today
among Europeans, Central Asians, and many of the ethnic groups of South Asia
(Sengupta et al., 2006; Kivisild et al., 2003; Jobling and Tyler-Smith., 2003). Recent
studies have dissected the R1a-M417, Page7 haplogroup in greater detail (Kivisild et
al., 2015; Pamjav et al., 2012). It is reasonable to hypothesize that the Pakistani
individuals from this study assigned to haplogroup R1a-M417,Page7 belong to one
of the sub-haplogroups of R1a-Z95, such as R1a-Z2125, R1a-M560, or R1a-M780
214
(Underhill et al., 2015). According ti the local people Tarklanis and the Yousafzais are
distinct subgroups of Pashtuns (Fig. 8), but several studies have suggested that there
are cultural and linguistic similarities (Caroe, 1992; Khan, 2008), which is clearly
mirrored in our genetic data. The results suggest that both historic and current gene
flow between members of these sub-groups (i.e., patrilineal clans) prevails despite
their current residency in remote areas of the Hindu Kush-Hindu Raj highlands. In
addition, neither of these two populations was significantly different from Pashtun
individuals from Afghanistan after Bonferroni correction (Fig. 65).
Utmankheils are also considered Pashtuns (Fig. 8), but with FST distances ranging
between 0.45 and 0.60 (23 loci) from the other four population samples from Dir and
Swat (Table 30) and distances ranging between 0.21 and 0.56 (10 loci) to populations
from the Indo-Pakistani sub-continent and Southwest Asia, they are genetically very
different from any other sample included in this study (Fig. 65). This is also reflected
in the haplogroup networks where most Utmankheils form a very distinct cluster
within haplogroup G-Page94 (Figs. 64 and 67, Table 29). This haplogroup is common
in the Caucasus but is also found in medium to low frequencies in the Middle East
and southern Europe (Nijjar, 2008; Kivisild et al., 2003). Consequently, the
Utmankheils can be considered a genetic outlier within the Indo-Pakistani sub-
continent or even Eurasia (Fig. 66), at least in regard to the Y-chromosome. Such
results suggest that they either have a different genetic origin than the members of
the other Pashtun sub-tribes included here or that the Utmankheil male lineage has
been subjected to severe genetic drift, due to a male founder effect or genetic
bottleneck followed by isolation. The latter scenario is perhaps supported by lower
215
genetic diversity observed among Utmankheils relative to that seen among members
of the other groups (Table 29).These results are particularly interesting and suggest
that members of the current Utmankheil clan are all descendants of a single adopted
son of unknown origin (Barfield, 2010; Caroe, 1992). This could explain the apparent
genetic isolation of the Utmankheil male lineage, although the presence of other Y-
SNP haplogroups in the population sample (Table 29) indicates that least some male-
mediated gene flow must have occurred in either ancient or recent times or that the
bottleneck was not quite as dramatic as proposed (i.e. one male). We note that our
findings do not question the ethnic descriptions of the Utmankheils as a sub-ethnic
group of the Pashtuns, but rather underline the fact that close cultural associations
may easily arise without a closely shared genetic history.
The Gujar population sample is also much differentiated genetically but shares
relatively close affinities to Baluchi population samples from Afghanistan and
Pakistan (Fig. 65). This observation could support previously suggested cultural
connections, such as a shared transhumant lifestyle and marriages (Adamec, 2011;
Nijjar, 2008; Barth, 1956) between Gujar and Baluch populations despite rather
profound linguistic differences (Grierson, 1903-1928; Strand, 1973; Morgenstierne,
1932). The high proportion of individuals sharing haplotype L1-M22(xM274) could
again be the result of strong genetic drift. This haplogroup is today found in West
Asia and the Indo-Pakistani sub-continent (Kivisild et al., 2003; Jobling and Tyler-
Smith, 2003). Although speculative, the data could also indicate recent gene flow
between Gujars and Kohistanis which may be due to a type of symbiotic relationship
arose between members of these two ethnic groups, since these share haplotypes
216
within haplogroup H1-M69, G2a-L30(xL14, L13,M278), and L1-M22(xM274)(Table 29
and Figure 67B). Haplogroup L1-M22(xM274) is found in low frequency among
Kohistanis but is the most frequent haplogroup among Gujars and thus recent
paternal gene flow from Gujars to Kohistanis can be hypothesized. However, more
data are needed to clarify this.
In contrast to the other four ethnic groups included in this study, Kohistanis are
more genetically diverse and not significantly different from a wide array of
population samples from the Indo-Pakistani sub-continent (Table 29, Figs. 65 and
67A). However, the exact relationship within haplogroup H1-M69 (the most frequent
haplogroup within Kohistanis) between Kohistanis and members of the other ethnic
groups of Pakistan and Afghanistan is unclear. This is possibly because the
individuals assigned to the network where haplogroup H1-M69 is included may
encompass a large range of (sub)-haplogroups, depending on the sub-set of Y-SNPs
characterized in individual studies. The result could suggest the term “Kohistani”
may have less biological meaning than the other ethnic group identifiers. After all,
Gujar refers to specific caste of herders, while Tarklani, Utmankheil and Yousafzai
refer to patrilineal clans. Kohistani merely refers to a resident of a particular region,
which may have no specific demand with regard to suitable marriage partners (at
least to the degree seen in the other four ethnic groups), and are therefore found
genetically admixed in our study.
Conclusions
In the current doctorial thesis, a total of 14 tooth-trait combinations defined by the
Arizona State University Dental Morphology System were investigated in 823
217
individuals belong to five ethnic groups (Gujars, Kohistanis, Yousafzai, Tarklanis
and Utmankheils) residing in Dir and Swat Districts. Gujars, Kohistanis and
Utmankheils tended to exhibit affinities to the Chalcolithic era inhabitants of
Harappa located within the Indus Valley. Yousafzais were found to exhibit close
affinities to ethnic group residing within the Hindu Kush-Karakoram highlands,
while Tarklanis were found to possess no close affinities to members of the other
samples included in the analysis. These results were confirmed by multidimensional
scaling. Principal coordinate analysis yielded a somewhat different picture. In this
case Kohistanis, Yousafzais, and Gujars were identified as possessing affinities to
one another. It was concluded from the results of the dental morphology analysis
that the immigrant Pashtuns groups (i.e, Tarklanis, Utmankheils, Yousafzais) were
likely small in number and upon their arrival in Pakistan intermarried extensively
with members of local groups, especially those occupying these highlands. On the
other hand, Kohistanis are not closely related to these immigrants but share affinities
to other highland ethnic groups, while the affinities of Gujars provides a clue to their
Indus Valley origin. In short all the three ethnic groups i.e Pashtuns, Gujars and
Kohistanis has retained their originality.
Molecular characterization of the five sampled ethnic groups was screened for
mtDNA haplogroups. High frequency of Western Eurasian haplogroup among the
four ethnic groups (Yousafzai, Kohistanis, Utmankheil and Tarklanis) reveales that
these populations have greater affinity with Western Eurasian gene pool and are also
closely related to each other, while the presence of South Asian lineages among
Gujar individuals confirms their affinities to ethnic groups of the Indus Valley and
218
beyond in peninsular India as attested by the results of dental morphology. The
occurrence of lineages assignable to maternal mega-haplogroup lineage R among
members of the five sampled ethnic groups of Dir and Swat Districts also confirms
that the inhabitants of northern Pakistan share their gene pool with West Eurasians
and Europeans. This genetic influx might be due to the Neolithic and Paleolithic
dispersal of populations from West Eurasia to South Asia through Iran and along the
Arabian Sea coast. Our results show that most of the ethnic groups exhibit a high
proportion of individuals possessing a West Eurasian haplogroup followed by those
possessing haplogroups of South Asian origin. East Asian, Southeast Asian,
Southern European and Central Asian lineages are all quite rare in maternal gene
pools of the five sampled erthnic groups of Dir and Swat Districts.
We have also characterized the genetic diversity for paternal lineages for members of
the same five ethnic groups residing witthin the mountainous Dir and Swat Districts
of the Khyber Pakhtunkhwa Province, Pakistan. With the exception of Tarklanis and
Yousafzais, we have documented extreme levels of genetic differentiation of the
male lineages between the groups. Such differences conclude that either a lack of
shared ancestry; perhaps due to several distinct ancient or historical migrations into
this region, and/or bottlenecks and isolation events resulting in severe genetic drift
in the local male gene pools. The Y-STR data presented here do not offer sufficient
resolution to investigate these scenarios further but the results provide a strong
impetus to resolve the demographic history of this region with genome-scale
analyses.
219
In concurrence with previous studies, we conclude that ethnicity provides a more
accurate predictor of genetic associations than simple geographic propinquity.
However, our data also illustrate a clear exception in that Utmankheils are not
related to other Pashtuns group anlyzed. Thus the cultural association must either be
a more recent phenomenon not explained by shared ancestry or that a founder
event, such as a putative adoption among the Utmankheils followed by strong
genetic drift, simply erased the genetic links but not the cultural ones. The overall
results also conclude that these populations are strongly associated with West
Eurasian and South Asian gene pools.
Recommendations
We analysed non-metric dental traits in the present study, further analysis based on
odontometrics should be also analysed to provide a clearer picture of the five
population samples from Swat and Dir districts. A cohort based study is
recommended of all the ethnic groups in Khyber Pakhtunkhwa Province and
Afghanistan and other adjoining areas to provide insight into the overall patterns of
biological affinities among members of these ethnic groups. The data produced
provides a sound baseline for elaborating the histocial profile and anthroplogial
standings of Pakistani people and development of a sound data base for personal
genomics and personalized medicine.
220
REFERENCES
Achilli, A., C. Rengo, C. Magri, V. Battaglia, A. Olivieri, R. Scozzari, F. Cruciani, M.
Zeviani, E. Briem, V. Carelli and P. Moral. 2004. The molecular dissection of
mtDNA haplogroup H confirms that the Franco-Cantabrian glacial refuge
was a major source for the European gene pool. Am J Hum Genet. 75(5): 910-
918.
Adamec, L.W. 2011. Historical dictionary of Afghanistan: Scarecrow Press.
Ahmad, H. and Sirajuddin 1996. Ethnobotanical profile of Swat. Proc. First. Train.
Workshop Ethnob. Appl. Conserve. Islamabad pp 202-206.
Ahmad, H., M. Ozturk, W. Ahmad and M. S. Khan. 2015. Status of Natural

Resources in the Uplands of the Swat Valley Pakistan: In Climate Change
Impacts on High-Altitude Ecosystems. Springer International Publishing. pp.
49-98.
Ahmad, H. and R. Ahmad. 2003. Agroecology and biodiversity of the catchment area
of Swat River. The Nucleus 40: 67-76. 271.
Ahmad, H. I. Hussain, I. Ahmad and F. Rahman. 2001. Historical overview and

ethnography of Swat Valley. Lahore Museum Bull. XIV: 23-27.
Ahmed, A. 1976. Millennium and Charisma among Pathans: A Critical Essay in

Social Anthropology. Routledge and Kegan Paul, London.
Ahmed, M and A. Sirajuddin. 1996. Ethnobotanical profile of Swat. In Proceeding of

first training workshop on Ethnobotany and its application to conservation.
Islamabad, Pakistan.
Ahmed, M. 2014. Ancient Pakistan – an Archaeological History. Amazon, Create Space

Independent Publishing Platform.
Akbar, N., H. Ahmad, M.S Nadeem, N. Ali and M. Saadiq, 2015. An Efficient
Procedure for DNA Isolation and Profiling of the Hyper Variable MtDNA
Sequences. J Life Sci. 9: 530-534.
Alam, S., E. M. Ali, A. Ferdous, T. Hossain, M. M. Hasan and S. Akhteruzzaman.

2010. Haplotype diversity of 17 Y-chromosomal STR loci in the Bangladeshi
population. Forensic Sci Int-Gen. 4(2):59-60.
Alechine, E., W. Schempp and D. Corach. 2016. Characterization of the AZF region
of the Y chromosome in Native American haplogroup Q. J Sci Hum Art. 3(4):
7-58.
221
Ali, H., H. Ahmad, K.B. Marwat, M. Yousaf, B. Gul and I. Khan. 2012. Trade
potential and conservation issues of medicinal plants in district swat,
pakistan. Pak J Bot. 44(6): 1905-1912.
Ali, H., J. Shah and A. K. Jan. 2008. "Medicinal value of family Ranunculaceae of Dir
District, Pakistan." Pak J Bot. 39 (4):1037-1044.
Ali, I., M. Zahir, M. Qasim. 2005. Archaeological survey of district Chitral. Frontier
Archaeology. 3: 91.
Ali, U and M.A. Khan. 1991. Origin and diffusion of settlements in Swat valley. Pak J
Geogr. 1(1): 97-115.
Anderson, S., A. T. Bankier, B. G. Barrell, M. H. L. de Bruijin, A. R. Coulson, J.

Drouin, I. C. Eperon, D. P. Nierlich, B. A. Roe, F. Sanger, P. H. Schrier, A. J. H.
Smith, R. Staden and I. G. Young. 1981. Sequence and organization of human
mitochondrial genome. Nature. 290: 457-465.
Andrews, R. M., I. Kubacka, P. F. Chinnery, R. N. Lightowlers, D. M. Turnbull and

N. Howell. 1999. Reanalysis and revision of the Cambridge reference
sequence for human mitochondrial DNA. Nat Genet.23: 147.
Anonymous. 1998. District census report Malakand Agency-1998: Population Census

Organization, Government of Pakistan, Islamabad.
Arif, M. 2014. An overview of archaeological research in Gandhara and its adjoining

regions (Colonial and Post Colonial Period). J Asian Civil. 37:73–78.
Aslamkhan M. 1996. Sapta Sindhvas: the Land of Seven Rivers. Lahore Mus Bull. 9:
59–67.
Ayub, Q and C. Tyler-Smith. 2009. Genetic variation in South Asia: assessing the
influences of geography, language and ethnicity for understanding history
and disease risk. Brief funct genomic proteomics. 8: 395-404.
Baart, J. L and Z. M. Sagar. 2002. The Gawri language of Kalam and Dir Kohistan.
Pp. 1-22.
Bailey, S. E. 2002. A closer look at Neanderthal postcanine dental morphology: the

mandibular dentition. Anat Rec. 269(3): 148-156.
Bailey, S. E. 2004. A morphometric analysis of maxillary molar crowns of Middle-

Late Pleistocene hominins. J Hum Evol. 47(3): 183-198.
222
Bailey, S. E., M. M. Skinner and J. J. Hublin. 2011. What lies beneath? An evaluation
of lower molar trigonid crest patterns based on both dentine and enamel
expression. Am J Phys Anthropol. 145(4): 505-518.
Bamshad, M., Wooding, S., Salisbury, B. A. and Stephens, J. C., 2004. Deconstructing
the relationship between genetics and race. Nat Rev Genet. 5(8): 598-609.
Bangash, S. 2012. Socio-economic conditions of post-conflict Swat: a critical

appraisal. J Peace Dev II FATA Research Centre, Islamabad.
Barfield, T. 2010. Afghanistan: A cultural and political history. Princeton University

Press.
Barth, F. 1956. Ecologic relationships of ethnic groups in Swat, North Pakistan. Am

Anthropol. 58: 1079-1089.
Barth, F. 1959. Political Leadership among Swat Pathans. The Athlone Press, London.
Basham, A. L. 1963. The Wonder That Was India. Orient Longmans Limited, New
Delhi.
BBC, 2010. Afghanistan Country Profile. Retrieved November 10th, 2010, from
http://news.bbc.co.uk/1/hi/world/south_asia/country_profiles/1162668.st
m
Behar, D., E. Metspalu, T. Kivisild, S. Rosset and S. Tzur. 2008. Counting the
founders: the matrilineal genetic ancestry of the Jewish Diaspora. PLoS One.
3(4): e2062.
Bekada, A., R. Fregel, M. V. Cabrera, M. J. Larruga, J. Pestano, S. Benhamamouch

and M. A. Gonzalez. 2013. Introducing the Algerian mitochondrial DNA and
Y-chromosome profiles into the North African landscape. PLoS One. 8(2):
e56775.
Bellew, H. W. 1994. A General Report on the Yousafzais. 1864; 3rd edn. Lahore: Sang-
e-Meel Publications. p.97.
Bernhard, W. 1983. Ethnogenesis of South Asia with special reference to India.

Anthropol Anz. 93-110.
Berta, P., R. J. Hawkins, H. A. Sinclair, A. Taylor, L. B. Griffiths, N. P. Goodfellow

and M. Fellous. 1990. Genetic evidence equating SRY and the testis-
determining factor. Nature. 34: 448-450.
223
Bhatti, S., M. Aslamkhan, M. Attimonelli, S. Abbas and H. H. Aydin. 2016a.
Mitochondrial DNA variation in the Sindhi population of Pakistan. Aust J
Forensic Sci. 48: 115–130.
Bhatti, S., M. Aslamkhan, S. Abbas, M. Attimonelli, H. H. Aydin and S. M. E. de

Souza. 2016b. Genetic analysis of mitochondrial DNA control region
variations in four tribes of Khyber Pakhtunkhwa, Pakistan. Mitochondr DNA.
1-11.
Birdwood. 1959. A History of the Pathans: Review. Geogr J . 125: 414-416.
Blaylock, S. R. 2008. Are the Koh an indigenous population of the Hindu Kush?. A
dental morphology investigation. Unpublished Master’s Thesis. California
State University, Bakersfield.
Bodmer, W. 2015. Genetic characterization of human populations: from ABO to a

genetic map of the British people. Genetics. 199(2): 267-279.
Bohner, J and V. Lucarini. 2015. Prevailing climatic trends and runoff response from
Hindukush-Karakoram-Himalaya, upper Indus basin. arXiv preprint
arXiv:150306708.
Bolk, L. 1916. Problems of human dentition. Am J Anthropol. 19: 91- 148.
Bolk, L. 1922. On the relationship between reptilian and mammalian teeth.

Odontological Essays IV. J of Anat. 56: 107-136.
Bouckaert, R., P. Lemey, M. Dunn, S. J. Greenhill, A. V. Alekseyenko, A. J.

Drummond, R. D. Gray, M. A. Suchard and Q. D. Atkinson. 2012. Mapping
the origins and expansion of the Indo-European language family. Science.
337:957–960.
Brandon, M. C., E. Ruiz-Pesini, D. Mishmar, V. Procaccio, M. T. Lott, K. C. Nguyen,

S. Spolim, U. Patil, P. Baldi and D. C. Wallace. 2009. Mitomaster: a
bioinformatics tool for the analysis of mitochondrial DNA sequences. Hum
Mutat. 30(1): 1-6.
Brandt, G., W. Haak, C. J. Adler, C. Roth, A. Szecsenyi-Nagy, S. Karimnia, S. Moller-

Rieker, H. Meller, R. Ganslmeier, S. Friederich, V. Dresely, N. Nicklisch, J. K.
Pickrell, F. Sirocko, D. Reich, A. Cooper, K. W. Alt. 2013. Ancient DNA
reveals key stages in the formation of Central European mitochondrial genetic
diversity. Science. 342: 257–261.
Bridge and R. Allchin. 1982. The Rise of Civilization in India and Pakistan. Cambridge
University Press. p. 306.
224
Brook, A. H and M. Scheers. 2006. Variations of Tooth Root Morphology in a
Romano-British Population. Dent Anthropol. 19(2): 33-38.
Brotherton, P., W. Haak, J. Templeton, G. Brandt, J. Soubrier, C. J. Adler, S. M.

Richards, C. Der Sarkissian, R. Ganslmeier, S. Friederich and V. Dresely. 2013.
Neolithic mitochondrial haplogroup H genomes and the genetic origins of
Europeans. Nat commun. 4.p.1764.
Brown, W. M., M. George and C. A. Wilson. 1979. Rapid evolution of animal

mitochondrial DNA. Proc Natl Acad Sci. 76: 1967–71.
Bryant., E and E.F. Bryant. 2001. The quest for the origins of Vedic culture: the Indo-
Aryan migration debate. Oxford University Press.
Buikstra, J. E., S. R. Frankenberg and L.W. Koningsberg. 1990. Skeletal Biological

Distance Studies in American Physical Anthropology: Recent Trends. Am J
Anthropol. 82: 1-7.
Busby, G. B., F. Brisighelli, P. Sanchez-Diz, E. Ramos-Luis, C. Martinez-Cadenas, G.

M. Thomas, G. D. Bradley, L. Gusmao, B. Winney, W. Bodmer and M.
Vennemann. 2012. The peopling of Europe and the cautionary tale of Y
chromosome lineage R-M269. In Proc R Soc B. 279 (1730): 884-892.
Butler, J. 2005. Forensic DNA Typing: Biology, Technology, and Genetics of STR
Markers. 2nd Edition. London. Elsevier Academic Press.
Butler, J. M. 2011. Y-chromosomal DNA testing. In: Advanced topics in forensic

DNA typing: Methodology, London: Academic Press. 371–403.
Butler, J. M. 2012. Advanced Topics in Forensic DNA Typing: Methodology.Elsevier

Academic Press.ISBN 978-0-12-374513-2.
Cann, H. M., C. De Toma, L. Cazes, F. M. Legrand, V. Morel, L. Piouffre, J. Bodmer,

F. W. Bodmer, B. Bonne-Tamir, A. Cambon-Thomsen and Z.Chen. 2002. A
human genome diversity cell line panel. Science. 296(5566): 261-262.
Cann, R., M. Stoneking and A. Wilson. 1987. Mitochondrial DNA and Evolution.
Nature. 325: 31-36.
Carabelli, G. 1842. Anatomie des Mundes. Braumüller und Seidel, Wien press.
Caroe, O. 1958. The Pathans. Oxford University Press, London.
Caroe, O. 1976. The Pathans: 550 B.C.–A.D. 1957. Oxford University Press, London.
225
Caroe, O. 1992. The Pathans (1958). Karachi: Oxford University Press.
Cavalli-Sforza, L. L., P. Menozzi, and A. Piazza. 1994. The History and Geography of
Human Genes. Princeton. Princeton University Press.
Chandrasekar, A., S. Kumar, J. Sreenath, N. B. Sarkar, P. B. Urade, S. Mallick, S. S.

Bandopadhyay, P. Barua, S. S. Barik, D. Basu, U. Kiran, P. Gangopadhyay, R.
Sahani, R. V. B. Prasad, S. Gangopadhyay, R. G. Lakshmi, R. R. Ravuri, K.
Padmaja, N. P. Venugopal, M-B. Sharma and R. V. Rao. 2009. Updating
Phylogeny of Mitochondrial DNA Macrohaplogroup M in India: Dispersal of
Modern Human in South Asian Corridor. PLoS One. 4: e7447.
Chang, X., Z. Wang, P. Hao, Y. Y. Li and Y. X. Li. 2010. Exploring mitochondrial

evolution and metabolism organization principles by comparative analysis of
metabolic networks. Genome. 95(6): 339-344.
Chauhan, R. A. H. 2001. A short history of the Gurjars: past and present/by Rana Ali
Hasan Chauhan.
Chennakrishnaiah, S., D. Perez, T. Gayden, L. Rivera, M. Regueiro and J. R. Herrera.

2013. Indigenous and foreign Y-chromosomes characterize the Lingayat and
Vokkaliga populations of Southwest India. Gene. 526(2): 96-106.
Clark, G. A. and C. M. Willermet (eds). 1997. Conceptual Issues in Modern Human

Origins Research. Transaction Publishers, New York.
Coningham, R and R. Young. 2015. The archaeology of South Asia: from the Indus to
Asoka, c. 6500 BCE–200 CE: Cambridge University Press.
Consortium, Y. C. 2002. A nomenclature system for the tree of human Y-

chromosomal binary haplogroups. Genome res. 12: 339-348.
Cox, M., F. Mendez, T. Karafet, M. Pilkington, S. Kingan, G. Destro-Bisol, B.

Strassmann and M. Hammer. 2008. Testing for Archaic Hominin Admixture
on the X Chromosome: Model Likelihood for the Modern Human RRM2P4
Region from Summaries of Genealogical Topology Under the Structural
Coalescent. Genetics 178: 427-437.
Crews R. D. 2015. Afghan Modern: the history of a global nation. Cambridge, MA:
Harvard University Press.
Cucina, A and V. Tiesler. 2003. Dental caries and antemortem tooth loss in the
Northern Peten area, Mexico: A biocultural perspective on social status
differences among the classic Maya. Am J Phys Anthropol. 122: 1-10.
226
Cunha, C., A. M. Silva, J. D Irish, G. R Scott, T. Tome and J. Marquez. 2012.
Hypotrophic roots of the upper central incisors - a proposed new discrete
dental trait. Dent Anthropol. 25(1): 8-14.
Cunliffe, B. 2015. By Steppe, Desert, and Ocean: The Birth of Eurasia. Oxford, United
Kingdom: Oxford University Press.
Cunnighum, A. 1865. Archoeological Survey Report of India, 1862-63. Delhi:

Indological Book House.II. p.73.
Dahlberg, A A. 1945. The changing dentition of man. J Am Dent Assoc. 32: 676-690.
Dahlberg, A. A. 1951. “The dentition of the American Indian,” in Papers on the

Physical Anthropology of the American Indian. Edited by W.S. Laughlin. New
York: Viking Fund. 138-176.
Dahlberg, A. A. 1956. Materials for the establishment of standards for classification

of tooth characteristics, attributes, and techniques in morphological studies of
the dentition. Chicago: University of Chicago Press.
Damgaard, P. B., A. Margaryan, H. Schroeder, L. Orlando, E. Willerslev and E. M.

Allentoft. 2015. Improving access to endogenous DNA in ancient bones and
teeth. Sci rep-UK. 5. p.11184.
Dani, A. H. 1980. North-West Frontier Burial Rites in their wider archaeological

setting. In: Loofs-Wissowa HHE, editor. The Diffusion of Material Culture
(Asian and Pacific Archaeology Series 9). Manoa: University of Hawaii. p. 121–
150.
Davey, M. J., D. Jeruzalmi, J. Kuriyan and M. O'Donnell. 2002. Motors and switches:
AAA+ machines within the replisome. Nat Rev Mol Cell Bio. 3(11): 826-835.
DeGiorgio, M., M. Jakobsson and N. A. Rosenberg. 2009. Out of Africa: modern

human origins special feature: explaining worldwide patterns of human
genetic variation using a coalescent-based serial founder model of migration
outward from Africa. Proc Natl Acad Sci. 106: 16057-16062
Derenko, M., B. Malyarchuk, A. Bahmanimehr, G. Denisova and M. Perkova. 2013.

Complete mitochondrial DNA diversity in Iranians. PloS one. 8: e80673.
DeSantis, L. R. 2016. Dental microwear textures: reconstructing diets of fossil

mammals. Surf Topogr: Metro Prop. 4(2): 023002.
Docherty, P. 2007. The Khyber Pass: a history of empire and invasion. New York:
Union Square Press.
227
Drennan, M. R. 1929. The dentition of a Bushman tribe. Ann S Afr Mus. 24: 61-88.
Durrani, F., I. Ali and G. Erdosy. 1991. Further excavations at Rehman Dheri. Ancient
Pakistan. 7:61–151.
Ebner, S., R. Lang, E. Mueller, W. Eder, M. Oeller, A. Moser, J. Koller, B. Paulweber,

J. Mayr, W. Sperl, B. Kofler. 2011. Mitochondrial Haplogroups, Control
Region Polymorphisms and Malignant Melanoma: A Study in Middle
European Caucasians. PLoS One. 6:e27192.
Edgar, H. J. 2013. Estimation of ancestry using dental morphological characteristics. J

forensic sci. 58: S3-S8.
Elphinstone, M. 2011. Account of the Kingdom of Caubul, and its dependencies in

Persia, Tartary, and India: comprising a view of the Afghan Nation, and a
history of the Dooraunee Monarchy. Cambridge University Press.
Endicott, P., S. Y. W. Ho and C. Stringer. 2010. Using genetic evidence to evaluate

four palaeoanthropological hypotheses for the timing of Neanderthal and
modern human origins. J Hum Evol. 59: 87–95.
Enoki, K and A. A. Dahlberg. 1958. Rotated maxillary central incisors. Orth J Jpn. 17:
157-159.
Eshed, V. A., Gopher and I. Hershkouitz. 2006. Tooth wear and dental pathology at
the advent of agriculture: New evidence from the Levant. Am J Phys
Anthropol. 130: 145-159.
Excoffier, L and A. Langaney. 1989. Origin and differentiation of human

mitochondrial DNA. Am J Hum Genet. 44: 73–85.
Excoffier, L and H. E. Lischer. 2010. Arlequin suite ver 3.5: a new series of programs
to perform population genetics analyses under Linux and Windows. Mol Ecol
Resour. 10: 564-567.
Fan, L and Y. G. Yao. 2011. MitoTool: A web server for analysis and retrieval for
mitochondrial DNA sequence variations. Mitochondrion. 2: 351–6.
FATA. 2010. Federally administered tribal area. (http://www.fata.gov.pk/).
Firasat, S., S. Khaliq, A. Mohyuddin, M. Papaioannou, C. Tyler-Smith, A. P.

Underhill and Q. Ayub. 2007. Y-chromosomal evidence for a limited Greek
contribution to the Pathan population of Pakistan. Eur J Hum Genet. 15(1): 121-
126.
228
Flower, W. H. 1885. On the size of teeth as a character of race. Jour Roy Anthrop Inst.
14: 183-186.
Forster, P and S. Matsumura. 2005. Did Early Humans Go North or South?. Science.
308:965-966.
Forster, P. 2004. Ice Ages and the mitochondrial DNA chronology of human
dispersals: a review. Philos Trans R Soc Lond B Biol Sci. 359: 255-64.
Foster, J. W and A. J. Graves. 1994. An SRY-related sequence on the marsupial X

chromosome: implications for the evolution of the mammalian testis-
determining gene. P Natl Acad Sci. 91. 1927-1931.
Fu, Q., A. Mittnik, P. L. Johnson, K. Bos, M. Lari, R. Bollongino, C. Sun, L. Giemsch,

R. Schmitz, J. Burger and A. M. Ronchitelli. 2013. A revised timescale for
human evolution based on ancient mitochondrial genomes. Curr Biol. 23(7):
553-559.
Gaikwad, S., T. Vasulu and V. Kashyap. 2006. Microsatellite diversity reveals the
interplay of language and geography in shaping genetic differentiation of
diverse Proto Australoid populations of west central India. Am J Phys
Anthropol. 129: 260-267.
Garrigan, D and M. F. Hammer. 2006. Reconstructing human origins in the genomic

era. Nat Rev Genet. 7: 669-80.
Geppert, M., M. Baeta, C. Nunez, B. Martínez-Jarreta, S. Zweynert,., O.W.V. Cruz, F.

González-Andrade, J. González-Solorzano, M. Nagy and L. Roewer. 2011.
Hierarchical Y-SNP assay to study the hidden diversity and phylogenetic
relationship of native populations in South America. Forensic Sci Int- Gen. 5(2):
100-104.
Glatzer, B. 1998. Being Pashtun-being Muslim: concepts of person and war in

Afghanistan. Essays on South Asian Society: Culture and Politics II. 83-94.
Glatzer, B. 2002. The Pashtun tribal system. Concept of tribal society. 5: 265-282.
Goldstein, D. B., L. A. Zhivotovsky, K. Nayar, A. R. Linares, L. L. Cavalli-Sforza and

W. M. Feldman. 1996. Statistical properties of the variation at linked
microsatellite loci: implications for the history of human Y chromosomes. Mol
Biol Evol. 13 (9): 1213-1218.
Gomes, V., P. Sanchez-Diz, A. Amorim, A. Carracedo and L. Gusmao. 2010. Digging

deeper into East African human Y chromosome lineages. Hum genet. 127. 603-
613.
229
Gomez-Robles, A., M. Martinon-Torres, J. M. B. de Castro, L. Prado, S. Sarmiento
and J. L. Arsuaga. 2008. Geometric morphometric analysis of the crown
morphology of the lower first premolar of hominins, with special attention to
Pleistocene Homo. J Hum Evol. 55 (4): 627-638.
Gomez-Robles, A., M. Martinon-Torres, J. M. B. de Castro, A. Margvelashvili, M.

Bastir, J. L. Arsuaga, A. Perez-Perez, F. Estebaranz and L. M. Martinez. 2007.
A geometric morphometric analysis of hominin upper first molar shape. J
Hum Evol. 53(3): 272-285.
Gooch, P. 1992. Transhumant pastoralism in Northern India: the Gujar case.

Nomadic Peoples. pp.84-96.
Government of Pakistan. 2002. District census report Swat: Population Census

Organization, Government of Pakistan, Islamabad.
Gower, J.C. 1966). Some distance properties of latent root and vector methods used
in multivariate analysis. Biometrika. 53: 325-338.
Green, R., J. Krause, W. A. Briggs, T. Maricic, U. Stenzel, M. Kircher, N. Patterson,

H. Li, W. Zhai, M. His-Yang Fritz, F. N. Hansen, Y. E. Durand, A-S.
Malspinas, D. J. Jensen. 2010. A Draft Sequence of the Neanderthal Genome.
Science. 328: 710-722.
Gregory, W. K. 1916. Studies in the evolution of the Primates. Part I. Cope-Osborn

theory of trituberculy and the ancestral molar patterns of the Primates. Part II.
Phylogeny of recent and extinct anthropoids, with special reference to the
origin the man. Bull Am Mus Nat Hist. 35: 259-355.
Gregory, W. K. 1922. The origin and evolution of the human dentition. Baltimore:
Williams and Wilkins.
Gregory, W. K. 1926. Paleontology of the human dentition. Am J Phys Anthropol. 9:

401-426.
Grierson, G.A. 1903-1928. Linguistic Survey of India (11 volumes). Calcutta: Office of
the Superintendent Government Printing press, India.
Grimes, B. F. 1992. Ethnologue: languages of the world. Summer Institute of

Linguistics, Dallas.
Grimes, B. F. and J. E. Grimes. 2000. Languages of the world. 14th ed. Ethnologue (1).
Dallas: SIL International.
230
Guimaraes-Ferreira, L. 2014. Role of the phosphocreatine system on energetic
homeostasis in skeletal and cardiac muscles. Einstein. 12(1): 126-131.
Guttman, L. 1968. A general nonmetric technique for finding the smallest coordinate
space for a configuration of points. Psychometrika. 33(4): 469-506.
Haak, W., I. Lazaridis, N. Patterson, N. Rohland, S. Mallick, B. Llamas, G. Brandt, S.

Nordenfelt, E. Harney, K. Stewardson and Q. Fu. 2015. Massive migration
from the steppe was a source for Indo-European languages in Europe. Nature.
522(7555): 207-211.
Haber, M., D. E. Platt, M. A. Bonab, S. C. Youhanna, D. F. Soria-Hernanz, B. Martı

nez- Cruz, B. Douaihy, M. Ghassibe-Sabbagh, H. Rafatpanah, M. Ghanbari, J.
Whale, O. Balanovsky, R. S. Wells, D. Comas, C. Tyler-Smith, P. A. Zalloua.
2012. Afghanistan’s ethnic groups share a Y-chromosomal heritage structured
by historical events, Genographic Consortium. PLoS One. 7. e34288.
Haber, M., M. Mezzavilla, Y. Xue, C. Tyler-Smith. 2016. Ancient DNA and the
rewriting of human history: be sparing with Occams razor. Genome Biol. 17:1–
8.
Haeussler, A. M. 1989. Morphological and Metrical Comparison of San and Central

Sotho Dentitions from Southern Africa. Am J Anthropol. 78: 115- 122.
Hamayun, M. 2005. Ethnobotanical profile of Utror and Gabral valleys, district Swat,
Pakistan. Ethnobotanical Leaflets. (1): 1-37. (http://www.siu.edu/~ebl/).
Hammer, M. F., A. E. Woerner, F. L. Mendez, J. C. Watkins and J. D. Wall. 2011.

Genetic evidence for archaic admixture in Africa. P Natl Acad Sci. 108: 15123–
28.
Hammer, M. F., M. T. Karafet, J. A. Redd, H. Jarjanazi, S. Santachiara-Benerecetti, H.

Soodyall and L. S. Zegura. 2001. Hierarchical patterns of global human Y-
chromosome diversity. Mol Biol Evol. 18: 1189-1203.
Harris, E. F. 1977. Anthropologic and genetic aspects of the dental morfology of

Solomon Islanders, Melanesia Tempe. PhD Dissertation, Arizona State
University.
Harvati, K., C. Stringer, R. Grun, M. Aubert, P. Allsworth-Jones, C. A. Polorunso.

2011. The Later Stone Age calvaria from Iwo Eleru, Nigeria: morphology and
chronology. Plos One. 6: e24024.
Hasegawa, M. and S. Horai. 1991. Time of the Deepest Root for Polymorphism in
Human Mitochondrial DNA. J mol evol. 32:37-42.
231
Hassanali, J. 1982. Incidence of Carabelli's trait in Kenyan Africans and Asians. Am J
Phys Anthropol. 59(3): 317-319.
Hayat, S., T. Akhtar, M. H. Siddiqi, A. Rakha, N. Haider, M. Tayyab, G. Abbas, A.

Ali, S. Y. A. Bokhari, M. A. Tariq. 2015. Mitochondrial DNA control region
sequences study in Saraiki population from Pakistan. Leg Med (Tokyo).
17:140–144.
Hazrat, A., J. Shah, M. Ali and I. Iqbal. 2007. Medicinal value of ranunculaceae Of
Dir Valley. Pak J Bot. 39(4): 1037-1044.
Heinz, T. M. 2015. Genetic ancestry of the Bolivian population: PhD thesis.

Universidad de Santiago de Compostela.
Hellman, M. 1928. Racial characters in human dentition part I. A racial distribution

of the Dryopithecus pattern and its modifications in the lower molar teeth of
man. Proc Am Philos Soc. 67(2): 157-174.
Hemphill, B. E. 2009a. Bioanthropology of the Hindu Kush High Lands: A Dental

Morphology Investigation. Pak Herit. 1: 19-36.
Hemphill, B. E. 2009b. The Swatis of Northern Pakistan-emigrants from Central Asia

or colonists from peninsular India? A dental morphometric investigation. Am
J Phys Anthropol. pp 147-147.
Hemphill, B. E. 2013. Grades, gradients, and geography: a dental morphometric

approach to the population history of South Asia. In: Scott GR, Irish JD,
editors. Anthropological Perspectives on Tooth Morphology: Genetics,
Evolution, Variation. Cambridge: Cambridge University Press. pp. 341-387.
Hemphill, B. E., I. Ali, A. Hameed. 2010. Dental Anthropology of the Madaklasht I:

A description and Analysis of Variation in Morphological Features of the
Permanent Tooth Crown. Pak Herit. 2: 1-33.
Hemphill, B. E., J. R. Lukacs and S. R. Walimbe. 2000. Ethnic Identity, Biological

History and Dental Morphology: Evaluating the Indigenous Status of
Mahatrashtra’s Mahars. Antiquity. 74: 671-681.
Hemphill, B.E. 2012. Tooth Size, Crown Complexity, and the Utility of Combining
Archaeologically-derived Samples with Living Samples for Reconstruction of
Population History. Pak Herit. 4: 49-85.
Hemphill, B.E., I. Ali, S. Blaylock and N. Willits. 2008. Are the Kho an Indigenous
Population of the Hindu Kush?: A Dental Morphometric Approach. In: M.
232
Tosi and D. Frenez (eds.), South Asian Archaeology 2008, Volume I. Oxford:
Archaeopress-BAR, pp. 127-137.
Hemphill, B.E. 2012. The Awans of Northern Pakistan: Emigrants from Central Asia,
Arabs from Western Afghanistan, or Colonists from Peninsular India? A
Dental Morphometric Investigation. Am J Phys Anthropol (Suppl 54): 163.
Henn, B. M., Botigue, L. R. Peischl, S. Dupanloup, I. Lipatov, M. Maples, B. K. and L.

Excoffier. 2016. Distance from sub-Saharan Africa predicts mutational load in
diverse human genomes. Proc Nat Acad Sci USA. 113(4): E440-E449.
Higgins, D and J. J. Austin. 2013. Teeth as a source of DNA for forensic identification
of human remains: a review. Sci Justice. 53(4): 433-441.
Hillson, S. 1979. Diet and Dental Disease. World Archeol. 2: 147-162
Hillson, S. 1996. Dental Anthropology, Cambridge. Cambridge University Press.
Hodgson, J and T. Disotell. 2008. No Evidence of a Neanderthal Contribution to

Modern Human Diversity. Genome Biol. 9(2): 206.
Holland, M. M and T. J. Parsons. 1999. Mitochondrial DNA sequence analysis-

validation and use for forensic casework. Forensic Sci Rev. (11): 21-50.
Hrdlicka, A. 1920. Shovel-shaped teeth. Am J Anthropol. 3: 429-465.
Hrdlicka, A. 1921. Further studies of tooth morphology. Am J of Phys Anthropol. 4(2):

141-176.
Hrdlicka, A. 1924. New data on the teeth of early man and certain fossil European
apes. Am J Anthropol. 7: 109-132.
Hsu, J. W., P. Tsai, T. H. Hsiao, H. P. Chang, L. M. Lin, K. Liu, H. S. Yu and D.

Ferguson. 1999. Ethnic dental analysis of shovel and Carabelli’s traits in a
Chinese population. Aust Dent J. 44(1): 40-45.
Hudjashov, G., T. Kivisild, P. Underhill, P. Endicott, J. Sanchez, A. Lin, P. Shen, P.

Oefner, C. Renfrew, R. Villems, P. Forster. 2007. Revealing the prehistoric
settlements of Australia by Y chromosome and mtDNA Analysis. Proc Natl
Acad Sci USA. 104: 8726-8730.
Hughes, J. F., H. Skaletsky, G. L. Brown, T. Pyntikova, T. Graves, S. R. Fulton, S.

Dugan, Y. Ding, J. C. Buhay, C. Kremitzki and Q. Wang. 2012. Strict
evolutionary conservation followed rapid gene loss on human and rhesus Y
chromosomes. Nature. 483: 82-86.
233
Hussain, A. A. 1962. The Story of Swat as told by the Founder Miangul Abdul
Wadud Badshah Sahib to Muhammad Arif Khan. feroz sons Ltd, Peshawar.
Hussain, J. 1997. A history of the peoples of Pakistan towards independence. Oxford

University Press, Karachi, Pakistan
Hutchison, C. A., J. E. Newbold, S. S. Potter and M. H. Edgell. 1974. Maternal

inheritance of mammalian mitochondrial DNA. Nature. 251:536–8.
Ilyas, M., J.S. Kim, J. Cooper, Y. A. Shin, H. M. Kim, Y. S. Cho, S. Hwang, H. Kim, J.
Moon, O. Chung and J. Jun. 2015. Whole genome sequencing of an ethnic
Pathan (Pakhtun) from the north-west of Pakistan. BMC.Genomics. 16(1):172.
Ingalls, D.H. 1976. Kalidasa and the Attitudes of the Golden Age. J Am Ori Soc.
Pp15-26.
Ingman, M., H. Kaessmann, S. Paabo and U. Gyllensten. 2000. Mitochondrial

genome variation and the origin of modern humans. Nature. 408: 708–713.
International Crisis Group. 2006. Pakistan's Tribal Areas: Appeasing the Militants.
International Crisis Group.
Irish, J. D and G. R. Scott. 2016. Crown wear: identification and categorization. In a

comparison to dental Anthropology (ed). new York: Wiley Blackwell. pp. 415-
432
Irish, J.D., D. Guatelli-Steinberg, S.S. Legge, D.J. de Ruiter and L.R. Berger. 2013.
Dental morphology and the phylogenetic “place” of Australopithecus sediba.
Science. 340(6129): 1233062.
Jakobsson, M., S. W. Scholz, P. Scheet, J. R. Gibbs, J. M. VanLiere, H. C. Fung and Z.

A. Szpiech. 2008. Genotype, haplotype and copy-number variation in
worldwide human populations. Nature. 451: 998-1003.
Jiang, T., C. C. Hou, Y. Z. She and X. W. Yang. 2013. The SOX gene family: function
and regulation in testis determination and male fertility maintenance. Mol Biol
Rep. 40: 2187-2194.
Jobling, M. A. and C. Tyler-Smith. 2003. The human Y chromosome: an evolutionary

marker comes of age. Nat Rev Genet. 4: 598-612.
Jobling, M. A., M. E. Hurles and C. Tyler-Smith. 2004. Human evolutionary genetics:

Origins, Peoples and Disease. New York, Garland Publishing.
Johanson, D. 2001. Origins of Modern Humans: Multiregional or Out of Africa.

American Institute of Biological Sciences.
234
Jorde, L. B., W. S. Watkins and M. J. Bamshad. 2001. Population genomics: a bridge
from evolutionary history to genetic medicine. Hum Mol Genet. 10: 2199-207.
Kaifu Y., M. Izuho, T. Goebel, H. Sato and A. Ono. 2015. Emergence and diversity of
modern human behavior in Paleolithic Asia. College Station, TX: Texas A&M
University Press.
Karafet, T. M., F. L. Mendez, M. B. Meilerman, P. A. Underhill and S. L. Zegura.

2008. New binary polymorphisms reshape and increase resolution of the
human Y chromosomal haplogroup tree. Genome Res. 18: 830-838.
Kareem, M. A., O. A. Hussein and H. I. Hameed. 2015. Y-chromosome short tandem

repeat, typing technology, locus information and allele frequency in different
population: A review. Afr J Biotechnol. 14(27): 2175-2178.
Katoh, K and D. M. Standley. 2013. MAFFT multiple sequence alignment software

version 7: improvements in performance and usability. Mol Biol Evol. 30(4):
772-780.
Kaul, V and S. Prakash. 1981. Morphological features of Jat dentition. Am J of Phys

Anthropol. 54(1): 123-127.
Kennedy, K. A. R., N. C. Lovell and C. B. Burrow. 1986. Mesolithic Human Remains

from the Gangetic Plains: Sarai Nahar Rai (Occasional Papers and Theses of
the South Asia Program number 10), Cornell University, Ithaca.
Kennedy, K.A.R., Chiment, J., Distell, T., and Meyers, D. 1984. Principal-components
Analysis of Prehistoric South Asian Crania. Am J of Phys Anthropol. 64(2), 105-
118.
Kenoyer, J.M. 2005. Culture Change during the Late Harappa Period at Harappa. In:
Bryant, E.F. and Patton, L.L., (Ed.). The Indo-Aryan Controversy: Evidence
and Inference in Indian History (pp. 21-49). London: Routledge
Ketmaier, V and C. Bernardini. 2005. Structure of the mitochondrial control region of

the Eurasian Otter (Lutra lutra; Carnivora, Mustelidae): Patterns of genetic
heterogeneity and implications for conservation of the species in Italy. J Hered.
96(4): 318-328.
Khan, F. 2013b. Recent discovery of Petroglyphs at Parwak, District Chitral,

Pakistan. J Asian Civil. 36:101–109.
Khan, M. N. 2013a. The early arrival of Muslims in Ancient Gandhara study based
on numismatic evidence from Kashmir Smast. Gandharan Stud. 7:115–119.
235
Khan, T. M. 2008. The Tribal Areas of Pakistan, a Contemporary Profile. Sang-e-Meel
Publications.
Khattak, M. H. K. 1997. Buner, The forgotten Part of Ancient Uddiyana. Noble Art
Press Karachi. p.43
Kieser, J. A and C. B. Preston. 1981. The dentition of the Lengua Indians of Paraguay.
Am J Phys Anthropol. 55(4): 485-490.
Kieser, J. A. 1984. Dental morphology of a Griqua skeletal population. Anthropolo

Anz 93-99.
Kivisild, T. 2015. Maternal ancestry and population history from whole

mitochondrial genomes. Investig genet. 6(1):3
Kivisild, T., J. M. Bamshad, K. Kaldma, M. Metspalu, E. Metspalu, M. Reidla, S. Laos,

J. Parik, S. W. Watkins, W. S, E. M. Dixon, S. S. Papiha, S. S. Mastana, R. M.
Mir, V. Ferak, R. Villems. 1999. Deep Common Ancestry of Indian and
Western-Eurasian Mitochondrial DNA Lineages. Curr Biol. 9: 1331-1334.
Kivisild, T., S. Rootsi, M. Metspalu, S. Mastana and K. Kaldma. 2003. The genetic
heritage of the earliest settlers persists both in Indian tribal and caste
populations. Am J Hum Genet. 72: 313-332.
Klein, R. 1999. The Human Career: Human Biological and Cultural Origins.
University of Chicago Press.
Klein, R. G. 2008. Out of Africa and the Evolution of Human Behaviour. Evol
Anthropol. 17:267-281.
Kloss-Brandstatter, A., D. Pacher, S. Schonherr, H. Weissensteiner, R. Binna, G.

Spechtand and F. Kronenberg. 2011. HaploGrep: a fast and reliable algorithm
for automatic classification of mitochondrial DNA haplogroups. Hum Mutat.
32(1): 25–32.
Kraus, B. S and M. Fur. 1953. Lower first premolars. J Dent Res. 32: 554-564.
Kraytsberg, Y., M. Schwartz, A. T. Brown, K. Ebralidse, S. W. Kunz, A. D. Clayton, J.

Vissing and K. Khrapko, K., 2004. Recombination of human mitochondrial
DNA. Science. 304(5673). 981-981.
Krithika, S., S. Maji and T.S. Vasulu. 2009. A microsatellite study to disentangle the
ambiguity of linguistic, geographic, ethnic and genetic influences on tribes of
India to get a better clarity of the antiquity and peopling of South Asia. Am J
of Phys Anthropol. 139(4). 533-546.
236
Kruskal, J. B. 1964. Multidimensional scaling by optimizing goodness of fit to a
nonmetric hypothesis. Psychometrika. 29(1): 1-27.
Kumar, S., R. Reddy Ravuri, P. Koneru, P. B. Urade, N. B. Sarkar, A. Chandrasekar

and R. V. Rao. 2009. Reconstructing Indian-Australian Phylogenetic Link.
Evol Biol. 9: 173-177.
Kuzmina, E. E and V. H. Mair. 2008. The prehistory of the Silk Road: University of
Pennsylvania Press.
Kayser, M., 2010. The human genetic history of Oceania: near and remote views of
dispersal. Curr Biolo. 20(4): 194-201.
Lachance, J., B. Vernot, C. C. Elbers, B. Ferwerda, A. Froment, M. J. Bodo, G. Lema,

W. Fu, B. T. Nyambo, R. T. Rebbeck and K. Zhang. 2012. Evolutionary history
and adaptation from high-coverage whole-genome sequences of diverse
African hunter-gatherers. Cell. 150: 457–69
Lahn, B. T and C. D. Page. 1997. Functional coherence of the human Y chromosome.

Science. 278(5338): 675-680.
Lahr, M. M and R. Foley. 1994. Multiple dispersals and modern human origins. Evol
Anthropol: Issues, News, and Reviews. 3(2): 48-60.
Lalata., Prasada and Pandeya. 1971. Sun-worship in ancient India. Motilal

Banarasidass. p. 245.
Landsteiner, K. 1901. Ueber agglutinationserscheinungen normalen menschlichen

blutes. Wien Kli Wchnschr. 14: 1132-1134.
Lapidus, I. M. 2002. A history of Islamic societies: Cambridge University Press.
Larmuseau, M. H., A. Van Geystelen, M. Kayser, M. van Oven and R. Decorte. 2015.
Towards a consensus Y-chromosomal phylogeny and Y-SNP set in forensics
in the next-generation sequencing era. Forensic Sci Int- Genet. 15: 39-42.
Lee, E. Y., J. K. Shin,A. Rakha, E. J. Sim, J. M. Park, Y. N. Kim, I. W. Yang and Y. H.

Lee. 2014. Analysis of 22 Y chromosomal STR haplotypes and Y haplogroup
distribution in Pathans of Pakistan. Forensic Sci Int-Genet. 11: 111-116.
Li, Z., C. J. Haines and Y. Han. 2008. “Micro-deletions” of the human Y chromosome
and their relationship with male infertility. J Genet Genomics. 35(4):193-199.
Lightowlers, R., P. Chinnery, D. Turnbull and N. Howell. 1997. Mammalian

mitochondrial genetics: heredity, heteroplasmy and disease. Trends Genet.
13(11): 450-5.
237
Lindholm, C. 1982. Generosity and Jealousy: The Swat Pukhtun of Northern
Pakistan Columbia University Press, New York.
Liu, H. Y., C. P. Liao, T. K. Chuang and C. M. Kao. 2011. Mitochondrial targeting of

human NADH dehydrogenase (ubiquinone) flavoprotein 2 (NDUFV2) and its
association with early-onset hypertrophic cardiomyopathy and
encephalopathy. J Biomed sci. 18(1): 1.
Liu, H., F. Prugnolle, A. Manica and F. Balloux. 2006. A Geographically Explicit

Genetic Model of Worldwide Human-Settlement History. Am J Hum Genet.
79: 230-237.
Liu, N and H. Zhao. 2006. A non-parametric approach to population structure

inference using multilocus genotypes. Hum genomics. 2(6):253.
Loogvali, E. L., U. Roostalu, B. A. Malyarchuk, M. V. Derenko, T. Kivisild, E.

Metspalu, K. Tambets, M. Reidla, H. V. Tolk, J. Parik and E. Pennarun. 2004.
Disuniting uniformity: a pied cladistic canvas of mtDNA haplogroup H in
Eurasia. Mol Biol Evol. 21(11): 2012-2021.
Lucas, P. W., J. P. Constantino and A. B. Wood. 2008. Inferences regarding the diet of
extinct hominins: structural and functional trends in dental and mandibular
morphology within the hominin clade. J Anat. 212: 486–500
Lukacs J. R., B. E. Hemphill and S. R. Walimbe. 1998. Are Mahars Autochthones of

Maharashtra? Dental Morphology and Population History in South Asia.
Human Dental Development,Morphology, and Pathology: A Tribute to Albert
A. Dahlberg. pp 119–53.
Lukacs, J. R and B. E. Hemphill. 1991. Dental anthropology of prehistoric

Baluchistan: a morphometric approach to the peopling of South Asia. In:
Kelley, M.A., Larsen, C.S. (Eds.), Advances in Dental Anthropology. Wiley-
Liss, New York. pp. 77–119.
Lukacs, J. R. 1986. Dental morphology and odontometrics of early agriculturalists

from Neolithic Mehrgarh, Pakistan. In: Russell, D. E., J. P. Santoro, D.
Sigogneau-Russell (Eds.). Teeth Revisited: Proceedings of the VIIth
International Symposium on Dental Morphology. Mémoires de la Museé
national Histoire naturelle (série C). Paris. 53: 285–303.
Lukacs, J. R. 1987. Biological relationships derived from morphology of permanent

teeth: recent evidence from prehistoric India. Anthrop Anz. 45: 97–116.
Lukacs, J. R., B. E Hemphill. 1992. Chapter V: Dental Anthropology. In: Kennedy,

K.A.R., et al. (Eds.), Human Skeletal Remainsfrom Mahadaha: A Gangetic
238
Mesolithic Site. South Asia Occasional Papers and Theses, No. 11. Cornell
University, Ithaca, pp.157–270.
Lukacs, J.R., 1983. Dental anthropology and the origins of two Iron Age populations
from northern Pakistan. Homo Gottingen. 34(1): 1-15.
Macaulay, V., C. Hill, A. Achilli, C. Rengo, D. Clarke, W. Meehan, J. Blackburn, O.

Semino, R. Scozzari, F. Cruciani, A. Taha, N. Kassim Shaari, J. Maripa Raja, P.
Ismail, Z. Zainuddin, W. Goodwin, D. Bulbeck, H. J. Bandelt, S. Oppenheimer,
A. Torroni and M. Richards. 2005. Single, Rapid Coastal Settlement of Asia
Revealed by Analysis of Complete Mitochondrial Genomes. Science. 308:1034-
1036.
Macaulay, V., M. Richards, E. Hickey, E. Vega, F. Cruciani, V. Guida, R. Scozzari, B.

Bonne-Tamir, B. Sykes and A. Torroni. 1999. The emerging tree of west
Eurasian mtDNAs: a synthesis of control-region sequences and RFLPs. Am J
Hum Genet. 64: 232-249.
Maji, S., S. Krithika and T.S. asulu. 2009. Phylogeographic distribution of

mitochondrial DNA acrohaplogroup M in India. J Genet. 88(1):127-139.
Maloney, C. 1974. People of south Asia.Rinehart and Winston.New York Holt

publisher, New York.
Marado, L. M and V. Campanacho. 2013. Carabelli’s trait: definition and review of a

commonly used dental non-metric variable. Cadernos do Geevh. 2(1): 24-39.
Marbaniang, D. 2015. History of Hinduism: Pre-vedic and Vedic Age. Raleigh, NC:
Lulu.com.
Marean, C. W. 2015. An Evolutionary Anthropological Perspective on Modern

Human Origins. Annu Rev Anthropol. 44: 533–56
Margulies, M., M. Egholm, W. E. Altman, S. Attiya, J. S. Bader, L. A. Bemben, J.

Berka, M. S. Braverman, Y. J. Chen, Z.Chen and S. B. Dewell. 2005. Genome
sequencing in microfabricated high-density picolitre reactors.
Nature. 437(7057): 376-380.
Margvelashvili, A., C. P. E. Zollikofer, D. Lordkipanidze, T. Peltomaki and M. S.

Ponce de Leon. 2013. Tooth wear and dentoalveolar remodeling are key
factors of morphological variation in the Dmanisi mandibles. Proc Natl Acad
Sci U S A. 110(43): 17278-17283
Marjanovic, D., d. Primorac. 2013. Forensic Genetics: Theory and Application, 2nd
ed. [Bosnian] Sarajevo: Lelo, d.o.o.
239
Matsumura, H., H. T. Ishida, H. Amano Ono and M. Yoneda. 2009. Biological
affinities of Okhotsk-culture people with East Siberians and Arctic people
based on dental characteristics. Anthropol Sci. 117(2): 121-132.
Maula, E. 1993. Hampaat-menneisyyden tietopankki. Hammaslaakarilehti. 7(93): 416-

418.
Mayhall, J. T., S. R. Saunders and P. L. Belier. 1982. The dental morphology of North
American whites: a reappraisal. Teeth: form, function, and evolution New
York: Columbia University Press. p. 245-258.
McAlpin, D.W. 1981. Proto-Elamo-Dravidian: The evidence and its implications.

Transactions of the American Philosophical Society. 71(3): 1-155.
McElreavey, K and L. Quintana-Murci. 2005 A population genetics perspective of the

Indus Valley through uniparentally-inherited markers. Ann Hum Biol. 32:154–
162.
Mckenzie, M and T. M. Ryan. 2010. Assembly factors of human mitochondrial

complex I and their defects in disease. IUBMB life. 62(7): 497-502.
McMahon, A. H and G. D. A. Ramsy. 1901. Report on the tribe of dir, Swat and
bajawar together with the Uthmankheil and Sam Ranezai, Saeed book Bank,
Peshawar.
Mehdi, S., R. Qamar, Q. Ayub, S. Khaliq and A. Mansoor. 1999. The Origins of
Pakistani Populations. Genomic Diversity: Springer. pp. 83-90.
Mellars, P. 2006. Going East: New Genetic and Archaeological Perspectives on the
Modern Human Colonization of Eurasia. Science. 313: 796-800.
Mendez, F. L., T. Krahn, B. Schrack, A. M. Krahn, K. R. Veeramah, E. A, Woerner,

M. L. F. Fomine, N. Bradman, G. M, Thomas, M. T. Karafet and F. M.
Hammer. 2013. An African American paternal lineage adds an extremely
ancient root to the human Y chromosome phylogenetic tree. Am J Hum Genet.
92: 454–59
Metspalu, M., T. Kivisild, E. Metspalu, J. Parik, G. Hudjashov, K. Kaldma, P. Serk,

M. Karmin, D. M. Behar, M. T. P. Gilbert, P. Endicott, S. Mastana, S. S. Papiha,
K. Skorecki, A. Torroni and R. Villems. 2004. Most of the extant mtDNA
boundaries in South and Southwest Asia were likely shaped during the initial
settlement of Eurasia by anatomically modern humans. BMC Genet. 5: 26.
Meyer, M., Q. Fu, A. Aximu-Petri, I. Glocke and B. Nickel. 2014. A mitochondrial

genome sequence of a hominin from Sima de los Huesos. Nature. 505:403–6.
240
Michaels, G. S., W. W. Hauswirth and J. P. Laipis. 1982. Mitochondrial DNA copy
number in bovine oocytes and somatic cells. Dev Biol. 94: 246–51.
Mihailidis, S., G. Scriven, M. Khamis and G. Townsend. 2013. Prevalence and

patterning of maxillary premolar accessory ridges (MxPARs) in several
human populations. Am J Phys Anthropol. 152(1): 19-30.
Mikkelsen, M., E. Rockenbauer, E. Sorensen, M. Rasmussen, C. Borsting and N.

Morling. 2008. A Mitochondrial DNA SNP Multiplex Assigning Caucasians
into 36 Haplo- and Subhaplogroups. Forensic Sci Int- Genet. 1: 287-289.
Miller, D. 1985. Ideology and the Harappan civilization. J Anthropol Archaeol. 4: 34-
71.
Mirabal, S., T. Varljen, T. Gayden, M. Regueiro, S. Vujovic, D. Popovic, M. Djuric, O.

Stojkovic and J. R. Herrera. 2010. Human Y-chromosome short tandem
repeats: a tale of acculturation and migrations as mechanisms for the diffusion
of agriculture in the Balkan Peninsula . Am J Phys Anthropol. 142: 380–390.
Moorrees, C. F. A. 1957. The Aleut dentition. Cambridge, Mass. Harvard University

Press.
Moreno, F., M. S. Moreno, A. C. Diaz, A. E. Bustos and V. J. Rodriguez. 2004.

Prevalence and variability of eight non-metric dental traits in students of Cali,
Colombia. Col Med. 35: 17-23.
Morgenstierne, G. 1932. Report on a linguistic mission to north-western India: Indus

Publication.
Morris, D. H. 1975. Bushmen maxillary canine polymorfism. SA J Sci. 71: 333-335.
Morris, D. H., A. A. Dahlberg and S. Glasstone-Hughes. 1978. The Uto-Aztecan

premolar: The anthropology of a dental trait, in P.M. Butler, K.A. Joysey
(eds.), Developme\nt Function and Evolution of the Teeth. London, Academic
Press. Pp. 69-79.
Murray, J. W. 1899. A Dictionary of the Pathan Tribes on the North-west Frontier of

India. Office of the Superintendent, Government Print, India.
Navarro-Costa, P. 2012. Sex, rebellion and decadence: the scandalous evolutionary

history of the human Y chromosome. Biochim Biophys Acta. 1822: 1851-1863.
Navarro-Costa, P., E. C. Plancha and J. Goncalves. 2010. Genetic dissection of the

AZF regions of the human Y chromosome: thriller or filler for male
(in)fertility?. J biomed and biotech. 2010:1-18.
241
Nei, M. 1995. Genetic Support for the out-of-Africa theory of human evolution. Proc
Nat Acad Sci USA. 92: 6720-6722.
Nesheva, D. V. 2014. Aspects of ancient mitochondrial DNA analysis in different

populations for understanding human evolution. Balkan J Med Genet. 17(1): 5-
14.
Newcomb, L. 1986. The Islamic Republic of Pakistan: country profile. Int Demogr. 5:
1-8.
Nichol, C. R. and C. G.Turner. 1986. Intra- and interobserver concordance in
observing dental morphology, Am. J. Anthropol., 69: 299-315.
Nichol, C. R., C. G. Turner and A. A. Dahlberg. 1984. Variation in the convexity of

the human maxillary incisor labial surface. Am J Anthropol. 63: 361-370.
Nijjar, B. S. 2008. Origins and History of Jats and Other Allied Nomadic Tribes of
India: 900 BC-1947 AD: Atlantic Publishers and Dist.
Nitecki, M. H and D. V. Nitecki (eds). 1994. Origins of Anatomically Modern

Humans. New York. Plenum Press.
Novelletto, A. 2007. Y chromosome variation in Europe: Continental and local

processes in the formation of the extant gene pool. Ann Hum Biol. 34: 139-172.
Novembre, J. and S. Ramachandran. 2011. Perspectives on human population

structure at the cusp of the sequencing era. Ann Rev Genom Hum G. 12: 245-
274.
Nusser, M and W. B. Dickore. 2002. A tangle in the triangle: vegetation map of the
eastern Hindukush (Chitral, northern Pakistan). Erdkunde. 56:37–59.
Olofsson, J. 2015. Forensic and Population Genetic Studies of the Human Y

Chromosome: PhD Thesis. Faculty of Health and Medical Sciences, University
of Copenhagen.
Olofsson, J. K., H. S. Mogensen, A. Buchard, C. Børsting and N. Morling. 2015.

Forensic and population genetic analyses of Danes, Greenlanders and Somalis
typed with the Yfiler® Plus PCR amplification kit. Forensic Sci Int- Genet. 16:
232-236.
Olofsson, J. K., V. Pereira, C. Børsting and N. Morling. 2015. Peopling of the North
Circumpolar Region–insights from Y chromosome STR and SNP typing of
Greenlanders. PloS one. 10: e0116573.
242
Oppenheimer, S. 2012. Out-of-Africa, the peopling of continents and islands: tracing
uniparental gene trees across the map. Phil Trans R Soc B. 367: 770-784.
O'Rourke, D.H and A. J. Raff. 2010. The human genetic history of the Americas: the
final frontier. Curr Biol. 20(4): 202-207.
Osborn, H. F. 1888. The evolution of mammalian molars to and from the

tritubercular type. The Ame Nat. 22(264): 1067-1079.
Osborn, H. F. 1907. Evolution of mammalian molar teeth to and from the triangular
type. New York: Macmillan.
Owen, R. B. 1845. Odontography or a treatise on the comparative anatomy of the

teeth: their physiological relations, mode of development and microscopic
structure in vertebrate animals. London: Hyppolyte Bailliere.
Pamjav, H., T. Feher, E. Nemeth and Z. Padar. 2012. Brief communication: New
Y‐chromosome binary markers improve phylogenetic resolution within
haplogroup R1a1. Am J Phys Anthropol. 149: 611-615.
Parishad and G. Bharatiya. 1996. Gurjara aura Unaka Itihasa me Yogadana Vishaya
para Prathama Itihasa Sammelana. The Packard Humanities Institute. Pp. 34–
65. Retrieved, 2007-05-31.
Pedersen, P. O. 1949. The East Greenland Eskimo dentition: numerical variations and
anatomy, a contribution to comparative ethnic odontography. Copenhagen:
Medd Gronl. 142: 1-244.
Perveen, R., Z. Rahman, M. S. Shahzad, M. Israr and M. Shafique. 2014. Y-STR

haplotype diversity in Punjabi population of Pakistan. Forensic Sci Int-Gen. 9:
e20.
Petraglia, M. D., A. Alsharekh, P. Breeze, C. Clarkson and R. Crassard. 2012.

Hominin dispersal into the Nefud desert and Middle Palaeolithic settlement
along the Jubbah palaeolake, northern Arabia. PLoS One. 7: e49840.
Piko, L and L. Matsumoto. 1976. Number of mitochondria and some properties of

mitochondrial DNA in the mouse egg. Dev Biol. 49:1–10.
Political and secret Department. 1933. Who is who in Dir, swat and Bajour Agency.
Government of India press, new Delhi. p 1-30. (www.Mahraka.com).
Poznik, G. D., B. M. Henn, M. C. Yee, E. Sliwerska, G. M. Euskirchen, A. A. Lin, M.

Snyder, L. Quintana-Murci, J. M. Kidd, P. A. Underhill and C. D. Bustamante.
243
2013. Sequencing Y chromosomes resolves discrepancy in time to common
ancestor of males versus females. Science. 341: 562–65.
Pretty, I. A and D. Sweet. 2001. A look at forensic dentistry part 1: the role of teeth in
determination of human identity. Brit Dent J. 190: 359-365.
Prieto, L., B. Zimmermann, A. Goios, A. Rodriguez-Monge, G. G. Paneto, C. Alves,

A. Alonso, C. Fridman, S. Cardoso, G. Lima and J. M. Anjos. 2011. The GHEP–
EMPOP collaboration on mtDNA population data—A new resource for
forensic casework. Forensic Sci Int-Genet. 5(2): 146-151.
Prufer, K., F. Racimo, N. Patterson, F. Jay, S. Sankararaman, S. Sawyer, A. Heinze, G.

Renaud, H. P. Sudmant, C. De Filippo and H. Li. 2014. The complete genome
sequence of a Neanderthal from the Altai Mountains. Nature. 505: 43–49.
Przeworski, M., R. R. Hudson and A. Di Rienzo. 2000. Adjusting the focus on human
variation. Trends Genet. 16: 296-302.
Purps, J., S. Siegert, S. Willuweit, M. Nagy and C. Alves. 2014. A global analysis of Y-
chromosomal haplotype diversity for 23 STR loci. Forensic Sci Int-Gen. 12: 12-
23.
Qamar, R., Q. Ayub, A. Mohyuddin, A. Helgason, K. Mazhar, A. Mansoor, T. Zerjal,

C. Tyler-Smith and S. Q. Mehdi. 2002. Y-chromosomal DNA variation in
Pakistan, Am J Hum Genet. 70: 1107–1124.
Qamar, R., Q. Ayub, S. Khaliq, A. Mansoor, T. Karafet, Q. S. Mehdi and F. M.

Hammer 1999. African and Levantine origins of Pakistani YAP + Y
chromosomes. Hum Biol. 71: 745–755
Qasmi, A. G. 1939. Tarikh-i-Riyasat-i-Swat. Peshawar: Hamidia Press. 27-29.
Quintana-Murci, L., O. Semino, H. J. Bandelt, G. Passarino and K. McElreavey. 1999.

Genetic evidence of an early exit of H. Sapiens sapiens from Africa through
eastern Africa. Nat genet. 23: 437-441.
Quintana-Murci, L., R. Chaix, R.S. Wells, D.M. Behar, H. Sayar, R. Scozzari, C.

Rengo, N. Al-Zahery, O. Semino, A. S. Santachiara-Benerecetti and A. Coppa.
2004. Where west meets east: the complex mtDNA landscape of the southwest
and Central Asian corridor. Am J Hum Genet. 74(5): 827-845.
Raff, J.A., A. D. Bolnick, J. Tackney and H. D. O'Rourke. 2011. Ancient DNA

perspectives on American colonization and population history. Am J phys
Anthropol. 146: 503-514.
244
Rahatullah., F. Haq, S. A. K. Saeed and S. Rehman .2011. Diversity and distribution
of ladybird beetles in District Dir Lower, Pakistan. Int J Biodivers Conserv.
3(12): 670-675.
Rami- Reddy, V., 1985. Dental eruption of India. Dental anthropology: Applications
and methods. New Delhi: Inter-India Publications. p, 55-73
Ranaweera, L., S. Kaewsutthi, A. W. Tun, H. Boonyarit, S. Poolsuwan and P. Lertrit.

2014. Mitochondrial DNA history of Sri Lankan ethnic people: their relations
within the island and with the Indian subcontinental populations. J Hhum
Genet. 59(1): 28-36.
Rasmussen, M., Y. Li, S. Lindgreen, J. S. Pedersen, A. Albrechtsen, I. Moltke, M.

Metspalu, E. Metspalu, T. Kivisild, R. Gupta, M. Bertalan, K. Nielsen, M. T.
Gilbert, Y. Wang, M. Raghavan and P. F. Campos. 2010. Ancient human
genome sequence of an extinct Palaeo-Eskimo. Nature. 463(7282):757-62.
Raza, A., S. Firasat, S. Khaliq, A. Abid, S. S. Shah, Q. S. Mehdi and A. Mohyuddin.

2013. HLA class I and II polymorphisms in the Gujar population from
Pakistan. Immunol Invest. 42(8): 691-700.
Rehman, A. 1979. “The Last Two Dynasties of Shahis”, Center for the Study of the
Civilization of Central Asia, Quaid-e-azam University, Islamabad. pp. 3 – 4.
Rehman, A. 1993. “Date of Overthrow of Laghman: The Last Turkishahi Ruler of

Kabul” . Lahore Museum Bulletin, Lahore. 11. pp. 29 – 31.
Rehman, A. U and S. Malik. 2016. Evaluation of Tribal Diversity of Pashtuns of

Bajaur Agency, North-West Pakistan, on the Basis of Allelic Polymorphisms
at ABO and Rh Loci. Pak J Zool. 48(3): 697-702.
Reid, C., J. F. V. Reenen and H. T. Groeneveld. 1991. Tooth size and the Carabelli
trait. Am J of Phys Anthropol. 84 (4): 427-432.
Relethford, J. H. 2008. Genetic evidence and the modern human origins

debate. Heredity. 100: 555-563.
Renfrew, C. 1987. Archaeology and Language - the Puzzle of Indo- European

Origins. Jonathan Cape. London.
Renfrew, C. 1996. Language families and the spread of farming. In: Harris D, editor.
The origins and spread of agriculture and Pastoralism in Eurasia.
Washington, DC: Smithsonian Institution Press. 70–92.
245
Renfrew, C. 2000. America past, America present: genes and languages in the
Americas and beyond. The McDonalds Institute for Archeological Research,
Cambridge.
Repping, S., K. S. van Daalen, G. L. Brown, M. C. Korver, J. Lange, D. J. Marszalek, T.

Pyntikova, F. van der Veen, H. Skaletsky, C. D. Page and S. Rozen. 2006. High
mutation rates have driven extensive structural polymorphism among human
Y chromosomes. Nat genet. 38: 463-467.
Reyes-Centeno, H., S. Ghirotto, F. Detroit, D. Grimaud-Hervé, G. Barbujani and K.

Harvati. 2014. Genomic and cranial phenotype data support multiple modern
human dispersals from Africa and a southern route into Asia. Proc Natl Acad
Sci. 111(20): 7248-7253.
Richards, M., V. Macaulay, E. Hickey, E. Vega, B. Sykes, V. Guida, C. Rengo, D.

Sellitto, F. Cruciani, T. Kivisild, R. Villems, M. Thomas, S. Rychkov, O.
Rychkov, Y. Rychkov, M. Gölge, D. Dimitrov, E. Hill, D. Bradley, V. Romano,
F. Cali, G. Vona, A. Demaine, S. Papiha, C. Triantaphyllidis, G. Stefanescu, J.
Hatina, M. Belledi, A. Di Rienzo, A. Novelletto, A. Oppenheim, S. Norby, N.
Al-Zaheri, S. Santachiara-Benerecetti, R. Scozzari, A. Torroni and H-J. Bandelt.
2000. Tracing European Founder Lineages in the Near Eastern mtDNA Pool.
Am J of Hum Gen. 67: 1251-1276.
Rightmire, G. P. 2009. Out of Africa: modern human origins special feature: middle
and later Pleistocene hominins in Africa and Southwest Asia. Proc Natl Acad
Sci USA. 106: 16046-16050
Risch, N., E. Burchard, E. Ziv and H. Tang. 2002. Categorization of humans in

biomedical research: genes, race and disease. Genome Biol. 3: 310-318.
Robin, E. D and R. Wong. 1988. Mitochondrial DNA Molecules and Virtual Number
of Mitochondria per Cell in Mammalian Cells. J Cell Physiol. 136: 507-513.
Robson, B. and J. Lipson. 2002. The Afghans: Their history and culture. Cultural
Orientation Resource Center, Center for Applied Linguistics.
Rodriguez, C. D. 2003. Antropologia dental prehispanica: variacion y distancias

biológicas en la poblacion enterrada en el cementerio prehispanico de
Obando, Valle del Cauca, Colombia entre los siglos VIII y XIII d.C. Miami.
Syllaba Press.
Rodriguez, J. V. 1999. Advances of dental anthropology in Colombia. National

University of Colombia (UNAL). Bogota.
246
Roebroeks, W and P. Villa. 2011. On the earliest evidence for habitual use of fire in
Europe. Proc Natl Acad Sci USA. 108(13): 5209-5214
Roewer, L., M. Krawczak, S. Willuweit, M. Nagy, C. Alves, A. Amorim, K.

Anslinger, C. Augustin, A. Betz, E. Bosch, and A. Caglia. 2001. Online
reference database of European Y-chromosomal short tandem repeat (STR)
haplotypes. Forensic sci int. 118: 106-113.
Roewer, L., M. Nothnagel, L. Gusmao, V. Gomes and M. Gonzalez. 2013. Continent-

wide decoupling of Y-chromosomal genetic variation from language and
geography in native South Americans. PLoS Genet. 9: e1003460.
Roewer, L., S. Willuweit, M. Stoneking and I. Nasidze. 2009. A Y-STR database of

Iranian and Azerbaijanian minority populations. Forensic Sci Int- Genet. 4: e53-
e55.
Rome, S. I. 2008. Swat State (1915-1969) From Genesis to Merger: An Analysis of

Political, Administrative, Socio-Political, and Economic Developments.
Karachi: Oxford University Press.
Rootsi, S., T. Kivisild, G. Benuzzi, H. Help, M. Bermisheva, I. Kutuev, L. Barac, M.

Pericic, O. Balanovsky, A. Pshenichnov and D. Dion. 2004. Phylogeography of
Y-chromosome haplogroup I reveals distinct domains of prehistoric gene flow
in Europe. Am J Hum Genet. 75(1): 128-137.
Rose, H. A., MacLagan and D. Edward. 1911. A Glossary of the Tribes and Castes of
the Punjab and North-West Frontier Province II. Lahore: Samuel T. Weston at
the Civ. and Mili. Gaz. Press, Pp. 272–273.
Rosenberg, N. A. 2006. Standardized subsets of the HGDP‐CEPH Human Genome

Diversity Cell Line Panel, accounting for atypical and duplicated samples and
pairs of close relatives. Ann Human Genet. 70: 841-847.
Rosenberg, N. A., J. K. Pritchard, J. L. Weber, H. M. Cann, K. K. Kidd, L. A.

Zhivotovsky and M.W. Feldman. 2002. Genetic structure of human
populations. Science. 298: 2381-2385.
Rosser, Z. H., T. Zerjal, M. E. Hurles, M. Adojaan and D. Alavantic. 2000. Y-

chromosomal diversity in Europe is clinal and influenced primarily by
geography, rather than by language. Am J Hum Genet. 67: 1526-1543.
Roy, S., C. M. Thakur and P. P. Majumder. 2003. Mitochondrial DNA variation in

ranked caste groups of Maharashtra (India) and its implication on genetic
relationship and origins. Ann Hum Biol. 30: 443–454.
247
Ruvolo, M., S. Zehr, M. von Dornum, D. Pan, D. Chang and J Lin. 1993.
Mitochondrial COII Sequences and Modern Human Origins. Molec Biol Evol.
10: 1115-1135.
Sabitov, Z. 2011. The origin of the Pashtuns (Pathans). Russ J Genet Geneal. 2(1): 60-63.
Sahoo, S., A. Singh. G. Himabindu. J. Banerjee. T. Sitalaximi. S. Gaikwad. R. Trivedi.

P. Endicott. T. Kivisild. M. Metspalu and R. Villems. 2006. A prehistory of
Indian Y chromosomes: evaluating demic diffusion scenarios. Proc Nat Acad
Sci- USA. 103(4): 843-848.
Saitou, N and M. Nei. 1987. The neighbor-joining method: a new method for
reconstructing phylogenetic trees. Mol Biol Evol. 4(4): 406-425.
Sanchez, J. J., C. Hallenberg, C. Borsting, A. Hernandez and N. Morling. 2005. High

frequencies of Y chromosome lineages characterized by E3b1, DYS19-11,
DYS392-12 in Somali males. Eur J Hum Genet. 13: 856-866.
Sanger, F., S. Nicklen and A. R. Coulson. 1977. DNA sequencing with chain-
terminating inhibitors. Proc Natl Acad Sci USA. 74. 5463-5467.
Scally, A and R. Durbin. 2012. Revising the human mutation rate: implications for
understanding human evolution. Nat Rev Genet. 13: 745–53.
Schick, K. D and P. N. Toth. 1994. Making silent stones speak: Human evolution and
the dawn of technology. Simon and Schuster.
Schurr, T. G. 2004b. Molecular genetic diversity in Siberians and Native Americans

suggests an early colonization of the NewWorld. In: Madsen DB (ed) Entering
America: Northeast Asia and Beringia before the last glacial maximum. Salt
Lake City: University of Utah Press. Pp. 187–238.
Schurr, T. G. 2004a. The peopling of the New World: perspectives from molecular
anthropology. Annu Rev Anthropol. 33: 551–583.
Schurr, T.G and T. S. Sherry. 2004. Mitochondrial DNA and Y chromosome diversity
and the peopling of the Americas: evolutionary and demographic evidence.
Am J Hum Biol . 16(4): 420-439.
Scott, G. R and C. G. Turner. 1997. The Anthropology of Modern Human Teeth.

Dental Morphology and its Variation in Recent Human Populations.
Cambridge: Cambridge University Press.
Scott, G. R. 1980. Population variation of Carabelli's Trait. Hum Biol. 52(1): 63-78.
248
Scott, R. S., P. S. Ungar, T. S. Bergstrom, C. A. Brown, F. E. Grine, M. Teaford and A.
Walker. 2005. Dental microwear texture analysis shows within-species diet
variability in fossil hominins. Nature. 436(7051): 693-695.
Seema, N. P., A. Geetha and C. Jagannath. 2011. Y-short tandem repeat haplotype
and paternal lineage of the Ezhava population of Kerala, south India. Croat
Med J. 52: 344-350.
Semino, O., G. Passarino, P. J. Oefner, A. A. Lin and S. Arbuzova. 2000. The genetic
legacy of Paleolithic H. Sapiens sapiens in extant Europeans: AY chromosome
perspective. Science. 290: 1155-1159.
Sengupta, S., L. A. Zhivotovsky, R. King, S. Mehdi and C. A. Edmonds. 2006.

Polarity and temporality of high-resolution y-chromosome distributions in
India identify both indigenous and exogenous expansions and reveal minor
genetic influence of Central Asian pastoralists. Am J Hum Genet. 78: 202-221.
Shah, G. A. 2013. Administration of dir under nawab shah jehan. Pak Ann Res J. 1
(49): 121-138.
Shah, T. V. 1940. Ancient India. From 900 B.C. to 100 A.D. vol III and IV. Shashikant
and Co. Baroda, India.
Sharma, J. C and V. Kaul. 1977. Dental morphology and odontometry in Panjabis. J

Ind Anthropol Soc. 12: 213–226.
Sharma, J. C. 1983. Dental morphology and odontometry of the Tibetan immigrants.

Am J Phys Anthropol. 61: 495–505.
Siddiqi, M. H., T. Akhtar, A. Rakha, G. Abbas, A. Ali, N. Haider, A. Ali, S. Hayat, S.

Masooma, J. Ahmad and A. M. Tariq. 2015. Genetic characterization of the
Makrani people of Pakistan from mitochondrial DNA control-region data. Leg
Med. 17(2): 134-139.
Sinclair, A. H, P. Berta, S. M. Palmer, R. J. Hawkins, L. B. Griffiths, J. M. Smith, W. J.

Foster, M. A. Frischauf, R. Lovell-Badge and N. P. Goodfellow. 1990. A gene
from the human sexdetermining region encodes a protein with homology to a
conserved DNA-binding motif. Nature. 346: 240-244.
Sirajuddin, S. 1970. Sarguzasht-e-Swat (Urdu) Lahore: Al-Hamra Academy. p.53.
Skaletsky, H., T. Kuroda-Kawaguchi, J. P. Minx, S. H. Cordum, L. Hillier, G. L.

Brown, S. Repping, T. Pyntikova, J. Ali, T. Bieri, and A. Chinwalla. 2003. The
male-specific region of the human Y chromosome is a mosaic of discrete
sequence classes. Nature. 423: 825-837.
249
Slatkin, M. 1987. Gene flow and the geographic structure of natural-populations.
Science. 236: 787–792.
Smith, B. H. 1991. Standards of human tooth formation and dental age assessment
.In: M. Kelley and C. Larsen (eds), Advances in Dental Anthropology. New
York: Wiley-Liss. 19: 143-168.
Smith, F. H and F. Spencer (eds.). 1984. The Origins of Modern Humans: A World
Survey of the Fossil Evidence. New York. Liss.
Smith, T. M and P. Tafforeau. 2008. New visions of dental tissue research: tooth
development, chemistry, and structure. Evol Anthropol. 17(5): 213-226.
Smith, V. A. 1914. The Early History of India from 600 B.C. to the Muhammadan
Conquest: Including the Invasion of Alexander the Great. Clarendon Press.
3.p 117.
Soares, P., F. Alshamali, J. B. Pereira, V. Fernandes, N. M. Silva, C. Afonso, M. D.

Costa, E. Musilova, V. Macaulay, M. B. Richards, V. Cerny, L. Pereira. 2012.
The Expansion of mtDNA Haplogroup L3 within and out of Africa. Mol Biol
Evol. 29: 915–27.
Soares, P., L. Ermini, N. Thompson, M. Mormina, T. Rito, A. Rohl, A. Salas, S.

Oppenheimer, V. Macaulay and M. Richards. 2009. Correcting for Purifying
Selection: An Improved Human Mitochondrial Molecular Clock. Am J Hum
Genet. 84: 740-759.
St John, J., D. Sakkas, K. Dimitriadi, A. Barnes, V. Maclin, J. Ramey, C. Barratt and C.

De Jonge. 2000. Failure of elimination of paternal mitochondrial DNA in
abnormal embryos. Lancet. 355(9199): 200-200.
Stacul, G. 1969. Excavation near Ghaligai (1968) and Chronological Sequence of

Protohistorical Cultures in the Swat Valley (West Pakistan). East and West. 19:
44-91.
Stewart, J. B and F. P. Chinnery. 2015. The dynamics of mitochondrial DNA

heteroplasmy: implications for human health and disease. Nat Rev Genet.
16(9): 530-542.
Stoljarova, M., J.L. King, M. Takahashi, A. Aaspollu and B. Budowle. 2016. Whole
mitochondrial genome genetic diversity in an Estonian population sample. Int
J Legal Med. 130:67–71.
Stoneking, M and F. Delfin. 2010. The human genetic history of East Asia: weaving a
complex tapestry. Curr Bio. 20: 88-193.
250
Stoneking, M. 2008. Human origins. The molecular perspective. EMBO Rep. 9 (1): 46–
50.
Strand, R. F. 1973. Notes on the Nūristāni and Dardic Languages. J Am Orient Soc:
297-305.
Stringer, C and R. McKie. 1996. African Exodus: The Origins of Modern Humanity.
New York: Henry Holt.
Stringer, C. 2002. Modern Human Origins: Progress and Prospects. Philosophical

Transactions. Bio Sci. 357: 563-579.
Sullivan, L. R. 1920. Differences in the pattern of the second lower molar tooth. Am J
Phys Anthropol. 3: 255-257.
Suzuki M and T. Sakai. 1973. Occlusal surface pattern of the lower molars and the
second deciduous molar among the living Polynesians. Am J Phys Anthropol.
39(2): 305-315.
Swati, M. F. 1997. Recent Discovery of Buddhist Sites in the Swat Valley, “Athariyat
(Archaeology). A Research Bulletin of the National Heritage Foundation
Peshawar, Pakistan. 1:151-84.
Szecsenyi-Nagy, A., G. Brandt, J. Jakucs, B. G. Mende, E. Banffy and K. W. Alt. 2014.

Ancient mitochondrial and Y chromosomal DNA reveals the western
Carpathian Basin as a corridor of the Neolithic expansion. (Presentation)
ISBA6, Basel, Switzerland 27-29th
Taanman, J. W. 1999. The mitochondrial genome: structure, transcription, translation

and replication. BBA-Bioenerge. 1410(2): 103-123.
Tabassum, S., M. Ilyas,., I. Ullah,., M. Israr and H. Ahmad. 2017. A comprehensive Y-

STR portrait of Yousafzai’s population. Int J Leg Med. pp.1-2.
Tajima, F. 1989. Statistical method for testing the neutral mutation hypothesis by
DNA polymorphism. Genet. 123: 585–95.
Tamimi, M. J. 2009. Hinduism in South Asia: Myth and Reality. Centre for South
Asian Studies, University of the Punjab, Quaid-e-Azam Campus, Lahore. 24:
221–241.
"Tareekh-e-Kakazai" (a.k.a. "Hidayat Afghani Tareekh-e-Kakazai Tarkani"

(Originally Published May, 1933).
Tattersall, I and J. H. Schwartz. 1999. “Hominids and hybrids: The place of

Neanderthals in human evolution.” Proc Natl Acad Sci USA. 96: 7117-7119.
251
Teaford, M and J. Lytle. 1996. Brief Communication: Diet induced changes in rates of
human tooth microwear: A case study involving stoneground maize. Am J
Teaford, M. F and S. P. Ungar. 2000. Diet and the evolution of the earliest human
ancestors. Proc Natl Acad Sci. 97: 13506-11.
Thapar, R .1969. A History of India. Part 1. Baltimore Penguin Press.
Tishkoff, S. A., F. A. Reed, F. R. Friedlaender, C. Ehret, A. Ranciaro, A. Froment, and

J. B. Hirbo, A. A. Awomoyi, M. J. Bodo, O. Doumbo and M. Ibrahim, . 2009.
The genetic structure and history of Africans and African Americans. Science.
324: 1035-44.
Tokayer, R. M. 2007. Mystery of the Ten Lost Tribes. Nihon-Yudaya, Huuin no

Kodaishi. (http://www.moshiach.com/features/tribes/afghanistan.php).
Tomes, C. S. 1889. A manual of dental anatomy: human and comparative. London: J

and A. Churchill.
Torres, J. B. 2016. A history of you, me, and humanity: mitochondrial DNA in

anthropological research. AIMS Genet. 3(2): 146-156.
Torroni, A., A. Achilli, V. Macaulay, M. Richards and H. J. Bandelt. 2006. Harvesting

the fruit of the human mtDNA tree. Trends Genet. 22: 339–345.
Townsend, G and T. Brown. 1981. The Carabelli trait in Australian aboriginal

dentition. Arch Oral Biol. 26(10): 809-814.
Townsend, G., H. Yamada and P. Smith. 1990. Expression of the entoconulid (sixth
cusp) on mandibular molar teeth of an Australian Aboriginal population. Am
J Phys Anthropol. 82(3): 267-274.
Triki Fendri, S., P. Sanchez Diz, D. Rey Gonzalez, I. Ayadi and A. Carracedo. 2015.
Paternal lineages in Libya inferred from Y chromosome haplogroups. Am J
Trivedi, R., S. Sahoo, A. Singh, G. Bindu, J. Banerjee, M. Tandon, S. Gaikwad, R.

Rajkumar, T. Sitalaximi, R. Ashma and N. B. G. Chainy. 2008. Genetic
imprints of Pleistocene origin of Indian populations: A comprehensive
phylogeographic sketch of Indian Y-chromosomes. Int J Hum Genet. 8(1-2): 97-
118.
Trombetta, B., E. D’Atanasio, A. Massaia, M. Ippoliti, A. Coppa, F. Candilio, V. Coia,

G. Russo, M. J. Dugoujon, P. Moral and N. Akar. 2015. Phylogeographic
252
refinement and large scale genotyping of human Y chromosome haplogroup
E provide new insights into the dispersal of early pastoralists in the African
continent. Genome Biol Evol. 7(7): 1940-1950.
Trombetta, B., F. Cruciani, D. Sellitto and R. Scozzari. 2011. A new topology of the
human Y chromosome haplogroup E1b1 (E-P2) revealed through the use of
newly characterized binary polymorphisms. PLoS One. 6(1): e16073.
Tucci, G. 1958. Preliminary report on an archaeological survey in Swat. East and

West. 9(4): 279-328.
Turner, C. G. II. 1967. The dentition of Arctic peoples, PhD Dissertation, Madison,
University of Wisconsin.
Turner, C. G., C. R. Nichol and G. R. Scott. 1991. Scoring procedures for key
morphological traits of the permanent dentition: the Arizona State University
Dental Anthropology System. In: Kelly MA, and Larsen CS (eds.). Advances
in Dental Anthropology. New York: Wiley-Liss. p 13-31.
Tyagi, V. P. 2009. Martial races of undivided India. New Delhi: Kalpaz Publications.
Underhill, P. A and T. Kivisild. 2007. Use of Y-chromosome and mitochondrial DNA

population structure in tracing human migrations. Annu Rev Genet. 41: 539–
64.
Underhill, P. A., G. D. Poznik, S. Rootsi, M. Jarve, A. A. Lin, J. Wang, B. Passarelli, J.

Kanbar, N. M. Myres, R. J. King and J. Di Cristofaro. 2015. The phylogenetic
and geographic structure of Y-chromosome haplogroup R1a. Eur J Hum Genet.
23(1):124-31.
Underhill, P. A., P. Shen, A. A. Lin, L. Jin, G. Passarino, H. W. Yang, E. Kauffman, B.

Bonne-Tamir, J. Bertranpetit, P. Francalacci, M. Ibrahim and T. Jenkins. 2000.
Y chromosome sequence variation and the history of human populations. Nat
Genet. 26: 358–361.
Underhill, P.A., G. Passarino, A. A. Lin, P. Shen, M. Mirazon Lahr, A. R. Foley, J. P.

Oefner and L. L. Cavalli-Sforza. 2001. The phylogeography of Y chromosome
binary haplotypes and the origins of modern human populations. Ann Hum
Genet. 65: 43-62.
van der Giezen, M. 2011. Mitochondria and the rise of eukaryotes. J Bioscience. 61(8):
594-601.
van Oven, M. 2015. PhyloTree Build 17: Growing the human mitochondrial DNA
tree. Forensic Sci Int- Genet. 5:e392-e394.
253
van Oven, M. and M. Kayser. 2009. Updated comprehensive phylogenetic tree of
global human mitochondrial DNA variation. Hum mutat. 30.2:e386-e394.
Van Oven, M., A. Geystelen, M. Kayser, R. Decorte and M.H. Larmuseau. 2014.
Seeing the wood for the trees: a minimal reference phylogeny for the human Y
chromosome. Hum mutat. 35(2): 187-191.
Vermeulen, M., A. Wollstein, K. van der Gaag, O. Lao, Y. Xue. 2009. Improving
global and regional resolution of male lineage differentiation by simple
single-copy Y-chromosomal short tandem repeat polymorphisms. Forensic Sci
Int-Gen. 3: 205-213.
Vigilant, L., M. Stoneking, H. Harpending, K. Hawkes and A. Wilson. 1991. African

Populations and the Evolution of Human Mitochondrial DNA. Science.
253:1503-1507.
Walimbe, S. R and S. S. Kulkarni. 1993. Biological Adaptations in Human Dentition:

An Odontometric Study on Living and Archaeological Populations in
India (Vol. 1). Deccan College Post Graduate and Research Institute, Pune,
India.
Wallace, D. C., K. Garrison and W. C. Knowler. 1985. Dramatic founder effects in

Amerindian mitochondrial DNAs. Am J Phys Anthropol. 68(2): 149-155.
Wallace, D., M. Brown and M. Lott. 1999. Mitochondrial DNA variation in human
evolution and disease. Genetics. 238(1): 211-30.
Wang, C. C., L. X. Wang, R. Shrestha, S. Wen, M. Zhang, X. Tong, L. Jin and H. Li.
2015. Convergence of Y chromosome STR haplotypes from different SNP
haplogroups compromises accuracy of haplogroup prediction. J Genet
Genomics. 42(7):403-407.
Wang, C. C., X. L. Wang, R. Shrestha, S. Wen, M. Zhang, X. Tong, L. Jin and H. Li.
2013. Convergence of Y chromosome STR haplotypes from different SNP
haplogroups compromises accuracy of haplogroup prediction. arXiv preprint
arXiv [q-bio.PE]:1310.5413.
Whale, J.W. 2012. Mitochondrial DNA analysis of four ethnic groups of Afghanistan:
PhD Thesis. Portsmouth: University of Portsmouth.
Wheeler, M. 1968. The Indus Civilization. 3rd Edition. Cambrdige: Cambridge

University Press
Willems, T., M. Gymrek, D. G. Poznik, C. Tyler-Smith. and Y. Erlich. 2016.

Population-Scale Sequencing Data Enable Precise Estimates of Y-STR
Mutation Rates. The Amer J Human Genet. 98(5): 919-933.
254
Winters, C. 2011. The Gibraltar Out of Africa Exit for Anatomically Modern
Humans. Webmed Central. 2(10):WMC002319.
Wolpert, S. 2000. A new history of India. Oxford University Press, New York
Wolpoff, M. H and R. Caspari. 1996. Race and Human Evolution: A Fatal Attraction.
New York. Simon and Schuster.
Wolpoff, M. H., X. Z. Wu, and A. G. Thorne. 1984. Modern Homo sapiens Origins: A
General Theory of Hominid Evolution Involving the Fossil Evidence from
East Asia. In: Smith, F.H. and Spencer, F. (eds.). The Origins of Modern
Humans. pp. 411-483.
Wolpoff, M., J. Hawks and R. Caspari. 2000. Multiregional, Not Multiple Origins. Am
J Phys Anthropol. 112: 129-136.
Wood, B. A and C.A. Engleman. 1988. Analysis of the dental morphology of Plio-
Pleistocene hominids. V. Maxillary postcanine tooth morphology. J o Anat.
161: 1-35.
Wood, B. A and H. Uytterschaut. 1987. Analysis of the dental morphology of Plio-

Pleistocene hominids. III. Mandibular premolar crowns. J Anat. 154: 121-156.
Wood, B. A and S. A. Abbott. 1983. Analysis of the dental morphology of

Pliopleistocene hominids. I. Mandibular molars: crown area measurements
and morphological traits. J of Anat. 136: 197-219.
Wood, B. A., S. A. Abbott and S. H. Graham. 1983. Analysis of the dental

morphology of Plio-Pleistocene hominids. II. Mandibular molars--study of
cusp areas, fissure pattern and cross sectional shape of the crown. J Anat. 137:
287-314.
Wood, B. A., S. A. Abbott and H. Uytterschaut. 1988. Analysis of the dental

morphology of Plio-Pleistocene hominids. IV. Mandibular postcanine root
morphology. J of Anat. 156: 107-139.
Wreford, R. G. 1941. Census of India. Vol. XXII, Jammu and Kashmir. Ranbir
Government Press, 1943.
Wright, S. 1951. The genetical structure of populations. Ann Eugen. 15: 323–354.
Xue Y., Q. Wang, Q. Long, L. B. Ng, H. Swerdlow, J. Burton, C. Skuce, R. Taylor, Z.

Abdellah, Y. Zhao, Asan and D. G. MacArthur. 2009. Human Y chromosome
base- substitution mutation rate measured by direct sequencing in a deep-
rooting pedigree. Curr Biol. 19: 1453–1457.
255
Yaad, L. 1986. Pukhtana Qabily Opejanai. p.86 (www.Khyber.Org).
Yasin, H. M. 2008. Social Welfare Program in the Former State of Swat (The Paradise
Lost). The Dialogue. 4(3): 1-32.
Zegura, S. L., M. T. Karafet, A. L. Zhivotovsky and F. M. Hammer. 2004. High-

resolution SNPs and microsatellite haplotypes point to a single, recent entry
of Native American Y chromosomes into the Americas. Mol Bio Evol. 21: 164-
175.
Zeng, Z., R. Garcia-Bertrand, S. Calderon, L. Li and M. Zhong. 2014. Extreme genetic

heterogeneity among the nine major tribal Taiwanese island populations
detected with a new generation Y23 STR system. Forensic Sci Int-Gen. 12: 100-
106.
Zhao, Z., F. Khan, M. Borkar, R. Herrera and S. Agrawal. 2009. Presence of three
different paternal lineages among North Indians: a study of 560 Y
chromosomes. Ann Hum biol. 36: 46-59.
Zhong, H., H. Shi, B. X. Qi, Y. Z. Duan, P. P. Tan, L. Jin, B. Su and Z. R. Ma. 2011.
Extended Y chromosome investigation suggests postglacial migrations of
modern humans into East Asia via the northern route. Molec bio and evol.
28(1): 717-727.
Zwalf, W. 1996. A Catalogue of the Gandhāra Sculpture in the British Museum, vol.
I, British Museum Press, London
256
APPENDIX I
257
APPENDIX II
258
APPENDEX III
STOCK REAGENTS
Phenol:Chloroform Mixture (1:1)
For each sample 200uL of phenol and 200uL of chloroform were used.
Lysis Buffer
500mM Tris-base
250 mM EDTA
5% SDS
Proteinase K 75ug/mL of lysis solution
Β-mercaptoethanol (14.4M),1uL/mL of lysis solution
50X TAE buffer
M Tris-HCl pH8
0.5 M EDTA
Make up to 1 L with dH2O and autoclave
Bromophenol blue dye
50 ml dH2O
50 g sucrose
1.86 g EDTA
0.1 g bromophenol blue
Dissolve
Adjust volume to 100 ml with dH2O, stir overnight
pH to 8.0
Filter through Whatmann filter paper
Store at room temperature
259
10 mg/ml Ethidium bromide (EtBr)
Add 1 g of ethidium bromide to
100 ml of ddH2O
Stir for several hours until completely dissolved
Store wrapped in aluminum foil at 4˚C
1kb size standard
285 μl 1kb ladder (cat# DM001)
143 μl Ficoll dye
2 400 μl 1 X TE
260
APPENDEX IV
Dental morphological trait frequencies of all poplations samples used in this study
ABB. TRAIT P N F ABB. TRAIT P N F ABB. TRAIT P N F

INM SHOVUI1 9 24 37.50 DJR SHOVUI1 3 16 18.75 KARa SHOVUI1 86 149 57.72
- SHOVUI2 4 19 21.05 - SHOVUI2 8 22 36.36 - SHOVUI2 75 151 49.67
- MLRUI1 14 25 56.00 - MLRUI1 3 17 17.65 - MLRUI1 53 115 46.09
- HYPOUM1 27 41 65.85 - HYPOUM1 30 30 100.0 - HYPOUM1 132 143 92.31
- MTCLUM1 6 41 14.63 - MTCLUM1 1 29 3.45 - MTCLUM1 20 138 14.49
- YGRVLM2 7 24 29.17 - YGRVLM2 11 35 31.43 - YGRVLM2 38 116 32.76
- CSPNLM1 32 39 82.05 - CSPNLM1 20 21 95.24 - CSPNLM1 113 147 76.87
- C6LM1 4 37 10.81 - C6LM1 1 20 5.00 - C6LM1 4 106 3.77
- C6LM2 0 24 0.00 - C6LM2 0 36 0.00 - C6LM2 1 87 1.15
- C7LM1 2 36 5.56 - C7LM1 1 32 3.13 - C7LM1 7 109 6.42
- C7LM2 1 25 4.00 - C7LM2 1 39 2.56 - C7LM2 1 89 1.12
MHR SHOVUI1 77 186 41.40 KUZ SHOVUI1 1 13 7.69 SYDm2 SHOVUI1 54 143 37.76
- C6LM1 13 191 6.81 - C6LM1 0 14 0.00 - C6LM1 3 101 2.97
- C6LM2 3 174 1.72 - C6LM2 0 15 0.00 - C6LM2 0 95 0.00
- C7LM1 25 191 13.09 - C7LM1 0 18 0.00 - C7LM1 11 104 10.58
- C7LM2 3 177 1.69 - C7LM2 0 18 0.00 - C7LM2 2 96 2.08
MDA SHOVUI1 80 163 49.08 MOL SHOVUI1 4 25 16.00 TANm2 SHOVUI1 31 149 20.81
261
- C6LM1 12 158 7.59 - C6LM1 3 33 9.09 - C6LM1 6 126 4.76
- C6LM2 5 152 3.29 - C6LM2 0 35 0.00 - C6LM2 0 114 0.00
- C7LM1 27 165 16.36 - C7LM1 2 39 5.13 - C7LM1 14 126 11.11
- C7LM2 7 158 4.43 - C7LM2 1 36 2.78 - C7LM2 0 112 0.00
MRT SHOVUI1 81 198 40.91 CHU SHOVUI1 64 194 32.99 GUJsw SHOVUI1 50 160 31.25
- C6LM1 17 194 8.76 - C6LM1 13 186 6.99 - C6LM1 19 160 11.88
- C6LM2 5 191 2.62 - C6LM2 1 186 0.54 - C6LM2 0 157 0.00
- C7LM1 15 198 7.58 - C7LM1 48 195 24.62 - C7LM1 23 160 14.38
- C7LM2 1 197 0.51 - C7LM2 18 194 9.28 - C7LM2 4 159 2.52
PNT SHOVUI1 52 176 29.55 GPD SHOVUI1 63 175 36.00 KOHsw SHOVUI1 54 162 33.33
262
- C6LM1 22 182 12.09 - C6LM1 21 169 12.43 - C6LM1 10 161 6.21
- C6LM2 5 179 2.79 - C6LM2 5 172 2.91 - C6LM2 2 160 1.25
- C7LM1 31 181 17.13 - C7LM1 23 172 13.37 - C7LM1 14 161 8.70
- C7LM2 11 182 6.04 - C7LM2 19 173 10.98 - C7LM2 3 160 1.88
KHO SHOVUI1 33 122 27.05 AWAm1 SHOVUI1 48 162 29.63 TRKd SHOVUI1 72 161 44.72
- C6LM1 4 129 3.10 - C6LM1 7 162 4.32 - C6LM1 18 161 11.18
- C6LM2 0 85 0.00 - C6LM2 0 138 0.00 - C6LM2 4 161 2.48
- C7LM1 12 129 9.30 - C7LM1 12 165 7.27 - C7LM1 35 161 21.74
- C7LM2 1 90 1.11 - C7LM2 4 142 2.82 - C7LM2 19 161 11.80
SKH SHOVUI1 0 9 0.00 MDK SHOVUI1 73 179 40.78 UTHd SHOVUI1 65 159 40.88
- C6LM1 1 14 7.14 - C6LM1 9 177 5.08 - C6LM1 21 159 13.21
- C6LM2 0 15 0.00 - C6LM2 2 165 1.21 - C6LM2 1 154 0.65
- C7LM1 1 15 6.67 - C7LM1 10 176 5.68 - C7LM1 20 159 12.58
- C7LM2 0 15 0.00 - C7LM2 1 163 0.61 - C7LM2 6 154 3.90
TMG SHOVUI1 1 7 14.29 SWT SHOVUI1 59 177 33.33 YSFsw SHOVUI1 53 181 29.28
263
- C6LM1 0 22 0.00 - C6LM1 8 172 4.65 - C6LM1 9 180 5.00
- C6LM2 1 18 5.56 - C6LM2 0 154 0.00 - C6LM2 1 180 0.56
- C7LM1 2 24 8.33 - C7LM1 20 176 11.36 - C7LM1 11 180 6.11
- C7LM2 2 20 10.00 - C7LM2 1 165 0.61 - C7LM2 0 180 0.00
NeoMRG SHOVUI1 18 28 64.29 WAKg SHOVUI1 37 145 25.52 GUJm2 SHOVUI1 89 155 57.42
- C6LM1 3 37 8.11 - C6LM1 5 140 3.57 - C6LM1 2 110 1.82
- C6LM2 0 44 0.00 - C6LM2 1 115 0.87 - C6LM2 1 90 1.11
- C7LM1 4 40 10.00 - C7LM1 10 143 6.99 - C7LM1 10 110 9.09
- C7LM2 0 43 0.00 - C7LM2 1 113 0.88 - C7LM2 1 90 1.11
ChlMRG SHOVUI1 13 25 52.00 WAKs SHOVUI1 31 158 19.62 AWAm2 SHOVUI1 25 141 17.73
264
- C6LM1 5 23 21.74 - C6LM1 4 158 2.53 - C6LM1 4 126 3.17
- C6LM2 2 18 11.11 - C6LM2 0 120 0.00 - C6LM2 0 125 0.00
- C7LM1 3 25 12.00 - C7LM1 18 158 11.39 - C7LM1 10 127 7.87
- C7LM2 0 24 0.00 - C7LM2 0 120 0.00 - C7LM2 0 128 0.00
HAR SHOVUI1 2 15 13.33 SAP SHOVUI1 2 19 10.53
- SHOVUI2 4 16 25.00 - SHOVUI2 5 17 29.41
- MLRUI1 8 12 66.67 - MLRUI1 4 17 23.53
- HYPOUM1 16 16 100.0 - HYPOUM1 36 36 100.0
- HYPOUM2 2 18 11.11 - HYPOUM2 23 32 71.88
- MTCLUM1 6 13 46.15 - MTCLUM1 3 37 8.11
- MTCLUM2 4 16 25.00 - MTCLUM2 2 34 5.88
- YGRVLM2 3 31 9.68 - YGRVLM2 7 38 18.42
- CSPNLM1 17 20 85.00 - CSPNLM1 22 28 78.57
- CSPNLM2 0 33 0.00 - CSPNLM2 2 41 4.88
- C6LM1 1 20 5.00 - C6LM1 3 25 12.00
- C6LM2 0 28 0.00 - C6LM2 0 40 0.00
- C7LM1 1 22 4.55 - C7LM1 1 38 2.63
- C7LM2 0 28 0.00 - C7LM2 0 43 0.00
265

Inam PHD Thesis Final Binding

Uploaded by

Copyright:

Available Formats

Inam PHD Thesis Final Binding

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Inam PHD Thesis Final Binding

Uploaded by

Copyright:

Available Formats

DENTAL MORPHOLOGY AND HAPLOTYPIC DIVERSITY

IN THE MAJOR ETHNIC GROUPS OF SWAT AND DIR

The Friday 03, March 2017

Research supervisor PROF. DR. HABIB AHMAD

Co supervisor DR. BRIAN E. HEMPHILL

(Al-Hujurat, 49: 13)

degree from this University (Hazara University Mansehra Pakistan) Or

anywhere else in the country/world.

At any time if my statement is found to be incorrect even after my Graduate the

university has the right to withdraw my PhD degree.

Districts ’’ is solely my research work with no significant contribution from any

University Mansehra) towards plagiarism. Therefore I as an Author of the above

material used as reference is properly referred/cited.

Student/Author Signature: ____________

I also extend sincere thanks to my research co-supervisor Dr. Brian E. Hemphill

I am grateful to Prof. Eske Willeslev, Director of Center for Geogenetics, University

I acknowledge Secretary Education KP, Directorate of schools and colleges of Swat

I would like to express my sincere thanks to FANTA`s members, (Mr. Ikram

LIST OF FIGURES XII

1.1 Modern human history 1

1.1.1. Dispersal of anatomically modern humans 2

1.3. Study Area 15

1.3.1. District Swat 16

1.3.2. District Dir 18

1.4. The Pashtuns\Pakhtuns 20

1.4.1. The Yousafzai 22

1.4.2. The Utmankheils 23

1.4.3. The Tarklanis 23

1.5. The Kohistani 24

1.6. The Gujars 25

1.7. The genetic characterization of human 27

1.8. Dental Morphology/ Dental Anthropology 30

1.8.2. Dental anthropology investigations in South Asia 34

1.8.3. Non-metric dental morphological traits 36

1.8.4. Basic terminology use in dental morphology 36

1.8.5. Analysis of dental morphology traits 38

1.9. Mitochondrial DNA (mtDNA) 46

1.9.1. MtDNA in human lineages 48

1.9.2. mtDNA Variation 49

1.10. The Y-chromosome 53

1.10.1. Phylogenetic Tree based on human Y-chromosome 55

1.10.2. Y-chromosomal haplogroup distribution across the globe 58

CHAPTER 2 MATERIALS AND METHODS 61

2.1. Samples collection for dental morphology study 62

2.1.1. Collection of dental Casts 62

2.1.2. Selection of volunteers 63

2.1.3. Biosafety Measures 63

2.1.4. Dental casting and labeling 64

2.1.5. Grading and scoring of dental morphology traits 65

2.2. Analyzing the DNA 66

2.2.1. Collection of saliva samples 66

2.2.2. Genomic DNA extraction 66

2.2.3. Screening of the purified gDNA 67

2.2.4. Agarose gel electrophoresis 68

2.3.1. PCR Amplification of target DNA 68

2.3.2. Thermocycling conditions for PCR 69

2.3.3. Visualization of the PCR Products 71

2.3.4. Elution of PCR Product 71

3.3. Y-chromosome analysis 72