Session 8: Human Evolution
Transcript of Part 2: African Genomics: African Population History
00:00:07.20 So in the second part of this lecture series, 00:00:10.13 I'm going to be discussing 00:00:12.03 African population history 00:00:14.02 based on patterns of genetic diversity. 00:00:18.23 So why do I think it's important 00:00:20.08 that we study African genetic variation? 00:00:22.28 Well, for one, 00:00:24.17 if we want to learn more about modern human origins, 00:00:26.23 we need to be looking in Africa, 00:00:28.15 which is the site of modern human speciation. 00:00:32.05 Secondly, if we want to learn more about African-American ancestry, 00:00:36.14 this will be an important region to study. 00:00:40.21 Third is that Africa is a region 00:00:42.13 with a very high level of infectious disease, 00:00:44.27 with HIV, malaria, and TB being three of the biggest killers, 00:00:49.21 but there's also an increasing level of 00:00:51.23 non-communicable diseases like diabetes, for example, 00:00:55.08 and cardiovascular disease. 00:00:57.11 And African populations have been greatly underrepresented 00:01:00.22 in the biomedical research, 00:01:03.00 and so we really need to give more focus 00:01:05.10 to these populations so that we can come up with better diagnostics 00:01:08.27 and better treatments for these diseases. 00:01:13.11 And lastly, we know that people differ in regards to drug response, 00:01:17.10 and this is likely due to variation at drug metabolizing genes, 00:01:21.02 but again, we currently know very little 00:01:23.03 about the extent of variation among Africans at these loci. 00:01:30.17 So first I have to give you a little bit of information 00:01:32.22 about African population history. 00:01:35.06 There are over 2,000 ethnic groups in Africa 00:01:37.26 speaking distinct languages, 00:01:40.12 and these languages have been classified 00:01:42.15 into four different language families. 00:01:45.17 So in blue are languages 00:01:48.15 classified as Afro-Asiatic. 00:01:50.27 They're found predominantly in the north and northeast of Africa, 00:01:55.23 and these would include, for example, 00:01:57.13 Semitic languages which are also spoken in the Middle East, 00:02:01.02 and they would also include Cushitic languages 00:02:04.14 spoken in northeast Africa. 00:02:07.02 And then in red we have populations 00:02:10.14 that are speaking Nilo-Saharan languages, 00:02:13.10 these tend to be pastoralist groups, 00:02:15.20 like the Maasai for example, who live in Kenya and Tanzania. 00:02:19.07 And these populations are mainly found 00:02:21.12 in central and eastern Africa 00:02:24.28 although there are a few groups who have migrated 00:02:27.14 to the west of Africa. 00:02:30.00 The most broad-spread language family 00:02:34.27 consists of the Niger-Kordofanian languages, 00:02:37.20 shown in yellow or orange here. 00:02:40.21 And the most common subfamily 00:02:43.18 is the family of Bantu languages. 00:02:47.06 Now, those are thought to have originated in Cameroon or Nigeria 00:02:51.02 around 5,000 years ago, 00:02:53.18 together with the development of iron tool technology, 00:02:56.26 which led to much better methods for practicing agriculture. 00:03:02.27 And so these populations 00:03:04.25 had a technological advantage in a sense, 00:03:07.20 and they were able to rapidly 00:03:09.29 expand across Africa into east Africa 00:03:13.12 and then south Africa, 00:03:15.14 or from west Africa 00:03:20.18 along the western coast into southern Africa. 00:03:24.14 The fourth language family, shown in green here, 00:03:28.05 is classified as Khoisan, 00:03:30.17 and it consists of languages that have click consonants. 00:03:34.03 So these are found predominantly 00:03:36.20 amongst the San hunter-gatherers in southern Africa, 00:03:40.24 and also amongst two groups called the Hadza and the Sandawe, 00:03:45.08 who live in Tanzania. 00:03:47.18 Now, despite the importance of studying Africa, 00:03:50.23 there have been relatively few genomics studies in that region, 00:03:54.05 and there's a number of reasons for that, 00:03:56.04 and one of which is just the challenges of 00:03:58.16 doing research in areas that sometimes 00:04:00.22 have little infrastructure. 00:04:02.25 And so I wanted to show you some examples of 00:04:05.21 the field work that we've done over the past 12 years. 00:04:08.25 We've mainly been studying 00:04:10.18 minority populations in Africa 00:04:12.13 who practice indigenous lifestyles, 00:04:14.22 and they live in very remote areas, 00:04:16.14 so we have to, for example, have a 4-wheel drive vehicle, 00:04:21.03 and this work has been done no only by myself, 00:04:23.21 but by my students and postdocs 00:04:25.27 and African collaborators over many years. 00:04:30.22 So here's an example, I like this, 00:04:32.13 it shows my postdocs Alessia Ranciaro and Simon Thompson, 00:04:37.06 and they were doing an expedition in Ethiopia in 2010. 00:04:41.04 We basically have to bring all of our lab equipment with us, 00:04:44.28 and I like this because it shows both the outside perspective of the car, 00:04:48.04 and also the inside perspective. 00:04:51.25 These are some of the other challenges that they faced. 00:04:54.20 They were there during the wet season, 00:04:56.03 making it extremely challenging to travel. 00:05:02.01 In each of these regions, 00:05:03.19 we typically start by doing what you could think of as 00:05:06.05 "Town Hall meetings", in which we explain the research 00:05:09.04 to the community, 00:05:11.01 and we explain both the risks and the benefits, 00:05:12.23 and make sure that they understand 00:05:14.11 why we're doing this research, 00:05:16.00 and how it might benefit or not benefit the community. 00:05:19.00 Ultimately though, 00:05:20.25 we have to obtain individual informed consent 00:05:23.12 to do this research. 00:05:27.09 We also measured a number of phenotypes, 00:05:29.11 like height and weight. 00:05:32.12 More recently, we've been looking at more detailed 00:05:34.29 anthropometric cardiovascular and metabolic traits. 00:05:41.13 From each of these samples, 00:05:42.24 we typically obtain blood intravenously, 00:05:45.29 and we've started to also obtain RNA. 00:05:50.07 But one of the challenges is processing these samples 00:05:53.12 in regions where there's no electricity. 00:05:55.25 So here's an example where we set up the so-called 00:05:58.19 "Bush Lab": we had to set up our centrifuge 00:06:00.28 and hook it up to the car battery. 00:06:05.02 But in other areas, we can find a local clinic, 00:06:07.08 they'll often have a generator, 00:06:09.06 and so then we're able to hook up a larger centrifuge. 00:06:12.06 One of the ways in which we obtain DNA... 00:06:15.20 and the DNA, I should note, 00:06:17.08 is only present in the white cells of blood, 00:06:19.24 so the first thing we're gonna do is we're gonna 00:06:21.13 break open all the red cells. 00:06:23.27 And we do that by adding a solution 00:06:27.00 that's going to cause them to burst open. 00:06:29.25 Then we're going to spin down the samples in this centrifuge, 00:06:34.04 and we have to repeat this several times, 00:06:36.09 and we're gonna end up with these little pellets at the bottom 00:06:39.01 of the white cells, and that's where we're gonna find the DNA. 00:06:45.10 Here are some other challenges of processing in the field. 00:06:48.13 After we've isolated the DNA, 00:06:50.03 we add another buffer, which is going to 00:06:53.01 preserve the samples at room temperature, 00:06:55.21 but here's a case where Simon Thompson 00:06:57.15 actually had to bring a generator with him 00:06:59.26 and set up the entire lab in the bush 00:07:02.18 when he was studying the Hadza hunter-gatherers of Tanzania. 00:07:08.15 Another very important thing 00:07:11.07 is to increase training and capacity building in Africa 00:07:15.13 so that they can do this research themselves, 00:07:17.21 and that's something that I've spent a lot of time doing, 00:07:20.09 and I think is very important. 00:07:23.23 Also equally important is actually 00:07:25.02 returning results to participants, 00:07:28.01 and it's really surprising how little this is done, 00:07:31.07 but I can assure you that people 00:07:32.26 really appreciate it when we return the results, 00:07:36.06 and I think it's also an ethical obligation 00:07:38.20 so that they can benefit from what we learn from these studies. 00:07:44.13 So I want to start by talking about some of the phenotypic variation 00:07:47.10 that we see in Africa. 00:07:49.03 This is an example of skin melanin levels, 00:07:52.07 or skin pigmentation. 00:07:54.09 So, the higher the value here, 00:07:56.15 the darker the skin color. 00:07:59.01 And I wanna just make the point that 00:08:00.16 we see a lot of variation in skin pigmentation levels 00:08:04.27 across diverse Africans. 00:08:07.18 And one of the things that we're interested in looking at is 00:08:10.10 correlations with vitamin D for example, 00:08:12.10 because we know that vitamin D is produced by UV light, 00:08:16.21 and that people with darker skin 00:08:18.12 may produce less vitamin D, for example. 00:08:21.02 And vitamin D can have important health implications, 00:08:23.15 so this is relevant to know. 00:08:25.29 It's also an interesting trait to look at how people 00:08:28.09 have adapted to different environments. 00:08:31.21 Here are the results of a principal component analysis 00:08:34.17 for a number of cardiovascular traits, 00:08:37.11 and these are different populations. 00:08:40.27 If the populations cluster close to each other, 00:08:43.19 it means that they're very similar for these traits, 00:08:45.28 and we've color-coded them based on shared language and ethnicity. 00:08:50.26 And what's interesting is that they tend to cluster 00:08:53.04 based on language and culture. 00:08:55.12 So here are the Nilo-Saharan speakers, 00:08:57.06 here are the Afro-Asiatic speakers, 00:08:59.18 and in yellow here are the Niger-Kordofanian speakers, 00:09:04.08 but we see two exceptions. 00:09:06.13 These are two groups that live on the coast of Kenya, 00:09:09.00 in geographic proximity to the Bantu-speaking groups, 00:09:13.10 suggesting that not only are genetic factors important, 00:09:16.02 but environment factors are probably quite important as well. 00:09:22.18 And here we can see tremendous variation 00:09:25.16 for height, weight, and BMI in Africa. 00:09:29.00 And again, we're seeing that 00:09:31.07 populations tend to cluster based on shared ethnicity, 00:09:35.02 and at the extremes 00:09:36.23 we have the very short statured pygmies from central Africa, 00:09:40.13 and then we have the very tall and thin individuals 00:09:43.27 from Kenya and other places... and the Sudan. 00:09:49.02 And so, as we'll talk about in the last section of my lecture series, 00:09:52.18 this may be due to adaptation to different environments. 00:09:58.17 So now I want to tell you about the patterns of 00:10:00.21 genetic variation and genetic structure in Africa, 00:10:04.19 and this is based on a paper that we published several years ago, 00:10:08.14 in which we looked at genome-wide variable markers, 00:10:13.21 and these were genotyped in over 2,500 Africans 00:10:17.06 from 121 ethnic groups 00:10:19.04 that are shown by these dots here. 00:10:21.12 But note that even though this was 00:10:23.06 more than had ever been done before, 00:10:25.16 it still represents just a fraction of the 00:10:27.16 2,000 ethnic groups in Africa, 00:10:30.06 so we're still missing a lot of the variation. 00:10:33.06 We then looked at 98 African-Americans 00:10:36.20 from 4 regions in the US 00:10:38.22 and a comparative dataset of about 1,500 non-Africans. 00:10:44.13 So let me first tell you about the levels of genetic variation that we saw, 00:10:48.05 and that's indicated by the height of this bar. 00:10:51.03 And I've color-coded this by geographic region, 00:10:53.20 so shown in orange are people from Africa, 00:10:57.01 and as nearly every study has shown, 00:10:59.11 Africans have the highest level of genetic variation. 00:11:02.27 And then we see decreasing variation 00:11:05.00 as we move west to east 00:11:07.00 across Eurasia into 00:11:09.06 East Asia, Oceania, and the Americas. 00:11:13.10 So the patterns of genetic diversity that we're seeing 00:11:16.10 simply reflect our evolutionary and demographic history. 00:11:20.10 We see the highest levels of diversity in Africa, 00:11:22.20 which is the site of origin of modern humans, 00:11:25.11 and then when small groups of people 00:11:27.23 migrated out of Africa within the past 50,000-100,000 years, 00:11:32.03 there was a population bottleneck, 00:11:34.06 and so we see a decrease in genetic diversity. 00:11:37.26 And as humans migrated west to east across Eurasia 00:11:41.07 and into the Americas 00:11:43.00 and into Oceania and so on, 00:11:45.01 there were a series of founding events and again, 00:11:47.16 a concomitant decrease in genetic diversity. 00:11:51.21 So this is a phylogenetic tree 00:11:54.01 constructed based on pair-wise genetic distances 00:11:56.14 between populations. 00:11:58.07 You can't see any details on this tree, 00:12:00.13 I just want to point out some overall trends. 00:12:02.27 And I've color-coded these such that 00:12:05.24 the populations shown in black, 00:12:08.21 the black branches, are non-Africans, 00:12:12.11 and then the Africans are shown here. 00:12:14.23 So the first thing that you can see from this tree 00:12:16.27 is that non-Africans are distinguished from Africans, 00:12:20.19 and that the non-African populations 00:12:22.19 are clustering by major geographic region. 00:12:25.25 So we have people from India, central Asia, Europe, 00:12:29.07 Middle East, east Asia, and the Americas, 00:12:34.01 and then Oceania. 00:12:36.10 And even within Africa, 00:12:38.13 populations are clustering by major geographic region, 00:12:41.22 so here are populations from the north of Africa, 00:12:44.02 from eastern Africa, 00:12:45.15 from west-central Africa, 00:12:47.17 and then from southern Africa, 00:12:49.10 with one exception: 00:12:51.25 down here, at the root of this tree, 00:12:54.08 we see the San hunter-gatherers from southern Africa, 00:12:58.22 but clustering near the San are the pygmies, 00:13:01.17 who today live in central Africa. 00:13:04.08 And that's really intriguing and maybe telling us something 00:13:06.16 about the history of these populations, 00:13:08.24 and I'll discuss that more in a moment. 00:13:13.14 Now, we can also compare genetic distances, 00:13:17.02 which are shown on the y-axis, 00:13:19.10 to geographic distances between pairs of populations, 00:13:22.20 shown on the x-axis. 00:13:24.28 And we see a significant positive correlation, 00:13:28.19 but we can also see a lot of scatter here. 00:13:32.01 And what that means is that there are some populations 00:13:34.26 that are geographically very close, 00:13:38.17 but they're genetically very different, 00:13:41.17 and those probably represent recent migration events 00:13:44.24 of genetically differentiated populations. 00:13:47.19 And then on the other end of the spectrum, 00:13:50.02 we have some populations that are genetically very similar to each other, 00:13:53.27 but geographically very far apart. 00:13:56.21 And those may reflect, for example, 00:13:58.24 the Bantu people, who migrated from western Africa 00:14:02.03 to eastern and southern Africa, 00:14:03.18 so they're gone quite a long geographic distance, 00:14:07.03 but genetically they're still very similar to each other. 00:14:11.07 So now I want to move away from looking at populations 00:14:13.28 and I want to talk about looking at variation amongst individuals. 00:14:18.20 And the first thing I want to show you is 00:14:20.28 a principal component analysis based on individual genotypes. 00:14:25.07 And so each of these circles 00:14:28.10 actually represents a person, 00:14:30.16 and if they cluster together 00:14:32.21 it means that they're genetically similar to each other. 00:14:35.22 So, as shown here, the first principle component 00:14:38.15 accounts for as much of the variability in the data as possible, 00:14:42.08 and each succeeding component 00:14:44.08 accounts for as much of the remaining variability as possible. 00:14:47.28 So the first principal component 00:14:50.09 essentially is differentiating 00:14:52.24 the African groups 00:14:55.04 from the non-African groups. 00:14:57.14 The second principal component 00:14:59.23 is differentiating the Native Americas, 00:15:03.05 Eastern Asians, 00:15:04.20 and Oceanin populations 00:15:06.12 from the rest of the world. 00:15:07.26 And the third principal component 00:15:09.27 is distinguishing the Hadza hunter-gatherers from Tanzania 00:15:13.11 from the rest of the world. 00:15:15.12 This next result is based on a probabilistic analysis 00:15:20.24 that simultaneously infers ancestral population clusters, 00:15:26.11 which are represented by the different colors shown here, 00:15:29.27 and then we have... 00:15:31.28 this is actually composed of a series of lines, 00:15:34.21 and each line represents an individual. 00:15:37.18 And an individual can have mixed ancestry 00:15:42.04 from different ancestral population clusters. 00:15:45.18 So what we tend to see outside of Africa, 00:15:48.03 which is shown along the bottom here, 00:15:50.10 is that individuals are clustering 00:15:52.05 by major geographic region. 00:15:54.08 So, in blue we have individuals 00:15:56.26 who self-identify as European or Middle Eastern, 00:16:00.27 and then here we have individuals from southern India, 00:16:04.25 here we have individuals from Pakistan, 00:16:08.09 central Asia, 00:16:09.27 east Asia, 00:16:11.03 Oceania, 00:16:12.29 and the Americas. 00:16:14.27 But what I want you to note is all the colors 00:16:17.27 that we see here in Africa. 00:16:20.22 That's representing the very large amount of genetic diversity, 00:16:24.15 not only within, 00:16:26.11 but among African populations, 00:16:28.15 compared to the whole rest of the globe. 00:16:31.20 I'll just point out a couple of trends. 00:16:35.10 In orange colors are populations from central and west Africa 00:16:38.27 who speak Niger-Kordofanian and Bantu languages. 00:16:43.09 In purple are populations 00:16:45.17 that speak Afro-Asiatic languages 00:16:47.21 and originated from northern or northeast Africa. 00:16:51.26 In red are populations that speak Nilo-Saharan languages 00:16:55.23 and they most likely originated from southern Sudan. 00:17:01.11 We have populations that are speaking Chadic languages, 00:17:05.16 a group called the Fulani who are nomadic pastoralists. 00:17:08.28 Most of the north Africans 00:17:10.27 have a lot of European or Middle Eastern admixture. 00:17:14.23 And then we have the hunter-gatherer groups, 00:17:16.15 like the Hadza, 00:17:18.09 the Sandawe, 00:17:19.23 pygmies from central Africa, 00:17:21.22 and the San hunter-gatherers from southern Africa. 00:17:26.08 Now, we repeated this analysis within Africa, 00:17:30.01 and again we inferred 14 ancestral population clusters, 00:17:34.15 but for ease of viewing I'm just going to pool individuals together 00:17:37.20 and show them as pie charts. 00:17:39.25 Now, first I'm showing you the 3 populations 00:17:42.13 that had been studied as part of the 00:17:44.07 HapMap and Thousand Genomes Initiative. 00:17:47.06 These are NIH-funded programs 00:17:50.22 to characterize genetic variation 00:17:52.28 across ethnically diverse human populations 00:17:56.02 and making that data publically available 00:17:58.02 so that it could be used by a wide range of 00:18:00.16 biomedical research scientists. 00:18:04.00 Now, what I want to point out is that 00:18:05.15 when we look at the rest of Africa, 00:18:08.15 we see quite a bit more variation. 00:18:11.27 And so, for example, populations in east Africa 00:18:15.08 look distinct from populations in western Africa, 00:18:19.25 northern, 00:18:21.04 and southern Africa. 00:18:23.00 It's also interesting 00:18:24.16 because we can see remnants of historic migration events. 00:18:27.11 So for example, I mentioned to you the Bantu migration. 00:18:30.18 The people who speak Niger-Kordofanian or Bantu languages 00:18:33.23 are represented by shades of orange, 00:18:35.26 and you can actually see that they appear 00:18:38.13 to have originated, as I said, 00:18:40.11 in Cameroon/Nigeria region, 00:18:43.03 and then they migrated 00:18:45.10 across Africa into eastern Africa, 00:18:48.03 where they admixed with the indigenous populations there, 00:18:51.19 and they also migrated into southern Africa, 00:18:54.05 where the admixed with the populations there. 00:18:57.05 We can also see remnants of migration of individuals 00:19:01.08 from northeast Africa who speak Afro-Asiatic languages 00:19:04.28 into Kenya and Tanzania. 00:19:07.22 We see migration of people who speak Nilo-Saharan languages, 00:19:11.20 originating from southern Sudan. 00:19:13.11 There was one group that went west, 00:19:16.07 and we think that some of these people who speak Chadic languages, 00:19:20.19 which are actually classified as Afro-Asiatic, 00:19:22.25 genetically they look very similar to the Nilo-Saharans. 00:19:26.02 So in fact there may have been a language substitution 00:19:28.15 at some point in the past. 00:19:30.23 And then we have migration of the Nilo-Saharan pastoralists 00:19:34.08 into Kenya and into Tanzania. 00:19:38.23 We can also see that some of the hunter-gatherer groups are very distinct. 00:19:42.27 Here are the Hadza hunter-gatherers, who speak with a click in Tanzania. 00:19:47.08 Here are the Sandawe, who speak with a click, also in Tanzania, 00:19:50.04 but their languages are very divergent from each other. 00:19:53.14 Here are the San hunter-gatherers shown in light green, 00:19:56.05 also speaking with a click, but again, 00:19:58.06 their languages are very differentiated 00:20:00.06 from the other two populations who speak with clicks in Tanzania. 00:20:05.08 And then we have the pygmy populations from central Africa. 00:20:10.13 Interestingly, the pygmy population called Mbuti, 00:20:14.09 who lives the furthest to the east, 00:20:16.26 appears to possible share some common ancestry with the San. 00:20:22.01 And in fact several pieces of data that we've studied 00:20:26.15 suggest that there could have been a 00:20:28.16 proto Khoesan-Pygmy hunter-gatherer population in Africa 00:20:32.16 that probably existed greater than 50,000 years ago, 00:20:36.01 and then underwent population divergence and differentiation 00:20:40.06 and then migration within the past 50,000 years, 00:20:43.22 but there's still a lot of work to be done 00:20:45.14 to try to differentiate this population history. 00:20:48.23 So next I wanna talk about what we found 00:20:51.05 in terms of African American ancestry. 00:20:53.22 We looked at African Americans 00:20:55.21 originating from four regions in the US: 00:20:58.15 Chicago, Pittsburgh, Baltimore, and North Carolina. 00:21:02.05 Now, not surprisingly, you can see that the majority of ancestry 00:21:06.17 is this western Niger-Kordofanian ancestry, 00:21:10.05 shown in orange. 00:21:12.08 The other major component of their ancestry, 00:21:14.21 which is summarized here, is European ancestry, 00:21:17.19 which ranges from about 0% to greater than 50%. 00:21:22.18 And then we see small amounts of ancestry from other populations, 00:21:25.22 including some other African populations 00:21:29.18 who speak Chadic languages, for example, 00:21:33.09 from western Africa. 00:21:34.27 We see a small amount of ancestry from east Africa, 00:21:37.25 and also very small amounts of 00:21:40.07 east Asian and Native American ancestry, 00:21:43.02 at least in these particular populations. 00:21:45.23 If you look at populations from other regions, 00:21:48.17 you may see more ancestry from those regions. 00:21:54.02 And again, this is reflecting the history of the transatlantic slave trade, 00:22:00.15 originating from west Africa, 00:22:03.10 and actually a very large source of the slave trade 00:22:05.29 was from Angola, 00:22:07.24 and we currently know very little about genetic variation in that region. 00:22:11.12 And that's going to be important to know 00:22:13.28 for some studies in which knowing variation 00:22:17.29 in African ancestral populations will be important 00:22:20.15 for identifying disease risk alleles 00:22:23.28 in African American or Afro-Caribbean populations. 00:22:28.28 I want to tell you about another study that I did with collaborators, 00:22:32.18 in which we looked at 00:22:35.22 over 250,000 single nucleotide polymorphisms, or SNPs. 00:22:41.11 These are just regions of the genome 00:22:43.18 that differ at a single nucleotide, 00:22:46.24 and we looked at them predominantly 00:22:49.12 in western populations along the coast of Africa, 00:22:54.05 and one group from southern Africa. 00:22:57.10 And when we do this principal component analysis, 00:22:59.22 one of the interesting results 00:23:02.07 is that the distribution really reflects the geography of these populations, 00:23:08.02 and that's not a huge surprise. 00:23:09.29 It means that people who live near each other 00:23:12.06 tend to mate with each other, 00:23:13.28 and people who live further apart are not intermixing as often, 00:23:17.27 and so they tend to be more genetically differentiated. 00:23:23.23 We then did a principal component analysis 00:23:26.25 including the African American individuals, 00:23:30.08 shown here in sort of fuchsia color, 00:23:33.29 and shown in red are Europeans, 00:23:37.03 and then here we have the different west African populations. 00:23:42.01 And we could actually determine 00:23:44.00 the amount of European or African ancestry in any individual 00:23:49.06 -- African American individual -- 00:23:51.19 by looking at their position along principal component 1. 00:23:56.05 So for example, this individual here, 00:23:58.24 this African American individual, 00:24:00.29 appears to have more European ancestry, 00:24:03.17 whereas this African American individual 00:24:06.04 seems to have more west African ancestry. 00:24:11.08 And then, using an approach that was developed by Carlos Bustamante's lab, 00:24:16.15 it was possible to actually scan along chromosomes, 00:24:19.16 so here we're showing 00:24:22.22 the different chromosomes starting at 22, 21, 20, 00:24:25.25 and so on down to chromosome 1. 00:24:28.12 And as you scan along the chromosome, 00:24:30.04 at any particular region, 00:24:32.11 you can infer if somebody has African ancestry, 00:24:36.25 which is shown in blue, 00:24:39.12 European ancestry, which is shown in red, 00:24:43.15 or mixed ancestry, which is shown in green. 00:24:47.27 And what we see is that most African Americans 00:24:50.23 have a mixture of ancestry. 00:24:53.01 So they tend to have a lot of, 00:24:54.16 not surprisingly, African ancestry shown in blue. 00:24:58.03 There are regions of mixed ancestry shown in green, 00:25:01.13 but also note that there are some regions of the genome 00:25:04.20 which are only of European ancestry, 00:25:07.26 and this differs quite a bit amongst different individuals. 00:25:10.11 Here's an example of someone who appears 00:25:12.13 to have undergone very recent admixture; 00:25:16.08 they have a lot of African ancestry. 00:25:19.27 Here's someone who has very recent European ancestry, 00:25:23.01 so we see a lot of regions of the genome 00:25:24.24 where they're of mixed ancestry. 00:25:27.20 Here's someone who has... 00:25:29.24 they self-identify as African American, 00:25:31.27 but they have almost no African ancestry, 00:25:34.24 so that goes to show you that there can be a lot of genetic variation 00:25:38.00 that may not always correlate with self-identified ethnicity. 00:25:42.29 The other important point here is that, 00:25:45.29 in the future, 00:25:48.25 the ideal that we have is to develop 00:25:51.02 more personalized medicine 00:25:53.27 that is tailored for the individual. 00:25:56.14 And here's someone that, for example, 00:25:58.16 if they went to the doctor and they self-identified 00:26:00.22 as African American, 00:26:02.20 the doctor might prescribe certain drugs that, say, 00:26:04.26 are more effective in African Americans. 00:26:07.11 But what if, at that particular position, 00:26:09.27 where they have only European ancestry, 00:26:12.12 what if there was a drug metabolizing enzyme gene 00:26:16.01 at that particular point, 00:26:18.27 and so that would be of pure European ancestry, 00:26:21.24 and so that might be important to know. 00:26:24.00 So this has important implications for 00:26:26.00 future design of future personalized medical approaches for treatment. 00:26:32.25 So in conclusion, people from different geographic regions 00:26:35.29 are genetically more similar to each other, 00:26:38.08 so for example, Asian individuals 00:26:40.15 will be more similar to other Asian individuals, 00:26:43.02 Europeans more similar to other Europeans. 00:26:46.02 But in Africa, 00:26:47.25 there has been more time to accumulate genetic variation, 00:26:50.29 they're had larger effective populations sizes 00:26:53.18 so they've maintained a lot of variation, 00:26:55.26 and they've live in diverse environments, 00:26:58.05 so they tend to be highly differentiated from each other, 00:27:01.05 although we also can see that 00:27:03.16 there's been a history of admixture throughout much of Africa. 00:27:07.29 So therefore, Africans have the highest level of genetic variation, 00:27:12.16 both within and between populations, 00:27:15.01 and we saw that African Americans 00:27:17.05 have ancestry from west Africa and Europe, 00:27:19.21 and that the ancestry varies along chromosomes, 00:27:22.03 which has important implications for personalized medicine. 00:27:26.18 And that concludes this portion of my lecture, 00:27:28.26 and for this section I'd like to acknowledge 00:27:30.23 the many individuals who contributed, 00:27:34.28 together with our funding organizations.