Controlling the Cell Cycle

Transcript of Part 2: Controlling the Cell Cycle: Cdk Substrates

00:00:02.03		So, hello, my name is Dave Morgan. I'm from the University of California in San Francisco.
00:00:06.04		And in this lecture I'm going to go over some of my own work on studies of
00:00:09.23		how the cyclin-dependent kinases drive the events of the eukaryotic cell division cycle.
00:00:15.29		Now itâ€™s well established at this point that the major regulators
00:00:19.11		of the eukaryotic cell cycle are the cyclin dependent kinases or Cdks.
00:00:23.13		And the basic idea is that a series of Cdk-cyclin complexes are activated
00:00:28.11		in a specific sequence during the cell cycle
00:00:30.08		to trigger the events of the cell cycle in the appropriate order.
00:00:33.21		And so, for example, S-phase Cdk-cyclin complex is formed in late mitosis or in late G1
00:00:39.15		and are then activated in the beginning of S-phase to initiate DNA synthesis.
00:00:43.03		And then M-phase Cdk-cyclin complexes form at the end of G2
00:00:47.06		and are activated to initiate the events of mitosis and take the cell to metaphase.
00:00:52.17		So the big question that I want to address today is how is it that these
00:00:55.26		Cdks actually drive these cell cycle events?
00:00:59.05		Now obviously, Cdks are protein kinases, which means that the most likely mechanism
00:01:02.24		by which they promote cell cycle events is through the phosphorylation of other proteins
00:01:07.10		which then bring about those events.
00:01:09.15		And so over the past 10 or 12 years or so, we've dedicated quite a lot of effort
00:01:13.18		to identifying the substrates of the cyclin-dependent kinases
00:01:16.20		in the hope that that will lead us to an answer to this question of
00:01:19.17		how the Cdks actually initiate cell cycle events.
00:01:23.23		So in this lecture I'm going to tell you about two methods that we've used
00:01:26.23		to systematically and comprehensively identify Cdk substrates
00:01:30.09		and then in the second half of the lecture we'll go into some interesting ways
00:01:33.20		in which we use those lists of substrates to address some general questions of
00:01:37.19		Cdk function and phospho-regulation.
00:01:41.07		So the first method we use to identify Cdk substrates
00:01:43.22		began about 10 or 12 years ago in a collaboration with Kevan Shokat, a chemist here at UCSF.
00:01:51.00		Now, Kevan came up with an idea whereby it would be possible to label the specific targets of
00:01:57.04		a protein kinase in a crude cell mixture and this slide attempts to explain that basic method.
00:02:02.09		On the left...let's focus on the left first.
00:02:04.08		On the left is a wild type regular protein kinase like Cdk1 with its cyclin regulatory partner.
00:02:10.08		And typically when one wants to label the targets of a protein kinase, like Cdk1,
00:02:14.28		you simply provide that kinase with a version of ATP in which the gamma phosphate
00:02:20.00		is labeled with a radioactive tag.
00:02:22.28		And then when that protein kinase uses that ATP it will then transfer
00:02:27.24		that P-32 onto its substrates.
00:02:31.11		Now, unfortunately this cannot be used to identify unknown substrates
00:02:34.24		of Cdks because if you take a pure Cdk and some gamma labeled P-32-ATP
00:02:40.16		and put that into a crude cell lysate, you will get not only the labeling of the Cdk's targets
00:02:46.12		but the labeling of all the other protein kinase targets in that lysate because
00:02:49.15		that ATP can be used by any kinase.
00:02:52.22		And so Kevan Shokat's idea was to avoid this problem by using
00:02:56.15		so called analog-sensitive protein kinases.
00:02:58.29		And the strategy is based on the fact that protein kinases
00:03:02.07		tend to contain a large hydrophobic residue in the wall
00:03:05.19		of the adenine binding pocket of their active site.
00:03:08.21		And so the basic strategy is to mutate that large hydrophobic side chain there
00:03:12.28		to a glycine residue resulting in the formation of an extra pocket in
00:03:16.18		the side of that ATP binding site. And as a result
00:03:20.00		this mutant kinase can now use a bulky ATP analog in which extra moieties
00:03:24.25		have been added to the adenine base of the ATP.
00:03:27.28		And so for example, N6-benzyl-ATP can be used by the analog sensitive Cdk1 kinase
00:03:32.28		but cannot be used by a wild type kinase because that bulky ATP analog
00:03:37.04		can't fit into the wild type active site.
00:03:41.14		And so, of course, if you put a radio-label on the gamma phosphate
00:03:44.05		of this bulky ATP analog and then add this kinase to a crude cell lysate
00:03:48.19		what you hope to get is the specific labeling of just the direct targets of
00:03:52.23		that protein kinase and no other kinases in the cell lysate
00:03:56.00		because those other kinases can't use this bulky ATP analog.
00:04:00.12		So this method...we developed this method in collaboration with Kevan Shokat
00:04:04.04		a number of years ago and applied it to the yeast Cdk1 kinase
00:04:08.28		and the results with that are shown in the next slide.
00:04:12.09		So this slide shows an autoradiograph of a protein gel
00:04:15.14		in which we've separated the reaction products
00:04:16.28		from three different reactions, two of which are control reactions
00:04:20.03		and the third of which is the experimental reaction.
00:04:23.09		In the first lane what you see is what happens when you add this radiolabeled N6-benzyl-ATP
00:04:28.16		this bulky ATP to a crude cell extract made from yeast.
00:04:33.05		And the result is that you get very little labeling of anything in that cell extract
00:04:36.14		because that ATP analog cannot be used by the protein kinases in that crude cell extract.
00:04:42.14		The next lane is another control in which we're mixing the purified protein kinase Cdk1-as1
00:04:48.23		together with a cyclin partner and then adding that to some N6-benzyl-ATP
00:04:54.01		that's radio-labeled on its gamma position and the result then is that you
00:04:57.10		see auto-phosphorylation of the cyclin subunit of the Cdk-cyclin complex.
00:05:01.21		And so that results in a background band in the experimental lane over here.
00:05:07.03		But the third lane is really the crucial lane in which all three components
00:05:09.22		have been added. And so we're adding a purified kinase...analog sensitive kinase
00:05:13.14		with the bulky ATP analog and the cell extract and the result
00:05:17.21		is that you see a whole raft of different proteins being radio-labeled
00:05:22.06		in the cell extract and those proteins are presumably the direct targets of
00:05:27.00		Cdk1-cyclin complexes in that lysate.
00:05:31.12		So we obtained this result a number of years ago and then
00:05:33.29		dedicated quite a lot of effort to identifying
00:05:36.03		these various radiolabeled bands in this cell lysate.
00:05:38.05		And to make very long story short, we ended up using proteomic libraries
00:05:42.24		to individually identify substrates and in the end
00:05:45.27		we came up with a list of about 181 proteins in cell extracts that
00:05:50.04		are rapidly modified by Cdk1-cyclin complexes.
00:05:53.27		And so this list of proteins provided us with our initial list of Cdk substrates.
00:05:59.23		These substrates are involved in a wide range of different cellular processes;
00:06:02.25		many of which are known to be connected to the cell cycle in some way
00:06:05.20		and are likely to represent important targets of Cdk1 throughout the cell cycle.
00:06:11.01		But for various reasons we decided that this list of substrates was
00:06:14.02		incomplete and also, because it was done in vitro we wanted to get
00:06:18.03		another approach that would allow us to identify, comprehensively, a larger number
00:06:23.09		of Cdk substrates that were modified in vivo by Cdk1.
00:06:26.23		And so the second method we've been using more recently has been to use
00:06:30.17		quantitative mass spectrometry approaches to identify all the phosphorylation
00:06:34.10		sites in the cell that are dependent on Cdk1.
00:06:37.18		In other words, phosphorylation sites whose levels decrease abruptly
00:06:40.18		when you inhibit the protein kinase activity of Cdk1.
00:06:45.12		And that method begins again with the analog sensitive Cdk1 mutant.
00:06:49.17		Now, another advantage of these analog sensitive mutants is that not only do they bind
00:06:53.24		bulky ATP analogs, but they also bind bulky inhibitors that can only fit into the active site
00:06:59.24		of the analog sensitive kinase but not the active site of a wild type kinase.
00:07:05.02		And so, for example, this inhibitor here 1-NM-PP1 binds with extremely high affinity to
00:07:10.05		analog sensitive Cdk1 but has essentially no affinity for the wild type kinase
00:07:14.06		or for any other kinase in the yeast cell.
00:07:16.23		And so we could use this analog sensitive Cdk1 to make a yeast strain in which
00:07:22.02		we can inhibit Cdk1 in vivo rapidly and specifically.
00:07:26.22		And so we did that a number of years ago.  We created a yeast strain in which
00:07:29.12		the endogenous Cdk1 protein is replaced with the analog sensitive protein
00:07:33.23		and in that yeast strain it is now possible to almost completely and specifically inhibit
00:07:38.05		Cdk1 activity within minutes by the addition of 1-NM-PP1 to the culture medium.
00:07:44.19		So we used that strain in
00:07:46.08		this quantitative mass spectrometry approach that I want to tell you about
00:07:49.16		which was done in a collaboration with Judith Villen and Steve Gygi of Harvard University.
00:07:55.11		And the basic approach that we used is illustrated in this slide and in the next two slides as well.
00:08:01.11		It begins, as I said, with the analog sensitive yeast strain cdk1-as cells
00:08:05.06		in which Cdk1 can be inhibited specifically with the 1NM-PP1 inhibitor.
00:08:10.21		And what we do is we grow two parallel cultures of this yeast strain.
00:08:14.14		One culture, the so called light culture, is grown in regular lysine and arginine
00:08:19.21		whereas the so called heavy culture is grown in a different form of lysine and arginine
00:08:23.11		in which carbon-13 and nitrogen-15 have replaced the usual carbon-12 and nitrogen-14.
00:08:29.21		And so, as a result, after growth in this medium for some time, all the proteins in these cells
00:08:34.03		have been labeled with slightly heavier than average
00:08:37.04		lysine and arginine residues which means that all
00:08:39.10		the peptides derived from this culture will have a slightly higher mass
00:08:43.01		in the eventual mass spectrometry analysis
00:08:45.21		and that will allow us to identify the peptides coming from these two lysates.
00:08:50.22		So we treat the heavy culture with the inhibitor 1-NM-PP1 for a brief period, 15 minutes.
00:08:56.03		And then we harvest these cells after the inhibitor treatment.
00:09:00.08		Harvest the cells, mix them together, lyse them, break them open,
00:09:04.03		and then treat all the resulting proteins in those cell lysates
00:09:08.18		with trypsin to break them all down into tryptic peptides.
00:09:12.06		And then Judit Villen in the Gygi lab has developed a wide range of
00:09:15.27		powerful methods for purifying the phospho-peptides out of that tryptic peptide mixture.
00:09:20.20		And then we then subject those phospho-peptides
00:09:22.25		to mass spectrometry as shown in the next slide.
00:09:26.23		There are two basic forms of mass spectrometry that are applied to these phospho-peptide mixtures.
00:09:31.07		The first, on top, is to use conventional tandem mass spectrometry methods
00:09:35.15		to actually fragment these peptides and use those fragments
00:09:38.07		to determine their sequence.
00:09:40.02		And so, by this approach we can determine the sequence of all the phospho-peptides
00:09:44.04		coming out of these yeast lysates and just as importantly, we can identify
00:09:47.10		the precise site of the phosphorylation on those peptides.
00:09:50.29		And so by doing this, Judit was able to produce a list
00:09:54.12		of about 10,000 phosphorylation sites on 2,000 different proteins in the yeast lysate.
00:10:00.25		And then, in addition to determining sequence, we also quantify all the peptides
00:10:06.12		and determine the relative amount of the so called light and heavy peptides.
00:10:10.19		What this means is that every peptide coming out of these phospho-peptide mixtures
00:10:15.15		comes in both a light form which originally came from the light medium culture
00:10:19.13		and a heavy form that originally came from the inhibitor-treated heavy culture.
00:10:23.17		And they can be distinguished based on this slight
00:10:25.08		mass difference of their lysines and arginines.
00:10:28.12		And what we're looking for, of course, are peptides that look like this:
00:10:30.29		where the heavy peptide is much less abundant than the light peptide.
00:10:34.16		And that means that that peptide's abundance was inhibited
00:10:37.23		or decreased as a result of Cdk1 inhibition and therefore
00:10:41.02		that phosphorylation site on that peptide represents a
00:10:43.23		Cdk1-dependent phosphorylation site in vivo.
00:10:47.23		And so by applying this approach to the many phosphorylation sites identified here
00:10:52.18		we came up with a list of about 547 phosphorylation sites on about 308 proteins
00:10:58.10		that were clearly Cdk1-dependent and represent likely candidates for Cdk1 targets in vivo.
00:11:04.13		This list of targets included many of the same proteins
00:11:07.02		we had identified in our previous screen in vitro
00:11:09.12		and so for those proteins at least we have very good evidence that these
00:11:11.29		proteins are kinase substrates both in vitro and in vivo.
00:11:17.11		Now the list of substrates includes a wide range of proteins involved in a wide range of processes.
00:11:23.20		I'm not expecting you to see or read any of the gene names on these lists here.
00:11:26.27		This slide is simply meant to illustrate that we have lists of proteins involved in
00:11:30.14		a wide range of interesting processes. Some of these processes are totally expected.
00:11:35.00		For example, DNA replication, spindle behavior, kinetochores and cytokinesis are all
00:11:40.08		processes in which we expect Cdks to be involved in regulating some aspect of those processes.
00:11:45.10		There's also a few surprises here as well.
00:11:47.27		Protein translation, chromatin structure, and nuclear transport and the secretory pathway
00:11:53.09		all have a number of Cdk substrates involved
00:11:56.08		in those processes and so one might imagine
00:11:58.15		that this will lead to some new understanding of how Cdks might control those processes
00:12:02.17		as well as the more conventional cell cycle regulated processes.
00:12:07.14		But for the rest of this lecture today, I'm not going to talk in detail about
00:12:10.21		any specific substrates or processes, but instead I'm
00:12:13.14		going to tell you how we used our lists of substrates
00:12:16.02		to address some interesting general questions on how cell cycle progression is
00:12:20.10		controlled by Cdks in general.
00:12:23.13		And so we're going to address two questions in the remainder of this lecture.
00:12:26.14		The first one of which is shown on this next slide.
00:12:30.15		And that question is this one: How do different cyclins trigger different cell cycle events?
00:12:35.10		So I told you at the beginning of the lecture that
00:12:37.02		S-phase cyclin Cdk complexes initiate S-phase and mitotic Cdk-cyclin complexes
00:12:42.00		initiate M-Phase and there's good evidence from yeast genetics and elsewhere
00:12:46.00		that S-phase Cdk-cyclin complexes have a better intrinsic ability to
00:12:49.21		initiate S-phase than a mitotic cyclin-Cdk complex.
00:12:54.01		So there's something different about cyclin-Cdk complexes that are activated at S-phase
00:12:58.28		that allows them to more effectively activate the onset of S-phase.
00:13:02.05		And so, what is that difference?
00:13:04.12		Well, one obvious possibility is that the cyclin that associates with the Cdk
00:13:07.21		helps determine the substrate specificity of that Cdk.
00:13:11.20		So in budding yeast, for example, where there's only a single Cdk
00:13:14.20		associating with all these different cyclins,
00:13:16.08		one can imagine that associating with an S-phase cyclin
00:13:19.20		at the beginning of S-phase might target that Cdk for specific substrates involved in S-phase.
00:13:26.01		And so we decided we could address this question on a more global level
00:13:29.15		by actually analyzing the relative phosphorylation rate of
00:13:32.09		a wide range of Cdk substrates using purified S-phase Cdks and M-phase Cdks.
00:13:38.15		And specifically we carried out these studies using the S-phase cyclin Clb5 from budding yeast
00:13:44.08		and the M-phase cyclin Clb2 from budding yeast.
00:13:47.15		And Mart Loog, a post-doc in the lab, basically purified these two kinases
00:13:51.14		and then tested their activity towards about 150 different Cdk substrates
00:13:55.17		to look for substrates that were highly specific for one or the other.
00:13:59.21		Some of his early results are shown in this next slide
00:14:02.07		which gives you an illustration of the sort of thing we found.
00:14:06.03		Here we're looking at autoradiographs of protein gels in which three different proteins
00:14:11.00		listed across the top--Mcm3, Orc2, and Orc6
00:14:14.00		have been treated with either the mitotic cyclin-Cdk complex on the left
00:14:18.14		or the S-phase complex on the right.
00:14:20.22		And you can see, quite clearly, that these three proteins are all phosphorylated
00:14:24.08		much more rapidly by the S-phase Cdk-cyclin complex Clb5.
00:14:29.21		So Mart went ahead and did this exact same reaction with about 150 proteins as I said
00:14:35.06		and the results from those experiments are shown on this slide.
00:14:37.24		So this slide summarizes everything that he found.
00:14:40.15		What we're looking at here is a plot of about 150 proteins
00:14:44.06		each one of which is represented by these little circles on this plot.
00:14:47.10		And these circles are plotted according to the rate of their phosphorylation
00:14:52.02		by Clb2 on that axis and Clb5 on this axis.
00:14:55.22		And so most of the proteins are falling along the diagonal of this plot
00:14:59.21		indicating that they are equally well phosphorylated by both kinases.
00:15:02.24		In other words, they're not cyclin specific targets.
00:15:05.09		However, we found a quite large number of proteins over here on the right
00:15:09.17		especially these red circles here that represent proteins that
00:15:12.25		are far more rapidly phosphorylated by Clb5-Cdk1 than they are by Clb2-Cdk1.
00:15:19.18		So these proteins, and note by the way that this is a log phase scale here
00:15:23.03		so some of these proteins are 10 or over 100 or even 1000 fold more rapidly phosphorylated by
00:15:28.02		Clb5-Cdk1 than by Clb2-Cdk1.  So these clearly represent proteins that are highly Clb5 specific.
00:15:35.05		That the cyclin is somehow determining or increasing
00:15:37.29		the rate of phosphorylation of these proteins.
00:15:40.14		So what are these proteins? Well, we were satisfied to see that at least five of them
00:15:46.07		are proteins known to be involved in DNA replication, especially Sld2 here.
00:15:50.07		Sld2 is a protein whose phosphorylation is known
00:15:52.25		to be crucial for the initiation of DNA replication.
00:15:55.16		And so these proteins make perfect sense as Clb5 specific targets because those are
00:16:00.04		the proteins that we need to phosphorylate early in S-phase
00:16:02.22		to help drive progression through chromosome duplication.
00:16:07.27		So what this list of cyclin-specific substrates in hand, we next addressed
00:16:12.10		the mechanism underlying this cyclin specificity.
00:16:14.25		Why is it that Clb5-Cdk1 phosphorylates these proteins
00:16:18.06		so much more rapidly than Clb2-Cdk1?
00:16:21.13		Through kinetic studies we discovered that the reason for this higher rate of phosphorylation was
00:16:26.15		that these substrates have a much higher affinity for the Cdk-cyclin complex
00:16:30.05		when Clb5 is associated, suggesting that they might associate with that cyclin subunit.
00:16:36.16		In fact, there's previous suggestions of what
00:16:38.04		the mechanism for this association might be.
00:16:40.25		And those are based on the known crystal structures of Cdk-cyclin complexes from human cells.
00:16:46.06		So this shows the crystal structure of a Cdk-cyclin complex from humans
00:16:50.20		that illustrates very nicely the basic parts of the Cdk-cyclin complex
00:16:55.08		and where cyclin substrates typically associate with this complex.
00:16:59.01		Over on the left is the Cdk catalytic subunit and between these two lobes here
00:17:02.29		is an active site cleft in which you can see this ATP molecule binding right here.
00:17:08.19		Typically a protein substrate would bind along the surface of this protein kinase right here
00:17:12.29		in a way that the serine or threonine hydroxyl would be positioned in such a way
00:17:17.25		to allow the transfer of phosphate from that ATP onto the hydroxyl residue.
00:17:23.10		So, the primary site of substrate association with the Cdk-cyclin complex
00:17:27.00		is of course the active site, the place where that serine or threonine
00:17:30.21		associates with its sequence contacts to be phosphorylated.
00:17:35.16		However, this is probably not the only site of substrate association in a Cdk-cyclin complex.
00:17:39.28		There is considerable evidence from mammals and from yeast as well
00:17:43.13		that there is a docking site on this cyclin itself
00:17:46.20		that can also associate to some extent with parts of the substrate.
00:17:50.10		And this docking site is mostly composed of this large alpha-helix here
00:17:54.18		that contains a number of hydrophobic residues
00:17:57.06		that are together called the hydrophobic patch.
00:17:59.16		It is involved in associating with certain substrates
00:18:01.23		and enhancing activity towards those substrates.
00:18:07.02		So, we obviously hypothesized that perhaps this docking site on Clb5
00:18:10.16		exists on Clb5 and that this docking site is required
00:18:13.26		for the cyclin-specific phosphorylation that we saw in our experiments.
00:18:18.03		And so to test that the obvious approach was to mutagenize this docking site
00:18:21.09		through a number of single point mutations and then test whether that
00:18:25.08		has any impact on cyclin specificity and that is shown in this slide.
00:18:28.20		And the answer was a definite yes, that mutation of that docking site
00:18:32.27		completely abolishes the Clb5 specificity that we had seen.
00:18:35.23		So here again we're looking at autoradiographs of protein phosphorylation
00:18:39.27		by purified Clb2 on the left two lanes and Clb5 on the right.
00:18:43.20		And we're looking at the phosphorylation of 5 highly Clb5 specific proteins.
00:18:47.20		And you can see that the wild type Clb2, the wt here, phosphorylates these proteins
00:18:52.13		rather poorly whereas wild type Clb5 phosphorylates them extremely well.
00:18:57.18		Once again, indicating how specific these proteins are for Clb5.
00:19:01.03		However, if you mutate the hydrophobic patch or the docking site
00:19:04.15		on Clb5 you find that that specific phosphorylation is almost completely lost
00:19:09.12		indicating that that site is really required for the increased affinity
00:19:12.25		that Clb5-Cdk1 has for these substrates.
00:19:18.20		So we conclude that an interaction, probably simultaneous between
00:19:22.01		this docking site and the active site, allows specific Clb5 substrates to interact
00:19:26.19		with the Clb5-Cdk complex in a high affinity fashion
00:19:30.10		that allows more rapid phosphorylation of those proteins.
00:19:34.20		And so that leads us to at least a partial answer for the question that I first posed
00:19:39.04		which is: How do different cyclins drive different cell cycle events?
00:19:43.07		Well, part of the answer appears to be that the associated cyclin that associates
00:19:47.01		with the Cdk helps target that Cdk to specific substrates.
00:19:51.01		And so S-phase Cdk-cyclin complexes when they're activated
00:19:54.14		at the end of G1 tend to phosphorylate more rapidly the proteins
00:19:57.25		that are most important for initiating S-phase.
00:20:02.06		OK, now I want to turn to an entirely different sort of general question
00:20:05.05		that we also used our substrate lists to address.
00:20:09.13		And in particular, we used our recent mass spectrometry analysis
00:20:12.11		and our 547 Cdk1-dependent phosphorylation sites to address this question.
00:20:17.20		And this is a much more general question that just...that goes beyond
00:20:21.28		issues of simple cell cycle control but reaches into areas involved in
00:20:25.26		the general issues of phospho-regulation. And the question is this one:
00:20:30.10		How is it that phosphorylation changes the function of a protein?
00:20:33.10		How is it that the addition of a phosphate group to a protein changes that protein's function
00:20:37.20		in a way that allows it to initiate cell cycle events or do other things?
00:20:41.27		And there are typically a couple of different approaches
00:20:44.09		or different mechanisms that are thought to be involved in changing protein function.
00:20:48.03		And the first and possibly most commonly imagined mechanism is this one here
00:20:51.25		the so-called allosteric switch. And the idea with this mechanism is that the placement
00:20:56.13		of a phosphate on a protein in a very specific location
00:20:59.06		causes a precise conformational change in that protein that then
00:21:02.24		initiates some change in its function, its enzymatic activity or its association with something.
00:21:08.21		Now, this mechanism, of course, requires that the position of that phosphorylation site
00:21:11.29		is extremely precise and conserved. In other words,
00:21:15.10		you can't put a phosphate just anywhere on a protein and
00:21:17.23		cause this very precise conformational change.
00:21:20.08		It has to be extremely well positioned
00:21:21.21		and because of that itâ€™s very difficult to evolve that sort of phospho-regulation.
00:21:26.23		That kind of phosphorylation cannot appear randomly very easily
00:21:31.22		and achieve the kind of regulation that is required.
00:21:35.08		So the alternative mechanism is this one--which I call bulk electrostatics.
00:21:40.08		And this mechanism suggests that the position of the phosphorylation
00:21:43.14		does not require such precise position of the phosphorylation.
00:21:47.27		The basic idea here is that the placement of clusters of phosphorylation sites
00:21:51.22		on the surface of the protein, typically on a loop or a disordered region on the surface
00:21:56.01		can result in interesting regulation such as interference with association with another protein
00:22:01.16		or for that matter, promotion of association with phosphate binding proteins.
00:22:06.04		And so this very simple mechanism of phospho-regulation
00:22:08.18		can occur by the placement of clusters of phosphates in a general region
00:22:13.19		of a protein but the exact position of each of those phosphates is not absolutely important,
00:22:18.06		not critical and therefore the position of those phosphates can shift during evolution
00:22:22.05		in different proteins. And so for that reason this mechanism is much more easily evolved.
00:22:28.10		Itâ€™s very easy to imagine that random mutations
00:22:30.09		could result in the appearance of phosphorylation sites
00:22:32.19		on the surface of certain proteins where they interact with other proteins
00:22:35.17		and that can result in regulatory possibilities that could be selected for.
00:22:41.08		And so, both of these mechanisms are known to be important in different cases.
00:22:46.12		There are examples of proteins that are regulated in both of these ways.
00:22:49.17		But we thought that perhaps our giant list of Cdk substrates would allow us to
00:22:53.11		address the relative importance of these two mechanisms more generally.
00:22:58.06		And so what we did is we took those 547 Cdk1-dependent phosphorylation sites
00:23:03.06		and aligned them with homologous sequences from other species
00:23:07.17		to see how well these phosphorylation sites are actually conserved.
00:23:10.27		And the basic idea here was that if we found that
00:23:13.11		sites are generally, extremely well preserved that might argue for this sort of mechanism,
00:23:17.24		but sites that drift during evolution might argue for this sort of mechanism.
00:23:23.23		So the next slide gives you an illustration of the sort of thing that we found.
00:23:27.14		Now, here we are aligning a bunch of different protein sequences
00:23:29.29		and I don't expect you to actually read these sequences.
00:23:32.17		The important thing is that there are these little yellow boxes that represent
00:23:35.15		a SP or TP di-peptide motifs that are the consensus sequences for Cdk phosphorylation.
00:23:42.09		And along the very top here is the sequence of a part of a protein
00:23:45.27		called Shp1 that we identified two phosphorylation sites in our mass spectrometry experiments.
00:23:51.24		Those sites are site A and site B.
00:23:53.26		One site is over here in a region of the protein that is known
00:23:57.05		to form into a globular domain and anther site is here in a region
00:24:00.21		that's predicted to form a disordered domain.
00:24:03.16		And these other sequences that lie below this top sequence from budding yeast
00:24:06.26		are the sequences of orthologous proteins from various yeast species
00:24:11.25		whose genomes have been sequenced
00:24:13.20		starting with the most closely related yeast here and
00:24:15.25		moving all the way down to the most distantly related yeast at the bottom.
00:24:20.03		And so these protein alignments tell us some very simple things.
00:24:23.19		First of all, site A here is very well conserved in evolution
00:24:27.10		and you can see that almost all of the orthologs of this protein in all these other yeast species
00:24:31.14		contain a likely Cdk consensus site at the exact same position in this highly conserved region.
00:24:38.25		So site A appears to represent an example of the kind of site I mentioned
00:24:42.09		in the left side of the previous slide, a site that is highly conserved in evolution.
00:24:46.27		But site B is not. Site B is very poorly conserved and disappears essentially
00:24:51.15		after a few species and is no longer found at that position in other orthologs.
00:24:57.24		However, if you look in this region of these other species' proteins, you find that
00:25:01.21		SP and TP di-peptide motifs appear scattered throughout this region
00:25:05.08		in a larger number of these yeast homologous proteins,
00:25:08.20		suggesting that even though the initial position here in budding yeast has not been preserved,
00:25:14.09		the Cdk phosphorylation of this region has been conserved in evolution
00:25:18.21		but the exact position of the phosphorylation sites
00:25:20.23		has been shifting dramatically over evolution.
00:25:23.27		So this is obviously consistent with the second idea,
00:25:26.13		that precise phosphorylation site positioning is not
00:25:29.16		required here because these sites might be involved in some more simple, general
00:25:34.03		regulatory mechanism involving association with phosphate binding proteins
00:25:38.03		or interference with protein binding.
00:25:41.11		So we did this exact same alignment for all 547 of our phosphorylation sites
00:25:46.09		and then in the next slide what I'm going to show you
00:25:47.28		is a somewhat complex graphic that illustrates the results that we found from that.
00:25:53.14		And this was done in collaboration with Brian Tuch, a graduate student
00:25:56.22		working in the laboratory of Sandy Johnson at UCSF.
00:26:00.09		And this top plot here represents a hierarchically clustered
00:26:02.27		clustergram as we call it that illustrates the conservation of the precise position of
00:26:08.26		phosphorylation sites in orthologs of the proteins we identified.
00:26:13.29		And so what we're looking at here is a graphic in which there are 547 columns
00:26:18.04		in this graphic, each one of which represents a single Cdk1 dependent phosphorylation site
00:26:23.14		that we identified by mass spectrometry.
00:26:26.17		And then each row in this graphic represents
00:26:29.22		how that phosphorylation site aligns with its orthologs in other species,
00:26:34.06		the same yeast species that I showed in the previous slide
00:26:36.23		starting with the most closely related yeast species
00:26:39.03		and then working to the most distantly related ones.
00:26:41.28		And in each column a yellow box indicates that that phosphorylation site
00:26:47.06		is precisely conserved in its position in that orthologous sequence.
00:26:51.03		In other words, over here on the left this yellow box at the top moves down
00:26:55.16		for a few species and then disappears indicating that this site is only precise...
00:26:59.24		these columns here, these phosphorylation sites
00:27:02.12		are conserved only in the closely related
00:27:04.21		yeasts species but then are lost in all distantly related species.
00:27:10.03		And so by looking at this graph you can see that there's only
00:27:12.10		a small group of phosphorylation sites, these ones in here especially
00:27:16.06		and note particularly these ones here that are conserved
00:27:19.04		throughout all the yeast homologs that we identified.
00:27:21.23		And so this small number of phosphorylation sites, perhaps 30 or 40 of them are
00:27:26.12		at most, are preserved in large numbers of yeast species, indicating that
00:27:31.29		the precise position of phosphorylation
00:27:33.17		has been conserved in a relatively small number of cases.
00:27:37.16		OK, so how do we then test the possibility that instead of precise positioning
00:27:44.01		during evolution that we're looking at drifting phosphorylation site positioning?
00:27:48.28		And that required the development of another graphic which is shown below here
00:27:52.17		in which I'll take you through slowly because itâ€™s a little bit complicated.
00:27:56.16		So in this case, once again, itâ€™s another hierarchically clustered graphic in which
00:28:00.22		there are 547 columns, each representing a different phosphorylation site
00:28:05.15		identified in our analyses. But in this case...
00:28:08.17		and once again, the rows represent alignments with orthologs, orthologous proteins
00:28:13.00		from other yeast species. But in this case the yellow box
00:28:16.02		doesn't indicate precise positioning of phosphorylation site, but instead
00:28:19.18		indicates that the ortholog of that particular protein in these other yeast species
00:28:24.03		has a statistically enriched frequency of Cdk consensus sites, SP and TP motifs.
00:28:30.28		In other words, these yellow boxes represent proteins
00:28:33.21		in which the frequency of SP and TP motifs is far greater than that expected by chance.
00:28:39.00		And so these large numbers of proteins here represent proteins in which
00:28:43.08		even though the precise site of phosphorylation is not conserved
00:28:46.19		as shown up here, these proteins do contain a high frequency of
00:28:50.12		Cdk consensus sites whose positions are clearly drifting during evolution.
00:28:56.04		And so these large number of proteins over here on the right side of this clustergram
00:28:59.24		may represent proteins in which the precise position of phosphorylation
00:29:03.21		does not matter but the regulation of those proteins by phosphorylation
00:29:07.17		is conserved despite that.  And so clearly we'd like to think that that evidence
00:29:14.00		tends to suggest that this mechanism on the right here
00:29:16.03		these easily evolved bulk electrostatic mechanism
00:29:19.05		is a major mechanism by which phospho-regulation can easily be evolved
00:29:23.10		and that drifting phosphorylation sites especially clusters of phosphorylation sites
00:29:26.19		on disordered regions is really crucial for
00:29:30.10		the regulation of many different Cdk substrates.
00:29:34.28		So with that, I want to leave us with the question that we started out this whole lecture with
00:29:39.20		and that is: How do Cdk's drive cell cycle events?
00:29:42.04		Well, clearly our list of Cdk substrates, the ones I'm showing here and
00:29:45.21		the many that aren't shown here, probably contain the answer to this question.
00:29:50.06		Clearly through the detailed analysis of large numbers of these substrates
00:29:53.03		and that will lead us to a much better understanding of how Cdks
00:29:57.17		drive the events of cell cycle and how they alter all these different processes
00:30:02.04		in the cell to initiate cell cycle events.