Introduction to Synthetic Biology and Metabolic Engineering
Transcript of Part 2: Teaching an Old Bacterium New Tricks
00:00:06.07 My name is Kristala Prather and I'm an associate professor of chemical engineering at MIT. 00:00:11.18 In the first part of my presentation, I gave an overview of metabolic engineering and synthetic biology 00:00:16.23 and now I'd like to talk more specifically about work being done in my lab 00:00:20.24 towards expanding the capacity of biology for chemistry, or as I've titled it here, 00:00:25.22 teaching an old bacterium new tricks. 00:00:28.04 So, in my introduction, I gave this maze as an example of how we think about metabolic engineering, 00:00:34.23 where you have the example here of a mouse Wemberly 00:00:38.08 that's lost its pet rabbit Petal and there's a maze of possibilities 00:00:42.16 of how the mouse might get to the rabbit it's looking for. 00:00:45.22 And our goal is to be able to block off, or to obstruct, 00:00:49.24 those pathways which are not going to be productive, 00:00:52.06 or to stimulate, in this case, the mouse to run faster, or in biological terms, 00:00:57.11 to increase the rate at which material will flow through our maze 00:01:00.15 so that we get to the product that we're interested in more quickly. 00:01:03.17 Now, there's another way that we actually think about engineering pathways 00:01:07.15 that also looks at this maze analogy. 00:01:09.25 In this case, our goal is completely different. Now, we actually want to blow the maze up 00:01:16.00 so that rather than forcing the mouse to run from one to the other through all these obstacles, 00:01:21.20 we have a more direct way to get from point A to point B. 00:01:25.00 And that's actually the focus of much of the work that goes on in my lab. 00:01:28.18 When we started thinking about this problem, 00:01:31.23 that is how do we actually get biology to do more chemistry, 00:01:35.19 the question was, well, what kind of targets could we look at? 00:01:38.09 What are good molecules to look at that might be produced by biology? 00:01:42.00 And in 2004, the US Department of Energy put together a report 00:01:46.02 called "Top Value Added Chemicals from Biomass" 00:01:48.21 where they actually sought to answer that very question. 00:01:51.09 That is to say, if you're using biology, or biomass, 00:01:55.01 as the input for chemicals, what are the right molecules that you'd want to produce. 00:01:59.20 And they came up with a list which is actually called the "top 10" list in the literature 00:02:04.21 of building block molecules. 00:02:06.08 Now, I always find this interesting because it turns out the top ten list actually has 12 lines 00:02:11.24 and a few of these lines have more than one molecule, 00:02:14.09 but nevertheless, it's called the top ten list because that sounds a lot better than the top 14 or 15 list. 00:02:19.12 If we look at this list, we see things for example like glutamic acid, 00:02:23.21 and that's an amino acid. We see aspartic acid, which is also an amino acid, 00:02:28.15 and those were compounds that we weren't really interested in working on 00:02:31.28 because our challenge was to find a pathway that either didn't exist 00:02:36.00 or one that was really, really complicated that we could, again, blow up our maze, 00:02:40.07 if we use that analogy, in order to get to the compound that we're interested in. 00:02:43.27 So, when we looked at this list, we eliminated compounds like that. 00:02:47.22 We also eliminated compounds like glycerol, which it turns out is actually relatively cheap today, 00:02:53.16 but wasn't when this report was first produced. 00:02:56.16 So, once we went through this process, 00:02:58.12 of saying, well here are things that we're not intellectually interested in, 00:03:01.17 and here are compounds that we don't think really give us the value that we want, 00:03:05.08 we began to focus on a couple of different compounds, 00:03:08.16 one of which is glucaric acid, that's shown here, 00:03:10.25 and I'd like to talk to you today about our work that we've done 00:03:14.06 to be able to produce this compound in a microbe, namely, E. coli. 00:03:19.00 Glucaric acid, as it's shown again here, is a structure that has 6 carbons, 00:03:24.13 so it's actually pretty similar to glucose in how it's arranged, and it is actually a natural product 00:03:30.07 and I mentioned natural products in the first half of my talk 00:03:32.24 as being compounds that are naturally produced by nature. 00:03:35.26 It turns out this is a compound that's produced in fruits and vegetables 00:03:39.09 and also in mammals, but there's no known microbial pathway for it, 00:03:43.10 meaning that if we look at the simplest organisms, 00:03:45.22 the ones that are easiest to think about putting into a factory, 00:03:49.13 there are no microbes like that where we know that glucaric acid is produced. 00:03:53.02 This compound has been studied for therapeutic purposes 00:03:56.10 either as an agent to reduce cholesterol, or even possibly to fight cancer, 00:04:00.25 but we've actually been more interested in its properties as a monomer 00:04:04.04 for different kinds of materials, or as detergents. 00:04:07.11 And the final bullet point on this slide just emphasizes the fact 00:04:10.10 that we actually know how to make this compound chemically, from glucose, 00:04:14.19 but it turns out that that process, the way it exists now, 00:04:17.13 is pretty messy, it requires a lot of harsh materials, 00:04:20.21 and so it's both not economical and not environmentally friendly. 00:04:24.22 So, we set out to come up with a way to make glucaric acid using biology. 00:04:29.05 This is actually what the natural pathway looks like 00:04:33.10 and hopefully you can see a lot of arrows here that might make you a little bit squeamish 00:04:37.15 if you were going to graduate school and your advisor said, 00:04:40.09 you've got to get this whole thing to work in E. coli. 00:04:43.08 Just to give you a quick overview of this pathway, 00:04:46.10 the compound we're interested in, glucaric acid, is in this box at the top. 00:04:50.00 I mentioned that this is something that could come from glucose 00:04:52.18 and we actually will use glucose as our starting compound as well, 00:04:55.20 and glucose is on this figure. 00:04:57.20 I'll give you a second to look and see if you can find it, 00:04:59.27 because it turns out there are quite a bit of arrows here, 00:05:02.17 but if you look very closely along the left-hand side, then you can see glucose right here. 00:05:07.15 You can also see that all these arrows are going back and forth, 00:05:10.22 you have this interaction with the pentose phosphate pathway, 00:05:13.17 you have another sugar, galactose, which is one input, 00:05:16.13 and you actually have an additional output, which is ascorbic acid. 00:05:19.23 This is a mess, 00:05:20.27 and this is not something that we would really want to think about putting into E. coli. 00:05:25.14 So, our challenge was to figure out, is there a different way for us to get from glucose 00:05:30.26 to the molecule that we're interested in, that would be much simpler, 00:05:33.25 that would have much, much less of this maze-like effect. 00:05:36.22 One of the nice things about having a molecule, however, that is a natural product, 00:05:42.25 is that we could go to the databases and say, is glucaric acid there? 00:05:46.26 That is, in known metabolism, is there an example where glucaric acid has been found 00:05:52.04 to be associated with biology. 00:05:54.16 And in fact, what we found is that glucaric acid could be produced from a compound 00:05:57.29 called glucaronic acid and it can produced using an enzyme called uronate dehydrogenase 00:06:03.03 that's actually found in a bacterium called Pseudomonas syringae. 00:06:06.15 But that was sort of the end of the story as far as Pseudomonas was concerned. 00:06:11.00 With our glucaronic acid, now, we could go back to the databases again 00:06:14.20 and ask the same question, that is, do we see glucaronic acid being produced by nature, 00:06:18.23 and in fact what we could find is that glucaronic acid could be produced 00:06:22.00 from a compound called myo-inositol with an enzyme called myo-inositol oxygenase. 00:06:26.28 And that enzyme is found in a number of sources, a number of mammalian sources, 00:06:31.12 and fungal sources, and we actually chose the variant from mouse 00:06:35.04 because it was one that had been shown to work well when it was expressed in E. coli. 00:06:39.15 But that was really the end of the story as far as mammalian biology was concerned, 00:06:43.11 but if we said where else does myo-inositol show up in metabolism, 00:06:47.21 we could actually find a linkage directly from myo-inositol, or to myo-inositol, 00:06:52.00 from glucose, and that was work done by John Frost's lab at Michigan State, 00:06:55.18 where he showed that you could use glucose as the input, 00:06:58.15 you would go through glucose-6-phosphate, 00:07:00.14 and then you would have just a single recombinant enzyme, 00:07:03.09 that is, a yeast myo-inositol-1-phosphate synthase that would produce 00:07:07.08 myo-inositol-1-phosphate and that, in E. coli, was naturally dephosphorylated 00:07:12.06 in order to give the myo-inositol compound that we're interested in. 00:07:14.24 So, now, rather than having this very complex network 00:07:17.24 of 11 or 12 steps, we really only need 3 different enzymes 00:07:21.29 to be expressed in E. coli, although from three very different sources, 00:07:25.24 in order to get the compound that we're interested in. 00:07:27.26 And so we could take advantage of that to actually have the first gene 00:07:31.17 directly PCR-amplified because we knew that would work in E. coli from John Frost's work. 00:07:36.11 The second gene we could take advantage of this DNA synthesis 00:07:39.23 that I talked about in the first part to be able to have this version of the gene synthesized 00:07:44.21 but synthesized in a way that E. coli would be able to produce it more easily 00:07:49.11 than the natural sequence of DNA that would come from mouse, 00:07:52.06 and then we actually had to do a little bit of work to figure out what was the sequence of DNA, 00:07:57.05 or the gene, encoding from the uronate dehydrogenase in bacteria. 00:08:01.23 But once we were able to do that, we now had all three of the genes that we needed 00:08:05.18 to put into E. coli to see whether or not it could make glucaric acid. 00:08:09.19 So, when we co-expressed all three of these genes, 00:08:13.28 what we found was exactly what we hoped to find. 00:08:15.28 And that is that we got glucaric acid being produced. 00:08:19.03 And the figure that's shown here shows the titer, or the concentration, in grams per liter, 00:08:24.05 of glucaric acid that we can measure in the culture medium. 00:08:27.13 So, this is actually spit out by the cell into the surrounding medium. 00:08:31.20 And I have two different bars that are shown here, one that has 0.1 millimolar IPTG 00:08:36.10 and one that has 0.05 millimolar IPTG. 00:08:39.04 I want to take just a second and explain what that really means. 00:08:42.07 IPTG in this case is what we'd call an inducer; 00:08:45.15 that means that it's something that we add to the culture that tells the cells 00:08:48.26 you should start making the proteins, or the enzymes, that we're interested in. 00:08:52.16 And what's shown now is the result on this slide, as something that we see a lot of times, 00:08:56.27 which is that if we have a somewhat higher concentration of our inducer, 00:09:00.21 where we're making more protein, you see that we actually have less of the product 00:09:04.27 than if we have a lower concentration of our inducer. 00:09:07.00 And that's really a core principle of metabolic engineering, which is that 00:09:10.18 changes that we make to the cell have these very broad systems-wide effects 00:09:14.23 that we don't always understand. 00:09:16.08 And so every time we seek to engineer an organism to make a compound we're interested in, 00:09:21.17 we have to go through this trial and error process of trying to identify 00:09:24.24 what really are the best conditions to make the compound that we're interested in. 00:09:28.25 The second thing that I want to point out is that we see, 00:09:31.22 besides glucaric acid being produced, we also find that we have myo-inositol, 00:09:36.13 which is accumulating, meaning we can measure that in the culture medium. 00:09:39.14 And the fact that that myo-inositol is there, it lets us know that the enzyme 00:09:44.06 which is converting myo-inositol to glucaronic acid is a limitation in the system. 00:09:48.22 That is, it's not working the way its supposed to work, such that all the myo-inositol that's produced 00:09:53.24 is converted to glucaronic acid, and then onto to glucaric acid. 00:09:57.20 I always think at this point, there must be a joke in here somewhere. 00:10:01.24 We have a yeast, a mouse and a bacterium 00:10:05.07 and they all go into a bar and I'm not really sure what the end result is here, 00:10:08.27 but we know that glucaric acid comes out somewhere. 00:10:10.26 Unfortunately, it's not quite that easy and we have a lot of challenges that we have to try to address 00:10:16.12 in trying to actually get the cells to make a lot more of this product that we're interested in. 00:10:20.20 The first of those challenges actually comes into place 00:10:24.26 when we actually look at the fact that we have this myo-inositol accumulating, 00:10:28.11 as I pointed out in the first graph, that showed glucaric acid being produced. 00:10:31.25 And in this case now, if we take a closer look at this enzyme, 00:10:35.00 all we're focused on is this one reaction. 00:10:37.21 We can see that this MIOX gene, the myo-inositol oxygenase, 00:10:41.12 takes myo-inositol as its input. It also uses molecular oxygen 00:10:45.19 and the product that's produced now is glucaronic acid. 00:10:48.15 And so we know that the cells are not actually doing this reaction, 00:10:52.29 that is, converted myo-inositol to glucaronic acid, at a fast enough rate 00:10:57.08 to consume of all it. So, if we study that enzyme by itself, 00:11:00.26 the experiment we did in this case was to look at cells producing just this enzyme, 00:11:05.01 so it doesn't have the first enzyme, which gives us myo-inositol, 00:11:08.09 it doesn't have the third enzyme, which actually takes that glucaronic acid 00:11:12.05 and converts it to glucaric acid. 00:11:13.19 Instead, we're looking at this in isolation, and we looked at two different conditions: 00:11:17.20 one where we actually have myo-inositol present in the culture medium 00:11:21.23 as we're growing up the cells and making the protein, 00:11:24.06 and one where it's missing. 00:11:26.04 And the only difference now is that at a point where we measure the activity of the cells, 00:11:30.22 we actually have some cells that saw substrate, that is the myo-inositol, 00:11:34.18 and some that didn't, but at the same time, when we would go to analyze them, 00:11:38.23 we take the cells away, so now there's no myo-inositol, 00:11:42.05 we break open the cells and release the protein and we expose those cells 00:11:46.18 to the same concentration of the substrate. 00:11:48.25 And in doing that and measuring the activity, 00:11:51.10 what we find is that for the cells that were able to previously see the substrate, 00:11:55.23 the activity of that protein is about an order of magnitude higher 00:11:59.03 than the cells that only saw substrate for the first time after the protein had actually been produced. 00:12:04.20 Well, so this actually raised an interesting question for us. 00:12:08.09 And we thought about we actually solve this problem, 00:12:10.27 and I can tell you the answer is not toss in a lot of myo-inositol, 00:12:14.11 because that's actually cheating. What we want to do is start from glucose, 00:12:17.19 which is going to be a more cheaply available substrate, 00:12:20.02 and make the product that we're interested in. 00:12:22.00 But now we can think about this as engineers and say, well, 00:12:26.03 what information do we have that actually gives us some guidance 00:12:29.14 on how we might actually be able to sole this problem, 00:12:32.04 even if we don't exactly understand the underlying reasons for the phenomenon that we see. 00:12:37.07 And so the first thing that we thought is, ok, what we want then 00:12:40.15 is for that first enzyme, the INO1, to make a lot of the myo-inositol, 00:12:45.14 and then that would be really good 00:12:47.06 because that's what we need for the second enzyme to be effective. 00:12:50.07 The only problem with that is that it sounds really good to say that, 00:12:53.06 but as we've worked on that, that turned out to be a lot easier said than done. 00:12:56.26 At the same time as we were looking at this, 00:12:59.12 we actually came up with another idea. 00:13:02.28 In this case, the idea came from a collaborator, John Dueber, 00:13:07.07 in SynBERC, which is the Synthetic Biology Engineering Research Center, 00:13:10.18 and John's work was looking at something called enzyme colocalization, 00:13:15.05 where the goal here was to be able to take enzymes 00:13:17.22 that normally might be freely disbursed throughout the cell, 00:13:20.15 with no reason for them to be together, 00:13:22.15 and to cause a way for those enzymes to be physically located next to each other. 00:13:26.24 In fact, what happens in this case is that the enzymes, shown here now as MIOX and INO1, 00:13:32.19 are actually exposed, or they have covalently attached to them these tags, 00:13:37.01 and those tags fold into a certain 3-dimensional conformation 00:13:40.28 that can then be recognized by a different piece of a protein. 00:13:45.08 That piece of a protein can then be put into something that we call a scaffold, 00:13:48.16 and if you now have the scaffold in the cell, 00:13:51.08 and you have these enzymes that are tagged with pieces that will recognize that scaffold, 00:13:56.04 that actually causes two enzymes to become located close to each other within the cell. 00:14:01.22 So, our idea here was very simple, that if we couldn't actually change the activity of the enzyme 00:14:06.19 and the way that we could get the upstream enzyme to make much more product, 00:14:10.11 if we actually reduced the distance between the two enzymes, 00:14:13.26 that would give us a higher local concentration of myo-inositol, 00:14:17.18 and maybe if that local concentration was higher, 00:14:19.21 that would give us the higher activity that we had seen before, 00:14:22.13 and that would actually give us higher yields and productivities. 00:14:25.15 And the first way that we tested this was exactly as its diagrammed on this slide, 00:14:31.02 where we actually had just these two enzymes being recruited to the scaffold, 00:14:35.01 in a one to one ratio, and in doing that, we actually got an increase of about a factor of 3 00:14:40.29 in the amount of glucaric acid that we were producing. 00:14:43.17 Now, as all good scientists, we have to ask ourselves, 00:14:46.27 is this working the way that we want it work? 00:14:49.10 And I'll remind you that our theory here was that what we would get was not just more glucaric acid, 00:14:55.05 but that that would happen because we would have a higher activity of MIOX, 00:14:58.23 that is, we would have better activation, and that would result in this faster conversion 00:15:02.27 that would give us more of the product that we're interested in. 00:15:05.15 So, we actually needed to test that theory, 00:15:08.07 that is, to measure the activity of this MIOX protein and find out 00:15:12.00 whether or not it actually had higher activity, as we supposed that it might. 00:15:16.25 What's shown now in the upper left-hand corner is the data for the product, or the glucaric acid titer, 00:15:22.19 where the lighter bars here are, well, on the left hand side, I should say, 00:15:26.15 without scaffold, and then on the right hand side, with scaffold. 00:15:29.10 And you can see again, these are two different conditions in terms of how much of this IPTG we use 00:15:34.14 to induce the expression of the proteins. 00:15:36.28 And in the first case now, of these lighter bars, there's no real difference 00:15:41.01 between not having scaffold and having scaffold, 00:15:43.22 on the amount of product that's being produced, 00:15:46.03 and if we actually look at the activity of the protein, 00:15:48.17 there's also no significant difference between the protein activity here and the protein activity in this case as well. 00:15:54.25 However, in our best case, where we actually had an increase of 3-fold 00:15:58.08 in the amount of glucaric acid being produced, that's the darker bar in this case, 00:16:01.29 we can look at the specific activity of the protein and we see about a 30% improvement 00:16:07.15 in the activity of this protein relative to when the scaffolds aren't present. 00:16:11.15 And the p-value is here just to show you that that difference is actually significant. 00:16:15.15 So, now we've actually verified that we have not just higher production of the product that we're interested in, 00:16:21.08 but we're getting that higher production by the mechanism that we had supposed 00:16:25.17 would actually happen. 00:16:27.12 Now, one of the nice things about these scaffolds is that what it allows you to do 00:16:31.16 is to explore different stoichiometries. 00:16:33.22 What I mean by that is you don't just have to have one of one protein 00:16:37.19 and one of a second protein coming together, but you can actually, in that scaffold, 00:16:41.27 dial in the stoichiometry by specifying the number of binding domains that you have 00:16:47.02 for each particular protein. So, this is an example of a different scaffold, 00:16:50.20 where you can see two binding domains for one of the proteins, 00:16:53.29 four binding domains for another protein and a single binding domain for the last protein. 00:16:58.12 And if we put that together, what it actually means is that we have, 00:17:01.10 in this case, four copies of the first gene, the INO1 enzyme, that is, 00:17:06.05 two copies of the second enzyme, and only one copy of that third enzyme. 00:17:09.23 This actually allows us to look at a wide variety of different configurations 00:17:14.20 as well as look at varying the amount of the scaffold that we have 00:17:18.11 and the amount of the enzyme that we have, 00:17:20.03 to look at the effect of that on the productivity. 00:17:22.23 And the result of that exercise is shown here, 00:17:25.12 where each of those dots is the average of a triplicate experiment 00:17:29.07 where we have the same amount of enzyme being produced in all cases, 00:17:32.20 but we're looking at a wide variety of scaffold induction levels 00:17:36.07 and also looking at a very wide configuration of different scaffolds themselves, 00:17:41.10 meaning different numbers of binding domains for these enzymes that we're interested in. 00:17:44.18 What we see if that we actually are able to change 00:17:49.00 the activity of this enzyme over a factor of about 7-fold 00:17:52.21 and that actually results in a change in the amount of glucaric acid that we have 00:17:56.24 in a factor of about 5-fold. So, we really have shown that we can use, 00:18:01.04 in this case what's called a synthetic biology device, 00:18:04.02 that is, these protein-protein co-localization mechanisms, 00:18:07.08 to be able to solve a problem with an engineering approach, 00:18:11.05 even if we still don't understand exactly what is it that leads to these differences 00:18:15.21 that we see in the activity of the protein. 00:18:18.08 Now, I want to remind you again of this maze analogy that we had before 00:18:24.03 of a protein, or rather a compound, coming into a maze 00:18:28.02 and having a number of different places that it could go. 00:18:30.17 And I showed a very simple diagram before of the maze having four different entry points. 00:18:35.11 Well, the reality is that this is really what the maze looks like inside the cell, 00:18:39.28 where each of the individual dots in this figure represents a particular chemical, 00:18:44.10 and each of the lines between those dots represents an enzyme 00:18:48.02 that can convert that chemical into something else. 00:18:50.20 So, that means that the networks that we're really talking about are very, very large mazes, 00:18:55.16 not these very simplified ones that I showed you. 00:18:57.26 And if our goal is to have glucose, for example, as a starting molecule, 00:19:01.06 work its way through this maze, and end up with a final compound that we're interested in, 00:19:05.28 we can often have by-products that are being produced. 00:19:08.24 And ideally what we'd like is to, again, knock-out those unproductive routes, 00:19:13.19 which are going to lead to byproduct formation, but the question becomes, 00:19:17.05 what if your byproduct is actually growth? 00:19:19.11 And growth in this case also means the ability to make the enzymes that you need 00:19:24.19 in order to catalyze all these chemical reactions 00:19:27.12 that are going to give you conversion of your starting substrate, glucose, 00:19:30.21 down to your final product, glucaric acid. 00:19:32.28 In this case now, we don't have the option of simply knocking out or deleting growth, 00:19:38.16 because now we're not actually going to make the enzymes that we need 00:19:41.10 and this means that we have to have a different way of solving this problem, 00:19:44.23 or a different approach to dealing with the byproduct that we have. 00:19:48.08 So, what we can do in this case is again, take advantage of these principles of synthetic biology, 00:19:54.05 which are based on design, to think about a control system. 00:19:57.17 In particular what we want is dynamic control of these activities. 00:20:01.22 We like to have our initial condition be fast growth, or growth being favored, 00:20:06.02 such that we actually make not just the cells, but again the proteins that we need, 00:20:10.03 that are going to give us the enzymes that give us the chemical reactions 00:20:13.02 that we need to make the product that we're interested in. 00:20:15.07 And then we want to trigger a switch to a production phase 00:20:18.16 where we say, stop growing now, and instead of growing, 00:20:21.15 use all of that glucose to make the molecule that we want you to make. 00:20:25.00 I can represent that diagrammatically like this, 00:20:27.26 where if we have our competing activity, initially, when the input is low, 00:20:32.04 that activity will be high, and at some point, I'm going to now add an input 00:20:36.03 that causes the competing activity to be low. 00:20:38.14 You can see that now, specifically, in what we're interested in, which is growth versus production, 00:20:43.24 which is that we want growth to actually start high, 00:20:46.29 and then after awhile, we want growth to go down, and instead we want the production here 00:20:52.21 to actually start to go up. This is actually something called a genetic inverter. 00:20:57.10 It's an inverter because when the input is low, the output is high. 00:21:01.17 When the input is high, the output is low. 00:21:04.05 And there is actually a precedent for this in nature, namely in secondary metabolite production. 00:21:09.03 Now, for secondary metabolites, these are natural products 00:21:12.25 where growth first is favored, and then the cell will naturally make this switch 00:21:17.12 such that you then will have the metabolites being produced later. 00:21:21.00 So, how do we actually make this process happen 00:21:25.20 when we're talking about having a switch for growth 00:21:28.22 where ideally what we're doing is having the cells use glucose for growth initially, 00:21:32.28 and then change that in order to use glucose for product formation 00:21:36.06 at some point after which we apply our trigger. 00:21:39.02 If we look at how glucose is normally used in our cells, 00:21:42.05 it comes in in what's called the PTS system, 00:21:44.11 and that PTS system brings in glucose as glucose-6-phosphate. 00:21:48.21 And it has two different routes that it can go into; 00:21:50.27 glycolysis or the pentose-phosphate pathway 00:21:54.02 and that's actually how that glucose is used by the cells for growth. 00:21:58.06 That's how the glucose is eaten, if we want to think about it that way. 00:22:01.16 And that's the process that we want to compete against. 00:22:03.24 Well, glucose-6-phosphate is the original substrate of our glucaric acid pathway, 00:22:08.19 but we didn't really want to deal with quite this complexity to start with, 00:22:12.12 so we decided to start on a simpler scale 00:22:14.22 and see if we could just address the glucose utilization issue 00:22:17.19 and then what we're doing now is to try to work up to the increasing complexity 00:22:21.27 that's required to deal with glucose-6-phosphate specifically. 00:22:25.11 That can be addressed by the fact that there is actually another way that glucose can come into the cell. 00:22:30.00 It can come in through what's called the galP, or galactose permease, 00:22:33.29 and in this case, it comes in as free glucose. 00:22:36.11 That glucose now has to be converted to glucose-6-phosphate 00:22:39.21 with an enzyme called glucokinase that uses ATP. 00:22:43.15 And because now the glucose has to go through that route, 00:22:46.17 it gives us just a single control point for being able to regulate, 00:22:51.01 that is control, how much of the glucose goes into our endogenous metabolism, or growth, 00:22:56.09 versus what goes into the product that we're interested in. 00:22:58.18 So, we can actually have this system now where we knock-out the PTS system, 00:23:02.25 we apply what we describe as a valve to regulate Glk activity, 00:23:07.06 and in doing that, we're able to modulate how much of the glucose is available 00:23:12.00 for endogenous metabolism, that is for growth, 00:23:14.21 versus how much is available for productivity. 00:23:17.06 And I just want to remind you that when we're talking about modulating the protein, 00:23:21.10 that is, how much of the glucokinase that's available, what we're really talking about 00:23:26.03 is controlling how much of the DNA, or how that DNA is being expressed. 00:23:30.18 So, we're actually doing all of our manipulations at the level of DNA synthesis, 00:23:34.21 which comes back to how we think about synthetic biology. 00:23:37.14 So, one way that we can actually test this, 00:23:41.27 rather than immediately going to a process where we have to worry about dynamic control, 00:23:46.18 is to look at what we would call static control of the system. 00:23:49.28 And that is that we can replace the natural glucokinase operon, 00:23:54.01 or production system, 00:23:55.20 which naturally consists of two different promoters that are negatively regulated 00:24:00.01 by this protein called FruR, we can get rid of all of that regulation, 00:24:04.16 that is we can replace that DNA, and instead have a library of different promoters 00:24:09.26 where the binding site for FruR is gone, 00:24:12.19 so the only thing that's regulating how much of this protein is produced 00:24:15.25 is the kind of promoter that we use. 00:24:17.26 And by varying the strength of these promoters, by using different variations here, 00:24:22.21 then we can end up with a library of different expression states 00:24:25.16 and ask the question, does that actually affect how much of a heterologous product 00:24:30.08 we could actually produce. 00:24:31.27 Here's now a little bit of characterization of this library. 00:24:35.16 The first thing that we're looking at in this slide is whether or not we actually do have increases in the mRNA, 00:24:40.28 that is, whether or not changing the promoter strength 00:24:43.10 changed the transcription, and then if that corresponded to increases in the protein being produced. 00:24:48.09 And what's shown in this case now, along the x-axis, is the relative promoter strength, 00:24:52.29 from very low strength, or weak promoters, up to very high strength promoters, 00:24:57.15 and then what's shown on the y-axis, on the left-hand side, 00:25:00.19 is the activity of the protein that we're interested in, glucokinase, 00:25:04.00 and what's shown on the right hand side is the mRNA levels. 00:25:07.05 And you can see now, that activity, which is in the solid circles, 00:25:11.19 does actually go up as we go from low promoter strengths 00:25:15.14 up to high promoter strengths, but it only goes up to a certain point, 00:25:19.01 after which we see it start to decline. 00:25:21.01 The same thing is true for the mRNA, that it actually will go up as we go along this axis here, 00:25:25.20 and it only will go up to a certain point and then it starts to decline as well. 00:25:30.06 These measurements were all done where we use glycerol 00:25:33.20 as a carbon source instead of glucose and that's actually to allow us 00:25:37.00 to decouple growth from measuring the properties of this enzyme 00:25:40.29 just to see if the library is working. 00:25:42.26 And what we actually found when we went to glucose 00:25:45.10 is that when the expression levels were too high here, 00:25:48.07 then these cells no longer grew. So, this cell has high mRNA, 00:25:52.03 but you can see the protein levels are pretty low. 00:25:54.10 And these cells would not grow on glucose. 00:25:56.29 The ones where the protein levels were still pretty high 00:26:00.01 would grow on glucose, except that we did have this gray region here, 00:26:03.29 this stipple region, where we saw the cells could grow, but only very, very poorly. 00:26:09.17 We could then take the cells that we knew were growing well, in this region here, 00:26:13.25 and then ask, can we actually now, in glucose, 00:26:17.03 relate the growth rate to the activity of this protein, which tells us whether or not it really can control 00:26:23.16 how much of the substrate is available for endogenous growth. 00:26:26.25 The result of that experiment is shown on this slide, 00:26:30.24 where again what we have now is expressed in terms of Glk activity 00:26:34.15 where it goes from a very low activity up to our higher activity 00:26:37.29 and then what's shown on the x-axis is the growth rate of the cells. 00:26:41.15 The native promoter is shown right here in this open triangle 00:26:44.25 and the filled squares will tell you that we're able to actually increase the growth rate 00:26:49.13 of the this cell. We can also decrease the growth rate of the cell by changing the glucokinase activity. 00:26:55.08 So, that confirms for us that we actually do have a control point 00:26:58.12 or a specific protein where if we vary the activity of that protein, 00:27:02.22 that actually will tell us, or allow us to control, rather, 00:27:05.26 how the cells are growing. The next question then is, 00:27:09.02 if you can control the growth of the cells, does that actually result in more product being produced. 00:27:14.13 So, in this case we have an example molecule, or a test molecule, gluconate, 00:27:18.02 this can be produced in one single enzymatic step from glucose, 00:27:21.25 and again, the competing reaction here is glucose-6-phosphate, 00:27:24.29 which is actually going to be produced from glucokinase. 00:27:27.18 What's shown now here is 5-KG, this is 5-ketogluconate, 00:27:31.05 which is just a spontaneous product that we actually get in very, very small amounts, 00:27:35.02 but we want to account for that by making sure that we look at the sum of both of these products, 00:27:39.18 to give us a sense of how much of the flux is coming through this side 00:27:43.17 versus this side of our pathway. 00:27:45.23 And now what's shown in this slide is actually the result of that experiment, 00:27:49.23 where what's shown now is the Glk activity, that is from, lower to higher amounts of that protein, 00:27:55.17 which is controlling how much glucose goes into endogenous metabolism 00:27:59.18 and what's shown on the y-axis is the molar yield, 00:28:02.29 and this is really how much of the glucose that we start with 00:28:06.04 goes into the compound that we're interested in, versus goes into other byproducts, 00:28:10.11 or into cellular growth. 00:28:12.04 And we see this very nice relationship where, when the activity is very low, 00:28:15.23 then we can see that we have a moderate amount of the yield, in this case, 00:28:21.12 that is the product that we're interested in. As we increase the activity, 00:28:25.00 then we get a slight bump, but as the activity goes higher and higher, 00:28:28.21 what we actually find is that we are decreasing the yield, 00:28:31.29 which basically tells us that as we get, now, 00:28:34.17 to the point where we're making more and more of this glucokinase, 00:28:38.06 we have more of the glucose going into growth 00:28:40.17 and less of it going into the product that we're interested in. 00:28:43.00 So, that actually gave us the validation that we needed 00:28:46.18 that the system design that we had envisioned, 00:28:49.12 one in which we could control the activity of this enzyme, 00:28:52.20 was going to be useful. What I haven't told you so far is that these cells here, 00:28:56.27 although they had the highest yield, did not have the highest concentration. 00:29:00.27 The concentration wasn't very different from the cultures that surrounded it 00:29:04.27 as far as yield was concerned, and they also didn't grow very well. 00:29:08.05 So, that just meant that the cells overall were not happy, 00:29:11.07 and that our original design of having them have a state where they grow very well first, 00:29:15.29 would probably work better in terms of giving us the maximum yield possible. 00:29:20.03 So, the system that we wanted to design here, again, is an inverter, 00:29:24.17 and the way that this will work is that we have this protein, 00:29:27.25 now as an example, GFP, 00:29:29.15 which is being produced by a promoter which is regulated by the lacI protein, 00:29:34.02 or the lacI operator. When lacI is not present, GFP is turned on. 00:29:39.11 We then have lacI, however, under the control of something called the tet promoter, 00:29:43.27 and the tet promoter is responsive to a small molecule 00:29:46.29 such that when you add aTc, this small molecule, it would turn on 00:29:51.10 the expression of lacI and that would turn off our GFP. 00:29:53.29 Now, that you can see by looking at the graph; the first point that we actually have 00:29:58.04 is the fact that in the absence of any aTc, then we have a very high fluorescence, 00:30:03.03 which means that the whole system is on. 00:30:04.27 If we then move to a point where we add aTc, 00:30:08.13 what you'll find in that case is that you can see the GFP levels start to go down 00:30:12.16 as a function of how much aTc we add, 00:30:15.10 and at the point where we've added 100 ng/ml of aTc, we have very little GPF being produced. 00:30:20.27 We can show that the mechanism of this is working the way we intend it to work 00:30:25.04 by adding an additional protein called IPTG, and what IPTG actually does 00:30:29.23 is to interfere with this lacI binding 00:30:32.13 such that you can recover some of the GPF expression 00:30:34.28 and that's actually shown in the last two points of this graph here, 00:30:38.07 that show that GFP can go back up. 00:30:40.23 So, now we know that our system, our basic inverter is working, 00:30:44.22 and what we have to do in this case now is to integrate that 00:30:47.12 into our cell, that is to change now Glk activity so that it responds in this same way. 00:30:54.06 And what's shown now in this slide is the result of having done exactly that. 00:30:58.25 So, here's now the construct of our inverter, 00:31:01.03 where again, this is really just how the DNA is being constructed, 00:31:04.26 and we're using that to control how Glk is being produced 00:31:07.20 and we can look at the same two properties that we looked at before, 00:31:10.21 which is, is the mRNA changing, that is, is the DNA to mRNA, that transcription process, 00:31:17.05 is that being regulated the way we want it to, 00:31:19.03 and does that correspondingly result in differences in the Glk activity? 00:31:22.25 And the mRNA levels are actually shown at the bottom, 00:31:25.04 where you can see that as we increase the amount of aTc, 00:31:28.00 we actually do see that we start initially with high levels of mRNA, 00:31:31.05 and then those levels of mRNA eventually come down. 00:31:34.05 The top graph here actually shows the response of Glk, 00:31:37.02 where it also starts very high, and then it also will come down to a very, very low level. 00:31:42.09 This is again a characterization in glycerol, where we don't have glucose present, 00:31:46.29 so we're only able to see the response of the cells to Glk 00:31:51.18 when it doesn't really need Glk and that actually tells us, is the system really working. 00:31:55.21 Now, we also want to know that it's actually dynamic. 00:32:00.01 So, the way we tested our static system before was just to change the promoters 00:32:05.13 that were encoding for Glk and then to ask, does that actually give us differences? 00:32:09.13 We now want to know if we have a switch. 00:32:11.13 If we start off with it on and then add this inducer so that we turn it off, 00:32:16.07 that is, we invert the response, do we actually get what we're interested in. 00:32:19.28 And the top graph that's shown here is the response of what happens 00:32:23.02 to the cell growth as we actually add our inducer, 00:32:26.10 where the top part of this is now uninduced, 00:32:28.24 that means that we're not adding anything chemically, 00:32:31.12 and we see that the cells are continuing to grow. 00:32:33.17 If we compare that now to the second line here, 00:32:35.26 where initially they both started off at the same point, 00:32:38.18 we add our inducer, we can see that the cells where we now have turned the gene off 00:32:43.10 by activating our inverter, are growing to a lower point. 00:32:47.00 We can also see a control plot in this very bottom here, which is what happens 00:32:51.14 if we add inducer from the very beginning. 00:32:53.20 That actually means that it turns off gene expression so low 00:32:56.17 that those cells never grow. You can see that the OD stays flat 00:32:59.22 and pretty much close to zero the whole time. 00:33:02.04 So, we know that again, the response we're looking for, growth, 00:33:05.14 is changing the way we want it to, 00:33:07.05 and just very briefly, what's shown in these bottom slides 00:33:09.21 is that the growth rate again is changing, 00:33:12.00 this is now relative OD between those two. 00:33:14.17 The activity is also changing, it's decreasing, 00:33:17.10 and the mRNA levels are going down as well. 00:33:19.22 Ok, so now we know the system is working exactly the way we want it to work, 00:33:23.19 it was designed in a certain way, we seem to have the output that we're interested in 00:33:27.13 from the design perspective of growth. 00:33:29.06 The question now is, does it actually give us the productivity enhancements that we were looking for. 00:33:34.09 And we're now again looking at the same system as before, 00:33:37.23 where our goal is to make this compound gluconate, 00:33:40.16 and the only difference now on this slide is that I've added now this product acetate 00:33:45.01 which is a byproduct of metabolism, 00:33:47.04 and is a representation of how much glucose flux is actually going down into endogenous metabolism. 00:33:52.28 And if you look at now the charts here on the right hand side, 00:33:56.06 the top one gives us the titers, or the concentrations, 00:33:59.07 and it shows that more glucose is being consumed, that's what in this white bar here, 00:34:03.05 is how much glucose is consumed. 00:34:04.21 More of that is consumed when the inverter is on; 00:34:07.20 the gray bar is how much product is being produced. 00:34:10.04 We make substantially more product being produced here as well, 00:34:13.05 and then these smaller bars here, the lightest kind of dark gray and the very, very black bar, 00:34:18.29 give us an indication of some of the minor byproducts. 00:34:21.19 And that's actually represented more easily in the bottom graph here, 00:34:25.04 where again, I'm showing the yield, that is how much of what goes in as glucose 00:34:29.15 is being converted to the glucaric acid product that we're interested in, 00:34:32.18 sorry, in this case the gluconate, or the gluconic acid product that we're interested in. 00:34:36.14 And the open white bars here give us the yield measurements 00:34:40.03 and in this case we've actually increased our yield from about 0.7, 00:34:43.23 and this was actually higher than what we had seen with the other system, 00:34:47.17 which tells us the cells are happier now, 00:34:49.06 and our yield in this case goes up to about 0.8, or a little bit higher than 0.8, 00:34:54.08 so we have about a twenty percent increase in the yield. 00:34:56.17 The grey bars that are shown here is this acetate by-product, 00:34:59.28 and you can see an even larger reduction in the waste going to acetate. 00:35:04.29 So, we have again a twenty percent increase in the yield here, 00:35:08.13 but we also have almost a fifty percent decrease in waste, 00:35:12.12 that is this acetate waste. 00:35:14.03 Now, the last thing that we wanted to look at was the timing of the induction 00:35:19.05 because we do know that based on exactly when we add this inducer to turn off Glk expression, 00:35:24.26 we could have the cell growth go way, way down, 00:35:27.15 I showed you that as a control plot, or if we wait too late, 00:35:30.22 then the cell is not actually able to respond 00:35:33.05 because it's going to stop being very active. 00:35:35.19 So, what we're looking at here now is the OD, or that is the growth, 00:35:39.06 at which we induce, starting from very early induction times, 00:35:42.08 up to later induction times, and then what's shown on the y-axis is the yield 00:35:46.19 relative to an uninduced culture. And we have two different yields that we're looking at, 00:35:50.27 one is the yield of product, and that's shown in the top here, 00:35:53.15 with the squares, and the second is the acetate yield, 00:35:56.22 or, again, a measure of waste that we have here. 00:35:59.29 What we find in this case is that the yield improvements are actually best 00:36:03.13 when we induce earlier. That means give the cells a little bit of time grow, 00:36:07.03 but don't let them grow too far, and we can see in our best case 00:36:10.20 about a 70% reduction in waste and a 20% increase in product being produced. 00:36:15.29 Let me summarize the story that I've given you about glucaric acid. 00:36:21.06 I started by talking about how we could come up with a new pathway 00:36:24.18 to be able to make this compound that was still a natural product, 00:36:28.16 but whose natural pathway was too cumbersome from being, to be produced in E. coli. 00:36:33.09 What we used in this case is part selection, or bio prospecting, 00:36:36.25 to find the enzymes that we could move from one source into another source, 00:36:41.04 and we're able to do this because once we know the DNA that encodes for those enzymes, 00:36:45.07 we can synthesize that DNA, and easily move it around between organisms. 00:36:49.15 And the second thing I showed you was this example of a synthetic biology device, 00:36:54.02 that was the protein-protein colocalization study, 00:36:56.26 which gave us increases in productivity. 00:36:59.04 And those protein-protein colocalization devices, or the scaffolds, 00:37:02.28 have been shown to be useful in other projects as well, 00:37:05.18 so that we know that they are reusable 00:37:07.14 and modular in a way that makes them very useful 00:37:10.10 for thinking about how do we actually engineer the metabolism of cells 00:37:13.26 to make the products that we're interested in. 00:37:15.16 And the last part that I showed you was an example 00:37:18.00 of how we might engineer the host, or chassis in the language of synthetic biology 00:37:22.10 to give us further improvements both in the titers, 00:37:25.16 that is the concentrations that we're interested in, 00:37:27.18 and in the flux, or the yield of the product that we want, 00:37:31.00 such that we get more of the substrate that we start with 00:37:33.23 going into more of the product that we're interested in. 00:37:35.29 I'd actually like to end this whole iBio seminar by acknowledging the folks that did the work. 00:37:41.26 I won't go through all the names, 00:37:43.12 but you can see them highlighted here in red, 00:37:45.13 as students who are both currently in the group working on these projects, 00:37:48.26 as well as former students and postdocs in the groups. 00:37:51.18 I've recognized John Dueber as a collaborator, he is still at the University of California at Berkeley, 00:37:56.13 and this work was primarily funded by the National Science Foundation 00:37:59.19 through SynBERC and through the Office of Naval Research through the young investigator program 00:38:03.13 with the last part of it being funded primarily by the National Science Foundation 00:38:07.06 through the career program. I hope you've enjoyed the iBio seminar 00:38:10.18 and thank you very much.