Projects of Zack Booth Simpson

Tuesday, May 19, 2009

Complementary logic ideas

Talking with John this morning about the equivalence between the gates we're proposing and electrical analogs. John points out that our gates are like "half of a tri-state gate". We started thinking about higher-order logic cells using the proposed gates and realized that you can be logically complete assuming that you can mix gates with complementary inputs and only lose some fraction of them to a bi-molecular cancellation. If this is not the case -- if you lose everything -- then there might still be a way to do it with extra translation stages, but I haven't thought that through yet.

(Image update 21 May. Thanks to Erik for pointing out that I forgot the promoter completion domain.)

Assuming that the above gate cancellation reaction is not favorable (or that tethering them reduces the favorability) then you could combine the gates to make buffers, inverters, and a biased-and-gate that doesn't produce a very clean output, but which would have the property that when inputs A & B are + then output would be + and all other input combination would give output slightly - to very -.

Traveling pulse - a stable orbit

I started hunting around in parameter space trying to get my head around what makes the traveling pulse stable and predictable. I don't yet have a set of exact rules, but what I've learned is that the reactions need to be slow compared to the diffusion. This is achieved by simply lowering the concentration of the gates and resistors appropriately. Next, the pull down gates 1 & 2 are very small compared to the feedback and shutdown gates. Also, the "tired" charging gate is very small so that you can delay the onset of the shutdown.

The biggest point is obvious when you look at the phase diagram: you have to let the system get back into steady-state before another pulse hits it. Also interesting is how perfectly straight are the edges of the phase diagram. I think that this means that the gates are run way out of their linear regions and are running in steady-state most of the time. I'm going to try to make a graph to make sense of that.

I also found that it is easy to make complex patterns form when you push the system really hard as in the following class-3-like cellular automata. Note that the system was started with symmetric initial conditions and has full symmetric rules yet is symmetric only until it starts to interact with itself; once it reaches the boundaries, it becomes asymmetric. Fascinating. I suppose this is because the "periodicity" of the pattern is not related to the size of the container so the two periods start to alias in some weird sense.

Friday, May 15, 2009

Idea: Cut healthcare costs? Reduce the patent duration.

Brooks has a good essay today about the proposed underwhelming health care cost-cutting measures. I agree that none of the proposed changes sound like enough to take a reasonable bite out of our growing health care costs; and I doubt that for such a big problem there exists many easy fixes. But, there is one very easy fix that would have an huge impact -- cut patent duration times from 20 years to, say, 10. Of course, innovating companies will hate the idea of reducing their patents and boring-old manufacturers will love it but I guarantee that 10 years from now there will be an incredible drop in drug prices.

We have a fundamental problem that no one wants to admit: until some revolution in drug development takes place (e.g: if it turns out that siRNAs are a magic bullet) then we simply can not have guns, butter, and bandages -- at least we can't have every newfangled "bandage" being made at such an incredible pace.

We have an impossible expectation for our health care that we don't have for any other sector of our economy. We simultaneously want the free market to invent new treatments on a for-profit motive and then we want everyone to have access to the result. In contrast, we don't expect every driver in the country to have access to a Lamborghini just because they exist. We don't expect everyone to have access to the latest iPhone gadget just because they exist. But we do expect -- for good ethical and moral reasons -- that everyone should have access to whatever the latest, best treatments are. While this expectation is understandable, it's nevertheless schizophrenic: "Pharma: go be innovative, invest a lot of money to make amazing drugs! Oh my god, why are they so expensive?" We don't say: "Apple: go be innovative, invest a lot of money to make amazing phone! Oh my god, why are they so expensive?" (Actually some people do, but most just recognize that if the phone is too expensive they'll just do without.)

Health care is always going to involve an insurance middle man be it private, public, or all-messed-up-in-between as it is now. So, health care will always be a collective venture. It is simply irrational to expect that we can collectively afford every possible innovation, just as it would be irrational to expect that we could all collectively own the latest iPhone gadgets. Thus, the systemic way to change the collective system is to simply lower the profit bar. And this can be done by changing one simple variable: the duration of patents. Make patents last 10 years and drug companies won't build as many expensive drugs and, yes, more people will die of things that could have been prevented. But, recognize that this is already the case! The 20 year limit is totally arbitrary. Had it been set at, say, 30 years then there would exist, right now, more amazing but even more expensive drugs and therefore because the number is set at 20 and not 30 we are "heartlessly" letting people go untreated because of an arbitrary number. The number has changed before (upwards) and we can change it again, downwards -- at least for drugs -- if we collectively choose to. It's the only "easy" fix.

Thursday, May 14, 2009

Traveling Pulse Phase Diagrams

Working on understanding the behavior of my amorphous traveling pulse, "Mexican Wave". On the right is a marked up phase diagram of the two states "standing" on the x axis and "tired" on the y axis. The mark ups show the regions where different parts of the circuit are operational. This has helped me get my head around what has to be adjusted to make the system more predictable. One lesson is that the mystery of why the pulse is traveling at different speeds has something to do with the fact that the system does not usually get all the way back down into the same steady-state. The bottom steady-state point "not standing and not tired" should be determined by the relationship of the pull down gate 1 & 2 and the grounding resistors. So, next thing I'm going to do is try to adjust things so that I give the system enough time always settle down into that same point. Then I can tackle understanding how the other gates reshape this phase chart.

An observation. The one directional traveling pulse on the left is making a pattern that looks like the branching pattern on a plant stem. This reminds me of a plant branching model Wolfram talked about in NKS.

Wednesday, May 13, 2009

More fun with Traveling Pulse

I started messing around today with the amorphous traveling pulse from yesterday. First thing I did was try creating an asymmetric starting condition by "pipetteing" in both a spot of "standing" as yesterday and also a spot of "tired" adjacent to that so that the pulse could travel only in one direction. As before, the x axis is cyclical space which is why the pulse travels off to the left and then reappears on the right.

Inexplicably, the pulse does not always travel at the same velocity. I have no idea why, maybe its an artifact of the integration but it seems periodic -- like its accelerating and decelerating at some predictable way.

I then start exploring parameter space of the circuit, repeated here for reference.

(Drawing revised 19 May)

I started with 3 vs 5. All things being equal, it should be the case that the concentration of gate 5 needs to be greater than the concentration of gate 3 so that it can overpower "standing" when "tired". As the following phase chart of 3 vs 5 illustrates, this is true. Also, as 3 grows so does the pulse width. This is intuitive because the harder p3 works to pull up "standing", the longer it will take for the discharge circuit to overpower it. Graph of P3 vs P5:

Then I started on P3 vs P4. P4 determines how fast it gets "tired" so more P4 should create a narrower pulse width, which is indeed the case. As you would expect, there's a limit, P4 can make the system tired so quickly that the pulse disappears (it becomes tired the instant it stands). However, there's a relationship between P3, the charging circuit and P4 the "getting tired" drive. As the standing driver is increased, you have to compensate with fast you become "tired". Makes sense. Ratios in the kind of 5-7 ball park seem to work well given the arbitrary other settings I have. Graph of P3 vs P4:

Crazy things happen when you change the two stabilizing gates p1 and p2. When pull down resistors are set to 0.01 and diffusion to 0.3 in this simulation. As p1 increases the pulse travels slower which makes sense as it is harder to charge standing. (Thanks to Xi for pointing out that I had previously stated this backwards.) At some critical value, it the charge circuit can't keep up with the diffusion and pull down sides and the pulse evaporates. Really weird things start happening around p1=0.01 and p2=0.07, looks like it becomes unstable and pattern forming, which is cool.

Some close ups of instability patterns. They look a like Sierpinski triangles which makes some vague sense because the standing and tired are in opposition to each other and can act as some kind of binary counter where diffusion permits the next space over to act as the carry bit. (I say this with while waving my hands furiously :-)

Tuesday, May 12, 2009

Traveling Pulse Amorphous Computer

After a few meetings with John, Nam, Xi, Edward, and Andy in the last few weeks I think I have a plausible molecular gate model that can make some interesting amorphous computations. Specifically, I've been trying to make the "Mexican Wave" -- an amorphous pulse wave.

A variable "A" is encoded by the log ratio of the concentration of two RNA species: a sense strand called "A+" and its anti-sense strand called "A-".

(Image updated 21 May -- Thanks for Erik for pointing out I left off the promoter completion domain)

Gates are molecular beacons that use promoter disruption to squelch the generation of some output strand. For now, all gates are unary operators. The RNAs can be displaced off the beacons by toe-hold mediated strand displacement. This design is basically Winfree lab's transcriptional circuits but where the gate is a hairpin DNA molecular beacon and where variables are encoded by log ratio of sense and anti-sense instead of as a proportionality to concentration of an ssRNA.

(Note I updated this diagram to change the naming convention on this 17 May 2009. Again on 21 May thanks for Erik for noticing I left off the promoter completion domain.)

Gates are modeled as having hyperbolic production curves and can be built according to one of four choices of sense and anti-sense sequence on the inputs and outputs. As a matter of convention, the sense strand is labeled "+" relative to the ssRNAs, not relative to the DNA because the concentration of the RNAs is the variable of interest in these systems.

To explore the model, I created a circuit that I hoped would make an amorphous pulse propagating wave. Below, I switch into electrical analogy which I do for my own sanity. The charge across capacitors represent the two variables which I call "standing" and "tired" by analogy with the Mexican Wave. The gates are labeled like "i+o+" meaning "when input is + the output will be -". (I've changed around the naming convention several times, this update is as of 17 May) The gates without inputs are under constitutive promotion and are labeld only by what they output. All nodes are pulled down by the same RNAases represented here as resistors to ground from each capacitor. The two variables are assumed to diffuse at equal rates. The only changeable parameter is assumed to be the concentrations of the gates.

(Thanks to Xi and John for help reworking this diagram. I updated it on 19 May.)

This circuit can be thought of like this. "Standing" and "tired" are constantly being pulled low by the gates 1 & 2 against the action of the resistors. If the rest of the gates weren't there, this would ensure the system will be "not standing" and "not tired". Gate 3 puts feedback on "standing" thus a small threshold level of "standing" will generate more until it saturates in steady-state against the resistor. Gate 4 increases "tired" if "standing". Gate 5 is in high concentration relative to the other gates and can thus overpower the "standing" variable when "tired".

Here are the 1D amorphous results. The two plots are "standing" (left) and "tired" (right). The X axis of each is space (cyclical coordinates). The Y axis from bottom to top is increasing time. Blue represents a high ratio of - to + strands. Red represents a high ratio of + to - strands. Black represents an even ratio. At time zero, a pulse of + is added to the "standing" variable representing a manual pipetteing operation at some point in space. As time passes (bottom to top) the pulse propagates in both directions at a constant rate until the two pulses hit each other and then stop.

Sunday, May 10, 2009

Understanding Principal Component Analysis via cool Gapminder graphs

Gapminder.org is a wonderful site full of "statistical porn". This chart in particular is a fascinating graph that demonstrates the correlation between income and child mortality rates. It is also a great example to teach about a cool statistical tool: "Principal Component Analysis".

In this graph of regions there is an obvious negative correlation between infant mortality and income illustrated by the fact that the data points scatter along a line from upper left to lower right. In other words, if you knew only the infant mortality rate or the income of a region you could make a reasonable guess at the other.

Principal Component Analysis (PCA) is a statistical tool that’s very useful in situations like this. PCA delivers a new set of axes that are well aligned to correlated data like this -- I've illustrated them here with black and red lines. For each axis, it also returns a “variance strength” which I’ve represented as the length of the black and red axes. (Actually I just hand approximated these axes by eye for the purposes of illustration).

The strongest new axis returned by PCA (the black one) aligns well with the primary axis of the data. In other words, if one were forced to summarize a region with a single number it would be best to do so with the position along this black axis. The zero point on the axis is arbitrary but is usually positioned in the center of the data (the mean). Positive valued points along this black axis would be those regions further toward the lower right and negative valued regions would be those further toward the upper left. Let’s call this new axis “wealth” to separate it in our minds from “income” which is the horizontal axis of the original data set. Increases in “wealth” represent an increase in income and drop in infant mortality simultaneously.

The second axis returned by PCA is shown as the red axis. Countries that lie far off the main diagonal trend-line (black axis) have particularly unique infant mortality rates given their wealth which we’ll assume is because of something unique about their health care systems. Points well below the black axis are regions that have very good health care given their wealth and those above it have particularly poor health care given their wealth.

Because PCA gives us convenient axes that are well aligned to the data, it makes senses to just rotate the graph to align to these new axes as illustrated here. Nothing has changed here, we've simply made the graph easier to read.

Before you even look at specific regions on these new axes, one could guess that socialist countries would score more negatively along this red axis and those whose economy is heavily biased towards mineral extraction -- where income tends to be very unevenly distributed -- would score more positively. Indeed, this is confirmed. The most obvious outliers below the black axis are Cuba and Vietnam where communist governments have directed the economy to spend disproportionately on health care and the outliers on the other side are: Saudi Arabia, South Africa, and Botswana -- all regions heavily dependent on resource extraction where the mean income statistics hide the reality that few are doing very well while the vast majority are in extreme relative poverty.

One particularly interesting outlier is Washington DC which is located as far along the red axis as is Botswana! In other words, based on this realigned graph, you might guess that the wealth in DC is as unevenly distributed as it is in Botswana. Fascinating! (The observation is probably at least partially explained by the fact that it is the only all urban "state" and urban areas will tend to have wider income distributions than rural/suburban areas.) Also note that all of the points in the United States (orange) are well into positive territory on the red axis -- our health care system is as messed up relative to our wealth as is Chad, Bhutan, and Kazakhstan -- countries with completely screwed-up governmental agendas. Think of it this way: the degree to which our infant mortality rates are "good" owes everything to our wealth and is despite the variables independent of wealth! In other words, countries that provide average health-care relative to their wealth like El Salvador, Ukraine, Australia and the UK fall right on the black axis but we fall significantly above that line -- roughly the same place as countries that are, independent of their wealth, really messed up like Chad and Kazakhstan. (A caveat: the chart is on a log scale so the comparative analysis is more subtle than I'm making it out here.)

PCA returns not only the direction of the new axes but also the variance of the data along those axes. To understand this, imagine for a moment that all the regions of the world had exactly the same health care given their income; in this case all the points would align perfectly along the main trend line (the black axis) and the variance of the red axis would be zero. In this imaginary case, the data would be “one dimensional”, that is income and infant mortality would be one in the same statement; if you knew one, you'd know the other exactly. Now imagine the opposite scenario. Imagine that there was no relationship at all between income and infant mortality; in that case we would see a scattering of points all over the place and there wouldn’t any obvious trend lines. Neither of these imaginary scenarios are what we see in the actual data. It isn’t quite a line along the black axis but neither is it a buckshot scattering of points, so we can say the data is somewhere between 1 dimensional and 2 dimensional. If both variances are large and equal to each other, then the system is 2 dimensional while if one of the variances is large while the other is near zero, then we know the system is nearly 1 dimensional. In other words, PCA permits you to summarize complicated data by finding axes of low variance and simply eliminate them. This technique is called “dimensional reduction” and is a very powerful tool for summarizing complicated data sets such as would arise if we looked at more than two variables. For example, we might include: car ownership, water accessibility, education, average adult height, etc to the analysis at which point performing a dimensional reduction would help to get our heads around any simplifications we might wish to make.