Projects of Zack Booth Simpson

Tuesday, April 14, 2009

Molecular computers -- A historical perspective. Part 1

I've been having discussions lately with Andy regarding biological/molecular computers and these discussions have frequently turned to the history of analog and digital computers as a reference -- a history not well-known by biologists and chemists. I find writing blog entries to be a convenient way to develop bite-sized pieces of big ideas and therefore what follows is the first (of many?) entries on this topic.

In order to understand molecular computers -- be they biological or engineered -- it is valuable to understand the history of human-built computers. We begin with analog computers -- devices that are in many ways directly analogous to most biological processes.

Analog computers are ancient. The first surviving example is the astonishing Antikythera Mechanism (watch this excellent Nature video about it). Probably built by the descendants of Archimedes' school, this device is a marvel of engineering that computed astronomical values such as the phase of the moon. The device predated equivilent devices by at least a thousand years -- thus furthering Archimedies' already incredible reputation. Mechanical analog computers all work by the now familiar idea of inter-meshed gear-work -- input dials are turned and the whiring gears compute the output function by mechanical transformation.

(The Antikythera Mechanism via WikiCommons.)

Mechanical analog computers are particularly fiddly to "program", especially to "re-program". Each program -- as we would call it now -- is hard-coded into the mechanism, indeed it is the mechanism. Attempting to rearrange the gear-work to represent a new function requires retooling each gear not only to change their relative sizes but also because the wheels will tend to collide with one another if not arranged just so.

Despite these problems, mechanical analog computers advanced significantly over the centuries and by the 1930s sophisticated devices were in use. For example, shown below is the Cambridge Differential Analyzer that had eight integrators and appears to be easily programmable by nerds with appropriately bad hair and inappropriately clean desks. (See this page for more diff. analyzers including modern reconstructions).

(The Cambridge differential analyzer. Image from University of Cambridge via WikiCommons).

There's nothing special about using mechanical devices as a means of analog computation; other sorts of energy transfer are equally well suited to building such computers. For example, in 1949 MONIAC was a hydraulic analog computer that simulated an economy by moving water from container to container via carefully calibrated valves.

(MONIAC. Image by Paul Downey via WikiCommons)

By the 1930's electrical amplifiers were being used for such analog computations. An example is the 1933 Mallock machine that solved simultaneous linear equations.

(Image by University of Cambridge via WikiCommons)

Electronics have several advantages over mechanical implementation: speed, precision, and ease of arrangement. For example, unlike gear-work electrical computers can have easily re-configurable functional components. Because the interconnecting wires have small capacitance and resistance compared to the functional parts, the operational components can be conveniently rewired without having to redesign the physical aspects of mechanism, i.e. unlike gear-work wires can easily avoid collision.

Analog computers are defined by the fact that the variables are encoded by the position or energy level of something -- be it the rotation of a gear, the amount of water in a reservoir, or the charge across a capacitor. Such simple analog encoding is very intuitive: more of the "stuff" (rotation, water, charge, etc) encodes more of represented variable. For all its simplicity however, such analog encoding has serious limitations: range, precision, and serial amplification.

All real analog devices have limited range. For example, a water-encoded variable will overflow when the volume of its container is exceeded.

(An overflowing water-encoded analog variable. Image from Flickr user jordandouglas.)

In order to expand the range of variables encoded by such means all of the containers -- be they cups, gears, or electrical capacitors -- must be enlarged. Building every variable for the worst-case scenario has obvious cost and size implications. Furthermore, such simple-minded containers only encode positive numbers. To encode negative values requires a sign flag or a second complementary container; either way, encoding negative numbers significantly reduces the elegance of the such methods.

Analog variables also suffer from hard-to-control precision problems. It might seem that an analog encoding is nearly perfect -- for example, the water level in a container varies with exquisite precision, right? While it is true that the molecular resolution of the water in the cup is incredibly precise, an encoding is only as good as the decoding. For example, a water-encoded variable might use a small pipe to feed the next computational stage and as the last drop leaves the source resivoir, a meniscus will form due to water's surface tension and therefore the quantity of water passed to the next stage will differ from what was stored in the prior stage. This is but one example of many such real-world complications. For instance, electrical devices, suffer from thermal effects that limit precision due to added noise. Indeed, the faster one runs an electrical analog computer the more heat is generated and the more noise pollutes the variables.

(The meniscus of water in a container -- one example of the complications that limit the precision of real-world analog devices. Image via WikiCommons).

Owing to such effects, the precision of all analog devices is usually much less than one might intuit. The theoretical limit of the precision is given by Shannon's formula. Precision (the amount of information encoded by the variable, measured in bits) is log2( 1+S/N ). It is worth understanding this formula in detail as it applies to any sort of information storage and is therefore just as relevant to a molecular biologist studying a kinase as it is to an electrical engineering studying a telephone.

.... to be continued.

Utility yard fence

In the last few days I've finished up the fence line that separates the backyard from the utility yard. This involved staining more boards with Pinofin which is as malodorous as it is beautiful. Thanks to Jules for the help with staining! Fortunately she is hard-of-smelling so didn't notice how bad it was!

Saturday, April 11, 2009

Finished workshop drawers

Today I finished attaching the hardware to my new tool drawers. I'm stupidly excited about them as I can put away all my tools and clear out a lot of clutter from my shop.

We ordered the boxes from Drawer Connection. They really did a great job; they are perfectly square, dovetailed joined, glued, sanded, and polyed. As Bruce said, "I'll never build another box again." It's a demonstration to me how custom web-based CNC construction is the future of a lot of products. We ordered about 30 boxes of all different sizes and the total was only about $1100 including shipping. There's no possible way we could have made them for that.

Thursday, April 9, 2009

Finished utility yard

Finished up the utility yard today which involved raising the AC units and changing grade a little bit. This weekend I'm going to stain the pickets and rebuilt the rear fence line.

Tuesday, April 7, 2009

The 21st Century Chemical / Biological Lab.

White Paper: The 21st Century Chemical / Biological Lab.

Electronic and computer engineering professionals take for granted that circuits can be designed, built, tested, and improved in a very cheap and efficient manner. Today, the electrical engineer or computer scientist can write a script in a domain specific language, use a compiler to create the circuit, use layout tools to generate the masks, simulate it, fabricate it, and characterize it all without picking up a soldering iron. This was not always the case. The phenomenal tool-stack that permits these high-throughput experiments is fundamental to the remarkable improvements of the electronics industry: from 50-pound AM tube-radios to iPhones in less than 100 years!

Many have observed that chemical (i.e. nanotech) and biological engineering are to the 21st century what electronics was to the 20th. That said, chem/bio labs – be they in academia or industry – are still in their “soldering iron” epoch. Walk into any lab and one will see every experiment conducted by hand, transferring micro-liter volumes of fluid in and out of thousands of small ad-hoc containers using pipettes. This sight is analogous to what one would have seen in electronics labs in the 1930s – engineers sitting at benches with soldering iron in hand. For the 21st century promise of chem/nano/bio engineering to manifest
itself, the automation that made large-scale electronics possible must similarly occur in chem/bio labs.

The optimization of basic lab techniques is critical to every related larger-scale goal be it curing cancer or developing bio-fuels. All such application-specific research depends on experiments and therefore reducing the price and duration of such experiments by large factors will not only improve efficiency but also make possible work that was not previously. While such core tool paths are not necessarily “sexy”, they are critical. Furthermore, a grand vision of chem/bio automation is one that no single commercial company can tackle as the vision for such requires both a very long time commitment and a very wide view of technology. It is uniquely suited to the academic environment as it both depends upon and affords cross-disciplinary research towards a common, if loosely
defined, goal.

Let me elucidate this vision with a science-fiction narrative:

Mary has a theory about the effect of a certain nucleic acid on a cancer cell line. Her latest experiment involves transforming a previously created cell line by adding newly purchased reagents, an experiment that involves numerous controlled mixing steps and several purifications. In the old-days, she would have begun her experiment by pulling-out a pipette, obtaining reagents out of the freezer, off of her bench, and from her friend's lab and then performed her experiment in an ad hoc series of pipette operations. But today, all that is irrelevant; today, she never leaves her computer.

She begins the experiment by writing a protocol in a chemical programming language. Like high-level languages used by electrical and software engineers for decades, this language has variables and routines that allow her to easily and systemically describe the set of chemical transformations (i.e. “chemical algorithms”) that will transpire during the experiment. Many of the subroutines of this experiment are well established protocols such as PCR or antibody
separation and for those Mary need not rewrite the code but merely link in the subroutines for these procedures just as a software engineer would. When Mary is finished writing her script, she compiles it. The compiler generates a set of fluidic gates that are then laid-out using algorithms borrowed from integrated circuit
design. Before realizing the chip, she runs a simulator and validates the design before any reagents are wasted – just as her friends in EE would do before they sent their designs to “tape out.” Because she can print the chip on a local printer for pennies, she is able to print many identical copies for replicate experiments. Furthermore, because the design is entirely in a script, it can be reproduced next week, next year, or by someone in another lab. The detailed script means that Mary’s successors won’t have to interpret a 10 page hand-waving explanation of her protocol translated from her messy lab notes in the supplementary methods section of the paper she publishes – her script *is* the experimental protocol. Indeed, this abstraction means that, unlike in the past, her experiments can be copyrighted or published under an open source license just as code from software or chip design can be.

Designing and printing the chip is only the first step. Tiny quantities of specific fluids need to be moved into and out of this chip – the “I/O” problem. But Mary’s lab, like any, both requires and generates thousands of isolated chemical and biological reagents each of which has to be stored separately in a controlled environment and must be manipulated without risking cross-contamination. In the old days, Mary would have used hundreds of costly sterilized pipette
tips as she laboriously transfered tiny quantities of fluid from container to container. Each tip would be wastefully disposed of despite the fact that only a tiny portion of it was actually contaminated – such was the cost when everything had to be large enough to be manipulated by hand. In the old days, each of the target containers – from large flasks to tiny plastic vials – would have had to be hand-labeled resulting in benches piled with tiny cryptic scribbled notes with all of the confusion and inefficiency that results from such clutter. Fortunately for Mary, today all of the stored fluids for her entire lab are maintained in a single fluidic database; she never touches any of them. In this fluidic database, a robotic pipette machine addresses thousands of individual fluids. These fluids are stored inside of tubes that are spooled off of a single supply and cut to length and end-welded by the machine as needed. Essentially, this fluidic database has merged the concepts of “container” and “pipette” – it simply partitions out a perfectly sized container on-demand and therefore the consumables are cheaper and less wasteful. Also, the storage of these tube-containers is extremely compact in comparison to the endless bottles (mostly filled with air) that one would have seen in the old days. The fluid-filled tubes could be simply wrapped around temperature-controlled spindles and, just like an electronic database or disk drive, the system can optimize itself by “defragmenting” its storage spindles ensuring there’s always efficient usage of the space. Furthermore, because the fluidic
database knows the manifest of its contents, all reagent accounting can be automated and optimized.

Mary has her experiment running. But, moving all these fluids around is just a means to an end. Ultimately she needs to collect data about the performance of her new reagent on the cancer line in question. In the old days, she would have run a gel, used a florescent microscope, or any number of other visualization techniques to quantify her results – any of these measurements would have required a large and expensive machine. But today, most of these measurements are either printed directly on the same chip as the fluidics using printable chemical / electronic sensors or those that can’t be printed are interfaced to a standardized re-usable sensor array. The development of those standards was crucial to the low capital cost of her equipment. Before far-sighted university engineering departments set those standards, each diagnostic had its own proprietary interface and therefore the industry was dominated by an oligopoly of several companies. But now, the standards have promoted competition and thus the price and capabilities of all the diagnostics has improved.

As Mary’s chemical program executes on her newly minted chip, she gets fluorescent read-outs on one channel and antibody detection on another – all such diagnostic were written into her experimental program in the same way that a “debug” or “trace” statement is placed into a software program. After her experiment runs, the raw sensor data is uploaded to the same terminal where she wrote the program and she begins her analysis without getting out of her chair.

After the experiment, the disposable chip and the temporary plumbing that connected to it are all safely incinerated to avoid any external contamination. In the old days, such safety protocols would have had to be known by every lab member and this would have required a time-consuming certification process. But today, all of these safety requirements are enforced by the equipment itself and therefore there’s much less risk of human mistake. Furthermore, because of the
enhanced safety and lower volumes, some procedures that were once classified as bio-safety level 3 are now BSL 2 and some that were 2 are now 1, meaning that more labs are available to work on important problems.

Mary’s entire experiment from design to data-acquisition took her under 1 hour – comparable to a week by old manual techniques. Thanks to all of this automation, Mary has evaluated her experiment and moved on to her next great discovery much faster than would have been possible before. Moreover, because so little fluid was used in these experiments her reagents last longer and therefore the cost has also fallen. Mary can contemplate larger-scale experiments than anybody dreamed of just a decade ago. Mary also makes many fewer costly mistakes because of the rigor imposed by writing and validating the entire experimental script instead of relying on ad hoc procedures. Finally, the capital cost of the equipment itself has fallen due to standardization, competition, and economies of scale. The combined result of these effects is to make the acquisition of chemical and biological knowledge orders of magnitude faster than was possible just decades ago.

Monday, April 6, 2009

Macro-scale examples of chemical principles

I like macro-scale examples of chemical principles. Here's two I've noticed recently.

I was very slowly pouring popcorn into a pot with a little bit of oil. The kernels did not distribute themselves randomly but instead formed some long chain aggregations because, apparently, the oil made them more likely to stick to each other than to stand alone. This kind of aggregation occurs frequently at the molecular scale when some molecule has an affinity for itself.

This is wheelbarrow chromatography. During a rain, water and leaves fell into this wheelbarrow. Notice that the leaves and the stems separated; apparently the stems are lighter than water and the leaves are heavier. This sort of "phase separation" trick is frequently used by chemists to isolate one type of molecule from another in a complex mixture. Sometimes the gradient of separation might be variable density as in this example, but other times it might be hydrophobicity or affinity to an antibody or many other types of clever chemical separations known generically as "chromatography". Note that the stems clustered. Like the popcorn above, apparently there is some inter-stem cohesion force that results in aggregation as occurs in many chemical solutions.

Porch branches

Bruce and I finished up the porch branches on Friday; they have not yet been stained so the color is different. It's funny -- this is one of the very first details I thought of for the house design and one of the very last to be implemented so for me this small detail is very important in that it collapses some sort of psychic "todo" stack and thereby provides the relief that one feels in crossing-out a complicated set of tasks (never mind that the list has grown substantially since then! :-)