Projects of Zack Booth Simpson

Friday, August 7, 2009

Thinking like a chemist

Photo from flickr user zcreem

Suppose you were asked to mass produce a toy train car -- four little wheels connected to a small wooden chasis. Being an inhabitant of the 21st century you'd know exactly what to do: build an assembly line. After some tinkering you'd end up with some sort of jig that held the body into place and attached the wheels in a nice predictable fashion so that it could be repeated rapidly. Henry Ford would be proud of you.

But what if the wheels of the car were as small as atoms and the chassis a single molecule? Then how would you do it?

Such molecular assembly has been done for a long time, it's called chemistry. But chemists don't think about the assembly problem the way Henry Ford did. Their approach is fundamentally different.

A chemist would build the molecular toy car by putting four bazillion copies of the wheels into a bag full of water with a bazllion copies of the chassis and then they would shake the bag intensely assuring that some fraction, perhaps a minuscule portion of the original, would assemble themselves *by accident* into the toy car.

A chemist would then come up with a way to separate the fully assembled cars from all the left over parts that failed to assemble -- perhaps the vast majority of the pieces. For example, the chemist might dump the contents of the bag out over a pine tree. The un-assembled pieces, being smaller than the fully assembled pieces, might fall through the branches more easily while the the larger assembled pieces would be more likely to snag a branch and get stuck on the way down. As a result, the top branches of the tree would be more likely to hold the desired product. Then a chemist would snip off the top half of the tree and shake out the contents and repeat the process, knowing that for each repetition they have purified the sample a little bit. Having never even laid eyes on the target, they would declare that their end product was, say, 99% pure.

Seems absurd? It is. But it's also incredibly clever -- it permits the manipulation of atomic scale things by exploiting the fact that you are much, much larger than they are and can therefore easily move around quintillions of molecules inside of a "bag" no bigger than a drop of water. As a filter they'd use something like gelatin which at the molecular scale is a lot like the pine tree -- a big furry mess of interconnecting obstacles that would let some things pass though easily while inhibiting the movement of other things. As hard as it is to believe, this technique is accurate enough to allow separation by incredibly tiny differences in mass. Below is a picture of an actual gel, the light pink parts are the molecular batches (in this case DNA) that have been pulled through the gel which is the purple background. You can make out six tracks in this gel which is six different runs.

photo from wikicommons Gel_electrophoresis_2.jpg

Below is a picture of someone loading a gel. You can make out 10 "lanes" they are loading for their experiment. Into each lane they are placing a tiny drop of water using a pipette which is just a fancy eye-dropper. Each drop they are inserting into the gel might contain billions upon billions of the molecule of interest.

Photo from flikr user rocksee

Software prototypes

(Images adapted from flickr users Duke TIP and Patrick Beeson)

One way in which software engineering differs from other engineering disciplines is that in software the prototype is often confused with the real thing.

Nobody looks upon a prototype 1/50th scale model of a bridge and says "Looks done, let's drive trucks across it now." Yet in software development such sentiment is commonplace -- "Looks like it's working, all we have to do is clean up the bugs and ship it!"

If this were merely a misunderstanding between the programmers and the business-types it would be understandable, but the unfortunate fact is that it's often the programmers themselves who believe this. This attitude of hack-and-patch contributes to the general lack of quality of software compared to other engineering disciplines and this is compounded by the perceived low-cost of failure. Structural engineers don't say to themselves: "If it doesn't work we'll just patch it in the field." Electrical engineers don't say: "Ahh, sort of works... we'll upgrade it after tape-out." It has never occurred to a mechanical engineer that they could hide their lack of quality control by creating an automatic update system that secretly updates their wares behind their customer's backs.

For software engineering to be done right -- like in every other creative discipline -- time and space have to be allocated for building *disposable* prototypes. There is no progress without failure, but you don't have to subject your customers to your failures either.

Wednesday, July 29, 2009

A month of web coding

(Clever mug from stevenfrank)

I haven't posted anything in the last month as I've been on a web-coding sprint for our new bioinformatic enterprise: "Traitwise". This was the first time I've done any serious web-based development since about 1997. It hasn't changed much -- it's still a disaster.

The list of things that are wrong with web development would fill a book so here's my really short summary.

D/HTML is a disaster. It made a tiny bit of sense in 1994 when pages were, briefly, 99% text and 1% mark up. But now, even reasonably simple pages turn into 99% inefficient layout gibberish and 1% content. So the semantic premise of a web page has been inverted. Sarah told me that I should switch to the meta markup languages like HAML and she's probably right, but I didn't find out about that until too late.

CSS is a disaster. Obviously a committee with no programming experience and little vision of where the web was going came up CSS because only those ignorant of programming would be dumb enough to use "-" as a delimiter. But that's just a trivial gripe: the entire design of CSS is flawed. The base HTML semantic tags "p", "a", etc are not useful, only were for about 6 months in 1994, and now every page devolves into zillions of divs and spans each with a specific class moniker so you end up with a flat CSS "hierarchy" thus nullifying the CSS design and turning every page into a giant, un-readable mess.

JavaScript is a disaster. Actually, the core language is mostly OK but the portability -- the thing that really matters -- is a disaster. Thanks Microsoft! Oh, and also thanks to Google -- Chrome has broken JS even more. Chrome is the first product from Google that really pisses me off in a Microsoftian kind of way -- making the world a WORSE place instead of a better place. Attention Google, we don't need any more non-standard changes to the JS API. Please, just back away slowly from the browser market and let Firefox and the Internet standards committees lead the way. You're not going to profit from it anyway so please, just say no.

SQL is a (continuing) disaster. The inefficient mapping of the most rudimentary problems onto the relational database model is just plain wrong. The world needs a standard simpler transactional hash object that would do most of what you want as a web developer and be a lot simpler and faster. This is especially relevant given that environments like RoR are just building and mapping all these relationships internally thus nullifying the point of a relational database. RoR might as well just abandon the database and use its own transactional hash-oriented file system. The only reason they don't do so, one assumes, is the momentum of legacy databases.

Flash/Flex/AS3/MXML is a disaster. Let's sart with the marketing. What is the name of this product? Flash became Flex but is wrapped in MXML? Or is it AS3? Or is Flash the cleint-side run-time? Why do you compile .as3 files with an MXML compiler? WTF guys, how about version numbers, try them, they're great. I'm sitting next to Rob, a former Adobe developer, and even he doesn't understand the names -- we had a 15 minute argument sorting out which part of this product line is named what. There's so many confusing and broken things in Flash/Flex/As3/MXML/WhateverItsCalled I could go on for hours. I wasted untold days sorting out the simplest, stupidest things. I put "FLEXHACK" in my code next to every work-around or counter-intutive hack (like putting +5 on every textWidth) there must be 100 of those tags in my code and I got tired of putting them in after a while. Really basic concepts of object oriented programming are just flat out violated by their awful Flex design, for example the unclear dynamic typing, the UIComponent mess, and the lack of deep object duplication. But really, I could go on and on.... Like, how about the secret *deleting* of subversion controlled files off my hard-drive that the IDE didn't own -- that was just great! Or the secret deleting of comment lines at the top of the files in the IDE. Or the screwed up caching of a SWF in the wrong folder that frustrated me for several painful hours.)

The only thing that isn't a development disaster in my month-long experience was Ruby on Rails. There were a few odd things about it that threw me off (it was much stricter about type casting than I assumed it would be) but mostly it made perfect sense and Rails is obviously the work of a professional. Everything about Rails from the docs to the API's demonstrated that the author(s) knew what they were doing. I was impressed.

Thursday, July 9, 2009

The nerd peacemaker

Apparently these devices are also sometimes used to communicate with people and not just end arguments over the population of Mauritania or the mass of Mars, but I wouldn't know.

Wednesday, July 8, 2009

Updated birds and bees

Daddy, where do babies come from?

Honey, when grown-ups love each other very much and decide to have a baby, which they do usually in their late thirties or early forties, they call an embryologist.

Em-bry-ol-ogist?

Yes dear. An embryologist gives the mother-to-be, assuming she's not just the egg-donor mother, in which case she'll be the biological mother but not the legal-mother, special shots which control her ovulation and then extract an egg from her ovaries under anesthesia. This is a tender act that only grown-ups do where the embryologist monitors the mother's follicles with ultrasound and may resort to a hormone antagonist to get the timing just right. Then the daddy-to-be, again assuming he's not just an anonymous donor in a cryobank, goes into a little room and uses provided magazines, or the gay magazines that are hidden under the sink, to auto-erotically put his germ-line in a dixie cup and gives it to a nurse. The two fluids are mixed in the precious beaker-of-love and that makes a zygote -- or, more typically, several zygotes.

Babies comes from goats?

No honey, not goats, ZY-gotes. Zygotes are single-celled fertilized eggs.

So then you have a baby?

Well, some people think so. Others think they're just tissue, like your skin or liver. Either way, the embryos get put in storage and whether or not you believe they are babies, everyone seems okay with freezing them solid.

Babies are frozen? Like popsicles?

Yup, that's how they come. Little babysicles. You defrost them like TV dinners and then you implant them in the mother's uterus, or the surrogate mother's uterus if the legal-mother-to-be can't or doesn't want to carry to term. But before you can do that, the embryos are graded for quality by how fast they grow.

Graded, like in school?

Yeah, like that, except those that don't win this race are destroyed.

You mean they're killed?

Well, some people think that. But then again, those slow ones were probably not going to implant much less make it to term so they would have died anyway.

Teacher said that my friend Tommy's not ever going to accomplish anything because he's slow since he goofs off. Are they going to destroy him?

Well, no, Tommy's not an embryo.

So, they should have been destroyed him when he WAS an embryo because he's slow?

Well, no, I mean, unless he was slow as an embryo, then yeah, I guess so. Anyway, pay attention, because this is where the act of love gets complicated since it's at this point where different insurance regimes have different policies. See, the procedure of baby-making is expensive, and the insurance companies in the US don't want to pay for more than one procedure so this puts pressure to implant more than one at a time. This increases the odds that at least one of the embryos grows to term but it's risky because you might get twins or triplets or even more sometimes. But in countries where there's single payer insurance then they just implant one because costs are better controlled. And that, honey, is why you have a fraternal twin brother.

But you said they plant three?

IM-plant three sweetie, not plant three. But yeah, they do, I mean, they did. You're sister or brother didn't attach so he or she didn't come to term.

I'm sad.

Yes honey, but not as sad as mommy and daddy were when the first two attempts at in vitro failed. The deductible almost killed daddy because back then daddy did contract work and could only afford catastrophic.

Cats a in trough?

No honey, "catastrophic" -- that's when the insurance company doesn't want to pay for baby making.

But what about baby sister? Why doesn't she have a twin?

Well, just to clarify, "baby sister" as you call her is actually older than you. See, she came from one of the frozen embryos from the first round of IVF because we switched back to the first embryologist on our third try because that doctor was then in-network to our HMO plan. So actually little sister was conceived two years before you, stayed frozen until you were three and was then unfrozen and implanted last year. She doesn't have a twin because her other two siblings didn't make it.

They died?

Well, yes, if you think they were alive in the first place. But that's something you'll have to decide on your own as you grow up.

I don't like baby making! I'm never-ever going to make babies! I don't want to talk about this anymore! WAHH! NO! NO! NO!

Listen, honey, you behave. Remember, we froze you once and we can freeze you again!

(Tip of the pen to Rob and Steve at lunch today)

Sunday, July 5, 2009

Communication is not a luxury

Drew up this cartoon this afternoon after thinking about how important communication technology is in a society that has committed to it. For example, there are no pay phones anymore.

Friday, July 3, 2009

Understanding logic level conventions

Over lunch John clarified a few things for me about the nomenclature used for transfer functions.

A "transfer function" is the model of how a gate/amplifier behaves. Given an input level (voltage for an electrical device or molarity for a chemical one) the model describes the equilibrium (or steady-state) output level. The above graph illustrates a hypothetical transfer function.

The main point of confusion for me was "What exactly is the definition of 'gain'?" and "By what convention are logic levels defined?"

John pointed out that the word "gain" is an over-used / abused word. Many people over-simplify the transfer function graph above and use 'gain' to mean different things. The gain is the slope -- but as you can see the slope of the function is different at different input levels so these is no such thing as "the" gain for a gate.

In the middle, linear range, the slope is roughly constant over an input domain. When building analog devices it is this roughly-linear region that is of interest and so an analog engineer would probably refer to the approximately-constant slope in this linear region as "the gain".

However a digital engineer uses the wider non-linear range to encode a binary variable. In this case, we must now have a convention that defines the logic levels. The electrical convention for this is that the two places where the slope, aka the "incremental gain", are equal to 1 are the places that define the inside bounds of the logic levels. Anything outside of these bounds are considered valid logic levels. Anything inside of them are considered "undetermined". The nominal values (the desired levels to be obtained by any gates) are defined by a "noise margin" outside of these inc. gain=1 points.