Thursday, May 26, 2011

I'm please to announce the first version of my new learn-to-program environment called "". It's a Javascript environment that allows novice programmers to create simple graphical programs with minimal fuss all from the web browser.

It also features an editable mini-browser with tutorial and challenge sections where more advanced programmers can create content to teach and challenge less experienced programmers.

I haven't added much in the way of tutorials yet, so if you're a programmer please come and help fill it out!

If you've never programed before, here's your chance to write your first "Hello world!"

Create an account to save your programs or edit the pages. Please report bugs by editing the "bugs and features" page off the home page in the mini-browser.

Tuesday, May 24, 2011

Traitwise, PGP, data standards and the OSI model

I spent the day with Jason Bobe of the Personal Genomes Project. PGP is a public effort sequence a large number of genomes. Traitwise has formed a partnership with them to serve as part of the phenotyping solution.

Jason's role at PGP puts him in the center of a growing number of people who have competing desires to control / provide services around personal health (or "quantified self" as some are calling it). Jason therefore is in the unique position to promote and evangelize.

An interesting part of the conversation focused around the need for open standards of public health and trait related data. I used the OSI networking protocol framework as a model of how conceptual standards can be incredibly useful to increase innovation. One of the best things about the OSI model is that it didn't attempt to actually define any standards but rather it serves as a conceptual framework that defines how standards interface with each other. From that model has emerged a number of interoperable standards at all levels and without it we certainly wouldn't live in the networked world we live in.

One of the aspects of the human data problem is privacy which has "levels of data privacy" and this, it occurred to us, was somewhat analogous to the OSI layer framework.

Layer 1 - Raw identifiable data (post-privacy)
Layer 2 - Anonymized raw data (HIPPA compliant)
Layer 3 - Algorithmically open data (sand-boxed, machine readable)
Layer 4 - Aggregated data

Layer 1 data is non-private. It requires a consent certificate of some sort to go along with the data.

Layer 2 data is considered by many scientists to be adequate for protection in many research circumstances and indeed HIPPA seems satisfied with this. However, for open projects such as Traitwise I personally don't think that it suffices as de-anonymization has been shown in many circumstances such as the infamous AOL search records scandal.

Layer 3 is perhaps the most interesting and least discussed. A Layer 3 system would allow an algorithm written by a researcher to run against raw data but within a sandbox that only allows the aggregated results to emerge. I haven't put a huge amount of thought into this, but it seems plausible to write an API that could enforce such constraints. But, even without such an API, a human code reviewer could accomplish the same thing.

Layer 4 is what Traitwise and others are currently focusing on -- aggregated data that is not deanonymizable.

This is just one aspect of the data and communications problem, but it is an important one and it was fun talking to Jason about it today.

Monday, May 16, 2011

Freshman Research Initiative -- First Semester Results

We just finished our first semester of the Freshman Research Initiative on Genetic Algorithms for Pattern Design. We paired up students from Computer Science with students from the Fashion/Textiles department and had them create algorithmic patterns that could be bred genetically. The results are on this page. Next semester we're going to print these patterns on fabric and make clothing from them.