You Get To Know How To Fold ‘Em…

by Andrew McAfee on August 12, 2010

When a brand-new protein rolls off the ribosome assembly line within a cell, it’s basically just a strip of amino acids in a pre-determined sequence. It then quickly bends, twists, and folds itself into a convoluted shape, the same one every time. This final folded shape is determined by….

no one knows.

Some of the basic rules are clear, but most of them are not. Despite 45 years of vigorous research, we have only the dimmest understanding of protein folding, despite the fact that we’d really like to know more. As Joachim Pietzsch writes, “Of all the molecules found in living organisms, proteins are the most important. They are used to support the skeleton, control senses, move muscles, digest food, defend against infections and process emotions.” Because they’re so vital, we’d love to know more about what makes them tick (or, in this case, form).

So why not use computers to simulate the folding process, thereby gaining a better understanding? Why not write an application that takes a given protein’s fresh-off-the-assembly line shape, applies all known folding rules to it, and tests to see which ones get the molecule into its final (known) shape? Programs like Rosetta do exactly this, but they run up against a nasty problem: even simple proteins are so complex that the fastest simulations can’t test all possibilities. Pietzsch explains:

“Cyrus Levinthal calculated in 1969 that finding the [final shape] by simple trial and error would be impossible. He said that even if a protein only consisted of 100 amino acids and each of these flexible residues could only take on two different spatial orientations, the protein could theoretically adopt as many as 1030 possible conformations. Assuming a protein could try out 100 billion different conformations per second, it would still take 100 billion years to try all possibilities.”

Rosetta and other protein folding algorithms do much better then simple trial and error; they incorporate all the rules and ‘tricks’ that we know about. But as Pietszch writes “Any realistic hope of cracking the folding code… is probably a very long way off.”

If the smartest biochemists and fastest computers have made so little progress on this bitterly difficult problem over half a century, it seems ludicrous to think that novices will be able to contribute much. But a paper published earlier this month (pdf) in Nature shows that amateurs can fold proteins better than anyone or anything else when they’re given the right training and incentives, and when they’re given digital tools that allow them to experiment, collaborate, and self-organize.

A few years back a team at the University of Washington took cues from both the phenomenon of massively multiplayer online roleplaying games and the concept of crowdsourcing scientific problems and developed Foldit, a protein folding game. Foldit presents protein folding as a visual or spatial challenge to the player, whose goal is basically to arrange an on-screen protein into the smallest possible shape that obeys all the game’s rules.

A set of starter puzzles familiarize players with the game’s interface, rules, and solving aids. After completing these, players are ready to tackle actual proteins. As they do so, they can work alone or join groups, many of which are open to all comers. They can also read and contribute to a wiki about the game and its strategy.

Players strive to get high scores on each puzzle; since correctly folded proteins are in the lowest possible energy state, a players’ Foldit scores are the opposite of the energy of the molecule they’ve created. Keeping score, of course, leads to rivalry and competition as people and groups strive to outdo each other and be recognized as the best at the game. There are no cash rewards.

The first public beta Foldit downloads became available in May of 2008, and since then more than 57,000 people have played the game. How well are they doing? The Nature paper reports the results of ten blind challenges — prediction puzzles involving proteins whose final, folded shapes were known to the paper’s authors but not “contained within publicly available databases for the duration of the puzzles.”

As the authors write, “we hypothesized that human spatial reasoning could improve both the sampling of conformational space and the determination of when to pursue suboptimal conformations if the [random elements of current algorithms] were replaced with human decision making while retaining the deterministic Rosetta algorithms as [tools for players].”

Was this hypothesis correct? In three of the ten challenges the best Foldit players and the best current simulations performed similarly — that is, the two approaches got about equally close to the final folded shape of the protein. In five other challenges, the best result from Foldit was substantially better than the best a superfast computer alone could do. In only two of the ten cases did the simulation do substantially better. These appeared to be the two hardest puzzles; neither Foldit players nor computers alone were able to get very close to the correct shape.

What explains this astonishing result? How can it be that crowds of people playing a game on desktop computers do better at this important task than supercomputers programmed by superscientists?

Are the scientists themselves the ones playing Foldit? Maybe, but they’re not the best at it; none of the five highest-scoring players took chemistry classes beyond high school.

The team behind Foldit realized that even ‘normal’ people have a number of interesting attributes that make us well-suited to tackle protein folding challenges.

  • We are particularly strong at spatial reasoning, or literally seeing solutions. As is the case with all primates, a substantial portion of our brains is devoted to processing visual signals. This means that when a puzzle is posed to us in spatial terms, we can apply a lot of our cranial horsepower to it. And protein folding is, at its heart, the work of folding an initial shape into a smaller shape.
  • We have intuition, especially after lots of experience in a given domain like Foldit. We develop a sixth sense for smart moves in the game, and even though most of us couldn’t explain where the idea for a particularly far out move came from, protein folding seems often to rely on such moves. It’s very hard to program computers to make intelligent-yet-far-out moves. We can program in randomness (and the best protein folding programs do just that), but it’s hard if not impossible to program in smart, intuitive randomness. As the Nature team writes, “We found that Foldit players were particularly adept at solving puzzles requiring substantial backbone remodelling… stochastic Monte Carlo trajectories [in other words, random computer guesses] are unlikely to [find] the coordinated backbone… shifts needed…”
  • We have great adaptivity. We can change our strategies and approaches over time based on what we learn and intuit about what’s working well, and what’s not. The Nature paper has some cool graphs showing how people change their mix of Foldit ‘moves’ after the first hour, first day, and first week of playing. Again, it’s hard to program computers to do this well.
  • In addition to being highly visual, we humans are also inherently tend toward collaboration. We form teams and share knowledge among members pretty effectively, thanks to the gift of language. Many scholars believe that is what most fundamentally separates us from all other animals. And technologies like wikis are a big step forward in facilitating collaboration within geographically dispersed groups.
  • While collaborating, we exercise a high degree of self-organization. The Foldit researchers found that “Within teams, there is often a division of labor. some players specialize in early-stage openings, others in middle- and end-game polishing.” I would bet that these roles were not assigned by team captains; Instead, people fell into them unconsciously over time, and also fell into effective workflows and divisions of labor within teams. This is exactly what’s happened with Wikipedia, and I’d be surprised if the situation were radically different within Foldit.
  • Finally, we love competition. The desire to get ahead of a rival or be on the game’s leader board can be a powerful motivation. Foldit’s designers did a great job of tapping into this motivation. They also made the game engaging to play and provided frequent feedback to players, thereby increasing intrinsic motivation as well.

Taking advantage of all these features simultaneously led to better outcomes in the dauntingly difficult domain of protein folding. By studying what people did to rack up high scores in Foldit, we may be able to improve how well computers alone work in this domain. As the authors write “More in-depth analysis of player strategies should provide further insight… and could lead to improved automation algorithms for protein structure prediction.” I’ll go a big step beyond that statement: it could be that studying how people succeed at Foldit might help us better understand not only computer simulation of protein folding, but perhaps also protein folding itself.

The Foldit team did science that was both rigorous and creative, and they deserve at least as much attention they’re getting. They also deserve credit for realizing that when faced with a nasty problem, the smart approach is not always to retrench – to rely more heavily on established experts and powerful computers.

Instead, when the tools needed for effective problem solving can be widely and cheaply distributed, the responsibility for problem solving can also be. And as Foldit results and lots of other evidence shows, expertise — for problem solving, innovation, etc. –  is emergent. It’s out there in large quantities, and in hard-to-predict places. A problem solving approach that lets pockets of enthusiasm and expertise manifest themselves and find each other can yield surprisingly large rewards, even in the unlikeliest places.

Where else have you seen widely distributed problem solving work well? Where else should it be applied? Leave a comment, please, and let us know…

p.s. If you’re not an American older than about 35 or a big Kenny Rogers fan, the title of this post might not make much sense. In which case this link and this one might be helpful…

http://www.amazon.com/Language-Instinct-How-Mind-Creates/dp/0060976519
  • http://twitter.com/deb_lavoy deb louison lavoy

    This is simply awesome validation of our intuition that many heads are better than one. I think the great challenge is to enable environments where people can have a shared picture of the challenge at hand, and the tools to aggregate, iterate and deliberate on its solution. (and I can't wait). This seems to build on the talk you gave at palantir – a great one.

  • JAJansenJr

    I think it would be very interesting to have successful “folders” articulate in words what “tricks” they find useful in folding a protein. It would be further useful to compare these “tricks” to known rules for intermolecular interactions, viz the rules that electrostatic interactions may dominate the folding forces (cf Feynman Hellman theorem)

  • http://twitter.com/esauve Eric Sauve

    Hello, this would seem like a false truth. It does not seem like crowdsourcing solved the problem here, but there happen to be a FEW of the crowd who were able to match/ out-perform the computer. This would seem normal. It IS interesting that the people succeeding have no formal expert training, but again, while it offers promise, it still doesn't help in the creation of a general understanding about how to fold proteins.

  • http://infocognito.blogspot.com/ Aaron Taylor

    This seems to really relate to the mathematical “P versus NP” problem, and to a very recent NYTimes article that describes how the power of social media has come into play in a process to prove/disprove one mathematician’s claim to having answered the problem. He posted his claim, and the blogosphere came alive with a new wiki and other sites for collaborative review and dissection of the claim. This can be seen in direct contrast to the scientific approach used in our recent past, which seemed so advanced at the time – discussions shared through emails, face-to-face meetings that evolved slowly into web-meetings; but now, the collaboration/discussions are nearly instantaneous, with the capability of continual input and updating of information at a fantastic rate.

  • amcafee

    This is a test.

Previous post:

Next post: