I read a little while back that a team at Stanford and elsewhere had developed a program that could accurately diagnose breast cancer by doing digital image processing of tissue sample. I thought this was an impressive achievement, but it’s only after I read the paper itself (nicely summarized here) that I realized just how impressive it is, and how significant.
Here’s a high-level, simplified outline of the effort:
- The team used a set of 248 breast cancer samples from women in the Netherlands to train the program, called C-Path, to grade the severity of the disease by digitally ‘looking at’ slides of tissue. The team had to manually train C-path to recognize the difference between two types of tissue (epithelial and stromal), but beyond that they just let the program draw its own conclusions about what was important in the images.
- C-path came up with a set of 6642 important features for such images. Again, these were not pre-identified, or even suggested, by previous research or human pathologists or computer scientists — they were just the aspects of these images, as a group, that C-Path deemed to be noteworthy or potentially significant. As David Rimm observes in his writeup of the research, these aspects included not only the standard elements of a picture of a bunch of cells (the nuclei, for example), but also “higher-level contextual, relational, and global image features that might not make sense to pathologists.” In other words, C-Path saw different things in the images than a human analyst would.
- The researchers then used the training set of 248 samples to teach C-Path which of these features were associated with cancer. To be a bit more precise, they took as the outcome of interest whether or not the women were still alive five years after providing the sample, then used machine learning to identify which of the 6642 features were most strongly associated with this outcome. The result was a C-Path ‘score’ that predicted the 5-year survival rate.
- Most of the features were irrelevant in predicting the survival rate, but some of them were highly relevant. What’s perhaps most interesting is that some of relevant ones had not previously identified. In other words, they were not things that today’s human pathologists consciously look for or know to be important. Three features of stromal tissue, for example, turned out to be important. According to the study’s authors, “these findings implicate stromal morphologic structure as a previously unrecognized prognostic determinant for breast cancer.”
- The trained C-Path software was then set to work analyzing a separate set of 328 Canadian samples. The software predicted survival rates for these women without knowing them in advance. The human researchers, who did know the actual survival rates, found that these predictions were highly accurate.
So now we have a digital pathologist that can accurately diagnose breast cancer by looking at a slide of tissue, just as humans do. Is this is a big deal?
I think it’s a huge deal, for a few reasons. First of all, I bet C-Path will be more more accurate than the average human pathologist. Remember, it’s already learned to look for important things that humans have never known to look for. As the editors of Science write in their introduction to the paper, “The C-Path score yielded information above and beyond that from many other measures of cancer severity including pathology grade, estrogen receptor status, tumor size, and lymph node status.”
And with more training, the program is only going to get better. I’d wager that a well-designed study would show that C-Path beats the average well-trained and experienced human pathologist at diagnosing breast cancer if the same sets of slides are presented to both the computer and the person. In fact, I’ll go further: I’ll bet it would beat the world’s best human at this task.
Whether or not it’s more accurate, C-Path will definitely be more consistent than any group of human pathologists, or even a single one over time. Medical diagnosticians don’t agree with each other, and are maddeningly inconsistent over time. For important tasks like figuring out what’s wrong with us we need accuracy, standards, and consistency. Computers have always been good at the latter two, and are now getting much better at the former.
C-Path is also cheaper and more scalable. The marginal cost of one additional digital diagnosis is pretty close to zero, and the costs of setting up and training the system will be spread widely if it’s successful. This will drive down the prices of digital diagnosis over time. Human diagnosis, meanwhile, will continue to be labor intensive, and so expensive.
C-Path diagnoses can be done around the world. Imagine that the program and others like it run as SaaS platforms in the Cloud, and that smartphones in the developing world have cameras that can record an image with enough resolution to permit diagnosis. Imagine how much health outcomes could be improved.
I think there will be many other programs like C-Path soon. There are plenty of extraordinarily powerful computers available now. They’re getting cheaper over time, so more and more labs and teams can afford them. These teams are increasingly cross-disciplinary — composed of digital, medical, and data geeks.
And even though we’re in an era of big data, C-Path needed a surprisingly small amount of data to get as good as it did — good enough to recognize things that no human pathologist had. It reached its insights and predictive capability from a sample size of less than 250 slides.
Rimm writes that “The human brain is exceptional at pattern recognition, and pathology is one specialized field in medicine in which this skill is honed to an extreme degree.” I certainly agree, but as Erik and I stress in Race Against the Machine, the pattern recognition abilities of computers are improving by leaps and bounds these days. In domains like playing Jeopardy! and playing chess, computers are just flat better already. In domains like medical diagnosis, I predict they soon will be.
If and when that happens, it’ll be very, very interesting to watch how readily the medical establishment acknowledges that fact. I bet they’ll be a lot less honest with themselves, and with us, than Jeopardy! champion Ken Jennings was. He concluded his loss against the Watson supercomputer by welcoming his new computer overlords.
Will our highly regarded and well-paid medical professionals do the same? We’ll soon find out…