Detecting Cancer with AI

How to Listen

Listen on Google Podcasts, Apple Podcasts, or via the SoundCloud media player above.

In this Episode

October is Breast Cancer Awareness Month. In this episode, we highlight an innovative technology for doctors treating patients with breast cancer. SwRI engineers and pathologists from UT Health San Antonio combined their expertise to develop a computer algorithm that quickly detects and analyzes breast cancer tumor cells on pathology images. The algorithm holds promise for faster, more accurate cancer detection, diagnosis, and treatment. The team’s brilliant work is grabbing global attention.

Listen now as we learn about potentially life-saving artificial intelligence.

TRANSCRIPT

Below is a transcript of the episode, modified for clarity.

Lisa Peña (LP): Artificial intelligence, or AI, is showing promise in the fight against breast cancer. October is National Breast Cancer Awareness Month. And today, we are highlighting a computer algorithm that detects breast cancer tumor cells. The technology could one day save lives. It's AI for your health on this episode the Technology Today.

[MUSIC PLAYING]

We live with technology, science, engineering, and the results of innovative research every day. Now, let's understand it better. You're listening to the Technology Today podcast, presented by Southwest Research Institute. Hello, and welcome to Technology Today. I'm Lisa Peña. October is National Breast Cancer Awareness Month. According to the National Cancer Institute, about 1 in 8 women will develop breast cancer over her lifetime.

Our guests today are experts in two different fields. But they combined their knowledge to develop a new and potentially life-saving technique to identify and analyze breast cancer tumor cells. Dr. Bradley Brimhall is a pathologist from UT Health San Antonio, and David Chambers is a Southwest Research Institute engineer. They are part of an innovative team that trained a computer algorithm to sort through slides of tissue and recognize which ones were cancerous. And their outstanding work won first place in an international competition. Thanks for joining us, Brad and David.

Dr. Bradley Brimhall (BB): Thanks.

David Chambers (DC): Thank you.

LP: So let's start with David. Let's talk about this algorithm. What exactly does it do?

DC: Well, the algorithm is designed to provide an estimation of the cancer cellularity. The cellularity is a particular way of quantifying the cancer, which is important for the prognosis and, sometimes, the diagnosis of specific cancers.

LP: So you're talking about cancer cellularity. And you had mentioned assigning a number to it. And Dr. Brimhall, if you'd like to expand on that.

BB: Yeah, quantifying the amount of cancer in a single digital image from a slide. And part of that involves recognizing that it's cancer, and secondly, quantifying the percentage of the slide that is cancer, or are cancer cells.

LP: So why is it so important to recognize these cancer cells quickly and accurately?

BB: Yeah, I mean, it's important diagnostically. We do that with our eyes currently as we look through a microscope. But our ability to quantify is limited just because we're human and we don't get things to the level of precision that a computer can get. So that additional information, as well as other information from the cancer cells themselves, can be very powerful diagnostically, but also prognostically. It could enable us to say, this patient may do a little better than average. This patient may not do as well. And maybe we need to adjust therapy accordingly. So it could be very powerful.

LP: So essentially, the algorithm looks at the cells and can determine how much, or how many cells in a slide may contain cancer. Am I getting that right?

DC: Yes. The technical definition is actually based on the area occupied by the cancer cells versus the overall area of any tissue that's not malignant.

LP: So you're looking at a percentage?

DC: Yes. And also, it's a fairly tricky and difficult problem as far as medical imaging goes. Because the cancer cells themselves individually can look a lot like the healthy cells that they are derived from or mutated from. And so you have to be able to use overall image information to really distinguish which cells are healthy, and which cells are not.

LP: So Brad, right now, pathologists are doing that work.

BB: Yes. We take the tissue, we process the tissue, and then we stain it, put it on a slide, and look at it through a microscope. And then we determine, are the cells malignant or not malignant? Which are the malignant cells, which are not? And we can do that, but our ability to quantify is not as good as a computer. And we're focused on what we're focused on, so we can't focus on the entire slide at the same time, or a large area at the same time.

So I think there's pieces of information that remain unused simply because we can't process it within the human brain. So we're pretty good, but I think that this kind of technology could be a real powerful adjunct to help us do things even better.

LP: So as a pathologist, when you're there analyzing these slides with your own eyes under a microscope, what is the weight of that? Obviously, you're feeling someone's life is in your hands at that moment. What does that feel like?

BB: Yeah, well, you want to get things right. So use all the information you can gather from your eyes, as well as other clinical information that's available, and then you can use that to make a diagnosis. And then once you've made the diagnosis, in many cases of cancer, in this case, you'll want to grade it, or you want to give it some additional prognostic information. So far, that's pretty, I mean, usually, we have low grade, high grade, medium grade, intermediate. But we're unable to grade things more precisely than that. And some of the grading systems are somewhat subjective. In other words, do the cells look, you know, how abnormal are they? The terminology we use is, what is the level of atypia in the cells? How abnormal do they look?

Well, that's pretty subjective. I mean, you can take extreme cases and say those cells look very bad. And people can agree on that. And then some of these cells don't look very bad at all. And then you have in the middle, where it's just a lot of disagreement, which makes a lot of sense. So I think computers could be a very big help for what we do.

LP: Yeah, something like an algorithm is a little more definitive.

Left to right: Dr. Bradley Brimhall, Lisa Peña, David Chambers

BB: Yes, especially when it's combined with what we're doing. Yeah.

LP: OK, so you are both part of a team of pathologists and engineers that came together to make this algorithm happen. So let's kind of start from the beginning. How did you develop the algorithm? What was the process of working together to achieve this? How did this come together?

DC: Well, we at SwRI, started really looking to apply our expertise in artificial intelligence and machine learning to look at what we can do in the medical field. It is something that we had done previously, I'm told, and something that had been sort of stagnant for some time. But given our resurgence in AI research and a lot of the other things that we were doing, it seems to make sense. We actually started by identifying a couple challenges to participate in.

And one of the challenges that we identified was the BreastPathQ cancer cellularity challenge. And it started off as just me, a team of one person, really looking through the data, seeing what I could do with it. And I think very quickly, I found out that I was not going to be able to look at these and tell you which cells were cancerous and which cells were not. I couldn't tell you which cells or which type, and what other tissues were.

So it was very clear that if we were going to do well in something that we were trying to achieve, we were going to need the subject matter expertise. So a group leader, Hakima Ibaroudene at Southwest Research Institute, and some of the other managers, our director, Ryan Lamm, and I think Kase Saylor, a manager, also helped us to identify. Through contacts, we had a couple of researchers at UT Health San Antonio.

LP: So one of them was Dr. Brimhall.

DC: And one was Dr. Brimhall.

LP: So they call you in and say, we want to develop a computer algorithm. And you're a pathologist. So what were your thoughts on being able to help an engineering team with this task?

BB: Yeah, I don't think we knew until we met them. So it was intriguing. So we said, well, we need to talk to the folks from SwRI. And so they came over and visited us on campus at the medical school. And we were very impressed with all three of them. It was David, Donald, and Hakima. And they showed us the breast cancer cellularity challenge. And we sat down with them, spent a few hours going over these cells are cancer, these cells are not. And I know David took copious notes. And I don't know exactly, he taught the computer somehow to recognize those things.

LP: So how do you teach someone with an untrained eye to identify cancer tumor cells in a slide? What does that look like?

BB: That's a four year residency training program. So probably about half of that time is spent with a microscope looking at tissue sections from actual patients and learning from those who knew before. So it's kind of an apprentice-like system. And you know, you get better with time. But even so, even the best of us, like I say, there's certain things that we can't quantify well. Our eyes are good at many, many things. And we could recognize patterns well, we're good at recognizing patterns.

But some of the more detailed things about the cells are different. I mean, the roundness of a nucleus, for example. I mean, how round is the nucleus? We can give extreme examples. We can say it's not round at all. It's spindled. Or, it's very round, it's circular, or something in between. But we can't, there's not, I can't tell you it's 68% circular, for example.

LP: But yet, you took your years of training and kind of condensed it into a crash course of a few hours. And somehow, David and the engineering team, they were able to, they were able to identify which cells were cancerous, more or less, because then you had to teach a computer. So what was the process of learning that? Maybe, David, you might want to explain that. What was it that Dr. Brimhall, what was his method that made it click for you?

DC: So I think the advantage we have is that breast cancer, there are only a handful of types. And we only had a handful of types represented. So fortunately, I didn't need four years of looking at things under a microscope. It was very interesting to learn. I think I really got a good foundational basis. It was also eye-opening about the way pathology is done as a whole. And some cases are so hard, that you really can't tell. And I think there were several that we looked at in our data set, where we asked an additional pathologist to look at it.

And I got to learn these things to a pathologist, they've come up with mechanisms for describing them. Sometimes, it's just, I can't explain it, but I know this looks bad. So I got a lot of the groundwork just by going and speaking with our pathologist partners. So I was able to learn which types of cells were which. I was able to recognize some different forms of the cancer after that.

And then, beyond that, I spent a good long time reading and looking at examples. So we did not have specific examples where it was laid out how many of the cells, or if each individual cell we're looking at is cancerous or not. But I had a bunch of images where I had a number. I had a 0 to 100% of this is how much cancer is here. And I learned to tell, based on reading and based on talking with Brad, what was what. And if I had a question, I cheated and sent an email.

LP: OK, so it was a little mix of everything. But it sounds like you had a really strong teacher and a really student really willing to learn here. And it was sort of a dream team that came together. And then you were able to create this algorithm. So what was the next step for you? Once you could understand what you were looking at in those slides, well, you had to teach the computer to do the same.

DC: Sure. Then, after that point, I'm in my comfort zone. I spend my days training AIs to look at images. So what I looked at when I saw the problem is, here is, what we want as the output is a percentage. And I could look at the entire image and guess the percentage based off of that. But what I'd really, really like is an algorithm that gave me a picture of which pixels are cancerous, and which aren't.

At the end of the day, you end up with a, you want the algorithm to give you a white piece of paper that has black dots where there are cancer cells. And I designed my neural network to do just that. I designed it to work with labels, where we went in and we labeled individual cells. I also designed that pathway for the neural network we designed to learn from just overall labels of, hey, here's an image.

I've given you some examples of which individual cells or groups of cells are cancerous. Now I want you to learn and guess on your own based on, I've given you a percentage. I've given you 45% percent of cells approximately or 45% of this area of cells is cancerous. And I was able to use both sources of information.

It's really cool. I mean, I've probably given the talk on the technical details probably 20 times now. And I could talk about it for 20 more minutes, if we had time.

LP: Well, you know, our audience can't see you. But you have a huge smile on your face. So this really lights you up. And it sounds like, you know, this is really exciting for you. So what about this was that exciting spark, that thing that sparks it for you? What makes it so awesome for you?

DC: So I think the awesome thing is that this particular application, the problem was unique. I mean, you get a lot of repeat problems in computer vision. So, for example, detecting objects. And I worked with autonomous vehicles and driver safety systems. And detecting an object is the same every time. You have an image and you want to make a box out of it. This was a completely new kind of problem.

So it was exciting to go in and to actually redesign the neural network to do exactly what I wanted it to do. That moment of, well, if we made a neural network that tried to make this guess of the mask, well, then we can see how it's behaving. So we actually had a way of kind of looking into what's generally a black box. And I can pull this image out, and I can say, hey, the score, and we did this as part of training the neural network. I actually looked at neural network-generated images of where cancer cells were against the real image.

And I sent side-by-sides to Brad. And we actually looked at, well, this is what it missed. And this is what it was doing right. And we were able to make some tweaks and improve things based off of that.

LP: So let's talk about that moment where you realized, this is working. What did that feel like for you?

DC: Well, I think there were a couple moments of realizing that it was working. There was, this competition had three phases. There was a training phase only. There was a validation phase. And then, thereafter, was the final test phase. And I felt decently confident going into the training phase.

We were at the top of the leaderboard right away. The validation phase, I think, probably, by the third submission, we were in the top two. And then, I think I waited, it must have been, it seemed like it was years, but it was probably about six weeks, where I was refreshing the leaderboard every few hours. And it finally popped up. And I think my labmates probably were scared because I shouted out so loud. Yeah, we had topped the competition.

LP: All right, well-deserved excitement, I would say. So Dr. Brimhall, then you get word, probably, that your work is moving up in the competition. And what was that moment like for you, that you contributed to this?

BB: Yeah, I received the link to the leaderboard. So I was able to watch it as well. And it was, remember, this was over the holidays last year, so a quieter time. And I kept getting these emails, here we are. And I'd go check. Yeah, it's at third place, second place. And the first place I heard by email and that was really exciting. I mean, this is an international competition; really good groups. We met some of them; you know, a group from Tokyo that were very good. So these were really, really sharp people. So it really does mean a lot to take first place in the world.

LP: Yeah, that's huge.

BB: Huge.

LP: OK, so a little bit more on the competition. This was the BreastPathQ: Cancer Cellularity Challenge. It was conducted by the American Association of Physicists in Medicine, the National Cancer Institute, and SPIE, the International Society for Optics and Photonics.

And as you mentioned, you found out you won early this year. So there were about 100 competing submissions from around the world. Your algorithm performed the best, matching the findings of human pathologists at the highest rate. So what do you think contributed to that accuracy, the best in the world?

DC: Well, I'd say that the algorithm was designed really well, but that's me taking too much credit. I think that the real secret sauce is that we made a way to peer into it and compare it to what a subject matter expertise, what someone like Brad is able to do. I mean, I think fusing that expertise with what we knew we could do well was critical.

LP: So what does this win mean for you now? You've actually had a busy year. What do you foresee this win culminating to?

DC: Well, I think we're looking at doing more and more medical imaging. I think digital pathology was not something that was on our radar at all. But this is something that we found. And we were fortunate. I mean, I've heard Brad say a few times that you don't know who your neighbors are. So to get in touch with UT Health San Antonio, we're thrilled. We're actually looking forward to doing additional research.

So we're looking at making some other kinds of quantifications that you would make. So particularly, we're looking at additional kinds of stains and analysis that has to be done for breast cancer. It's been really exciting. And I think we've got more to look forward to in the future.

BB: Yeah, definitely. We're looking at, for example, the inflammatory cell response to cancers. That's something that people have known about, but it's been difficult to quantify the specific inflammatory cells that come, and they recognize the cancer. They can't necessarily kill it, but they're there. And there is some correlation, we know, between the amount of inflammatory response and how well the patient will perform with the cancer.

But I can see a day where the pathologists would have a monitor, like we currently do, attached to a camera over the microscope. But in this case, the pathologist could select an area of what he or she sees on the computer screen. And the artificial intelligence could then be applied to that to gather additional information that we would have a difficult time quantifying, and in some cases, even recognizing.

And that information may well, in fact, I believe it will, prove to be a prognostic value. We're already gathering information about the about these cancers in terms of their DNA structure, if you will, you know, the mutation analysis information. But this, in this case, you're actually looking at the cells themselves. You're looking at the phenotype, not just what the genes are saying, but the phenotype of the cells. And I think there's prognostic information there. So this could really change things, if we can get that constructed. I think we have the talent here to do it.

LP: How far off do you think this is to becoming reality?

BB: I think it could be within one or two years. You could have, and the nice thing is it would grow. So you'd start off with simpler algorithms. And it will grow rapidly, I think. The amount of-- think about it this way. In the case of doing genetic analysis on cancer, that requires additional money. Because you have to take a sample and submit it for genetic analysis. And that's worthwhile, in many cases. It's worth the money to find out more about the cancer.

In this case, the nice thing is, the information's sitting in front of us right now. There's no additional cost, per se, to pull additional information out of these images. We may be able to do some immuno-stains. That might cost a little, but the cost is pretty minimal compared to large scale DNA sequencing. So this is information that we're just not gathering that's sitting right in front of us. But we can't process it as well as a computer can process it.

LP: So it's not just about quantifying how many cells are cancerous in a particular slide. There's much more to it. And there's more information you can gather. And what are some examples of that information?

BB: Like I say, just the appearance of the cells themselves. I mean, there are factors that we look at and we kind of describe the shape of the cells, for example. There are cancers where the shape makes a difference.

But we just say, they look kind of long. They look kind of circular. It's very, we use very, that's what we do as humans. We categorize that kind of terminology. We say, well, that looks kind of round. Or, that looks kind of oval. But we don't sit down and do complicated mathematics. Not to say, here's the, it's an oval with this particular equation to describe it. We don't have time to do that. And I'm not sure, I think I'd be terrible at doing it anyway because I'm not a mathematician.

But a computer could do that rapidly and synthesize it for hundreds of cells, or thousands of cells. And provide additional information that could be prognostically useful for us to be able to help the patient to anticipate what's going to happen, A, and B, to perhaps modify therapy to say, this may be a more, this therapy may work better for this cancer than it would for that one because of these differences that we've identified in the slides themselves.

LP: OK, so being able to quickly identify and analyze these cells means faster treatment and better treatment options for patients.

BB: Yeah, absolutely.

LP: OK, and you started with breast cancer tumor cells. But does this extend to other types of cancer? Is this all cancers?

BB: Potentially all kinds, yeah. I mean, even cancers that you would think of as not being in tissue. For example, leukemias, they're still in the peripheral blood. The peripheral blood could be analyzed, anything where we can get a pathology image, which is basically any kind of cancer. Yes.

LP: So one to two years, we could see this happening. That's the reality?

BB: Yeah, I think we could have some of the earlier forms. Like I said, I think it will grow after that because we're going to find numerous things. Like, I see it not just as a single algorithm, but as potentially hundreds of them. But as David can tell you, these things really seem to run pretty fast. They don't take a lot of time to run.

LP: And David, you've used this algorithm in other applications. So what had it been used for before?

DC: Well, I think if you look at the particular method that we used, this is just, it's everywhere in computervision, essentially. Our method is a fully convolutional neural network. The general application is doing segmentation. So for driving, for example, this algorithm had been used for identifying lanes and lane markings, keeping your car exactly where it needs to be on the road.

This has been used for off-road terrain segmentation, meaning, identify the grass that you can drive through, versus the water or the bushes that you shouldn't drive through. This really spans a giant list of potential applications. And we made some very heavy modifications to it, but underneath, the technology really remains the same. And it's a really fascinating technology, actually.

The algorithm itself, or the method itself relies on a network of learned parameters that's based on looking at examples and creating the desired output, and making a complex mapping between the two of them.

And when you do this, you're actually solving for millions of parameters. I want to say, and I wish I had written it down so that I wasn't off. But you're talking about things that are on the order of 10 million different solvable, tunable little knobs that you have to figure out in order to make your algorithm work the way you want it to work.

LP: So you could tune those knobs for many applications. And in this particular instance, you tuned it to identify cancer cells.

DC: Yeah, tuned it just right, I'd say.

LP: Yeah. That's amazing. So for each of you, what are your hopes for this algorithm in the future?

DC: So I'd like to use this as a jumping-off point for doing some of the other things that need to be done, or that would be beneficial. So tedious tasks are one thing. So, if you look at what the algorithm does, I think that the contributions are it's repeatable.

So between pathologists, you can have disagreement. And you can have biases. One might generally say things that are higher than they are. One might generally say things that are lower than they are. They might agree in terms of the way they rank things. But if you ended up with one pathologist on one day, and another on the next day, well, now you have a problem. I see that as something that's important.

The next thing that would be important is that algorithms like this can remove some of the tedious work, or some of the things that just aren't done. I mean, a pathologist is dealing with, I don't know, dozens of cases a day, potentially?

BB: Oh, yeah. Yeah.

DC: Yeah. And so they, to free up some of their time so they can do something that is incredibly beneficial by automating tasks that are tedious, or tasks that you wouldn't otherwise do-- I think that adds a lot of value.

And then I think that the last thing that you really do is this has the potential to see things that we haven't seen yet. If we can use algorithms like this to make new kinds of correlations that we hadn't seen before, that opens up the door to, we could treat someone differently and get a better outcome.

And I think that the outcomes is what it's all about. So taking the load off of a very tired pathologist will probably lead to some better outcomes, I should really hope.

LP: So your insight, Brad?

BB: No, I agree. And I'm glad David brought up the idea of some of these very tedious tasks. I mean, some of those are more in the line of looking for infectious organisms like tuberculosis, for example. A pathologist could easily spend half an hour looking at a single slide on a very high magnification trying desperately to see if they can find a tuberculosis organism, a mycobacterium. And sometimes you can see them.

And it's easy to get distracted during that half hour or so. So it'd be so much better to have a computer that doesn't get bored and doesn't lose, it has perfect attention. And it can handle those tedious tasks and look for that. So that's definitely a time-saver. And then also, just to identify features of the tumor that we have not been able to really examine very well in the past.

We often count mitotic figures as a way of creating tumors. And so those are cells that are in the middle of mitosis. And so it's a surrogate way of looking at how fast the tumor is growing. Because if you see a lot of dividing cells, then you know it's growing rapidly.

But that can take time too, to sit down. And I'm not sure how accurately we count those because we just pick the areas. We pick 10 different areas and count them up, and say, well, you know, this is the number of mitoses that we see in this particular image.

I think a computer could do a lot better job. It could sample the tumor much more thoroughly.

LP: Really opening doors in pathology.

BB: Oh, yeah.

LP: How does it feel to be a part of this?

BB: Well, I think it's a lot of fun. I mean, I'm really glad that we were able to hook up and meet because we're in the same town. And it's amazing to me, like, like David said, sometimes you don't even know your neighbors. And so I'm glad this all worked out the way it did. I think it's going to lead to some really good things for both SwRI, as well as the University. But more importantly, I think it's going to have a very positive impact for the people that we're serving, the patients, and the people that are really in need.

LP: All right, well, I'm definitely excited to see where this goes. The potential is going to be amazing, I think. So thank you both for joining us and for the lifesaving and really brilliant work you're doing. And as I said, I know this technique is going to help a lot of people. And I think that's the best kind of science.

DC: Thank you.

BB: Thanks.

LP: And that wraps up this episode of Technology Today.

Subscribe to the Technology Today podcast to hear in-depth conversations with people like David Chambers and Dr. Brimhall, changing our world and beyond through science, engineering, research, and technology.

Connect with Southwest Research Institute on Facebook, Instagram, Twitter, and LinkedIn. Check out the Technology Today Magazine at technologytoday.swri.org. And now is a great time to become an SwRI problem solver. Visit our career page at swri.jobs. Thanks for listening.

[MUSIC PLAYING]

As part of SwRI's Human Performance Initiative, we are developing AI solutions for biomedical and health applications. Computer algorithms can improve health diagnostics, medical imaging and disease detection.

Learn More

Episode 12: Detecting Cancer with AI

In this Episode

TRANSCRIPT