AI Helps Predict Risk of Lung Nodules Likely to Become Cancerous

By Richard Dargan

Artificial intelligence (AI) can help predict which small lung nodules will go on to become cancerous, potentially speeding appropriate treatment to patients, according to a study presented Monday.

Mukherjee

Results released in 2011 from the National Lung Screening Trial (NLST), a landmark National Cancer Institute study, showed that CT was better than chest radiographs for lung cancer screening. The trial spurred an expansion of Medicare to cover lung cancer screening for high-risk patients such as longtime smokers. The decision was based on the potential of lung cancer screening to catch cancers earlier, when they are more treatable. However, expansion of CT in this setting also created the challenge of an increase in false positive results.

"As more people were screened, the number of false positives increased, resulting in a lot of unnecessary testing," said study co-author Pritam Mukherjee, PhD, from the Stanford Center for Biomedical Informatics Research (BMIR) at the Stanford University School of Medicine, CA. "Follow-up testing can be expensive and invasive and may expose the patient to more radiation, even though only a very small fraction of biopsies prove to be cancerous."

Machine Learning Model Aids Cancer Prediction

Dr. Mukherjee and colleagues used NLST data to develop a machine learning (ML) algorithm that can mine CT images for prognostic information on the risk of lung nodules becoming cancerous.

They used CT scans from more than 1,000 patients who screened positive with lung nodules more than 4 millimeters (mm) in diameter in the NLST. Of those, 553 subjects were diagnosed with cancer during the study and 585 were never diagnosed with cancer during the study but were demographically similar to the cancer-positive group.

"CT scans have many features that can't be discerned by the naked eye," Dr. Mukherjee said. "By training our model on almost 1,200 patients with CT scans positive for lung nodules, we were able to distinguish between patients who would go on to have cancer and those who wouldn't with reasonably high accuracy."

The researchers built a two-stage machine learning ML model for cancer prediction using CT images from one, two and three screening time points, respectively. The first ML stage, common to all three models, detected nodules and predicted malignancy scores. The second ML stage used the popular algorithm XGBoost to predict cancer probability using the locations and malignancy scores of the patient's lung nodules predicted by the first stage.

"Much CT data has no information on where the lesions are," Dr. Mukherjee said. "We tried to look at the CT scan to find out where the tumor is and try to make judgments based on its appearance."

The results showed it is possible to predict whether a patient with lung nodules larger than 4 mm has or will develop cancer in subsequent years based on screening lung CT scans only. Further, the prediction performance improves if CT imaging data from multiple screening timepoints are incorporated into the model.

"What differentiates our study is that we're not just interested in the current state of the tumor. We are also trying to see if the patient will develop cancer three or four years down the line," Dr. Mukherjee said. "This information could help doctors make treatment decisions."

Dr. Mukherjee said the model may reduce the numbers of false positive screens, resulting in less cost and risk for patients with screen-detected lung cancer. The researchers plan to validate their models on additional data and assess their performance across different centers and groups of patients. They also plan to study the method in combination with genetic and pathological data to further improve diagnosis.

"As more patients are followed up and we have more scans, the technique will get better," Dr. Mukherjee said.