Optellum’s technology has been independently validated by world-leading lung cancer experts interested in verifying our claims and quantifying benefits to their own patient populations. Below is a selection of publications from high-quality pulmonary and respiratory journals.

American Journal of Respiratory and Critical Care Medicine
Assessing the Accuracy of a Deep Learning Method to Risk Stratify Indeterminate Pulmonary Nodules

Pierre P. Massion 1,2, Sanja Antic 1, Sarim Ather 3, Carlos Arteta 4, Jan Brabec 5, Heidi Chen 6, Jerome Declerck 4, David Dufek 5, William Hickes 3, Timor Kadir 4, Jonas Kunst 5, Bennett A. Landman 7, Reginald F. Munden 8, Petr Novotny 4, Heiko Peschl 3, Lyndsey C. Pickup 4, Catarina Santos 4, Gary T. Smith 9,10, Ambika Talwar 3, and Fergus Gleeson 3

The management of indeterminate pulmonary nodules (IPNs) remains challenging, resulting in invasive procedures and delays in 
diagnosis and treatment. Strategies to decrease the rate of unnecessary invasive procedures and optimize surveillance regimens are needed.

To develop and validate a deep learning method to improve the management of IPNs.

A Lung Cancer Prediction Convolutional Neural Network model was trained using computed tomography images of IPNs from the National Lung Screening Trial, internally validated, and externally tested on cohorts from two academic institutions.

Measurements and Main Results:
The areas under the receiver operating characteristic curve in the external validation cohorts were 83.5% (95% confidence interval [CI], 75.4–90.7%) and 91.9% (95% CI, 88.7–94.7%), compared with 78.1% (95% CI, 68.7–86.4%) and 81.9 (95% CI, 76.1–87.1%), respectively, for a commonly used clinical risk model for incidental nodules. Using 5% and 65% malignancy thresholds defining low- and high-risk categories, the overall net reclassifications in the validation cohorts for cancers and benign nodules compared with the Mayo model were 0.34 (Vanderbilt) and 0.30 (Oxford) as a rule-in test, and 0.33 (Vanderbilt) and 0.58 (Oxford) as a rule-out test. Compared with traditional risk prediction models, the Lung Cancer Prediction Convolutional Neural Network was associated with improved accuracy in predicting the likelihood of disease at each threshold of management and in our external validation cohorts.

This study demonstrates that this deep learning algorithm can correctly reclassify IPNs into low- or high-risk categories in more than a third of cancers and benign nodules when compared with conventional risk models, potentially reducing the number of unnecessary invasive procedures and delays in diagnosis.

early detection; risk stratification; neural networks;lung cancer; computer-aided image analysis

External validation of a convolutional neural network artificial intelligence tool 
to predict malignancy in pulmonary nodule

David R Baldwin,1 Jennifer Gustafson,2 Lyndsey Pickup,3 Carlos Arteta,3 Petr Novotny,4 Jerome Declerck,3 Timor Kadir,3 Catarina Figueiras,2 Albert Sterba,5 Alan Exell,6 Vaclav Potesil,3 Paul Holland,7 Hazel Spence,7 Alison Clubley,7 Emma O’Dowd,1 Matthew Clark,8 Victoria Ashford-Turner,9 Matthew EJ Callister,9 Fergus V Gleeson2

Estimation of the risk of malignancy in pulmonary nodules detected by CT is central in clinical management. The use of artificial intelligence (AI) offers an opportunity to improve risk prediction. Here we compare the performance of an AI algorithm, the lung cancer prediction convolutional neural network (LCP-­CNN), with that of the Brock University model, recommended in UK guidelines.

A dataset of incidentally detected pulmonary nodules measuring 5–15 mm was 
collected retrospectively from three UK hospitals for use in a validation study. Ground truth diagnosis for each nodule was based on histology (required for any cancer), resolution, stability or (for pulmonary lymph nodes only) expert opinion. There were 1397 nodules in 1187 patients, of which 234 nodules in 229 (19.3%) patients were cancer. Model discrimination and performance statistics at predefined score thresholds were compared between the Brock model and the LCP-­CNN.

The area under the curve for LCP-­CNN was 89.6% (95% CI 87.6 to 91.5), compared with 86.8% (95% CI 84.3 to 89.1) for the Brock model (p≤0.005). Using the LCP-­CNN, we found that 24.5% of nodules scored below the lowest cancer nodule score, compared with 10.9% using the Brock score. Using the predefined thresholds, we found that the LCP-­CNN gave one false negative (0.4% of cancers), whereas the Brock model gave six (2.5%), while specificity statistics were similar between the two models.

The LCP-­CNN score has better discrimination and allows a larger proportion of benign nodules to be identified without missing cancers than the Brock model. This has the potential to substantially reduce the proportion of surveillance CT scans required and thus save significant resources.

Artificial Intelligence Tool for Assessment of Indeterminate Pulmonary Nodules Detected with CT

Roger Y. Kim, Jason L. Oke, Lyndsey C. Pickup, Reginald F. Munden, Travis L. Dotson, Christina R. Bellinger, Avi Cohen, Michael J. Simoff, Pierre P. Massion, Claire Filippini, Fergus V. Gleeson, Anil Vachani

Limited data are available regarding whether computer-aided diagnosis (CAD) improves assessment of malignancy risk in indeterminate pulmonary nodules (IPNs).

To evaluate the effect of an artificial intelligence–based CAD tool on clinician IPN diagnostic performance and agreement for both malignancy risk categories and management recommendations.

Materials and Methods:
This was a retrospective multireader multicase study performed in June and July 2020 on chest CT studies of IPNs. Readers used only CT imaging data and provided an estimate of malignancy risk and a management recommendation for each case without and with CAD. The effect of CAD on average reader diagnostic performance 
was assessed using the Obuchowski-Rockette and Dorfman-Berbaum-Metz method to calculate estimates of area under the receiver operating characteristic curve (AUC), sensitivity, and specificity. Multirater Fleiss κ statistics were used to measure interobserver agreement for malignancy risk and management recommendations.

A total of 300 chest CT scans of IPNs with maximal diameters of 5–30 mm (50.0% malignant) were reviewed by 12 readers (six radiologists, six pulmonologists) (patient median age, 65 years; IQR, 59–71 years; 164 [55%] men). Readers’ average AUC improved from 0.82 to 0.89 with CAD (P < .001). At malignancy risk thresholds of 5% and 65%, use of CAD improved average sensitivity from 94.1% to 97.9% (P = .01) and from 52.6% to 63.1% (P < .001), respectively. Average reader specificity improved from 37.4% to 42.3% (P = .03) and from 87.3% to 89.9% (P = .05), respectively. Reader interobserver agreement improved with CAD for both the less than 5% (Fleiss κ, 0.50 vs 0.71; P < .001) and more than 65% (Fleiss κ, 0.54 vs 0.71; P < .001) malignancy risk categories. Overall reader interobserver agreement for management recommendation categories (no action, CT surveillance, diagnostic procedure) also improved with CAD (Fleiss κ, 0.44 vs 0.52; P = .001).

Use of computer-aided diagnosis improved estimation of indeterminate pulmonary nodule malignancy risk on chest CT scans and improved interobserver agreement for both risk stratification and management recommendations.