Rain is a machine learning service that uses a large forest of randomized decision trees to detect lung nodules in computed tomography (CT) scans. I built this system as part of the incubation phase for a new venture that sought to improve lung cancer detection.

This work was recently described in a SAJC article, Lung cancer in India: Current status and promising strategies (mirror download):


"Imaging Computation and algorithms that allows faster, more accurate and consistent evaluation of lesions is not only progressing exponentially but is crucial in lung cancer for two reasons. First it will be able to distinguish whether a solitary pulmonary nodule is benign or malignant.[16,17] And secondly it has the potential to allow early diagnosis, which will allow us to diagnose lung cancer when it is operative with curative intent.[17,18] At the forefront of this approach are Tomas Vykruta and Joe Bertolami from the Microsoft Kinect project. We are in discussion with them to devise computer algorithm to distinguish pulmonary tuberculosis from lung cancer with a high degree of accuracy."



Motivation

70% of new cases of lung cancer in India each year are detected at stage 4, which suggests that awareness and access to providers is a major problem, particularly in remote areas of the country. Motivated by these statistics, my team and I were focused on solutions that would:

  1. Create a pipeline for medical AI research in India

  2. Develop new strategies using machine learning for automated lung cancer detection

  3. Deploy services to remote regions to help patients without easy access to screening services



Technology

This system builds upon my work with randomized decision forests and was designed to integrate with existing picture archiving and communication systems (PACS) already depolyed in hospitals. The Chennai lossless 3D image codec was used to efficiently store and transfer the CT data.



More Information

The source code to Rain is not yet available. In the meantime check out my blog for posts related to machine learning.