The treatment of diseases using cutting edge technologies is one of the prominent features of the healthcare industry. If there is a piece of tech that can make a difference - it will get its piece of the action and prove its worth.
In this regard, neural networks have achieved their well-deserved spotlight. The use of different types of neural networks has proven to be an effective tool in detecting and classifying cancer before it is too late.
In this article, we will talk about:
- The state of skin cancer diagnosis technologies;
- Describe our case study.
Skin cancer is known for its deadliness. If not treated properly, this type of cancer can spread to other parts of the body, and in the worst-case scenario - become fatal.
At the time of writing this piece, skin cancer is amongst the most common types of cancer. According to the Center for Disease Control and Prevention study, The United States healthcare system deals with over 1,5 million new cases on a yearly basis.
However, its treatment workflow leaves a lot to be desired.
The most common problem with skin cancer treatment is a late diagnosis. This is a common occurrence due to a combination of technical and management issues.
- The current healthcare system is overloaded and riddled with bottlenecks in patient management and especially medical testing. In other words, things are going way too slow.
- This is bad news when it comes to cancer because timely diagnosis is one of the keys to effective treatment.
- In addition to this, there is a lack of trained personnel to satisfy demand.
To make things worse, the technology behind diagnosis is not efficient enough to handle things.
- Detection and classification is the most critical and time-sensitive stage.
- Cancer diagnosis relies on a long series of clinical screenings, dermoscopic analysis, biopsies, and histopathological examinations. At best, this sequence takes months to complete.
- The whole process involves numerous professionals and continuous testing, yet, it is only about 77% accurate.
Sounds grim, right? Well, there’s hope.
The rapid development of artificial intelligence and machine learning technologies, especially neural networks, can be a game-changer in cancer classification.
Our company was approached to develop a neural network solution for skin cancer diagnosis. Here’s how we achieved it.
The central machine learning component in the process of a skin cancer diagnosis is a convolutional neural network (in case you want to know more about it - here’s an article).
- CNN can handle the classification of skin cancer with a higher level of accuracy and efficiency than current methods.
The gist of the system is in the way it applies the cancer research body of knowledge and public health databases to perform its operation.
- Human medical professionals who mostly rely on their knowledge, experience and manual handling of the results data.
- However, they are prone to human error.
- On the other hand, neural networks are capable of processing large quantities of data and taking more factors into consideration.
Here’s how it works:
- Classification stage - the detected anomalies are further assessed with different filters.
- The key requirement is to gather as much data as possible in order to make an accurate recognition;
- After this, the resulting data is verified by medical professionals with the available databases and subsequently implemented into the patient’s health record.
The implementation of the machine learning neural network into the process of skin cancer classification can significantly help with the following issues:
- Streamline cancer diagnosis workflow - make it faster, more efficient and cost-effective.
- Lessen the dependence on various medical professionals in the diagnosis process.
- Reduce the delivery time of clinical testing results.
- Increase the accuracy of clinical testing results.
The key requirements for the development of a skin cancer diagnosis neural network were the following:
- System scalability;
- Accuracy of the results;
- Accessible interface with effective visualizations;
- Cost-effectiveness of the infrastructure.
The main challenges of the implementation of neural networks to the cancer diagnosis workflow were the following:
- Data processing takes time to complete, while, at the same time, there is a need to get results as soon as possible;
- The algorithms require time for refinement and optimization;
- Classification requires significant computational resources for input data;
- The maintenance of such infrastructure is quite expensive.
In order to deal with these challenges, we decided to build an entire system on a cloud platform.
- This approach handles the scalability and time-sensitivity issues.
- At the same time, the use of the cloud platform allows limiting spending to only the resources actually used.
The system itself consists of the following elements:
- Image input
- Convolutional neural network for classification
- Cloud Datastore
- Integration with relevant databases
- Browser-based dashboard with results and visualizations
The system was developed with the following tools:
- HAM10000 dataset;
- ImageNet pre-trained models;
- TensorFlow for VGG16 CNN
- Apache Beam for data processing pipeline;
- D3 visualization package;
The information transformation is performed in the following sequence:
- Input images are uploaded to the Cloud Storage and sent to CNN;
- Convolutional Neural Network processes input images:
- Anomaly detection algorithm rounds up the suspicious elements;
- The classification algorithm determines the type of anomaly.
- The results of the processing are then saved to the database;
- After that, the results are summarized and visualized.
CNN was trained on a publicly available skin lesion dataset HAM10000.
The classifiers include the following criteria:
- anatomic location of the lesion;
- patient profile characteristics (age, gender, etc);
- lesion size, scale, and scale-crust;
- telangiectasia and other vascular features;
- pink blush;
- blue-grey ovoids;
- dirt trails;
- purple blotches;
- pale areas;
The image classification algorithm included
- the decision forest classifier
- random forest classifier
The results were visualized as a confusion matrix area under the receiver operating characteristic (AUROC) curve as the key metric.
The scalability challenges are handled by the distributed services of the cloud infrastructure.
- Google Cloud Platform’s autoscaling features provide high scalability of the system. The system can handle as much workload as it needs to do the job.
- The data processing workflow consistency is provided by the Apache Beam framework.
The use of cloud infrastructure cut the maintenance costs by half and made the system cost-effective. With the cloud solution, the maintenance costs are limited to the used resources.
One of the key requirements of the project was to refine image recognition models of the convolutional neural network in order to provide more accurate results.
- CNN was trained on the HAM10000 dataset. This dataset includes samples of different types of skin cancer and its identifying elements.
The biggest challenge of the project was to integrate different input, analysis and visualization tools into a coherent whole.
- The thing is - skin cancer diagnosis workflow is not designed with distributed services in mind. At the same time, its elements can be presented as such.
- Cloud computing power gives enough scalability to process input data in a shorter span of time.
The other big challenge of the project was interface accessibility.
- In order to maintain its usefulness, the system needed an accessible user interface.
- The key requirement was to present the data required in a clearly structured and digestible form.
- In order to figure out the most appropriate design scheme - we applied extensive user testing.
- The resulting interface is an interactive dashboard with data visualization and reporting features.
The implementation of the convolutional neural network to the process of skin cancer diagnosis was a worthwhile test of the technology's capabilities.
Overall, the results are as follows:
- The average time of delivering test results is 24 hours.
- With the CNN classification, the results can be delivered in a matter of hours (one hour average, depending on the amount of input data).
- The accuracy of results averages 90% (on testing data).
- In addition to that, the more system is in action, the more efficient it gets at clinical data processing.
- The operational costs are reduced by half.
As a result, we’ve built a system that is capable of processing large quantities of data in a relatively short time.