Paper status: completed

Automating cancer diagnosis using advanced deep learning techniques for multi-cancer image classification

Published:10/23/2024

multi-cancer image classification (1)Deep CNN Evaluation (1)Tumor Image Segmentation and Contour Feature Extraction (1)DenseNet121 for Medical Image Analysis (1)Automated Cancer Detection (1)

Original Link

Price: 0.100000

12 readers

This analysis is AI-generated and may not be fully accurate. Please refer to the original paper.

TL;DR Summary

This paper automates multi-cancer diagnosis by evaluating ten CNN models on segmented medical images across seven cancer types. DenseNet121 proved most effective, achieving 99.94% validation accuracy, demonstrating deep learning's high potential for efficient and accurate multi-c

Abstract

Automating cancer diagnosis using advanced deep learning techniques for multi-cancer image classification Yogesh Kumar 1 , Supriya Shrivastav 2 , Kinny Garg 5 , Nandini Modi 1  , Katarzyna Wiltos 3 , Marcin Woźniak 3  & Muhammad Fazal Ijaz 4  Cancer detection poses a significant challenge for researchers and clinical experts due to its status as the leading cause of global mortality. Early detection is crucial, but traditional cancer detection methods often rely on invasive procedures and time-consuming analyses, creating a demand for more efficient and accurate solutions. This paper addresses these challenges by utilizing automated cancer detection through AI-based techniques, specifically focusing on deep learning models. Convolutional Neural Networks (CNNs), including DenseNet121, DenseNet201, Xception, InceptionV3, MobileNetV2, NASNetLarge, NASNetMobile, InceptionResNetV2, VGG19, and ResNet152V2, are evaluated on image datasets for seven types of cancer: brain, oral, breast, kidney, Acute Lymphocytic Leukemia, lung an

Mind Map

In-depth Reading

English Analysis~11 min read · 16,441 chars

1. Bibliographic Information

Title: Automating cancer diagnosis using advanced deep learning techniques for multi-cancer image classification
Authors: Yogesh Kumar, Supriya Shrivastav, Kinny Garg, Nandini Modi, Katarzyna Wiltos, Marcin Wozniak, and Muhammad Fazal Ijaz. The authors are affiliated with various institutions, including universities in India and Poland, indicating a collaborative international research effort.
Journal/Conference: The paper was published in a journal, as indicated by the "Received," "Accepted," and "Published online" dates (23 October 2024). The publisher is noted as Springer Nature, a reputable publisher of scientific journals. The specific journal name is not provided in the text, but the context suggests a peer-reviewed publication in the field of computational medicine or engineering.
Publication Year: 2024
Abstract: The paper tackles the challenge of cancer detection, a leading cause of global mortality. It proposes an automated multi-cancer diagnosis system using deep learning to overcome the limitations of traditional invasive and time-consuming methods. The authors evaluate ten different Convolutional Neural Network (CNN) models on image datasets for seven types of cancer: brain, oral, breast, kidney, Acute Lymphocytic Leukemia (ALL), lung and colon, and cervical cancer. The methodology involves image segmentation and contour feature extraction. The models are evaluated on metrics like accuracy, loss, and Root Mean Square Error (RMSE). The key finding is that the DenseNet121 model performed the best, achieving a validation accuracy of 99.94%, a loss of 0.0017, and low RMSE values, demonstrating the high potential of AI for accurate cancer detection.
Original Source Link: The paper is available at /files/papers/68ebccbf00e47ee3518bc8a5/paper.pdf. It is an Open Access article published under a Creative Commons license.

2. Executive Summary

Background & Motivation (Why):
- Core Problem: Traditional cancer diagnosis methods are often invasive, slow, and prone to human error, particularly when dealing with multiple cancer types simultaneously. Early and accurate detection is critical for improving patient outcomes, creating a pressing need for more efficient and reliable diagnostic tools.
- Importance & Gaps: Cancer remains a primary cause of death worldwide. Manual interpretation of medical images (like histopathology slides) is subjective and cannot easily scale. Previous AI-based studies often focused on a single cancer type or used smaller, less diverse datasets, limiting their real-world applicability (generalizability). This paper addresses the gap by developing a framework for multi-cancer classification using a variety of publicly available datasets.
- Innovation: The paper's novelty lies in its comprehensive comparative analysis of ten well-established deep learning models on a combined dataset of seven different cancers. It also employs a detailed image preprocessing pipeline, including advanced segmentation and feature extraction techniques, to enhance the models' ability to identify cancerous regions accurately.
Main Contributions / Findings (What):
- Primary Contributions:
  1. Development and evaluation of a deep learning framework using 10 different transfer learning models for classifying seven types of cancer from histopathology and medical images.
  2. Implementation of an advanced image preprocessing pipeline, including grayscale conversion, Otsu binarization, noise removal, watershed transformation for segmentation, and contour feature extraction.
  3. A rigorous comparative analysis of the models based on multiple performance metrics (accuracy, loss, RMSE, precision, recall, F1-score).
- Key Conclusions:
  1. The study demonstrates that deep learning models can achieve exceptionally high accuracy in multi-cancer classification.
  2. The DenseNet121 model emerged as the most effective classifier, achieving the highest validation accuracy (99.94%) and the lowest validation loss (0.0017).
  3. The results confirm that AI-based techniques, particularly deep learning, are powerful tools that can support clinicians by providing early, accurate, and automated cancer diagnoses, potentially reducing mortality rates.

Foundational Concepts:
- Histopathology Images: These are microscopic images of tissue that have been stained to highlight different cellular structures. Pathologists examine these images to diagnose diseases like cancer. In this paper, they are the primary input data for the AI models.
- Deep Learning (DL): A subfield of machine learning based on artificial neural networks with many layers (hence "deep"). DL models can automatically learn complex patterns and features from large amounts of data, like images, without being explicitly programmed.
- Convolutional Neural Network (CNN): A specialized type of deep learning network designed for processing grid-like data, such as images. CNNs use special layers called convolutional layers to automatically and adaptively learn spatial hierarchies of features, from simple edges and textures to complex objects. They are the standard for image classification tasks.
- Transfer Learning: A technique where a model developed for one task is reused as the starting point for a model on a second task. In this paper, the authors use models (like VGG19, DenseNet121) that were pre-trained on a massive image dataset (ImageNet) and fine-tune them on the specific cancer datasets. This saves significant training time and often improves performance, especially with limited medical data.

Previous Works: The paper reviews several prior studies that used deep learning for diagnosing specific cancers. The "Background" section and Table 1 summarize this landscape, highlighting the progress and remaining gaps.

Acute Myelogenous Leukemia (ALL): Anilkumar et al. (2022) used models like AlexNet and achieved 94.12% accuracy, but on a small dataset.
Brain Tumors: Saedi et al. (2023) used a 2D CNN on MRI images, reaching 96.47% accuracy, but again with a limited dataset. Mohsen et al. (2018) also worked on brain tumors with a CNN, achieving 96.97% accuracy on an even smaller set of 66 images.
Lung and Colon Cancer: Kwon et al. (2023) used a DNN on liquid biopsies, showing high AUC but on a small patient cohort. Hadiyoso et al. (2023) used VGG16 on a large histopathological dataset (25,000 images), achieving a high accuracy of 98.96%.
Breast Cancer: Abunasser et al. (2023) proposed a BCCNN model on the BreakHis dataset, reaching 98.28% accuracy but noted it was time-consuming.
Cervical Cancer: Kalbhor and Shinde (2023) used ResNet and GoogleNet on the Herlev dataset, achieving up to 96.01% accuracy but noted a lack of generalization.

Below is a transcription of the comparative analysis from the paper's Table 1.

Table 1. Comparative analysis.** (Manual Transcription)

Author's Name Cancer Dataset Techniques Outcomes Limitations

Anilkumar et al. (2022)10 ALL Images collected fromAmerican Society ofHaematology (ASH) AlexNet, LuekNet Accuracy=94.12% Small dataset

Saedi et al. (2023)13 Brain 3264 Magnetic ResonanceImaging (MRI) brain images 2D CNN Accuracy = 96.47%AUCROC=0.99 Limited dataset

Autoencoder Accuracy=95.63%UCROC=1

Kwon et al. (2023)25 Lung and colon 95 lung cancers96 colorectal cancer DNN CNN Accuracy=0.92 The parameterso the modelneed to finetuned

Bansal et al. (2022)28 Oral Real time Images are oflow resolutionwhich hinderedthe performanceof he odels

Histopathological dataset Accuracy=92.41%Los=0.03RMSE=0.09

Raza et al. (2023)17 BUSI dataset DeepBraestCancerNet DL model Accuracy=99.63%F1 Score =99.50Recall = 100%Precision=99.50% More imagesshould be taken

Uhm et al. (2021)24 Kidney TCIA dataset Nawaz et al. (2018)18 Anisotropicdiffusion methoddoes not addressall types of noiseiin images

4. Methodology (Core Technology & Implementation)

Principles: The core idea is to automate the classification of seven different types of cancer from medical images using deep learning. The methodology is built on the principle that pre-trained CNNs (transfer learning) can effectively learn the distinguishing visual features of different cancer cells after a systematic preprocessing and segmentation pipeline enhances the relevant parts of the images.
Steps & Procedures: The overall workflow, as illustrated in Fig. 1, consists of several key stages:

该图像是一个示意图，展示了用于多癌症分类的系统设计流程。包括多癌症数据集的预处理、图像分割（含多步骤细节）、特征提取（如面积、周长、像素强度等）、多种深度学习分类器模型的应用，以及基于准确率、损失、RMSE等指标的评估过程，最终实现癌症类型预测。
1. Data Collection: Images for seven cancer types are collected from various public datasets.
2. Data Pre-processing and Segmentation: Raw images are processed to isolate and highlight the regions of interest (cancerous cells).
3. Feature Extraction: Geometric and textural features are computed from the segmented regions.
4. Classification: Ten different CNN models are trained and evaluated on the processed images.
5. Evaluation: The models' performance is measured using various metrics to identify the best-performing architecture.

Datasets: The study utilizes seven publicly available datasets for different cancer types. Table 2 provides a breakdown of the image distribution for training, validation, and testing (a 70%/15%/15% split).

Acute Lymphocytic Leukemia (ALL): 3,256 images of peripheral blood smears from Taleqani Hospital, Iran.
Brain Tumor: 3,064 T1-weighted MRI images from 233 patients, covering glioma, meningioma, and pituitary tumors.
Lung and Colon Cancer: 25,000 histopathological images, generated and augmented from 750 original images of lung and colon tissue.
Kidney Tumor: 12,446 DICOM images from PACS in Dhaka, Bangladesh, representing cysts, normal cases, stones, and tumors.
Cervical Cancer: 4,049 images of isolated cells from the SIPaKMeD Database.
Oral Cancer: 5,192 histopathological images, including normal and Oral Squamous Cell Carcinoma (OSCC) cases.
Breast Cancer: 7,909 histopathology images from 82 patients, including benign and malignant cases.

*

Table 2. Dataset distribution of various types of cancer. (Manual Transcription)

Cancer Type Training Set Validation Set Test Set Total Images

Lung and Colon Cancer 17,500 3750 3750 25,000

Breast Cancer 5536 1186 1186 7909

Brain Cancer 2144 459 459 3064

Kidney Cancer 8712 1866 1866 12,446

Cervical Cancer 2834 607 607 4049

Acute Lymphocytic Leukemia 2280 488 488 3256

Oral Cancer 4946 120 126 5192

Data Pre-processing and Segmentation: This crucial pipeline, visualized in Fig. 2, prepares the images for the deep learning models.

该图像是一个示意图，展示了七种不同癌症（急性淋巴细胞白血病、脑癌、乳腺癌、宫颈癌、肾癌、肺癌与结肠癌、口腔癌）图像的分割处理流程。每类癌症分别呈现了六个步骤的图像处理结果，包括原始图像、灰度图、Otsu二值化、噪声去除、距离变换和分水岭变换，直观地反映了多步骤的图像分割技术在癌症检测中的应用过程。
1. Grayscale Conversion: The RGB (color) images are converted to single-channel grayscale images to reduce computational complexity while retaining essential luminance information. $I _ { g r a y } = 0 . 2 9 9 \cdot I _ { R } + 0 . 5 8 7 \cdot I _ { G } + 0 . 1 1 4 \cdot I _ { B }$
  
  $I_{gray}$ : Intensity of the pixel in the grayscale image.
  
  $I_R, I_G, I_B$ : Intensities of the red, green, and blue channels of the original pixel.
2. Otsu Binarization: This is an automatic thresholding method that converts a grayscale image into a binary (black and white) image, separating the foreground (e.g., cells) from the background. It works by finding a threshold that maximizes the variance between the two classes of pixels. $\sigma _ { B } ^ { 2 } \left( t \right) = w _ { 1 } \left( t \right) \cdot w _ { 2 } \left( t \right) \cdot \left[ \mu _ { 1 } \left( \mathrm { t } \right) - \ \mu _ { 2 } \left( \mathrm { t } \right) \right] ^ { 2 }$
  
  $\sigma_B^2(t)$ : The between-class variance for a given threshold $t$ .
  
  $w_1(t), w_2(t)$ : Probabilities of a pixel belonging to the background and foreground, respectively.
  
  $\mu_1(t), \mu_2(t)$ : Mean intensity values of the background and foreground pixels.
3. Noise Removal: A Gaussian filter is applied to smooth the image and remove random variations in intensity (noise). $I _ { s m o o t h e d } ( x , y ) = \frac { 1 } { 2 \pi \sigma ^ { 2 } } \sum _ { i = - k } ^ { k } \sum _ { j = - k } ^ { k } \exp \left( - \frac { i ^ { 2 } + j ^ { 2 } } { 2 \sigma ^ { 2 } } \right) \cdot I ( x + i , y + j )$
  
  $I_{smoothed}(x, y)$ : The new intensity value of the pixel at (x, y).
  
  $I(x+i, y+j)$ : Intensity of a neighboring pixel.
  
  $\sigma$ : Standard deviation of the Gaussian function, controlling the amount of smoothing.
  
  The double summation represents a convolution operation over a neighborhood of size $(2k+1) \times (2k+1)$ .
4. Distance Transform: This technique calculates, for each pixel, the distance to the nearest non-zero pixel. It helps in separating touching objects. The paper provides a formula for this step, though it appears to be non-standard for a typical distance transform and may represent a different calculation. $D \left( x , y \right) = \sqrt { \sum _ { i = - k } ^ { k } \sum _ { j = - k } ^ { k } I \left( x + i , y + j \right) \mathrm { x } I \left( x + i , y + j \right) }$
  - Critique: This formula seems incorrect for a standard Euclidean distance transform. It calculates the L2-norm of pixel intensities in a local neighborhood, not the distance to a specific feature. A true distance transform calculates each pixel's distance to the closest "object" pixel (e.g., a white pixel in a binary image).
5. Watershed Transformation: This is an advanced segmentation algorithm often used to separate touching or overlapping objects in an image, such as cells in a cluster. It treats the image like a topographic map, where pixel intensities are elevations. "Flooding" the map from its local minima (markers) allows the algorithm to find the boundaries (watershed lines) that separate different catchment basins (objects).

Feature Extraction: After segmentation, contour features are extracted to quantify the shape and properties of the detected regions. These features, summarized in Table 3, include Area, Perimeter, Width, Height, Aspect Ratio, and pixel intensity values (Min/Max/Mean Color). These quantitative descriptors can help the model differentiate between the morphologies of various cancer types.

Table 3. Contour characteristics of images of various classes of cancer dataset. (Manual Transcription)

Parameters	ALL	Kidney	Cervical	Brain	Breast	Oral	Lung and Colon
Area	2.0	2.0	4.0	6.5	0.0	2.0	0.5
Perimeter	6.0	6.82	7.65	13.07	10.0	5.65	3.414
Epsilon	0.60	0.682	0.765	1.307	1.0	0.565	0.3414
Width	3	2	3	6	1	3	2
Height	2	4	4	4	6	3	2
Aspect Ratio	1.5	0.5	0.75	1.5	0.16	1.0	1
Extent	0.33	0.25	0.33	0.27	0.0	0.22	0.125
Diameter	1.59	1.59	2.25	2.87	0.0	1.59	0.79
Min Value	128.0	128.0	127.0	128.0	128.0	98.0	128.0
Max Value	130.0	166.0	136.0	137.0	135.0	184.0	134.0
Min Value Loc	(12,510)	(262,480)	(420,502)	(316,317)	(311,506)	(90,437)	(105,511)
Max Value Loc	(11,510)	(261,478)	(421,502)	(317,317)	(311,511)	(91,437)	(106,511)
Mean Color	(129.16)	(148.5)	(130.625)	(132.0)	(132.0)	(137.4)	(131.0)
Extreme leftmost point	(11,510)	(261,477)	(419,502)	(315,316)	(311,506)	(89,437)	(105,510)
Extreme rightmost	(13,511)	(262,480)	(421,502)	(320,319)	(311,506)	(91,437)	(106,511)
Extreme topmost	(11,510)	(261,477)	(420,502)	(315,316)	(311,506)	(90,436)	(105,510)
Extreme bottommost	(11,511)	(262,480)	(420,504)	(318,319)	(311,511)	(90,438)	(105,511)

Applied Classifiers: Ten different pre-trained CNN architectures were used. Table 4 in the paper details their layered structure and parameter counts. The hyperparameters (like learning rate) were optimized using techniques like grid search and early stopping to prevent overfitting.

Table 4. Layered architecture of applied classifiers.** (Manual Transcription)

Models	Layer	Output Shape	Param#
DenseNet121	Densenet121	(None, 1,1,1024)	7,037,504
	Flatten_1	(None, 1024)	0
	Dense_1	(None, 60)	61,500
	Dense_2	(None, 10)	610
	Total parameters	7,099,614
	Trainable parameters	7,015,966
	Non-trainable parameters	83,648
DenseNet201	Densenet201	(None, 1, 1, 1920)	18,321,984
	Flatten_2	(None, 1920)	0
	Dense_3	(None, 60)	115,260
	Dense_4	(None, 10)	610
	Total parameters	18,437,854
	Trainable parameters	18,208,798
	Non-trainable parameters	229,056
InceptionResNetV2	InceptionResNetV2	(None, 8, 8, 1536)	54,336,736
	Flatten_3	(None, 98304)	0
	Dense_5	(None, 60)	5,898,300
	Dense_6	(None, 10)	610
	Total parameters	60,235,646
	Trainable parameters	60,175,102
	Non-trainable parameters	60,544
InceptionV3	InceptionV3	(None, 8, 8, 2048)	21,802,784
	Flatten_4	Dense_13	(None, 60)	3,104,700
	Dense_14	(None, 10)	610
	Total parameters	7,375,026
	Trainable parameters	7,338,288
	Non-trainable parameters	36,738	0
	Dense_16	(None, 60)	Dense_18	(None, 60)	1,505,340

Similar papers

Recommended via semantic vector search.

No similar papers found yet.