• Open access
  • Published: 24 July 2023

Deep learning applications to breast cancer detection by magnetic resonance imaging: a literature review

  • Richard Adam 1 ,
  • Kevin Dell’Aquila 1 ,
  • Laura Hodges 1 ,
  • Takouhie Maldjian 1 &
  • Tim Q. Duong 1  

Breast Cancer Research volume  25 , Article number:  87 ( 2023 ) Cite this article

6577 Accesses

5 Citations

1 Altmetric

Metrics details

Deep learning analysis of radiological images has the potential to improve diagnostic accuracy of breast cancer, ultimately leading to better patient outcomes. This paper systematically reviewed the current literature on deep learning detection of breast cancer based on magnetic resonance imaging (MRI). The literature search was performed from 2015 to Dec 31, 2022, using Pubmed. Other database included Semantic Scholar, ACM Digital Library, Google search, Google Scholar, and pre-print depositories (such as Research Square). Articles that were not deep learning (such as texture analysis) were excluded. PRISMA guidelines for reporting were used. We analyzed different deep learning algorithms, methods of analysis, experimental design, MRI image types, types of ground truths, sample sizes, numbers of benign and malignant lesions, and performance in the literature. We discussed lessons learned, challenges to broad deployment in clinical practice and suggested future research directions.

Breast cancer is the most common cancer and the second leading cause of cancer death in women. One in eight American women (13%) will be diagnosed with breast cancer in their lifetime, and one in 39 women (3%) will die from breast cancer (American Cancer Society Statistics, 2023). The American Cancer Society recommends yearly screening mammography for early detection of breast cancer for women, which may begin at age 40 [ 1 ]. About 2%–5% of women in the general population in the US have a lifetime risk of breast cancer of 20% or higher [ 1 ], although it can vary depending on the population being studied and the risk assessment method used. The ACS recommends yearly breast magnetic resonance imaging (MRI) in addition to mammography for women with 20–25% or greater lifetime risk [ 1 ]. Early detection and treatment are likely to result in better patient outcomes.

MRI is generally more sensitive and offers more detailed pathophysiological information but is less cost effective compared to mammography for population-based screening [ 2 , 3 ]. Breast MRI utilizes high-powered magnets and radio waves to generate 3D images. Cancer yield from MRI alone averages 22 cancers for every 1000 women screened, a rate of cancer detection roughly 10 times that achieved with screening mammography in average-risk women, and roughly twice the yield achieved with screening mammography in high-risk women [ 4 ]. Many recent studies have established contrast-enhanced breast MRI as a screening modality for women with a hereditary or familial increased risk for the development of breast cancer [ 5 ].

Interpretation of breast cancer on MRI relies on the expertise of radiologists. The growing demand for breast MRI and the shortage of radiologists has resulted in increased workload for radiologists [ 6 , 7 ], leading to long wait times and delays in diagnosis [ 8 , 9 ]. Machine learning methods show promise in assisting radiologists, in improving accuracy with the interpretation of breast MRI images and supporting clinical decision-making and improving patient outcomes [ 10 , 11 ]. By analyzing large datasets of MRIs, machine learning algorithms can learn to identify and classify suspicious areas, potentially reducing the number of false positives and false negatives [ 11 , 12 ] and thus improving diagnostic accuracy. A few studies have shown that machine learning can outperform radiologists in detecting breast cancer on MRIs [ 13 ]. Machine learning could also help to prioritize worklists in a radiology department.

In recent years, deep learning (DL) methods have revolutionized the field of computer vision with wide range of applications, from image classification and object detection to semantic segmentation and medical image analysis [ 14 ]. Deep learning is superior to traditional machine learning because of its ability to learn from unstructured or unlabeled data [ 14 ]. Unlike traditional machine algorithms which require time-consuming data labeling, deep learning algorithms are more flexible and adaptable as they can learn from data that are not labeled or structured [ 15 ]. There have been a few reviews on deep learning breast cancer detection. Oza et al. reviewed detection and classification on mammography [ 16 ]. Saba et al. [ 17 ] presented a compendium of state-of-the-art techniques for diagnosing breast cancers and other cancers. Hu et al. [ 18 ] provided a broad overview on the research and development of artificial intelligence systems for clinical breast cancer image analysis, discussing the clinical role of artificial intelligence in risk assessment, detection, diagnosis, prognosis, and treatment response assessment. Mahoro et al. [ 10 ] reviewed the applications of deep learning to breast cancer diagnosis across multiple imaging modalities. Sechopoulos et al. [ 19 ] discussed the advances of AI in the realm of mammography and digital tomosynthesis. AI-based workflows integrating multiple datastreams, including breast imaging, can support clinical decision-making and help facilitate personalized medicine [ 20 ]. To our knowledge, there is currently no review that systematically compares different deep learning studies of breast cancer detection using MRI. Such a review would be important because it could help to delineate the path forward.

Figure  1 shows a graphic representation of a deep learning workflow. The input layer represents the breast cancer image that serves as input to the CNN. The multiple convolutional layers are stacked on top of the input layer. Each convolutional layer applies filters or kernels to extract specific features from the input image. These filters learn to detect patterns such as edges, textures, or other relevant features related to breast cancer. After each convolutional layer, activation functions like rectified linear unit (ReLU) are typically applied to introduce nonlinearity into the network. Following some of the convolutional layers, pooling layers are used to downsample the spatial dimensions of the feature maps. Common pooling techniques include max-pooling or average pooling. Pooling helps reduce the computational complexity and extract the most salient features. After the convolutional and pooling layers, fully connected layers are employed. These layers connect all the neurons from the previous layers to the subsequent layers. Fully connected layers enable the network to learn complex relationships between features. The final layer is the output layer, which provides the classification or prediction. In the case of breast cancer detection, it might output the probability or prediction of malignancy or benignity.

figure 1

The input layer represents the breast cancer image that serves as input to the CNN. The multiple convolutional layers are stacked on top of the input layer. Pooling layers are used to downsample the spatial dimensions of the feature maps. Fully connected layers are then employed to connect all the neurons from the previous layers to the subsequent layers. The final layer is the output layer, which provides the classification

The goal of this study was to review the current literature on deep learning detection of breast cancer using breast MRI. We included literature in which DL was used for both primary screening setting and as a supplemental detection tool. We compared different deep learning algorithms, methods of analysis, types of ground truths, sample size, numbers of benign and malignant lesions, MRI image types, and performance indices, among others. We also discussed lessons learned, challenges of deployment in clinical practice and suggested future research directions.

Materials and methods

No ethics committee approval was required for this review.

Search strategy and eligibility criteria

PRISMA guidelines for reporting were adopted in our systematic review. The literature search was performed from 2017 to Dec 31, 2022, using the following key words: “breast MRI,” “breast magnetic resonance imaging,” “deep learning,” “breast cancer detection,” and “breast cancer screening.” The database included Pubmed, Semantic Scholar, ACM Digital Library, Google search, Google Scholar, and pre-print depositories (such as Research Square). We noted that many of the computing or machine learning journals were found on sites other than Pubmed. Some were full-length peer-reviewed conference papers, in contrast with small conference abstracts. Articles that were not deep learning (such as texture analysis) were excluded. Only original articles written in English were selected. Figure  2 shows the flowchart demonstrating how articles were included and excluded for our review. The search and initial screening for eligibility were performed by RA and independently verified by KD and/or TD. This study did not review DL prediction of neoadjuvant chemotherapy which has recently been reviewed [ 21 ].

figure 2

PRISMA selection flowchart

Pubmed search yielded 59 articles, of which 22 were review articles, 30 were not related to breast cancer detection on MRI, and two had unclear/unconventional methodologies. Five articles were found in Pubmed search after exclusion (Fig.  2 ). In addition, 13 articles were found on different databases outside of Pubmed, because many computing and machine learning journals were not indexed by Pubmed. A total of 18 articles were included in our study (Table 1 ). Two of the studies stated that the patient populations were moderate/high risk [ 22 , 23 ] or high risk [ 23 ], while the remaining papers do not state whether the dataset was from screening or supplemental MRI.

In this review, we first summarized individual papers and followed by generalization of lessons learned. We then discussed challenges of deployment in the clinics and suggested future research directions.

Summary of individual papers

Adachi et al. [ 13 ] performed a retrospective study using RetinaNet as a CNN architecture to analyze and detect breast cancer in MIPs of DCE fat-suppressed MRI images. Images of breast lesions were annotated with a rectangular region-of-interest (ROI) and labeled as “benign” or “malignant” by an experienced breast radiologist. The AUCs, sensitivities, and specificities of four readers were also evaluated as well as those of readers combined with CNN. RetinaNet alone had a higher area under the curve (AUC) and sensitivity (0.925 and 0.926, respectively) than any of the readers. In two cases, the AI system misdiagnosed normal breast as malignancy, which may be the result of variations in normal breast tissue. Invasive ductal carcinoma near the axilla was missed by AI, possibly due to confusion for normal axillary lymph node. Wider variety of data and larger datasets for training could alleviate these problems.

Antropova et al. [ 24 ] compared MIP derived from the second post-contrast subtraction T1-weighted image to the central slice of the second post-contrast image with and without subtraction. The ground truth was ROIs based on radiology assessment with biopsy-proven malignancy. MIP images showed the highest AUC. Feature extraction and classifier training for each slice for DCE-MRI sequences, with slices in the hundreds, would have been computationally expensive at the time. MIP images, in widespread use clinically, contain enhancement information throughout the tumor volume. MIP images, which represent a volume data, avoid using a plethora of slices, and are, therefore, faster and computationally less intensive and less expensive. MIP (AUC = 0.88) outperformed one-slice DCE image, and subtracted DCE image (AUC = 0.83) outperformed single-slice DCE image (AUC = 0.80). The subtracted DCE image is derived from 2 timepoints, the pre-contrast image subtracted from the post-contrast image, which produces a higher AUC. Using multiple slices and/or multiple timepoints could further increase the AUC with DCE images, possibly even exceeding that of the MIP image (0.88). This would be an area for further exploration.

Ayatollahi et al. [ 22 ] performed a retrospective study using 3D RetinaNet as a CNN architecture to analyze and detect breast cancer in ultrafast TWIST DCE-MRI images. They used 572 images (365 malignant and 207 benign) taken from 462 patients. Bounding boxes drawn around the lesion in the images were used as ground truth. They found a detection rate of 0.90 and a sensitivity of 0.95 with tenfold cross validation.

Feng et al. [ 23 ] implemented the Knowledge-Driven Feature Learning and Integration model (KFLI) using DWI and DCE-MRI data from 100 high-risk female patients with 32 benign and 68 malignant lesions, segmented by two experienced radiologists. They reported 0.85 accuracy. The model formulated a sequence division module and adaptive weighting module. The sequence division module based on lesion characteristics is proposed for feature learning, and the adaptive weighting module proposed is used for automatic feature integration while improving the performance of cooperative diagnosis. This model provides the contribution of sub-sequences and guides the radiologists to focus on characteristic-related sequences with high contribution to lesion diagnosis. This can save time for the radiologists and helps them to better understand the output results of the deep networks. As such, it can extract sufficient and effective features from each sub-sequence for a comprehensive diagnosis of breast cancer. This model is a deep network and domain knowledge ensemble that achieved high sensitivity, specificity, and accuracy.

Fujioka et al. [ 25 ] used 3D MIP projection from early phase (1–2 min) of dynamic contrast-enhanced axial fat-suppressed DCE mages, with performance of CNN models compared to two human readers (Reader 1 = breast surgeon with 5 years of experience and Reader 2 = radiologist with 20 years of experience) in distinguishing benign from malignant lesions. The highest AUC achieved with deep learning was with InceptionResNetV2 CNN model, at 0.895. Mean AUC across the different CNN models was 0.830, and range was 0.750–0.895, performing comparably to human readers. False-positive masses tended to be relatively large with fast pattern of strong enhancement, and false-negative masses tended to be relatively small with medium to slow pattern of enhancement. One false positive and one false negative for non-mass enhancing lesion that was observed were also incorrectly diagnosed by the human readers. The main limitation of their study was small sample size.

Haarburger et al. [ 26 ] performed an analysis of 3D whole-volume images on a larger cohort ( N  = 408 patients), yielding an AUC of up to 0.89 and accuracy of 0.81, further establishing the feasibility of using 3D DCE whole images. Their method involved feeding DCE images from 5 timepoints (before contrast and 4 times post-contrast) and T2-weighted images to the algorithms. The multicurriculum ensemble consisted of a 3D CNN that generates feature maps and a classification component that performs classification based on the aggregated feature maps made by the previous components. AUC range of 0.50–0.89 was produced depending on the CNN models used. Multiscale curriculum training improved simple 3D ResNet18 from an AUC of 0.50 to an AUC of 0.89 (ResNet18 curriculum). A radiologist with 2 years of experience demonstrated AUC of 0.93 and accuracy of 0.93. An advantage of the multicurriculum ensemble is the elimination of the need for pixelwise segmentation for individual lesions, as only coarse localization coordinates for Stage 1 training (performed in 3D in this case) and one global label per breast for Stage 2 training is needed, where Stage 2 involved predictions of whole images in 3D in this study. The high performance of this model can be attributed to the high amount of context and global information provided. Their 3D data use whole breast volumes without time-consuming and cost prohibitive lesion segmentation. A major drawback of 3D images is the requirement of more RAM and many patients required to train the model.

Herent et al. [ 27 ] used T1-weighted fat-suppressed post-contrast MRI in a CNN model that detected and then characterized lesions ( N  = 335). Lesion characterization consisted of diagnosing malignancy and lesion classification. Their model, therefore, performed three tasks and thereby was a multitask technique, which limits overfitting. ResNET50 neural network performed feature extraction from images, and images were processed by the algorithm’s attention block which learned to detect abnormalities. Images were fed into a second branch where features were averaged over the selected regions, then fitted to a logistic regression to produce the output. On an independent test set of 168 images, a weighted mean AUC of 0.816 was achieved. The training dataset consisted of 17 different histopathologies, of which most were represented as very small percentages of the whole dataset of 335. Several of the listed lesion types represented less than 1% of the training dataset. This leads to the problem of overfitting. The authors mention that validation of the algorithm by applying it to 3D images in an independent dataset, rather than using the single 2D images as they did, would show if the model is generalizable. The authors state that training on larger databases and with multiparametric MRI would likely increase accuracy. This study shows good performance of a supervised attention model with deep learning for breast MRI.

Hu et al. [ 28 ] used multiparametric MR images (DCE-MRI sequence and a T2-weighted MRI sequence) in a CNN model including 616 patients with 927 unique breast lesions, 728 of which were malignant. A pre-trained CNN extracted features from both DCE and T2w sequences depicting lesions that were classified as benign or malignant by support vector machine classifiers. Sequences were integrated at different levels using image fusion, feature fusion, and classifier fusion. Feature fusion from multiparametric sequences outperformed DCE-MRI alone. The feature fusion model had an AUC of 0.87, sensitivity of 0.78, and specificity of 0.79. CNN models that used separate T2w and DCE images into combined RBG images or aggregates of the probability of malignancy output from DCE and T2w classifiers both did not perform significantly better than the CNN model using DCE-alone. Although other studies have demonstrated that single-sequence MRI is sufficient for high CNN performance, this study demonstrates that multiparametric MRI (as fusion of features from DCE-MRI and T2-weighted MRI) offers enough information to outperform single-sequence MRI.

Li et al. [ 29 ] used 3D CNNs in DCE-MR images to differentiate between benign and malignant tumors from 143 patients. In 2D and 3D DCE-MRI, a region-of-interest (ROI) and volume-of-interest (VOI) were segmented, and enhancement ratios for each MR series were calculated. The AUC value of 0.801 for the 3D CNN was higher than the value of 0.739 for 2D CNN. Furthermore, the 3D CNN achieved higher accuracy, sensitivity, and specificity values of 0.781, 0.744, and 0.823, respectively. The DCE-MRI enhancement maps had higher accuracy by using more information to diagnose breast cancer. The high values demonstrate that 3D CNN in breast cancer MR imaging can be used for the detection of breast cancer and reduce manual feature extraction.

Liu et al. [ 30 ] used CNN to analyze and detect breast cancer on T1 DCE-MRI images from 438 patients, 131 from I-SPY clinical trials and 307 from Columbia University. Segmentation was performed through an automated process involving fuzzy C-method after seed points were manually indicated. This study included analysis of commonly excluded image features such as background parenchymal enhancement, slice images of breast MRI, and axilla/axillary lymph node involvement. The methods also minimized annotations done at pixel level, to maximize automation of visual interpretation. These objectives increased efficiency, decreased subjective bias, and allowed for complete evaluation of the whole image. Obtaining images with multiple timepoints from multiple institutions made the algorithm more generalizable. The CNN model achieved AUC of 0.92, accuracy of 0.94, sensitivity of 0.74, and specificity of 0.95.

Marrone et al. [ 31 ] used CNN to evaluate 42 malignant and 25 benign lesions in 42 women. ROIs were obtained by an experienced radiologist, and manual segmentation was performed. Accuracy of up to 0.76 was achieved. AUC as high as 0.76 was seen on pre-trained AlexNet versus 0.73 on fine-tuning of pre-trained AlexNet where the last trained layers were replaced by untrained layers. The latter method could allow reduced number of training images needed. The training from scratch AlexNet model is accomplished when AlexNet pre-trained on the ImageNet database is used to extract a feature vector from the last internal CNN layer, and a new supervised training is employed, which yielded the lowest AUC of 0.68 and accuracy of 0.55.

Rasti et al. [ 32 ] analyzed DCE-MRI subtraction images from MRI studies ( N  = 112) using a multi-ensemble CNN (ME-CNN) functioning as a CAD system to distinguish benign from malignant masses, producing 0.96 accuracy with their method. The ME-CNN is a modular and image-based ensemble, which can stochastically partition the high-dimensional image space through simultaneous and competitive learning of its modules. It also has the advantages of fast execution time in both training and testing and a compact structure with a small number of free parameters. Among several promising directions, one could extend the ME-CNN approach to the pre-processing stage, by combining ME-CNN with recent advances in fully autonomous CNNs for semantic segmentation.

Truhn et al. [ 33 ] used T2-weighted images with one pre-contrast and four post-contrast DCE images in 447 patients with 1294 enhancing lesions (787 malignant and 507 benign) manually segmented by a breast radiologist. Deep learning with CNN demonstrated an AUC of 0.88 which was inferior to prospective interpretation by one of the three breast radiologists (7–25 years of experience) reading cases in equal proportion (0.98). When only half of the dataset was used for training ( n  = 647), the AUC was 0.83. The authors speculate that with increased training on a greater number of cases that their model could improve its performance.

Wu et al. [ 34 ] trained a CNN model to analyze and detect lesions from DCE T1-weighted images from 130 patients, 71 of which had malignant lesions and 59 had benign tumors. Fuzzy C-means clustering-based algorithm automatically segmented 3D tumor volumes from DCE images after rectangular region-of-interest were placed by an expert radiologist. An objective of the study was to demonstrate that single-sequence MRI at multiple timepoints provides sufficient information for CNN models to accurately classify lesions.

Yurtakkal et al. [ 35 ] utilized DCE images of 98 benign and 102 malignant lesions, producing 0.98 accuracy, 1.00 sensitivity, and 0.96 specificity. The multi-layer CNN architecture utilized consisted of six groups of convolutional, batch normalization, rectified linear activation function layers, and five max-pooling followed by one dropout layer, one fully connected layer, and one softmax layer.

Zheng et al. [ 36 ] used a dense convolutional long short-term memory (DC-LSTM) on a dataset of lesions obtained through a university hospital ( N  = 72). The method was inspired by DenseNet and built on convolutional LSTM. It first uses a three-layer convolutional LSTM to encode DCE-MRI as sequential data and extract time-intensity information then adds a simplified dense block to reduce the amount of information being processed and improve feature reuse. This lowered the variance and improved accuracy in the results. Compared to a ResNet-50 model trained only on the main task, the combined model of DC-LSTM + ResNet improved the accuracy from 0.625 to 0.847 on the same dataset. Additionally, the authors proposed a latent attributes method to efficiently use the information in diagnostic reports and accelerate the convergence of the network.

Jiejie Zhou et al. [ 37 ] evaluated 133 lesions (91 malignant and 62 benign) using ResNET50, which is similar to ResNET18 used by Truhn et al. [ 33 ] and Haarburger et al . [ 26 ]. Their investigation demonstrated that deep learning produced higher accuracy compared to ROI-based and radiomics-based models in distinguishing between benign and malignant lesions. They compared the metrics resulting from using five different bounding boxes. They found that using the tumor alone and smaller bounding boxes yielded the highest AUC of 0.97–0.99. They also found that the inclusion of a small amount of peritumoral tissue improved accuracy compared to smaller boxes that did not include peritumoral tissue (tumor alone boxes) or larger input boxes (that include tissue more remote from peritumoral tissue), with accuracy of 0.91 in the testing dataset. The tumor microenvironment influences tumor growth, and the tumor itself can alter its microenvironment to become more supportive of tumor growth. Therefore, the immediate peritumoral tissue, which would include the tumor microenvironment, was important in guiding the CNN to accurately differentiate between benign and malignant tumors. This dynamic peritumoral ecosystem can be influenced by the tumor directing heterogeneous cells to aggregate and promote angiogenesis, chronic inflammation, tumor growth, and invasion. Recognizing features displayed by biomarkers of the tumor microenvironment may help to identify and grade the aggressiveness of a lesion. This complex interaction between the tumor and its microenvironment may potentially be a predictor of outcomes as well and should be included in DL models that require segmentation. In DL models using whole images without segmentation of any sort, the peritumoral tissue would already be included, which would preclude the need for precise bounding boxes.

Juan Zhou et al. [ 38 ] used 3D deep learning models to classify and localize malignancy from cases ( N  = 1537) of MRIs. The deep 3D densely connected networks were utilized under image-level supervision (weakly supervised). Since 3D weakly supervised approach was not well studied compared to 2D methods, the purpose of this study was to develop a 3D deep learning model that could identify malignant cancer from benign lesions and could localize the cancer. The model configurations of global average pooling (GAP) and global max-pooling (GMP) that were used both achieved over 0.80 accuracy with AUC of 0.856 (GMP) and 0.858 (GAP) which demonstrated the effectiveness of the 3D DenseNet deep learning method in MRI scans to diagnose breast cancer. The model ensemble achieved AUC of 0.859.

Summary of lessons learned

Most studies were single-center studies, but they came from around the world, with the majority coming from the US, Asia, and Europe. All studies except one [ 33 ] were retrospective studies. The sample size of each study ranged from 42 to 690 patients, generally small for DL analysis. Sample sizes for patients with benign and malignant lesions were comparable and were not skewed toward either normal or malignant lesions, suggesting that these datasets were not from high-risk screening patients because high-risk screening dataset would have consisted of very low (i.e., typically < 5%) positive cases.

Image types

Most studies used private datasets as their image source. ISPY-1 data were the only public dataset noted ( https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=20643859 ). Most studies involved DCE data acquisition, but most analysis include only a single post-contrast MRI. For those that used multiple post-contrast MRI dynamics, most fed each dynamic into each separate independent channel, which does not optimally make use of the relationships between imaging dynamics. Some studies used subtraction of post- and pre-contrast or signal enhancement ratio (SER) [ 24 , 32 , 35 ]. Three studies utilized MIP DCE images to minimize computation cost [ 13 , 24 , 25 ]. However, collapsing images by MIP has drawbacks (i.e., collapse enhanced vascular structures into a single plane may be mistaken as cancer). There were only five studies [ 23 , 26 , 28 , 33 , 36 ] that utilized multiparametric data types (i.e., DCE, T2-weighted, and DWI). Although combining multiple types of MRIs should improve performance, this has not been conclusively demonstrated in practice.

Types of DL architectures

RetinaNet and KFLi are optimized for object detection, while VGGNet, InceptionResNet, and AlexNet are designed for image classification (see review [ 16 , 17 , 39 ]). LSTM is used for time-series modeling. DenseNet, on the other hand, can be used for a wide range of tasks, including image classification, object detection, and semantic segmentation. Ensemble methods, which combine multiple models, are useful for boosting the overall performance of a system. U-Net and R-Net are specialized deep learning models for semantic segmentation tasks in medical image analysis. U-Net uses an encoder–decoder architecture to segment images into multiple classes, while R-Net is a residual network that improves the accuracy and efficiency of the segmentation task.

The most used algorithm is CNN or CNN-based. There is no consensus that certain algorithms are better than others. Given the fact that different algorithms were tested on different datasets, it is not possible to conclude that a particular DL architecture performs better than others. Careful comparison of multiple algorithms on the same datasets is needed. Thus, we only discussed potential advantages and disadvantages of each DL architecture. Performance indices could be misleading.

Although each model has its own unique architecture and design principles, most of the above-mentioned methods utilized convolutional layers, pooling layers, activation functions, and regularization techniques (such as dropout and batch normalization) for model optimization. Additionally, the use of pre-trained models and transfer learning has become increasingly popular, allowing leverage of knowledge learned from large datasets such as ImageNet to improve the performance of their models on smaller, specialized datasets. However, the literature on transfer learning in breast cancer MRI detection is limited. A relatively new deep learning method known as transformer has found exciting applications in medical imaging [ 40 , 41 ].

Ground truths

Ground truths were either based on pathology (i.e., benign versus malignant cancer), radiology reports, radiologist annotation (ROI contoured on images), or a bounding box, with reference to pathology or clinical follow-up (i.e., absence of a positive clinical diagnosis). While the gold standard is pathology, imaging or clinical follow-up without adverse change over a prescribed period has been used as empiric evidence of non-malignancy. This is an acceptable form of ground truth.

Only four out of 18 studies provided heatmaps of the regions that the DL algorithms consider important. Heatmaps enable data to be presented visually in color showing whether the area of activity makes sense anatomically or if it is artifactual (i.e., biopsy clip, motion artifact, or outside of the breast). Heatmaps are important for interpretability and explainability of DL outputs.

Performance

All studies include some performance indices, and most include AUC, accuracy, sensitivity, and specificity. AUC ranged from 0.5 to 1.0, with the majority around 0.8–0.9. Other metrics also varied over a wide range. DL training methods varied, and they included leave-one-out method, hold-out method, and splitting the dataset (such as 80%/20% training/testing) with cross validation. Most studies utilized five- or tenfold cross validation for performance evaluation but some used a single hold-out method, and some did not include cross validation. Cross validation is important to avoid unintentional skewing of data due to partition for training and testing. Different training methods could affect performance. Interpretation of these metrics needs to be made with caution as there could be study reporting bias, small sample size, and overfitting, among others. High-performance indices of the DL algorithm performance are necessary for adoption in clinical use. However, good performance indices alone are not sufficient. Other measures such as heatmaps and experience to gain trust are needed for widespread clinical adoption of DL algorithms.

DL detection of axillary lymph node involvement

Accurate assessment of the axillary lymph node involvement in breast cancer patients is also essential for prognosis and treatment planning [ 42 , 43 ]. Current radiological staging of nodal metastasis has poor accuracy. DL detection of lymph node involvement is challenging because of their small sizes and difficulty in getting ground truths. Only a few studies have reported the use of DL to detect lymph node involvement [ 44 , 45 , 46 ].

Challenges for DL to achieve routine clinical applications

Although deep learning is a promising tool in the diagnosis of breast cancer, there are several challenges that need to be addressed before routine clinical applications can be broadly realized.

Data availability: One of the major challenges in medical image diagnosis (and breast cancer MRI in particular) is the availability of large, diverse, and well-annotated datasets. Deep learning models require a large amount of high-quality data to learn from, but, in many cases, medical datasets are small and imbalanced. In medical image diagnosis, it is important to have high-quality annotations of images, which can be time-consuming and costly to obtain. Annotating medical images requires specialized expertise, and there may be inconsistencies between different experts. This can lead to challenges in building accurate and generalizable models. Medical image datasets can lack diversity, which can lead to biased models. For example, a model trained on images with inadequate representation of racial or ethnicity subgroups may not be broadly generalizable. Private medical datasets obtained from one institution could be non-representative of certain racial or ethnic subgroups and, therefore, may not be generalizable. Publicly available data are unfortunately limited, one of which can be found on cancerimagingarchive.net. Collaborative learning facilitating training of DL models by sharing data without breaching privacy can be accomplished with federated learning [ 47 ].

Interpretability , explainability, and generalizability [ 48 ]: Deep learning models are often seen as “black boxes” that can be difficult to interpret. This is especially problematic in medical image diagnosis, where it is important to understand why a particular diagnosis is made. Recent research has focused on developing methods to explain the decision-making process of deep learning models, such as using attention mechanisms or generating heatmaps to highlight relevant regions in the MRI image. While efforts have been made to develop methods to explain the decision-making process of deep learning models, the explainability of these models is still limited [ 49 ]. This can make it difficult for clinicians to understand the model's decision and to trust the model. Deep learning models may perform well on the datasets on which they were trained but may not generalize well to new datasets or to patients with different characteristics. This can lead to challenges in deploying the model in a real-world setting.

Ethical concerns: Deep learning models can be used to make life-or-death decisions, such as the diagnosis of cancer. This raises ethical concerns about the safety, responsibility, privacy, fairness, and transparency of these models [ 50 ]. There are also social implications (including but not limited to equity) of using artificial intelligence in health care. This needs to be addressed as we develop more and more powerful DL algorithms.

Perspectives and conclusions

Artificial intelligence has the potential to revolutionize breast cancer screening and diagnosis, helping radiologists to be more efficient and more accurate, ultimately leading to better patient outcomes. It can also help to reduce the need for biopsy or unnecessary testing and treatment. However, some challenges exist that preclude broad deployment in clinical practice to date. There need to be large, diverse, and well-annotated images that are readily available for research. Deep learning results need to be more accurate, interpretable, explainable, and generalizable. A future research direction includes incorporation of other clinical data and risk factors into the model, such as age, family history, or genetic mutations, to improve diagnostic accuracy and enable personalized medicine. Another direction is to assess the impact of deep learning on health outcomes to enable more investment in hospital administrators and other stakeholders. Finally, it is important to address the ethical, legal, and social implications of using artificial intelligence.

Availability of data and materials

Not applicable.

Saslow D, Boetes C, Burke W, Harms S, Leach MO, Lehman CD, Morris E, Pisano E, Schnall M, Sener S, et al. American Cancer Society guidelines for breast screening with MRI as an adjunct to mammography. CA Cancer J Clin. 2007;57(2):75–89.

Article   PubMed   Google Scholar  

Feig S. Comparison of costs and benefits of breast cancer screening with mammography, ultrasonography, and MRI. Obstet Gynecol Clin North Am. 2011;38(1):179–96.

Kumar NA, Schnall MD. MR imaging: its current and potential utility in the diagnosis and management of breast cancer. Magn Reson Imaging Clin N Am. 2000;8(4):715–28.

Article   CAS   PubMed   Google Scholar  

Lehman CD, Smith RA. The role of MRI in breast cancer screening. J Natl Compr Canc Netw. 2009;7(10):1109–15.

Mann RM, Kuhl CK, Moy L. Contrast-enhanced MRI for breast cancer screening. J Magn Reson Imaging. 2019;50(2):377–90.

Article   PubMed   PubMed Central   Google Scholar  

Batchu S, Liu F, Amireh A, Waller J, Umair M. A review of applications of machine learning in mammography and future challenges. Oncology. 2021;99(8):483–90.

Wuni AR, Botwe BO, Akudjedu TN. Impact of artificial intelligence on clinical radiography practice: futuristic prospects in a low resource setting. Radiography (Lond). 2021;27(Suppl 1):S69–73.

Skegg D, Paul C, Benson-Cooper D, Chetwynd J, Clarke A, Fitzgerald N, Gray A, St George I, Simpson A. Mammographic screening for breast cancer: prospects for New Zealand. N Z Med J. 1988;101(852):531–3.

CAS   PubMed   Google Scholar  

Wood DA, Kafiabadi S, Busaidi AA, Guilhem E, Montvila A, Lynch J, Townend M, Agarwal S, Mazumder A, Barker GJ, et al. Deep learning models for triaging hospital head MRI examinations. Med Image Anal. 2022;78: 102391.

Mahoro E, Akhloufi MA. Applying deep learning for breast cancer detection in radiology. Curr Oncol. 2022;29(11):8767–93.

Deo RC, Nallamothu BK. Learning about machine learning: the promise and pitfalls of big data and the electronic health record. Circ Cardiovasc Qual Outcomes. 2016;9(6):618–20.

Meyer-Base A, Morra L, Tahmassebi A, Lobbes M, Meyer-Base U, Pinker K. AI-enhanced diagnosis of challenging lesions in breast mri: a methodology and application primer. J Magn Reson Imaging. 2021;54(3):686–702.

Adachi M, Fujioka T, Mori M, Kubota K, Kikuchi Y, Xiaotong W, Oyama J, Kimura K, Oda G, Nakagawa T, et al. Detection and diagnosis of breast cancer using artificial intelligence based assessment of maximum intensity projection dynamic contrast-enhanced magnetic resonance images. Diagnostics (Basel). 2020;10(5):330.

LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436–44.

Koh DM, Papanikolaou N, Bick U, Illing R, Kahn CE Jr, Kalpathi-Cramer J, Matos C, Marti-Bonmati L, Miles A, Mun SK, et al. Artificial intelligence and machine learning in cancer imaging. Commun Med (Lond). 2022;2:133.

Oza P, Sharma P, Patel S, Bruno A. A bottom-up review of image analysis methods for suspicious region detection in mammograms. J Imaging. 2021;7(9):190.

Saba T. Recent advancement in cancer detection using machine learning: systematic survey of decades, comparisons and challenges. J Infect Public Health. 2020;13(9):1274–89.

Hu Q, Giger ML. Clinical artificial intelligence applications: breast imaging. Radiol Clin North Am. 2021;59(6):1027–43.

Sechopoulos I, Teuwen J, Mann R. Artificial intelligence for breast cancer detection in mammography and digital breast tomosynthesis: state of the art. Semin Cancer Biol. 2021;72:214–25.

Sheth D, Giger ML. Artificial intelligence in the interpretation of breast cancer on MRI. J Magn Reson Imaging. 2020;51(5):1310–24.

Khan N, Adam R, Huang P, Maldjian T, Duong TQ. Deep learning prediction of pathologic complete response in breast cancer using MRI and other clinical data: a systematic review. Tomography. 2022;8(6):2784–95.

Ayatollahi F, Shokouhi SB, Mann RM, Teuwen J. Automatic breast lesion detection in ultrafast DCE-MRI using deep learning. Med Phys. 2021;48(10):5897–907.

Feng H, Cao J, Wang H, Xie Y, Yang D, Feng J, Chen B. A knowledge-driven feature learning and integration method for breast cancer diagnosis on multi-sequence MRI. Magn Reson Imaging. 2020;69:40–8.

Antropova N, Abe H, Giger ML. Use of clinical MRI maximum intensity projections for improved breast lesion classification with deep convolutional neural networks. J Med Imaging (Bellingham). 2018;5(1): 014503.

PubMed   Google Scholar  

Fujioka T, Yashima Y, Oyama J, Mori M, Kubota K, Katsuta L, Kimura K, Yamaga E, Oda G, Nakagawa T, et al. Deep-learning approach with convolutional neural network for classification of maximum intensity projections of dynamic contrast-enhanced breast magnetic resonance imaging. Magn Reson Imaging. 2021;75:1–8.

Haarburger C, Baumgartner M, Truhn D, Broeckmann M, Schneider H, Schrading S, Kuhl C, Merhof D: Multi scale curriculum CNN for context-aware breast mri malignancy classification. In: Medical Image Computing and Computer Assisted Intervention—MICCAI ; 2019: 495–503.

Herent P, Schmauch B, Jehanno P, Dehaene O, Saillard C, Balleyguier C, Arfi-Rouche J, Jegou S. Detection and characterization of MRI breast lesions using deep learning. Diagn Interv Imaging. 2019;100(4):219–25.

Hu Q, Whitney HM, Giger ML. A deep learning methodology for improved breast cancer diagnosis using multiparametric MRI. Sci Rep. 2020;10(1):10536.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Li J, Fan M, Zhang J, Li L: Discriminating between benign and malignant breast tumors using 3D convolutional neural network in dynamic contrast enhanced-MR images. In: SPIE Medical Imaging : SPIE; 2017: 10138.

Liu MZ, Swintelski C, Sun S, Siddique M, Desperito E, Jambawalikar S, Ha R. Weakly supervised deep learning approach to breast mri assessment. Acad Radiol. 2022;29(Suppl 1):S166–72.

Marrone S, Piantadosi G, Fusco R, Petrillo A, Sansone M, Sansone C: An investigation of deep learning for lesions malignancy classification in breast DCE-MRI. In: International Conference on Image Analysis and Processing (ICIAP) 2017: 479–489.

Rasti R, Teshnehlab M, Phung SL. Breast cancer diagnosis in DCE-MRI using mixture ensemble of convolutional neural networks. Pattern Recogn. 2017;72:381–90.

Article   Google Scholar  

Truhn D, Schrading S, Haarburger C, Schneider H, Merhof D, Kuhl C. Radiomic versus convolutional neural networks analysis for classification of contrast-enhancing lesions at multiparametric breast MRI. Radiology. 2019;290(2):290–7.

Wu W, Wu J, Dou Y, Rubert N, Wang Y. A deep learning fusion model with evidence-based confidence level analysis for differentiation of malignant and benign breast tumors using dynamic contrast enhanced MRI. Biomed Signal Process Control. 2022;72: 103319.

Yurttakal AH, Erbay H, İkizceli T, Karaçavuş S. Detection of breast cancer via deep convolution neural networks using MRI images. Multimed Tools Appl. 2020;79:15555–73.

Zheng H, Gu Y, Qin Y, Huang X, Yang J, Yang G-Z: Small lesion classification in dynamic contrast enhancement MRI for breast cancer early detection. In: International Conference on Medical Image Computing and Computer-Assisted Intervention; 2018.

Zhou J, Zhang Y, Chang KT, Lee KE, Wang O, Li J, Lin Y, Pan Z, Chang P, Chow D, et al. Diagnosis of benign and malignant breast lesions on DCE-MRI by using radiomics and deep learning with consideration of peritumor tissue. J Magn Reson Imaging. 2020;51(3):798–809.

Zhou J, Luo LY, Dou Q, Chen H, Chen C, Li GJ, Jiang ZF, Heng PA. Weakly supervised 3D deep learning for breast cancer classification and localization of the lesions in MR images. J Magn Reson Imaging. 2019;50(4):1144–51.

Alzubaidi L, Zhang J, Humaidi AJ, Al-Dujaili A, Duan Y, Al-Shamma O, Santamaria J, Fadhel MA, Al-Amidie M, Farhan L. Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. J Big Data. 2021;8(1):53.

Li J, Chen J, Tang Y, Wang C, Landman BA, Zhou SK. Transforming medical imaging with transformers? A comparative review of key properties, current progresses, and future perspectives. Med Image Anal. 2023;85: 102762.

Moutik O, Sekkat H, Tigani S, Chehri A, Saadane R, Tchakoucht TA, Paul A. Convolutional neural networks or vision transformers: Who will win the race for action recognitions in visual data? Sensors (Basel). 2023;23(2):734.

Chang JM, Leung JWT, Moy L, Ha SM, Moon WK. Axillary nodal evaluation in breast cancer: state of the art. Radiology. 2020;295(3):500–15.

Zhou P, Wei Y, Chen G, Guo L, Yan D, Wang Y. Axillary lymph node metastasis detection by magnetic resonance imaging in patients with breast cancer: a meta-analysis. Thorac Cancer. 2018;9(8):989–96.

Ren T, Cattell R, Duanmu H, Huang P, Li H, Vanguri R, Liu MZ, Jambawalikar S, Ha R, Wang F, et al. Convolutional neural network detection of axillary lymph node metastasis using standard clinical breast MRI. Clin Breast Cancer. 2020;20(3):e301–8.

Ren T, Lin S, Huang P, Duong TQ. Convolutional neural network of multiparametric MRI accurately detects axillary lymph node metastasis in breast cancer patients with pre neoadjuvant chemotherapy. Clin Breast Cancer. 2022;22(2):170–7.

Golden JA. Deep learning algorithms for detection of lymph node metastases from breast cancer: helping artificial intelligence be seen. JAMA. 2017;318(22):2184–6.

Gupta S, Kumar S, Chang K, Lu C, Singh P, Kalpathy-Cramer J. Collaborative privacy-preserving approaches for distributed deep learning using multi-institutional data. Radiographics. 2023;43(4): e220107.

Miotto R, Wang F, Wang S, Jiang X, Dudley JT. Deep learning for healthcare: review, opportunities and challenges. Brief Bioinform. 2018;19(6):1236–46.

Holzinger A, Langs G, Denk H, Zatloukal K, Muller H. Causability and explainability of artificial intelligence in medicine. Wiley Interdiscip Rev Data Min Knowl Discov. 2019;9(4): e1312.

Smallman M. Multi scale ethics-why we need to consider the ethics of AI in healthcare at different scales. Sci Eng Ethics. 2022;28(6):63.

Download references

Author information

Authors and affiliations.

Department of Radiology, Albert Einstein College of Medicine and the Montefiore Medical Center, 1300 Morris Park Avenue, Bronx, NY, 10461, USA

Richard Adam, Kevin Dell’Aquila, Laura Hodges, Takouhie Maldjian & Tim Q. Duong

You can also search for this author in PubMed   Google Scholar

Contributions

RA performed literature search, analyzed data, and wrote paper. KD performed literature search, analyzed data, and edited paper. LH analyzed literature and edited paper. TM analyzed literature and edited paper. TQD wrote and edited paper, and supervised.

Corresponding author

Correspondence to Tim Q. Duong .

Ethics declarations

Ethics approval and consent to participate, consent for publication, competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Adam, R., Dell’Aquila, K., Hodges, L. et al. Deep learning applications to breast cancer detection by magnetic resonance imaging: a literature review. Breast Cancer Res 25 , 87 (2023). https://doi.org/10.1186/s13058-023-01687-4

Download citation

Received : 09 March 2023

Accepted : 11 July 2023

Published : 24 July 2023

DOI : https://doi.org/10.1186/s13058-023-01687-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Machine learning
  • Artificial intelligence
  • Texture feature analysis
  • Convolutional neural network
  • Dynamic contrast enhancement

Breast Cancer Research

ISSN: 1465-542X

literature review on breast cancer detection

AIP Publishing Logo

  • Previous Article
  • Next Article

Literature review of breast cancer detection using machine learning algorithms

  • Article contents
  • Figures & tables
  • Supplementary Data
  • Peer Review
  • Reprints and Permissions
  • Cite Icon Cite
  • Search Site

Basem S. Abunasser , Mohammed Rasheed J. AL-Hiealy , Ihab S. Zaqout , Samy S. Abu-Naser; Literature review of breast cancer detection using machine learning algorithms. AIP Conf. Proc. 22 May 2023; 2808 (1): 040006. https://doi.org/10.1063/5.0133688

Download citation file:

  • Ris (Zotero)
  • Reference Manager

Cancer is the leading cause of non-accidental deaths worldwide. Specifically, nearly 10 million people died globally from cancer in the year 2020. Breast Cancer (BC) is a common and fatal disease among women worldwide, and ranks fourth among the fatal diseases among various cancers, such as cervical, colorectal, and cervical tumors and brain tumors. In addition, the number of new breast cancer patients is expected to increase by 70% in the next 20 years. Therefore, early and accurate diagnosis plays a pivotal role in improving prognosis and increasing the survival rate of cancer patients from 30 to 50%. With technical advances in healthcare, machine learning and deep learning play an important role in processing and analyzing a large number of medical images. The aim of this study is to identify studies that have been done on the application of classification techniques in diagnosing BC and analyze them from four perspectives: classification techniques used, Dataset used, Programming language used and best accuracy. We conducted a systematic literature review of 32 selected studies published between 2002 and 2020. The results showed that among the classification techniques examined, artificial neural networks, support vector machines and k-nearest neighbor were the most widely used. Moreover, artificial neural networks, support vector machines, and group classifiers have been implemented better than other techniques, with average accuracy values between 83.45% and 99.30%. Most of the selected studies used Wisconsin CSV dataset and a few of the studies used different types of images such as mammography, ultrasound, and micro images.

Sign in via your Institution

Citing articles via, publish with us - request a quote.

literature review on breast cancer detection

Sign up for alerts

  • Online ISSN 1551-7616
  • Print ISSN 0094-243X
  • For Researchers
  • For Librarians
  • For Advertisers
  • Our Publishing Partners  
  • Physics Today
  • Conference Proceedings
  • Special Topics

pubs.aip.org

  • Privacy Policy
  • Terms of Use

Connect with AIP Publishing

This feature is available to subscribers only.

Sign In or Create an Account

  • Open access
  • Published: 31 January 2024

Unveiling promising breast cancer biomarkers: an integrative approach combining bioinformatics analysis and experimental verification

  • Ali Golestan 1 , 2 ,
  • Ahmad Tahmasebi 3 ,
  • Nafiseh Maghsoodi 1 , 2 ,
  • Seyed Nooreddin Faraji 4 ,
  • Cambyz Irajie 1 &
  • Amin Ramezani 1 , 2  

BMC Cancer volume  24 , Article number:  155 ( 2024 ) Cite this article

524 Accesses

Metrics details

Breast cancer remains a significant health challenge worldwide, necessitating the identification of reliable biomarkers for early detection, accurate prognosis, and targeted therapy.

Materials and methods

Breast cancer RNA expression data from the TCGA database were analyzed to identify differentially expressed genes (DEGs). The top 500 up-regulated DEGs were selected for further investigation using random forest analysis to identify important genes. These genes were evaluated based on their potential as diagnostic biomarkers, their overexpression in breast cancer tissues, and their low median expression in normal female tissues. Various validation methods, including online tools and quantitative Real-Time PCR (qRT-PCR), were used to confirm the potential of the identified genes as breast cancer biomarkers.

The study identified four overexpressed genes ( CACNG4 , PKMYT1 , EPYC , and CHRNA6 ) among 100 genes with higher importance scores. qRT-PCR analysis confirmed the significant upregulation of these genes in breast cancer patients compared to normal samples.

Conclusions

These findings suggest that CACNG4 , PKMYT1 , EPYC , and CHRNA6 may serve as valuable biomarkers for breast cancer diagnosis, and PKMYT1 may also have prognostic significance. Furthermore, CACNG4 , CHRNA6 , and PKMYT1 show promise as potential therapeutic targets. These findings have the potential to advance diagnostic methods and therapeutic approaches for breast cancer.

Peer Review reports

Introduction

Breast cancer has become the most common cancer in women, surpassing lung cancer as the leading cause of cancer incidence. The high incidence rate of the disease, with more than 2.3 million new cases each year, continues to be a cause for concern [ 1 ]. Due to the variation in molecular traits, histological features, and clinical outcomes, breast cancer is classified into several subtypes, providing valuable insights into the disease and aiding treatment planning. Breast cancer is typically divided into six subgroups based on their molecular characteristics: basal-like, claudin-low, normal-like, luminal A and B, and HER2-positive. These subgroups have unique molecular profiles that distinguish their characteristics. The basal-like and claudin-low subtypes of triple-negative breast cancer (TNBC) lack expression of estrogen receptor (ER), progesterone receptor (PR), and HER2. These subtypes are associated with a higher risk of disease relapse and a greater likelihood of developing visceral metastases [ 2 ].

Biomarkers play a crucial role in identifying and predicting outcomes as well as therapeutic approaches for breast cancer. However, some commonly used biomarkers, including carcinoembryonic antigen (CEA), CA 15-3, and CA 27–29, have insufficient sensitivity and specificity, making them unsuitable for detecting breast cancer. They are recommended for monitoring disease progression and evaluating treatment response, particularly in patients with metastatic breast cancer [ 3 ]. On the other hand, biomarkers such as ER, PR, and HER2 have been extensively used in the management of breast cancer. They provide valuable information for prognosis and serve as targets for targeted therapy and hormone therapy [ 4 ]. In the pursuit of advancing breast cancer diagnosis and treatment, it is crucial for researchers to gain a comprehensive understanding of the molecular pathways that underlie breast carcinogenesis. Despite years of dedicated research into breast cancer patients, the overall 5-year survival rate remains unsatisfactory [ 5 ]. Consequently, there is a significant need for the discovery of reliable and novel biomarkers to aid in the early detection of breast cancer, enhance prognostic accuracy, enable precise prediction of disease behavior, and facilitate the development of targeted therapeutic approaches.

High throughput gene expression technologies provide comprehensive genetic information on cancer samples and identify changes in disease progression [ 6 , 7 , 8 ]. High throughput data like genomics, epigenomics, and transcriptomics in online databases were mined to identify potentially novel cancer-associated biomarkers. Recently, machine learning models such as support vector machine (SVM) and random forest have become attractive strategies for obtaining gene signatures.

The study identified new genes associated with breast cancer using large-scale transcriptomics data and the random forest technique. The expression of these genes in breast cancer tissues was validated using qRT-PCR and compared to normal tissues.

Data collection and differential expression analysis

The RNA expression data for breast cancer was obtained from TCGA using the TCGA biolinks package [ 7 ]. Then, differential expression analysis between breast cancer and normal samples was performed using the edgeR Bioconductor package [ 7 ]. The DEGs (differentially expressed genes) were identified based on absolute fold changes > 2 and a false discovery rate (FDR) < 0.01. The top 500 up-regulated genes were selected from DEGs for further scrutiny and analysis.

To investigate the altered expression of selected genes, we investigated their expression levels in the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) database[ 9 ].

Screening of important genes based on the random forest method

A random forest analysis was conducted to identify key genes. The expression data obtained from the TCGA database were initially normalized through log2 transformed fragments per kilobase of transcript per million mapped reads (UQ-FPKMs). Feature selection was then performed using the random forest classifier in the R package 'randomForest' to identify the most important gene features among the up-regulated DEGs. The random forest model's Gini index was used to discriminate between normal and cancer samples [ 10 ]. A higher Gini index value indicates greater relevance and importance of the gene in the classification process. Finally, the genes were ranked based on their significance level, and the top 100 genes with the highest Gini index values were selected as candidate feature genes for further investigation.

Expression profiling analysis

The online database GEPIA2 ( http://gepia2.cancer-pku.cn/#index ) is a valuable resource that provides data derived from the Genotype-Tissue Expression (GTEx) and the TCGA databases. This database contains a comprehensive collection of RNA sequencing data for both cancer and normal tissues [ 11 ]. The GEPIA2 database was used to determine the expression levels of selected genes and their profiles based on pathological stages. Overexpressed genes in breast cancer tissues were identified in comparison to normal tissues. The UCSC Xena ( http://xena.ucsc.edu ) platform [ 12 ] and the UALCAN web resource ( https://ualcan.path.uab.edu/ ) [ 13 ] were also used to compare tumor samples derived from TCGA and normal samples to validate the upregulation of selected genes in the breast cancer sample types.

Subcellular localization study

GeneCards ( https://www.genecards.org/ ) is a comprehensive and integrative database that provides information on all predicted and known human genes, including concise genomic, transcriptomic, proteomic, genetic, and functional information [ 14 ]. The GeneCards database was employed to undertake an initial evaluation of the subcellular location of each protein.

Clinico-pathological variables associated with selected genes

Breast Cancer Gene-Expression Miner v4.8 analysis (bc-GenExMiner v4.8) was employed to assess the association between the expression pattern of selected genes and various clinico-pathological variables in breast cancer. These variables included Scarff-Bloom and Richardson grade status (SBR1, SBR2, and SBR3), BRCA1/2 status (Wild type and Mutated), and PAM50 subtypes (Basal-like, HER-2, Luminal A, Luminal B, and Normal breast-like) [ 15 ]. To determine statistically significant differences between groups, we employed Welch's test followed by the Dunnett-Tukey-Kramer's test. We considered a p-value of less than 0.05 as significant. The UALCAN web resource ( https://ualcan.path.uab.edu/ ) [ 13 ] was used to assess further clinico-pathological features, including nodal metastasis status, TP53 mutation status, and the patient’s gender. The student’s t-test was employed to assess the differences in transcriptional expression.

Functional enrichment analysis

GEPIA2 was used to identify genes with similar expression patterns, ranked by Pearson correlation coefficient (PCC). This facilitated the identification of closely related genes [ 9 ]. In addition, cBioPortal was used to identify genes positively associated with our selected genes through co-expression network analysis. The FunRich tool 3.1.3 [ 16 ] was then used to perform Gene Ontology (GO) and biological pathway enrichment analyses on the overlapping genes obtained from the GEPIA2 and cBioPortal databases. Furthermore, we conducted gene set enrichment analysis (GSEA) utilizing the GSEA software to investigate hallmark gene sets showing significant enrichment [ 17 ]. The expression levels of shared genes sourced from the GEPIA2 and cBioPortal databases were employed to assess the correlation between a gene set and a specific phenotype.

Genetic alteration and somatic mutation analysis

The cBioPortal database ( https://www.cbioportal.org/ ) was used to assess the genetic alterations of the selected genes [ 9 ]. The spectrum of genomic alterations, including mutations and putative copy-number alterations (CNAs), was analysed using default parameters and the GISTIC (Genomic Identification of Significant Targets in Cancer) algorithm. Additionally, the COSMIC database (cancer.sanger.ac.uk) [ 18 ] was employed to investigate the somatic mutations in the candidate genes.

Survival analysis

The association between mRNA expression levels of the selected genes and overall survival (OS) outcomes in breast cancer patients was investigated using the Kaplan-Meier (KM) plotter database ( https://kmplot.com/analysis/ ) [ 19 ]. Statistical significance in the analysis was determined using a log-rank p-value threshold of less than 0.05.

Assessment of selected genes as potential therapeutic targets

To assess the potential impact of the selected genes on breast cancer cell growth and survival and to explore their suitability as potential therapeutic targets, the Cancer Dependency Map ( https://depmap.org/portal/ ) was used [ 20 ].

In vitro mRNA expression quantification

Tissue samples preparation.

Fifty-five female breast cancer patients were included in this study, conducted at MRI Hospital in Shiraz, Iran, between 2015 and 2019. Patients were selected based on molecular pathology tests, biopsy results, and imaging techniques. Surgical procedures were performed to obtain samples of breast cancer tissues (BCT) as well as non-tumoral adjacent tissues (NTAT), serving as the normal control. The collected tissues were promptly frozen and stored at -70°C. Before tissue collection, none of the patients had undergone any form of treatment. Careful collection of tumor tissues from non-necrotic areas ensured that over 90% of the samples were of high-quality. The research protocol received ethical approval from the Research Ethics Committee of Shiraz University of Medical Sciences (Approval ID: IR.SUMS.REC.1401.215).

RNA extraction and cDNA synthesis

The RNX-Plus buffer (CinnaGen, Iran) was used to extract total RNA from snap-frozen tissues, following the manufacturer's instructions. To prevent potential contamination with genomic DNA, DNase I treatment was applied during the RNA extraction process. Afterward, cDNA synthesis was conducted using a cDNA synthesis kit (Fermentas, Lithuania), which combined oligo-dT primers and random hexamers.

The ABI StepOne instrument (Applied Biosystems, USA) was used to perform the qRT-PCR experiments in 48-well microtitre plates. Each reaction consisted of a total volume of 20 µL containing primers specific for the target genes and an ABI SYBR Green master mix. The amplification process included an initial denaturation step at 95 °C for 10 minutes, followed by 45 cycles of denaturation at 95 °C for 20 seconds, and annealing/extension at 60 °C for 60 seconds. To ensure accurate quantification, the CtNorms algorithm was applied to normalize amplification efficiency [ 21 ]. Melt curve analysis was performed to confirm the specificity of the qRT-PCR results. Data normalization was carried out using the 2 -ΔΔCT formula, a widely used method for comparing gene expression levels between different samples. Specific primers for the target genes ( CACNG4 , PKMYT1 , EPYC , and CHRNA6 ) and the internal control gene ( Actin Beta ) were designed using Allele ID 7 software (see Table 1 ). Each sample was tested in triplicate, and the final result was determined by the average Ct value. To validate the qRT-PCR results and exclude the possibility of genomic DNA contamination, additional PCR reactions were conducted using extracted RNA samples without reverse transcription. This step was implemented to confirm that the PCR results originated from complementary DNA (cDNA) and not from genomic DNA.

Clinico-pathological data collection

Clinico-pathological data, including age, human epidermal growth factor receptor 2 (HER2) status, progesterone receptor (PR) status, estrogen receptor (ER) status, lymph nodes (LN) involvement, and molecular breast cancer subtypes (luminal, HER2 overexpressed, and TNBC) were collected from the patient's medical records. The data were then compiled and analyzed to assess their association with CACNG4 , PKMYT1 , EPYC , and CHRNA6 gene expression patterns in the breast cancer tissue samples. This step was important in determining the potential clinical relevance of these genes as biomarkers for breast cancer diagnosis and prognosis. At last, to assess the potential correlation between these gene expression patterns, a nonparametric Spearman correlation coefficient was calculated using the expression data obtained from the qRT-PCR analysis.

Statistical analysis

The data was analyzed using GraphPad Prism 9.4.0 software (GraphPad Software, Inc., USA). A paired t-test was employed to determine the mean differences of the CACNG4 , PKMYT1 , EPYC , and CHRNA6 genes between BCT and NTAT tissues. At the same time, the Mann-Whitney test was used to assess the normalized expression ratio concerning the clinico-pathological features of the study population. The Kruskal-Wallis test was also applied to examine the variations among breast cancer subtypes. The nonparametric Spearman correlation coefficient was used to measure the expression correlation between these genes ( CACNG4 , PKMYT1 , EPYC , and CHRNA6 ).

Screening of important genes

The RNA expression data were extracted from the TCGA database to compare the differentially expressed genes (DEGs) between breast invasive carcinoma (BRCA) and normal samples. A random forest algorithm was then used to determine the significance of the up-regulated DEGs and identify important gene features (Additional files 1 & 2 ). After the selection of 100 genes with higher importance scores, these genes were subjected to a meticulous selection process. Genes exhibiting overexpression in breast cancer tissues compared to normal tissues were identified through an evaluation utilizing UCSC Xena server, UALCAN, and GEPIA2 databases. Then GEPIA2 was employed to discriminate genes with a low median expression in normal female tissues. Finally, a comprehensive examination and analysis of relevant scientific literature and articles were conducted to identify a novel panel of potential diagnostic biomarkers among the determined DEGs, highlighting the innovative aspects of gene exploration. Through this screening process, four genes, namely CACNG4 , PKMYT1 , EPYC , and CHRNA6 , were identified, and then the prognostic and therapeutic implications of these selected genes were investigated using various databases. Based on METABRIC database, CACNG4 , PKMYT1 , and CHRNA6 exhibited differential expression in breast cancer tissues compared to normal tissues. However, EPYC did not display such a distinction (Additional file 3 ).

Profiles of mRNA expression

The GEPIA2 database was used to compare the expression levels of the selected genes between breast cancer patient tissues and normal subjects. The analysis revealed an overexpression of CACNG4 , PKMYT1 , EPYC , and CHRNA6 genes in BRCA tissues compared to normal breast tissues, while this overexpression was statistically significant in the case of CACNG4 and PKMYT1 genes, as depicted in Fig.  1 A. The differential expression of these four genes was also analyzed using the UCSC and UALCAN databases. Using the Xena UCSC tool and the UALCAN web resource ( https://ualcan.path.uab.edu/ ), it was found that the expression levels of all four selected genes were significantly higher in breast cancer than in normal tissues, as presented in Additional file 4 . In addition, the expression level of PKMYT1 ( P = 0.006) showed differential expression across tumor stages, whereas the expression levels of CACNG4 ( P = 0.06), EPYC ( P = 0.4), and CHRNA6 ( P = 0.2) did not show any statistically significant differences, as illustrated in Fig.  1 B.

figure 1

Expression analysis and stage correlation in Breast Invasive Carcinoma (BRCA) patients from GEPIA2 Database. A Expression level of CACNG4 , PKMYT1 , EPYC , and CHRNA6 between BRCA and normal breast tissues. B Correlation with Tumor Stages. TPM, transcripts per million

Prediction of subcellular localization

The GeneCards database was employed to determine the subcellular localization of the identified proteins. Considering the significance of cell membrane proteins as therapeutic targets, the identified genes were investigated for subcellular localization. CACNG4 and CHRNA6 were predicted to be localized to the plasma membrane. In contrast, PKMYT1 was predicted to be localized in the nucleus and cytosol, while EPYC was predicted to be located in the extracellular matrix (Additional file 5 ).

Investigating the correlation of expression with clinico-pathological parameters

We investigated the expression level of CACNG4 , PKMYT1 , EPYC , and CHRNA6 genes in breast cancer patients categorized by Scarff-Bloom and Richardson grade status (SBR1, SBR2, and SBR3), BRCA1/2 status (Wild type and Mutated), and PAM50 subtypes (Basal-like, HER-2, Luminal A, Luminal B, and Normal breast-like) using the Breast Cancer Gene-Expression Miner v4.8 databases (bc-GenExMiner v4.8) (see Additional files 6 & 7 ). The results revealed significant differences in the expression levels of CACNG4 , PKMYT1 , and CHRNA6 among SBR1, SBR2, and SBR3 ( CACNG4 (SBR1 and SBR 2> SBR3, SBR1=SBR2) PKMYT1 (SBR3>SBR2>SBR1) CHRNA6 (SBR1 and SBR 2> SBR3, SBR1=SBR2)). However, no difference was observed in the expression of the EPYC gene among SBR1, SBR2, and SBR3, as displayed in Additional file 7 (Supplementary Fig. 4 A). Moreover, when comparing BRCA1/2 status, no significant differences in the expression levels of CACNG4 and EPYC were found across wild type and mutated BRCA1/2. In contrast, the expression level of PKMYT1 in the mutated ones was higher than that of the wild type, and the expression level of CHRNA6 in the mutated group was lower than that of the wild type as shown in Additional file 7 (Supplementary Fig.  4 A). Furthermore, this analysis revealed various expressions of CACNG4 , PKMYT1 , EPYC , and CHRNA6 in different BRCA subtypes compared to normal breast-like ( CACNG4 (HER-2, Luminal A, and B > normal breast-like, basal-like < normal breast-like), PKMYT1 (basal-like, HER-2, and Luminal B > normal breast-like, Luminal B < normal breast-like), EPYC (Luminal A and B > normal breast-like, basal-like < normal breast-like HER-2 = normal breast-like), and CHRNA6 (HER-2, Luminal A, and B, basal-like > normal breast-like)) that can be seen in Additional file 7 (Supplementary Fig.  4 A). To supplement our findings, we analyzed CACNG4 , PKMYT1 , EPYC , and CHRNA6 genes expression and clinico-pathological parameters based on nodal metastasis status (Normal, N0, N1, N2, N3), TP53 mutation status (Normal, TP53 mutant, and TP53 non-mutant), and patient’s gender (Normal cases, Male and Female patients) through UALCAN database. The findings revealed significant variations in the expression levels of CACNG4 , PKMYT1 , EPYC , and CHRNA6 mRNA in nodal metastasis status (N0, N1, N2, N3 > Normal) and TP53 mutation status (TP53 mutant and TP53 non-mutant > Normal)) (Additional file 7 , Supplementary Fig.  4 B). Furthermore, an evaluation of the patient’s gender demonstrated that CACNG4 , PKMYT1 , and CHRNA6 mRNA expression levels were significantly higher in male and female patients than in normal cases. Notably, the expression level of EPYC mRNA in female patients was higher than in normal cases (p < 0.05); however, there was no significant difference in the expression of EPYC mRNA between male patients and normal cases (p > 0.05) as illustrated in Additional file 7 (Supplementary Fig.  4 B).

Functional and pathway enrichment analysis

The GEPIA2 database (BRCA dataset) and cBioPortal dataset (TCGA, PanCancer Atlas) were used to select the genes co-expressing with CACNG4 , PKMYT1 , EPYC, and CHRNA6 genes. Data from these two databases were crossed to identify the common genes. The co-expressed genes were subjected to gene ontology and pathway analysis using the FunRich tool (version 3.1.1) (Detailed data are supplied in the Additional files 8 to 12). GO enrichment analysis categorizes gene functions into three distinct groups: biological process (BP), molecular function (MF), and cellular component (CC). Based on GO analysis, the common co-expressed genes of CACNG4 were considerably prominent in the subcategories of plasma membrane, transport activity, and signal transduction (Additional file 8 ).

Similarly, the PKMYT1 common co-expressed genes were enriched in the nucleus, DNA binding, and cell cycle subcategories (Additional file 9 ). In contrast, the EPYC common co-expressed genes were enriched in the extracellular matrix, extracellular matrix consistent, and cell growth subcategories (Additional file 10 ). In addition, the CHRNA6 common co-expressed genes were enriched in the subcategories of cytoplasm, transcription regulation activity, and cell communication subcategories (Additional file 11 ). Our biological pathway enrichment analysis revealed that CACNG4 was significantly associated with the ErbB receptor signaling network and the mTOR signaling pathway. At the same time, PKMYT1 was enriched in the DNA replication and cell cycle pathways. Furthermore, EPYC was mainly associated with epithelial-to-mesenchymal transition, and CHRNA6 was observed to be involved in the ErbB receptor signaling network and signal transduction. We also conducted the GSEA to pinpoint the most notable hallmark gene sets. The results of the GSEA showed enrichment in 5 gene sets (Additional file 13 ).

Genetic alterations and somatic mutations in selected genes

The cBioPortal database was utilized to investigate the frequency of genetic alterations in CACNG4 , PKMYT1 , EPYC , and CHRNA6 genes in BRCA. Our findings indicate an overall alteration frequency of 37% across all queried genes, as shown in Additional file 14 . Notably, the highest proportion (16%) of cases were patients with CACNG4 alteration, resulting in mRNA upregulation. Conversely, for the CHRNA6 gene, amplification was the most common alteration (6.55%). Furthermore, the COSMIC database analyzed the mutations of the CACNG4 , PKMYT1 , EPYC , and CHRNA6 genes in breast cancer. Additional file 14 provides details on the types of mutations observed in these four genes. The most significant proportion of cases for all four genes were missense mutations among other types of transformation.

Prognostic potential of selected genes

Based on the findings obtained from the Kaplan-Meier plotter database analysis, it was observed that a higher expression of PKMYT1 exhibited a significant association with unfavorable overall survival (OS) outcomes in BRCA (OS Hazard Ratio (HR) = 1.38, log-rank p-value = 0.00074) as illustrated in Fig.  2 A. In contrast, a high expression of CHRNA6 was shown to confer a potentially favorable prognosis (OS HR=0.81, logrank P =0.025) (Fig.  2 B). However, there was no statistically significant difference in OS for CACNG4 (OS HR=0.84, logrank P =0.06) (Fig.  2 C) and EPYC (OS HR=1.04, logrank P =0.72) genes (Fig.  2 D).

figure 2

Survival analysis of query genes in BRCA patients from the Kaplan-Meier plotter database. Overall survival curves of (A) PKMYT1 , (B) CHRNA6 , (C) CACNG4 , and (D) EPYC were analyzed. A log-rank p -value below the 0.05 threshold indicates a statistically significant association. HR, Hazard ratio

Assessment of tumor cell line dependency on CACNG4, PKMYT1, EPYC, and CHRNA6

Tumor cell growth and survival are dependent on the expression levels of some of the therapeutic biomarkers. The Cancer Dependency Map analysis tool is one of the databases that can help identify biomarkers associated with tumour cell viability. In this study, the DepMap tool was used to evaluate the significance of identified biomarkers for breast tumour cell growth and survival. Specifically, siRNA and CRISPR screening data were analyzed to determine the likelihood of breast cell line dependency on the identified genes, as indicated by the dependency scores. A lower Chronos score suggests a higher likelihood that the gene of interest is crucial in a particular cell line. Among our identified genes, PKMYT1 exhibited a significant dependency score in the case of CRISPR knockout (a lower Chronos score), while CACNG4 , EPYC , and CHRNA6 did not display substantial dependency scores, as indicated in Additional file 15 .

mRNA expression quantification

Based on qRT-PCR analysis conducted on 55 breast cancer patients and normal cases, it was observed that CACNG4 , PKMYT1 , EPYC , and CHRNA6 mRNA exhibited significantly higher expression levels in breast cancer tissues (BCT) compared to non-tumoral adjacent tissues (NTAT) ( P < 0.0001). As depicted in Fig. 3 , paired t-tests comparing the gene expression profiles of CACNG4 , PKMYT1 , EPYC , and CHRNA6 between breast tumors and normal tissues revealed an almost 5.55-fold, 2.31-fold, 2.32-fold, and 2.14-fold increase in breast tumors, compared to normal tissues, respectively. Notably, PCR reactions performed on extracted RNA samples without reverse transcription showed no amplification.

figure 3

The relative gene expression levels of CACNG4 , PKMYT1 , EPYC , and CHRNA6 were compared between non-tumoral adjacent tissues (NTAT) and breast cancer tissues (BCT) using the qRT-PCR technique. The findings revealed a significant upregulation of mRNA expression levels for CACNG4 , PKMYT1 , EPYC , and CHRNA6 in BCT compared to NTAT (**** indicates a p-value less than 0.0001)

Association of gene expression pattern with clinico-pathological characteristics

An evaluation of the clinico-pathological features of the breast cancer patients revealed that the mean age was 57.6 ± 11.5 (37-76) years. Most patients (77.5%) were diagnosed with early-stage breast cancer, specifically stages I and II, and 47% had positive lymph node involvement. Fig.  4 and Additional file 16 show the association between the relative expression of CACNG4 , PKMYT1 , EPYC , and CHRNA6 genes with clinico-pathological features, including age, ER, PR, HER2 status, TNM stages, and histological grades, which were assessed using the Mann-Whitney test. Analysis of CACNG4 mRNA expression unveiled a significant upregulation in patients with grade III breast cancer compared to patients with grade I and II tumors. Additionally, a significant increase in CACNG4 mRNA expression was observed in ER-positive breast cancer patients compared to ER-negative cases, as illustrated in Fig.  4 A. PKMYT1 mRNA expression was up-regulated in patients aged 50 years and older, as well as in patients with HER2-positive status, as displayed in Fig.  4 B. Analysis revealed a significant increase in EPYC mRNA expression levels in hormone receptor-positive (HR+) breast cancer patients compared to HR-negative (HR-) individuals. Moreover, a substantial upregulation of EPYC mRNA expression was observed in late-stage (III+IV) breast cancer patients compared to early-stage (I+II) patients, as depicted in Fig.  4 C. Fig.  4 D shows that CHRNA6 mRNA expression was higher in patients over 50 years of age, as well as in patients with PR-positive and HER2-positive status. The association of these identified genes with lymph node status and breast cancer subtypes was also evaluated, but no significant differences were observed. Therefore, these findings were not included in Fig.  4 . Nonparametric Spearman correlation analysis was conducted to assess the strength of association between the identified genes. A significant positive correlation was observed between CACNG4 and EPYC mRNA expression levels (Spearman's correlation coefficient = 0.87, P value < 0.0001), and between CACNG4 and CHRNA6 mRNA expression levels (Spearman's correlation coefficient = 0.5, P value = 0.0008). However, no significant correlations were observed between PKMYT1 and EPYC , PKMYT1 and CACNG4 , PKMYT1 and CHRNA6 , as well as EPYC and CHRNA6 mRNA expression levels in breast cancer patients (Fig.  5 ).

figure 4

Association of gene expression patterns of CACNG4 , PKMYT1 , EPYC , and CHRNA6 with clinico-pathological features in breast cancer patients. The expression patterns of these genes with age, ER, PR, and HER2 status, TNM stages, and histological grades. Statistical significance was indicated using asterisks (* p < 0.05, ** p < 0.01, and *** p < 0.001). ER: Estrogen Receptor; PR: Progesterone Receptor; HER2: Human Epidermal Growth Factor Receptor 2; and TNM: Tumor, Node, Metastasis

figure 5

Correlation analysis. Spearman correlation analysis based on qRT-PCR results of 55 breast cancer patients showed a significant positive correlation between CACNG4 and EPYC mRNA expression (Spearman's correlation = 0.87, P < 0.0001), and a positive correlation was observed between CACNG4 and CHRNA6 mRNA expression (Spearman's correlation = 0.5, P = 0.0008)

There is an urgent need to characterize new biomarkers that can facilitate early detection of breast cancer and overcome the limitations of mammography and the challenges of current tumor biomarkers such as CA-125 and CEA [ 22 , 23 ].

In this study, RNA expression data was obtained from TCGA to identify DEGs between BRCA and normal samples. The up-regulated genes were then analyzed using the random forest algorithm to identify the most important genes. These key genes were further investigated based on their overexpression in breast cancer tissues, low median expression in normal female tissues, and potential as novel diagnostic biomarkers. Four genes were identified from this screening: CACNG4 , PKMYT1 , EPYC , and CHRNA6 . Integrated online bioinformatics databases were used to gain insight into the diagnostic, prognostic, and therapeutic roles of these identified potential biomarkers. Analysis using the UCSC Xena tool confirmed higher expression of these four genes in breast cancer tissues than in normal tissues. In vitro quantification in breast tumor tissues further confirmed the overexpression of these novel identified BRCA biomarkers. The association of these genes with various clinico-pathological parameters in breast cancer patients suggests that these identified genes could be used as potential therapeutic biomarkers in breast cancer patients. Pathway analysis conducted using the biological pathway revealed the involvement of these identified genes in the regulation of key cellular processes, including cell growth, which is critical for cancer development and progression. In addition, analysis of the COSMIC and cBioPortal databases showed that aberrant expression of these novel genes in breast cancer is associated with mutations and genetic alterations. These findings provide valuable insights for researchers investigating the molecular mechanisms of breast cancer. They also provide clinicians with potential targets that could improve diagnostic accuracy and contribute to the development of more effective treatment strategies.

The identification of CACNG 4 as a potential breast cancer biomarker is an important step towards improving clinical outcomes for breast cancer patients. As a transmembrane type I, AMPA receptor regulatory protein, CACNG4 plays a critical role in regulating both channel gating and trafficking of AMPA receptors [ 24 ].

Amplification of CACNG4 has been shown to contribute to increased breast cancer cell motility, transformation, and metastasis, highlighting the importance of targeted therapies that can disrupt its actions [ 25 ]. As CACNG4 is located on the plasma membrane, antibody-based therapies have the potential to inhibit its function and impede breast cancer progression, providing a viable and valuable approach for the development of novel treatment strategies. Additionally, our findings in the biological pathway analysis revealed that CACNG4 is involved in the ErbB receptor signaling network and the mTOR signaling pathway, both of which have been implicated in cancer metastasis and poor prognosis based on studies by Drago et al. and Tian et al. [ 26 , 27 ]. It is worth noting that the molecular function of CACNG4 is voltage-gated calcium channel activity. Studies have shown that calcium channel antagonists have anti-proliferative effects on various cell types, including vascular, retinal pigment, and prostate cancer cells. Therefore, targeting Ca 2+ pumps or channels has been suggested as a potential therapeutic approach for the treatment of breast cancer [ 28 ].

The protein kinase PKMYT1 (Membrane Associated Tyrosine/Threonine 1), a member of the WEE kinase family, has been shown to play a negative role in the G2/M phase of the cell cycle and has been implicated in the development and progression of several cancers, including hepatic, glioblastoma, colorectal, and non-small cell lung cancers [ 29 ]. Overexpression of PKMYT1 in these cancers is typically associated with poor prognosis and disease progression [ 30 ]. Based on Kaplan-Meier plotter database analysis, PKMYT1 overexpression is also associated with a poor prognosis in breast cancer patients. Liu et al. also reported that PKMYT1 overexpression had been linked to poor prognosis, suggesting that it may be an appealing therapeutic target for breast carcinoma [ 29 ]. A study by Zhang et al. demonstrated that PKMYT1 upregulation promotes tumor progression and correlates with poorer overall survival in patients with esophageal squamous cell carcinoma (ESCC) [ 31 ]. In this study, FunRich tool analysis revealed that the biological pathway for co-expressed genes with the PKMYT1 gene is cell cycle and DNA replication, indicating that overexpression of this gene could develop breast cancer tumorigenesis. This protein upregulation is crucial for the development of some cancers, such as glioblastoma, colon cancer, and hepatic carcinoma [ 32 ], and promotes gastric cancer (GC) cell proliferation and apoptosis resistance [ 33 ]. This may be due to the effects of PKMYT1 on enhancing the AKT/mTOR signaling pathway in promoting carcinogenesis and the progression of cancer cells through other pathways, such as activation of Notch signaling [ 34 ]. Based on the Cancer Dependency Map analysis tool, lower dependency scores correspond to a higher likelihood that the gene is essential for cell survival or growth. PKMYT1 has been identified as critical for breast cancer cell line survival, suggesting its potential as a viable strategy for therapeutic intervention in breast cancer patients. In another study, PKMYT1 was identified as a promising target to enhance the radio sensitivity of lung adenocarcinoma (LUAD). This finding suggests that targeting PKMYT1 could potentially be an attractive target for anticancer therapy [ 35 ].

Epiphycan ( EPYC) is a member of the small leucine-rich repeat proteoglycan family. Epiphycan, also known as dermatan sulfate proteoglycan 3, interacts with collagen fibrils and other extracellular matrix proteins and regulates fibrillogenesis. It has been suggested that EPYC is involved in bone formation, maintaining joint integrity, and establishing the organized structure of cartilage through matrix organization [ 36 ]. EPYC protein is secreted into the extracellular matrix based on the GenCards database analysis. Studies have shown that insufficient expression of EPYC can lead to corneal dystrophy and hearing loss [ 37 ]. However, there have been very few studies on the role of EPYC in cancer. The FunRich analysis tool revealed that genes co-expressed with EPYC are mainly involved in epithelial-mesenchymal transition (EMT), a process in which breast cancer cells acquire mobility, leading to progression and metastasis [ 38 ]. A study by Deng et al. investigated the effects of EPYC overexpression on the proliferation, invasion, and metastasis of ovarian cancer cells [ 36 ]. In the current study, EPYC was found to be positively co-expressed with COL11A1 and MMP13. Overexpression of COL11A1 is often associated with an aggressive tumor phenotype and a poor prognosis in many solid tumor types, including pancreatic, breast, ovarian, and colorectal cancers [ 39 ]. MMP-13 may be vital for the invasion and metastasis of breast cancer cells [ 40 ] and may be helpful as a prognostic marker when assessed simultaneously with lymph node status and HER2 expression [ 41 ]. Additionally, Spearman correlation analysis revealed a significant positive correlation between the mRNA expression level of CACNG4 and EPYC , indicating a strong association between the expression of these two genes.

The CHRNA6 gene encodes an alpha subunit of neuronal nicotinic acetylcholine receptors, which function as ion channels and play a crucial role in neurotransmission in the nervous system. This protein is activated by acetylcholine and exogenous nicotine and mediates dopaminergic neurotransmission. In this study, the CHRNA6 protein is predicted to be expressed on the plasma membrane based on the GenCards database, indicating that antibody-targeted therapy could be helpful. However, there is currently no in silico or experimental study on the effects of the CHRNA6 gene on cell proliferation and tumor progression, and it could be a potential novel biomarker in cancer studies. This study investigates for the first time the mRNA expression of CHRNA6 in breast tumor tissues. The FunRich analysis tool revealed that CHRNA6 interacts with other molecules and is involved in the ErbB receptor signaling network and signal transduction. This pathway plays a crucial role in regulating cell growth and differentiation, and its dysregulation has been implicated in various cancers [ 27 ]. The clinico-pathological databases analysis showed that HER2 upregulation and PR downregulation were associated with high CHRNA6 expression, and BRCA1/2 mutation was associated with low CHRNA6 expression, suggesting that CHRNA6 may be a potential diagnostic biomarker in breast cancer. The co-expression of CHRNA6 with TLR7 and OLR1 was investigated and confirmed. Survival analysis showed that TLR7 expression had a significant impact on survival [ 42 ]. OLR1 overexpression revealed a poor prognosis in breast cancer and might represent a potential therapeutic target for breast cancer patients [ 43 ].

This retrospective study has some limitations. First, although new breast cancer-associated biomarkers are predicted, their mechanism of action remains unclear. Second, the results need to be validated by a larger sample size and more experimental studies. Therefore, additional prospective clinical and large-scale studies are needed to validate these results.

The integration of bioinformatics databases could help to find and select novel diagnostic, prognostic, and therapeutic biomarkers. Through bioinformatics analysis and qRT-PCR validation, we confirmed the upregulation of CACNG4 , PKMYT1 , EPYC , and CHRNA6 in breast cancer. The co-expression and GO enrichment analyses shed light on the potential mechanisms of these genes in breast cancer development and progression. We propose that CACNG4 , PKMYT1 , and CHRNA6 hold promise as potential targets for both the diagnosis and treatment of breast cancer, while EPYC has the potential to be used only as an effective diagnostic biomarker.

Availability of data and materials

The data can be available from the corresponding author upon reasonable request.

The datasets analysed during the current study are publicly available at:

1. The raw data were obtained from the TCGA database ( https://portal.gdc.cancer.gov/ ) using the TCGAbiolinks R package.

2. The online database GEPIA2 ( http://gepia2.cancer-pku.cn/#index ).

3. The UCSC Xena ( http://xena.ucsc.edu ) platform.

4. The UALCAN web resource ( https://ualcan.path.uab.edu/ ).

5. GeneCards database ( https://www.genecards.org/ ).

6. Breast Cancer Gene-Expression Miner v4.8 analysis (bc-GenExMiner v4.8).

7. The cBioPortal database ( https://www.cbioportal.org/ ).

8. The COSMIC database (cancer.sanger.ac.uk).

9. Kaplan-Meier (KM) plotter database ( https://kmplot.com/analysis/ ).

10. The Cancer Dependency Map ( https://depmap.org/portal/ ).

Abbreviations

Breast Cancer

Breast Invasive Carcinoma

Estrogen Receptor

Progesterone Receptor

Triple-Negative Breast Cancer

The Scarff–Bloom–Richardson grade

Overall Survival

Differentially Expressed Genes

Gene Ontology

The Cancer Genome Atlas

Biological Process

Cellular Component

Molecular Function

Gene Set Enrichment Analysis

Sung H, Ferlay J, Siegel RL, et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin. 2021;71(3):209–49. https://doi.org/10.3322/caac.21660 .

Article   PubMed   Google Scholar  

Deng JL, Xu YH, Wang G. Identification of Potential Crucial Genes and Key Pathways in Breast Cancer Using Bioinformatic Analysis. Front Genet. 2019;10:695. https://doi.org/10.3389/fgene.2019.00695 .

Article   CAS   PubMed   PubMed Central   Google Scholar  

Hing JX, Mok CW, Tan PT, et al. Clinical utility of tumour marker velocity of cancer antigen 15–3 (CA 15–3) and carcinoembryonic antigen (CEA) in breast cancer surveillance. Breast. 2020;52:95–101. https://doi.org/10.1016/j.breast.2020.05.005 .

Li X, Gou J, Li H, et al. Bioinformatic analysis of the expression and prognostic value of chromobox family proteins in human breast cancer. Sci Rep. 2020;10(1):17739. https://doi.org/10.1038/s41598-020-74792-5 .

Qiu N, He Y, Zhang S, et al. Cullin 7 is a predictor of poor prognosis in breast cancer patients and is involved in the proliferation and invasion of breast cancer cells by regulating the cell cycle and microtubule stability. Oncol Rep. 2018;39(2):603–10. https://doi.org/10.3892/or.2017.6106 .

Article   CAS   PubMed   Google Scholar  

Samad A, Haque F, Nain Z, et al. Computational assessment of MCM2 transcriptional expression and identification of the prognostic biomarker for human breast cancer. Heliyon. 2020;6(10):e05087. https://doi.org/10.1016/j.heliyon.2020.e05087 .

Rossing M, Sørensen CS, Ejlertsen B, et al. Whole genome sequencing of breast cancer. Apmis. 2019;127(5):303–15. https://doi.org/10.1111/apm.12920 .

Article   PubMed   PubMed Central   Google Scholar  

Akrami S, Tahmasebi A, Moghadam A, et al. Integration of mRNA and protein expression data for the identification of potential biomarkers associated with pancreatic ductal adenocarcinoma. Comput Biol Med. 2023;157:106529.

Gao, J., B.A. Aksoy, U. Dogrusoz, et al., Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci Signal, 2013. 6(269): pl1. https://doi.org/10.1126/scisignal.2004088 .

Liaw, A. and M. Wiener, Classification and Regression by RandomForest. Forest, 2001. 23

Tang Z, Kang B, Li C, et al. GEPIA2: an enhanced web server for large-scale expression profiling and interactive analysis. Nucleic Acids Res. 2019;47(W1):W556-w560. https://doi.org/10.1093/nar/gkz430 .

Goldman MJ, Craft B, Hastie M, et al. Visualizing and interpreting cancer genomics data via the Xena platform. Nat Biotechnol. 2020;38(6):675–8. https://doi.org/10.1038/s41587-020-0546-8 .

Chandrashekar DS, Karthikeyan SK, Korla PK, et al. UALCAN: An update to the integrated cancer data analysis platform. Neoplasia. 2022;25:18–27. https://doi.org/10.1016/j.neo.2022.01.001 .

Safran M, Rosen N, Twik M, et al. The GeneCards Suite. In: Abugessaisa I, Kasukawa T, editors., et al., Practical Guide to Life Science Databases. Singapore: Springer Nature Singapore; 2021. p. 27–56.

Chapter   Google Scholar  

Jézéquel, P., W. Gouraud, F. Ben Azzouz, et al., bc-GenExMiner 4.5: new mining module computes breast cancer differential gene expression analyses. Database (Oxford), 2021. 2021. https://doi.org/10.1093/database/baab007 .

Pathan M, Keerthikumar S, Ang CS, et al. FunRich: An open access standalone functional enrichment and interaction network analysis tool. Proteomics. 2015;15(15):2597–601. https://doi.org/10.1002/pmic.201400515 .

Subramanian A, Tamayo P, Mootha VK, et al. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci. 2005;102(43):15545–50. https://doi.org/10.1073/pnas.0506580102 .

Tate JG, Bamford S, Jubb HC, et al. COSMIC: the catalogue of somatic mutations in cancer. Nucleic Acids Res. 2018;47(D1):D941–7. https://doi.org/10.1093/nar/gky1015 .

Article   CAS   PubMed Central   Google Scholar  

Győrffy B. Discovery and ranking of the most robust prognostic biomarkers in serous ovarian cancer. GeroScience. 2023. https://doi.org/10.1007/s11357-023-00742-4 .

Tsherniak A, Vazquez F, Montgomery PG, et al. Defining a cancer dependency map. Cell. 2017;170(3):564-576.e16. https://doi.org/10.1016/j.cell.2017.06.010 .

Ramezani A. CtNorm: Real time PCR cycle of threshold (Ct) normalization algorithm. J Microbiol Methods. 2021;187: 106267. https://doi.org/10.1016/j.mimet.2021.106267 .

Heywang-Köbrunner SH, Hacker A, Sedlacek S. Advantages and disadvantages of mammography screening. Breast Care (Basel). 2011;6(3):199–207. https://doi.org/10.1159/000329005 .

Afzal S, Hassan M, Ullah S, et al. Breast Cancer; Discovery of Novel Diagnostic Biomarkers, Drug Resistance, and Therapeutic Implications. Front Mol Biosci. 2022;9:783450. https://doi.org/10.3389/fmolb.2022.783450 .

Bissen D, Foss F, Acker-Palmer A. AMPA receptors and their minions: auxiliary proteins in AMPA receptor trafficking. Cell Mol Life Sci. 2019;76(11):2133–69. https://doi.org/10.1007/s00018-019-03068-7 .

Kamran M, Bhattacharya U, Omar M, et al. ZNF92, an unexplored transcription factor with remarkably distinct breast cancer over-expression associated with prognosis and cell-of-origin. NPJ Breast Cancer. 2022;8(1):99. https://doi.org/10.1038/s41523-022-00474-2 .

Tian, T., X. Li, and J. Zhang, mTOR Signaling in Cancer and mTOR Inhibitors in Solid Tumor Targeting Therapy. Int J Mol Sci. 2019. 20(3). https://doi.org/10.3390/ijms20030755 .

Drago JZ, Ferraro E, Abuhadra N, et al. Beyond HER2: Targeting the ErbB receptor family in breast cancer. Cancer Treat Rev. 2022;109:102436. https://doi.org/10.1016/j.ctrv.2022.102436 .

Kanwar, N., R. Nair, D.-Y. Wang, et al., The calcium channel subunit CACNG4 plays a role in breast cancer metastasis. Cancer Res. 2013. 73(8_Supplement): 5116-5116.

Liu Y, Qi J, Dou Z, et al. Systematic expression analysis of WEE family kinases reveals the importance of PKMYT1 in breast carcinogenesis. Cell Prolif. 2020;53(2):e12741. https://doi.org/10.1111/cpr.12741 .

Asquith CRM, Laitinen T, East MP. PKMYT1: a forgotten member of the WEE1 family. Nat Rev Drug Discov. 2020;19(3):157. https://doi.org/10.1038/d41573-019-00202-9 .

Zhang Q, Zhao X, Zhang C, et al. Overexpressed PKMYT1 promotes tumor progression and associates with poor survival in esophageal squamous cell carcinoma. Cancer Manag Res. 2019;11:7813–24. https://doi.org/10.2147/cmar.s214243 .

Wang XM, Li QY, Ren LL, et al. Effects of MCRS1 on proliferation, migration, invasion, and epithelial mesenchymal transition of gastric cancer cells by interacting with Pkmyt1 protein kinase. Cell Signal. 2019;59:171–81. https://doi.org/10.1016/j.cellsig.2019.04.002 .

Zhang QY, Chen XQ, Liu XC, et al. PKMYT1 Promotes Gastric Cancer Cell Proliferation and Apoptosis Resistance. Onco Targets Ther. 2020;13:7747–57. https://doi.org/10.2147/ott.s255746 .

Sun, Q.S., M. Luo, H.M. Zhao, et al., Overexpression of PKMYT1 indicates the poor prognosis and enhances proliferation and tumorigenesis in non-small cell lung cancer via activation of Notch signal pathway. Eur Rev Med Pharmacol Sci. 2019. 23(10): 4210-4219. https://doi.org/10.26355/eurrev_201905_17925 .

Long HP, Liu JQ, Yu YY, et al. PKMYT1 as a Potential Target to Improve the Radiosensitivity of Lung Adenocarcinoma. Front Genet. 2020;11:376. https://doi.org/10.3389/fgene.2020.00376 .

Deng L, Wang D, Chen S, et al. Epiphycan predicts poor outcomes and promotes metastasis in ovarian cancer. Front Oncol. 2021;11:653782. https://doi.org/10.3389/fonc.2021.653782 .

Hanada Y, Nakamura Y, Ishida Y, et al. Epiphycan is specifically expressed in cochlear supporting cells and is necessary for normal hearing. Biochem Biophys Res Commun. 2017;492(3):379–85. https://doi.org/10.1016/j.bbrc.2017.08.092 .

Sun, X., M. Wang, M. Wang, et al., Exploring the Metabolic Vulnerabilities of Epithelial–Mesenchymal Transition in Breast Cancer. Frontiers in Cell and Developmental Biology, 2020. 8. https://doi.org/10.3389/fcell.2020.00655 .

Nallanthighal, S., J.P. Heiserman, and D.J. Cheon, Collagen Type XI Alpha 1 (COL11A1): a novel biomarker and a key player in cancer. Cancers (Basel), 2021. 13(5). https://doi.org/10.3390/cancers13050935 .

Kotepui M, Punsawad C, Chupeerach C, et al. Differential expression of matrix metalloproteinase-13 in association with invasion of breast cancer. Contemp Oncol (Pozn). 2016;20(3):225–8. https://doi.org/10.5114/wo.2016.61565 .

Kotepui M, Punsawad C, Chupeerach C, et al. Differential expression of matrix metalloproteinase-13 in association with invasion of breast cancer. Contemp Oncol/Współczesna Onkologia. 2016;20(3):225–8. https://doi.org/10.5114/wo.2016.61565 .

Shi S, Xu C, Fang X, et al. Expression profile of Toll-like receptors in human breast cancer. Mol Med Rep. 2020;21(2):786–94. https://doi.org/10.3892/mmr.2019.10853 .

Sun X, Fu X, Xu S, et al. OLR1 is a prognostic factor and correlated with immune infiltration in breast cancer. Int Immunopharmacol. 2021;101(Pt B):108275. https://doi.org/10.1016/j.intimp.2021.108275 .

Download references

Acknowledgements

We would also like to express our appreciation to our esteemed colleague, Ms. Mahmoodi, for her invaluable contributions in the execution of experiments.

Ali Golestan, a medical biotechnology student, conducted the research as part of his Ph.D. thesis. The study was funded by Shiraz University of Medical Sciences (Grant No. 24453). However, it is stated that the funder was not involved in the data collection, analysis, study design, decision to publish, or manuscript preparation.

Author information

Authors and affiliations.

Department of Medical Biotechnology, School of Advanced Medical Sciences and Technologies, Shiraz University of Medical Sciences, Shiraz, Iran

Ali Golestan, Nafiseh Maghsoodi, Cambyz Irajie & Amin Ramezani

Shiraz Institute for Cancer Research, School of Medicine, Shiraz University of Medical Sciences, Shiraz, Iran

Ali Golestan, Nafiseh Maghsoodi & Amin Ramezani

Institute of Biotechnology, Shiraz University, Shiraz, Iran

Ahmad Tahmasebi

Department of Pathology, School of Medicine, Shiraz University of Medical Sciences, Shiraz, Iran

Seyed Nooreddin Faraji

You can also search for this author in PubMed   Google Scholar

Contributions

The conception and design of the study were collectively conceived by A.R. and A.G. Data curation and formal analysis were performed by A.G. and A.T. The initial draft was written by A.G. The funding was obtained by A.R. Subsequently, all authors reviewed and edited the previous versions of the manuscript. Furthermore, each author approved the final version of the manuscript.

Corresponding author

Correspondence to Amin Ramezani .

Ethics declarations

Ethics approval and consent to participate.

The research protocol received ethical approval from the Research Ethics Committee of Shiraz University of Medical Sciences (Approval ID: IR.SUMS.REC.1401.215), and prior to the study beginning, the patient’s written informed consent was obtained.

Consent to publication

Not applicable.

Competing interests

All authors state that they have no conflicts of interest.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1..

 List of the differentially expressed genes (DEGs) and the Gini index of the random forest model for each of DEGs.

Additional file 2: Supplementary Fig. 1A.

Volcano plot of tumor vs. healthy samples. The red points represent the 500 upregulated genes.

Additional file 3: Supplementary Table 1.

Altered expression of selected genes based on METABRIC database.

Additional file 4: Supplementary Fig. 2A.

Expression levels of CACNG4 , PKMYT1 , EPYC and CHRNA6 in different breast cancer sample types based on the UCSC Xena server from TCGA dataset. High expression levels of identified genes in primary tumor (Blue) compared to normal tissues (Green).

Additional file 5: Supplementary Fig. 3.

Subcellular Localization Prediction. The GeneCards database was used to evaluate each protein’s subcellular location.

Additional file 6: Supplementary Table 2.

The correlation between CACNG4 , PKMYT1 , EPYC and CHRNA6 expression levels and clinico-pathological parameters.

Additional file 7: Supplementary Fig. 4A.

The expression analysis of CACNG4 , PKMYT1 , EPYC and CHRNA6 with clinical characteristics of BC patients. A: SBR; B: BRCA1/2 status; C: PAM50 subtypes via bc-GenExMiner v4.8. These graphs were generated by comparing significant changes between normal variables and other variables. Abbreviation: BC, breast cancer; SBR, Scarff Bloom and Richardson grade status.

Additional file 8.

 Gene ontology and biological pathway of co-expressed genes with CACNG4.

Additional file 9.

 Gene ontology and biological pathway of co-expressed genes with PKMYT1.

Additional file 10.

 Gene ontology and biological pathway of co-expressed genes with EPYC.

Additional file 11.

 Gene ontology and biological pathway of co-expressed genes with CHRNA6.

Additional file 12: Supplementary Fig. 5A.

Venn diagram represents the intersection of genes between the cBioPortal database and the GEPIA2 database. 56 co-expressed genes for CACNG4 , 49 co-expressed genes for PKMYT1 , 129 co-expressed genes for EPYC and 150 co-expressed genes for CHRNA6 based on the FunRich analysis tool.

Additional file 13:

  Supplementary Table 3 . GSEA of hallmark gene sets. The results of the GSEA showed enrichment in 5 gene sets.

Additional file 14:

  Supplementary Fig. 6 . Assessment of alteration frequency and mutation types. A. Histogram of the frequency of alterations in queried genes. Queried genes are altered in 404 (37%) of queried patients/samples. The frequency of genetic alteration in CACNG4 , PKMYT1 , EPYC for breast cancer is mRNA high than other copy number variation, however for CHRNA6 gene, amplification is the most frequent copy number alteration by searching the cBio Cancer Genomics Portal database. B. An overview of the types of mutation observed. Pie charts demonstrating the mutation types of CACNG4 , PKMYT1 , EPYC and CHRNA6 in BC based on results from the COSMIC database. BC, breast cancer.

Additional file 15:

  Supplementary Fig. 7. Tumor cell line dependency based on the DepMap tool. A: The essential role of indicated genes ( CACNG4 , PKMYT1 , EPYC and CHRNA6 ) in tumor cell line panels via DepMap, was established from CRISPR (blue) and RNAi (violet) databases. B: The Chronos dependence scores in breast cancer cells. A higher likelihood that the gene of interest is crucial in a particular cell line is indicated by a lower Chronos score. A gene is not essential if it has a score of 0 (dotted line); a value of 1 is like the average of all pan-essential genes (red line).

Additional file 16:

  Supplementary Table 4. Association of gene expression pattern of CACNG4 , PKMYT1 , EPYC , and CHRNA6 with clinico-pathological features in breast cancer patients. This file shows the expression patterns of these genes with age, ER status, PR status, HER2 status, TNM stages, histological grades and molecular subtypes. P value less than 0.05 considered to be statistically significant. SD: Significantly difference; NS: Non Significant; ER: Estrogen Receptor; PR: Progesterone Receptor; HER2: Human Epidermal Growth Factor Receptor 2; and TNM: Tumor, Node, Metastasis.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Golestan, A., Tahmasebi, A., Maghsoodi, N. et al. Unveiling promising breast cancer biomarkers: an integrative approach combining bioinformatics analysis and experimental verification. BMC Cancer 24 , 155 (2024). https://doi.org/10.1186/s12885-024-11913-7

Download citation

Received : 04 October 2023

Accepted : 23 January 2024

Published : 31 January 2024

DOI : https://doi.org/10.1186/s12885-024-11913-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Breast cancer
  • Biomarker identification
  • Bioinformatic analysis
  • Differentially expressed genes

ISSN: 1471-2407

literature review on breast cancer detection

ScienceDaily

Novel technique has potential to transform breast cancer detection

An innovative breast imaging technique provides high sensitivity for detecting cancer while significantly reducing the likelihood of false positive results, according to a study published today in Radiology: Imaging Cancer , a journal of the Radiological Society of North America (RSNA). Researchers said the technique has the potential to offer more reliable breast cancer screening for a broader range of patients.

Mammography is an effective screening tool for early detection of breast cancer, but its sensitivity is reduced in dense breast tissue. This is due to the masking effect of overlying dense fibroglandular tissue. Since almost half of the screening population has dense breasts, many of these patients require additional breast imaging, often with MRI, after mammography.

Low-dose positron emission mammography (PEM) is a novel molecular imaging technique that provides improved diagnostic performance at a radiation dose comparable to that of mammography.

For the study, 25 women, median age 52, recently diagnosed with breast cancer, underwent low-dose PEM with the radiotracer fluorine 18-labeled fluorodeoxyglucose ( 18 F-FDG). Two breast radiologists reviewed PEM images taken one and four hours post 18 F-FDG injection and correlated the findings with lab results.

PEM displayed comparable performance to MRI, identifying 24 of the 25 invasive cancers (96%). Its false positive rate was only 16%, compared with 62% for MRI.

Along with its strong sensitivity and low false-positive rate, PEM could potentially decrease downstream healthcare costs as this study shows it may prevent further unnecessary work up compared to MRI. Additionally, the technology is designed to deliver a radiation dose comparable to that of traditional mammography without the need for breast compression, which can often be uncomfortable for patients.

"The integration of these features -- high sensitivity, lower false-positive rates, cost-efficiency, acceptable radiation levels without compression, and independence from breast density -- positions this emerging imaging modality as a potential groundbreaking advancement in the early detection of breast cancer," said study lead author Vivianne Freitas, M.D., M.Sc., assistant professor at the University of Toronto. "As such, it holds the promise of transforming breast cancer diagnostics and screening in the near future, complementing or even improving current imaging methods, marking a significant step forward in breast cancer care."

Low-dose PEM offers potential clinical uses in both screening and diagnostic settings, according to Dr. Freitas.

"For screening, its ability to perform effectively regardless of breast density potentially addresses a significant shortcoming of mammography, particularly in detecting cancers in dense breasts where lesions may be obscured," she said. "It also presents a viable option for patients at high risk who are claustrophobic or have contraindications for MRI."

The technology could also play a crucial role in interpreting uncertain mammogram results, evaluating the response to chemotherapy and ascertaining the extent of disease in newly diagnosed breast cancer, including involvement of the other breast.

Dr. Freitas, who is also staff radiologist of the Breast Imaging Division of the Toronto Joint Department of Medical Imaging, University Health Network, Sinai Health System and Women's College Hospital, is currently researching PEM's ability to reduce the high rates of false positives typically associated with MRI scans. Should PEM successfully lower these rates, it could significantly lessen the emotional distress and anxiety linked to false positives, Dr. Freitas said. Additionally, it might lead to a decrease in unnecessary biopsies and treatments.

More studies are needed to determine low-dose PEM's exact role and efficacy in the clinical setting.

"While the full integration of this imaging method into clinical practice is yet to be confirmed, the preliminary findings of this research are promising, particularly in demonstrating the capability of detecting invasive breast cancer with low doses of fluorine-18-labeled FDG," Dr. Freitas said. "This marks a critical first step in its potential future implementation in clinical practice."

  • Breast Cancer
  • Medical Imaging
  • Medical Devices
  • Women's Health
  • Breastfeeding
  • Diseases and Conditions
  • Today's Healthcare
  • Mammography
  • Breast cancer
  • Breast reconstruction
  • Cervical cancer
  • Breast implant
  • Monoclonal antibody therapy
  • Colorectal cancer

Story Source:

Materials provided by Radiological Society of North America . Note: Content may be edited for style and length.

Journal Reference :

  • Vivianne Freitas, Xuan Li, Anabel Scaranelo, Frederick Au, Supriya Kulkarni, Sandeep Ghai, Samira Taeb, Oleksandr Bubon, Brandon Baldassi, Borys Komarov, Shayna Parker, Craig A. Macsemchuk, Michael Waterston, Kenneth O. Olsen, Alla Reznik. Breast Cancer Detection Using a Low-Dose Positron Emission Digital Mammography System . Radiology: Imaging Cancer , 2024; 6 (2) DOI: 10.1148/rycan.230020

Cite This Page :

  • Can Hydrogels Help Mend a Broken Heart?
  • Discovery: Oldest Known Bead in the Americas
  • Geological Carbon Cycle: Unknown Part
  • Ancient Air-Breathing Fish Comes to Surface
  • Migration Solves Exoplanet Puzzle
  • Do Viral Infections Increase Alzheimer's Risk?
  • Baird's Beaked Whale: Surprising Behavior
  • Classical Computers Can Surpass Quantum
  • What Was the Early Universe Like?
  • Unique Superconductivity for Quantum Computing
  • Open access
  • Published: 07 February 2024

Breast cancer screening and early diagnosis in China: a systematic review and meta-analysis on 10.72 million women

  • Mengdan Li 1 ,
  • Hongying Wang 2 ,
  • Ning Qu 3 ,
  • Haozhe Piao 4 &

BMC Women's Health volume  24 , Article number:  97 ( 2024 ) Cite this article

204 Accesses

Metrics details

The incidence of breast cancer among Chinese women has gradually increased in recent years. This study aims to analyze the situation of breast cancer screening programs in China and compare the cancer detection rates (CDRs), early-stage cancer detection rates (ECDRs), and the proportions of early-stage cancer among different programs.

We conducted a systematic review and meta-analysis of studies in multiple literature databases. Studies that were published between January 1, 2010 and June 30, 2023 were retrieved. A random effects model was employed to pool the single group rate, and subgroup analyses were carried out based on screening model, time, process, age, population, and follow-up method.

A total of 35 studies, including 47 databases, satisfied the inclusion criteria. Compared with opportunistic screening, the CDR (1.32‰, 95% CI: 1.10‰–1.56‰) and the ECDR (0.82‰, 95% CI: 0.66‰–0.99‰) were lower for population screening, but the proportion of early-stage breast cancer (80.17%, 95% CI: 71.40%–87.83%) was higher. In subgroup analysis, the CDR of population screening was higher in the urban group (2.28‰, 95% CI: 1.70‰–2.94‰), in the breast ultrasonography (BUS) in parallel with mammography (MAM) group (3.29‰, 95% CI: 2.48‰–4.21‰), and in the second screening follow-up group (2.47‰, 95% CI: 1.64‰–3.47‰), and the proportion of early-stage breast cancer was 85.70% (95% CI: 68.73%–97.29%), 88.18% (95% CI: 84.53%–91.46%), and 90.05% (95% CI: 84.07%–94.95%), respectively.

There were significant differences between opportunistic and population screening programs. The results of these population screening studies were influenced by the screening process, age, population, and follow-up method. In the future, China should carry out more high-quality and systematic population-based screening programs to improve screening coverage and service.

Peer Review reports

Breast cancer is the most common malignant tumor in the world [ 1 ]. In China, the incidence of breast cancer and the disease burden continue to increase [ 2 ]. Improving the early diagnosis of breast cancer followed by effective treatment is an effective measure to reduce breast cancer mortality [ 3 , 4 , 5 ]. Western countries began to standardize the breast cancer screening process earlier than China, and have successively implemented screening programs [ 6 , 7 , 8 ]. For large-scale cancer screening, cases must be effectively detected, especially early cases [ 9 ].

In the past 10 years, the provinces and cities in China have also launched several population-based breast cancer screening programs successively. Notably, two national cancer screening programs [ 10 , 11 ] have persisted and yielded considerable social benefits. However, due to the large, widely dispersed population and shortage of equipment in China, it is difficult to unify breast cancer screening strategies in different programs. Meanwhile, some problems were exposed. For example, the starting age was not standardized, some screening programs had a short duration and no follow-up surveys, and the types of screening equipment were different in some institutions.

Published studies on breast cancer screening in China mainly focused on risk factors and screening techniques. Most of the data were from a single province, part of a region, or a single program. There is no comprehensive analysis of all studies, let alone analysis of early diagnosis. We aimed to analyze the current situation of breast cancer screening in China. Therefore, in the present study, we systematically analyzed the population and opportunistic breast cancer screening programs in China, and compared the cancer detection rates (CDRs), early-stage cancer detection rates (ECDRs), and the proportions of early-stage cancer. Subgroup analysis of population screening was conducted based on screening time, screening process, target population, and follow-up method.

The review protocol was registered in the Open Science Framework ( https://doi.org/10.17605/OSF.IO/EABPH ).

Search strategy

We searched relevant articles in databases including PubMed, EMBASE, Web of Science, China National Knowledge Infrastructure (CNKI), Chinese Scientific Journals Full Text Database (CQVIP), and Wanfang Data. Articles published between January 1, 2010 and June 30, 2023 were considered for inclusion. The search keywords included “breast cancer” OR “breast tumors” AND “screening” AND “China” OR “Chinese” (Table S 1 ). In addition, we manually searched systematic reviews and references. This study was conducted and reported in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [ 12 ] (Table S 2 , PRISMA checklist).

Study selection

A literature database was created to retrieve relevant studies and exclude duplicate studies by Endnote® (version X6; Thomson Reuters, Inc., Philadelphia, PA) bibliographic software. In order to prevent bias, two authors (LMD and ZB) independently screened the titles and abstracts. Any disagreements were resolved by discussion with the third author. Finally, the preliminary selected articles were examined in full texts, and irrelevant articles were excluded according to the inclusion and exclusion criteria.

The inclusion and exclusion criteria were as follows: (1) the subjects were from mainland China and voluntarily participated in breast cancer screening; (2) studies of patients with breast cancer or precancerous lesions were excluded; (3) the overall sample size was ≥ 1000; (4) the screening process, methods, and detection indicators were clearly defined, especially with respect to the detection rates of breast cancer and early-stage breast cancer; and (5) when two or more studies were conducted in the same study population, the most recent article or the article with the largest sample size was included.

Quality assessment

To assessed the quality and validity of the included studies, a modified quality assessment tool based on ten aspects was used [ 13 ]. For each aspect, a score of 0 (high risk) or 1 (low risk) was given, so the total score ranged from 0 to 10. Studies with 8 to 10 points were considered to be of high quality, studies with 4 to 7 points were considered to be of moderate quality. Furthermore, studies with points below 7 were considered low quality and excluded from the research.

Data extraction

The included studies were read in detail by two authors (LMD and ZB). Moreover, the following variables were extracted: first author, year of publication, characteristics of the screening programs (screening mode, screening time, target population, province, age range, screening process, follow-up method), the number of screening participants, the number of detected breast cancers and early breast cancers, etc.

Statistical analysis

Stata (Version 14.0; Stata Corp., College Station, TX) was used for the pooled analysis. The random effects model was used to combine the results (CDRs, ECDRs, and the proportions of early-stage cancer) and 95% CI. The heterogeneity of the selected studies was assessed using the I 2 index.

Population screening refers to the systematic and organized examination conducted on all women in the target group, whether at a national, region, or unit level. Opportunistic screening involves women voluntarily choosing to undergo examination at medical institutions or as recommendation during routine medical consultations . Furthermore, subgroup analyses were carried out to explore the main heterogeneity of population screening by screening time (< 2012 and ≥ 2012 year), screening age (< 40, 40–49, 50–59 and ≥ 60 years old), residence (urban/rural), geographical region (north and south), Human development index (HDI) (< 0.75, 0.79–0.75 and ≥ 0.8), screening process, and follow-up method. According to the population screening methods, the screening process could be divided into three main categories: (i) subjects underwent clinical breast examination (CBE) as initial screening, some of them followed by breast ultrasonography (BUS) or mammography (MAM) according to the results; (ii) subjects underwent BUS as initial screening, some of them followed by MAM according to the results; and (iii) subjects underwent BUS in parallel with MAM as initial screening. Follow-up involves tracking women who received positive screening results through various methods to obtain the final diagnosis and results. We divided the studies into three types: (i) no follow-up; (ii) inquiry follow-up by telephone or interview after 1 year, and (iii) second screening after 1 year.

Literature search and study characteristics

According to the process, a total of 4,602 studies were initial found in the databases. During the screening stage, 3,083 studies were excluded due to duplication, while an additional 1,250 studies were excluded based on title and abstract reviews. In the eligibility evaluation stage, 269 studies were accessed in full text, and 234 studies were excluded by considering the inclusion and exclusion criteria. Finally, 35 studies [ 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 , 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 , 44 , 45 , 46 , 47 , 48 ] (five in English and 30 in Chinese), 47 databases, and a total of 12,984,958 participants were included in the analysis. A flowchart of the screening process is shown in Fig.  1 .

figure 1

Flowchart of the screening process in our meta-analysis

Of these 47 databases, 39 were from population screening and 8 were from opportunistic screening. The databases covered a number of provinces in China, of which three were national screening programs and five were multi-center screening programs. We conducted subgroup analysis of the population screening databases. The screening population of 19 databases were urban population and 20 were rural population. The screening times of 21 databases were before 2012 and 18 were in or after 2012. Based on the screening process of the population screening databases. 22 were from BUS followed by MAM, and 12 were from BUS in parallel with MAM as initial screening. Other characteristics of the databases are summarized in Table  1 .

Risk of bias

We assessed the quality of the studies using the modified quality assessment tool. The scores ranged from 7 to 10. Of the 35 studies, six studies were considered as being of medium quality and 29 studies were considered as being of high quality. After our evaluation, each included study established its own quality control program, and required all physicians and technicians to be trained accordingly (the physicians were responsible for making the diagnosis). Details are shown in Table  1 .

Comparison of breast cancer screening effect between opportunistic and population screening programs in China

The detailed data on breast cancer detection by screening model are summarized in Table  2 . Overall, the CDR in the opportunistic screening group with a sample size of 224,240 was 11.99‰ (95% CI: 5.14‰–21.57‰; I 2  = 98%), and in the population screening group with a sample size of 12,760,718 was 1.32‰ (95% CI: 1.10‰–1.56‰; I 2  = 99%). When we defined TNM stage 0–II as early-stage breast cancer, the ECDR of opportunistic screening group was 4.90‰ (95% CI: 1.02‰–11.37‰) and the proportion of early-stage cancer was 72.42% (95% CI: 57.28%–85.57%); the ECDR of populations screening group was 0.82‰ (95% CI: 0.66‰–0.99‰) and the proportion of early-stage cancer was 80.17% (95% CI: 71.40%–87.83%). The forest plots of pooled data by screening model are shown in Figure S 1-3 .

The breast cancer screening effect based on population in population screening programs

We further divided the population screening programs into urban women and rural women. The CDR of urban women (2.28‰, 95% CI: 1.70‰–2.94‰) was higher than that of rural women (0.70‰, 95% CI: 0.57‰–0.3‰) (Table  3 ). At the same time, more stage 0–II breast cancer was detected by population screening in urban women, with the ECDR of 1.60‰ (95% CI: 1.19‰–2.06‰) and the proportion of early-stage cancer of 85.70% (95% CI: 68.73%–97.29%) (Table  4 ). Regarding age, with the increasing of screening age, the CDR gradually increased, and in the ≥ 60 age groups, the CDR increased to 1.76‰ (95% CI: 1.03‰–2.68‰) (Table  3 ). To explore variations in breast cancer screening programs across different populations, we conducted subgroup analyses based on geographic location and the provinces’ HDI sizes. The results showed a slightly higher CDR in north China compared to south China, although the difference was not obvious (Table S 3 ). Additionally, within urban population, regions with an HDI ≥ 0.8 exhibited relatively higher CDR and ECDR (Table S 3-5 ).

The breast cancer screening effect based on screening process in population screening programs

The potential sources of population screening heterogeneity were assessed by estimating the detection rates based on different screening process. Overall, the CDR was 3.29‰ (95% CI: 2.48‰–4.21‰) in the BUS in parallel with MAM screening group, which was higher than in the CBE followed by BUS or MAM group (0.48‰, 95% CI: 0.40‰–0.56‰) and in the BUS followed by MAM group (0.94‰, 95% CI: 0.70‰–1.20‰) (Table  3 ); in the early detection of breast cancer, the BUS in parallel with MAM screening group also had a significant advantage. The ECDR was 2.49‰ (95% CI: 1.89‰–3.16‰) (Table  4 ).

The breast cancer screening effect based on screening time in population screening programs

Based on screening time, we further divided the data into two periods, before and after 2012. We found that before 2012, the CDR was 1.38‰ (95% CI: 1.08‰–1.71‰), and after 2012, the CDR was 1.26‰ (95% CI: 0.96‰–1.60‰) (Table  3 ). Similarly, there was little change in the ECDR and the proportion of early-stage cancer over both time periods (Table  4 ).

The breast cancer screening effect based on follow-up method in population screening programs

Regarding the follow-up method, the CDR in the no follow-up group was 0.81‰ (95% CI: 0.51‰–1.17‰), which was lower than in the interview follow-up group (1.41‰, 95% CI: 1.15‰–1.69‰) and the second screening group (2.54‰, 95% CI: 1.65‰–3.61‰). The results are summarized in Table  3 . Compared to the follow-up screening groups, the ECDR (0.46‰, 95% CI: 0.30‰–0.64‰) and the proportion of early-stage cancer (71.10%, 95% CI: 59.07%–81.95%) in the no follow-up group were also lower (Table  4 ).

Currently, various breast cancer screening models exist in China, such as population screening, opportunistic screening and physical examinations. However, more and more countries in the European Union are implementing organized screening programs [ 49 ]. Organized screening typically has been subjected to rigorous health technology assessment (HTA) to assess its benefits, cost-effectiveness, and potential harmful side effects (although some screening techniques have no adverse reactions) [ 50 ]. As a result, China has no real sense of organized breast cancer screening program. CDR and early diagnosis are important indicators to evaluate the quality of cancer screening programs [ 51 ]. Our study showed that the CDRs of the two screening models were ≥ 3 times higher than the incidence reported by the Chinese cancer registry [ 2 ], and more early-stage breast cancers were detected through screening. Compared with patients with late-stage cancer, those diagnosed with early-stage cancer are more likely to receive curative treatment and have lower treatment costs [ 52 , 53 ]. Notably, 51.2% of breast cancer patients in the United States were diagnosed with stage I cancer, and more than 84.0% of diagnosed patients had stage 0–II cancer [ 54 ]. In high-income Asian countries such as Singapore and Japan, more than 85% of breast cancers were diagnosed at stage II [ 55 , 56 ]. These data indicate that the proportion of early-stage breast cancer in China is still low. Our findings indicated that the CDR of opportunistic screening was about nine times higher than population screening. However, the proportion of early-stage breast cancer was lower, potentially due to most women participating in opportunistic screening already having noticeable symptoms. Similar results were obtained in other countries [ 57 ]. Meanwhile, most opportunistic screenings depend on individual willingness or the extent of available primary healthcare services, thus lacking the guarantee regular screenings. Consequently, population-based organized screening holds greater potential to enhance screening coverage and diminish cancer incidence and mortality rates. Nonetheless, China faces numerous challenges in executing high-quality organized screening, which involve existing infrastructure, resource limitations, and public acceptance of centralized healthcare.

Since the CDR of population screening in meta-analysis is affected by various factors such as strategy of screening, age, population. Therefore, we conducted subgroup analysis to examine the relationship between CDR and these factors. In China, the strategy of population breast cancer screening has undergone a change from using one method alone to using multiple methods in combination. In our study, among the three screening strategies, the CDR for BUS in parallel with MAM was the highest. The proportion of stage 0–II breast cancers was 88.18%, which was consistent with data from other countries [ 58 , 59 ]. Given China's vast population, diverse economic levels, and disparate resource allocations across regions, implementing a standardized screening strategy poses challenges [ 60 ]. BUS, being cost-effective, has gained the main screening method, especially in rural areas of China [ 61 ]. At the same time, Chinese women often have smaller breasts with a higher proportion of dense breasts tissue [ 62 ]. The latest breast cancer screening guidelines in China recommend BUS combined with MAM for average-risk women with dense breasts or high-risk women [ 63 ]. In the future, it's crucial to conduct cost-effectiveness and survival benefit analyses across diverse population screening programs, and establish a systematic national breast cancer screening strategy to standardize the implementation of organized screening programs.

We also found a disparity in preliminary effectiveness of breast cancer screening between urban and rural areas. The CDR in urban areas was about three times higher than that in rural areas, and the proportion of early-stage breast cancer (stage 0–II) could reach 85%. This can be attributed to several factors. Firstly, the incidence of breast cancer in urban areas is higher than that in rural areas. Secondly, women in urban areas possess greater awareness of cancer screening, and have easier access to better medical resources, leading to more diagnoses of early-stage breast cancers [ 64 ]. Furthermore, the screening results were also closely related to the geographical location and economic status of the regions. The stage at diagnosis strongly influences the treatment strategies and prognosis of patients with cancer. In China, breast cancer patients consistently have a lower survival rate in rural areas than in urban areas [ 65 ]. Enhancing the proportion of early diagnosis might narrow the survival gap among diverse populations. Williams et al. [ 66 ] found that women living in non-metropolitan or rural areas were 11% more likely to be diagnosed with late-stage breast cancer than women living in metropolitan or urban areas. The current results suggest that providing free screening services alone cannot compensate for the deficiency in preventive care for low-income and uninsured women [ 67 ]. To benefit more women in rural areas, increased clinical services, including follow-ups and medical insurance, are imperative [ 68 , 69 ].

Chinese women tend to develop breast cancer at an earlier age compared to their Western women. Our findings demonstrated disparities in the starting age of screening, indicating the absence of a standardized criterion for population-based screening in China. The recruitment age of most programs began at 35 or 40 years old, with the detection rate gradually rising with age. However, further survival analysis was lacking, and the benefits of screening at different ages were still uncertain. Studies on the starting age for screening still require a lot of data [ 70 ].

The incidence of breast cancer among Chinese women has gradually increased in recent years [ 2 ]. Interestingly, we found that the CDR and ECDR of population screening programs did not change significantly over the decade. This trend could potentially stem from publication bias. Moreover, it might be associated with screening management. Although the screening coverage of regions and populations has increased rapidly, there hasn't been a substantial improvement in follow-up methods and service quality. When we further analyzed the follow-up methods, less than 60% of the population screening programs conducted follow-up, and of these, only 43% were published in or after 2012. Without standardization of follow-up management, most high-risk subjects were missed during the program, which substantially reduced the effectiveness of screening [ 71 ]. Addressing this issue entails fostering collaboration with cancer registration departments to promptly collect breast cancer incidence, mortality, and survival data. Such data will serve as a critical foundation for conducting comprehensive breast cancer research and health economic evaluations.

Multiple real-world studies have evidenced the positive impact of cancer screening on reducing mortality rates [ 72 , 73 , 74 ]. However, a recent meta-analysis on cancer screening suggested that current evidence does not unequivocally establish the life-saving benefits of common cancer screening tests [ 75 ]. This prompts us to prudently reassess both the benefits and drawbacks of screening [ 76 ]. Notably, not all cancers are suitable for screening. Hence, blindly adopting foreign screening guidelines might not be ideal. Instead, the focus should be on developing screening programs tailored to the specific characteristics of Chinese women. Furthermore, this study corroborates the positive impact of opportunistic screening on elevating breast cancer detection rates. Future endeavors should emphasize heightened publicity and educational campaigns aimed at enhancing women's awareness of breast health and fostering their active participation in screening. Since 2017, our team has carried out a population-based breast cancer screening and intervention technology research program across Liaoning, Shandong, and Shanghai. The program established the first “Program Team-Community-Subjects” network interaction platform to standardize the screening and follow-up process, and applied the latest imaging techniques (digital breast tomography, ultrasonic elastography, and micropore imaging) to compare with conventional techniques (full-field digital mammography and breast ultrasound) in breast cancer screening. This program evaluates the optimal screening strategy for Chinese women and provides a reference for breast cancer screening in China and globally.

Limitations

There were several limitations to this study. First of all, we found that most of the studies only calculated the CDR without describing TNM staging or early-stage breast cancer detection. Therefore, the studies that could be included were limited, and the description of the CDR may suffer from inclusion bias. Second, the physicians or technicians involved in screening were required to have uniform technical training or qualification, but we did not make subgroup analysis about the facilities used or the professional titles of diagnostic doctors, etc. There may be some bias in the results. Third, the purpose of cancer screening is to find not only early-stage cancer, but also precancerous lesions, especially precancerous lesions that can be treated. We should also analyze the detection of precancerous lesions of breast cancer, but the relevant data of the available studies were limited, so we did not include them.

Conclusions

In conclusion, there were significant differences in the detection rates of breast cancer and early-stage breast cancer between opportunistic and population screening programs among Chinese women. The results of these population screening studies were influenced by various factors including the screening process, age, population, and follow-up method. Moving forward, China's breast cancer prevention and control efforts should emphasize the advancement of population-based organized screening programs, complemented by opportunistic screening. This strategic approach aims to expand screening coverage and improve screening services.

Availability of data and materials

The original contributions are included in the article. Further inquiries can be directed to the corresponding author.

Siegel RL, Miller KD, Fuchs HE, Jemal A. Cancer statistics, 2022. CA Cancer J Clin. 2022;72(1):7–33.

Article   PubMed   Google Scholar  

Zheng R, Zhang S, Zeng H, Wang S, Sun K, Chen R, et al. Cancer incidence and mortality in China, 2016. Journal of the National Cancer Center. 2022;2(1):1–9.

Article   ADS   Google Scholar  

Pace LE, Keating NL. A systematic assessment of benefits and risks to guide breast cancer screening decisions. JAMA. 2014;311(13):1327–35.

Article   CAS   PubMed   Google Scholar  

Choi E, Jun JK, Suh M, Jung KW, Park B, Lee K, et al. Effectiveness of the Korean National Cancer Screening Program in reducing breast cancer mortality. NPJ Breast Cancer. 2021;7(1):83.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Duffy SW, Tabár L, Yen AM, Dean PB, Smith RA, Jonsson H, et al. Mammography screening reduces rates of advanced and fatal breast cancers: Results in 549,091 women. Cancer. 2020;126(13):2971–9.

Qaseem A, Lin JS, Mustafa RA, Horwitch CA, Wilt TJ, Forciea MA, et al. Screening for Breast Cancer in Average-Risk Women: A Guidance Statement From the American College of Physicians. Ann Intern Med. 2019;170(8):547–60.

Fitzgerald SP. Breast-Cancer Screening-Viewpoint of the IARC Working Group. N Engl J Med. 2015;373(15):1479.

PubMed   Google Scholar  

Mainiero MB, Moy L, Baron P, Didwania AD, diFlorio RM, Green ED, et al. ACR Appropriateness Criteria(®) Breast Cancer Screening. J Am Coll Radiol. 2017;14(11s):S383-s390.

Kopans DB. Misinformation and Facts about Breast Cancer Screening. Curr Oncol. 2022;29(8):5644–54.

Article   PubMed   PubMed Central   Google Scholar  

Chen WQ, Li N, Shi JF, Ren JS, Chen HD, Li J, et al. Progress in early diagnosis and early treatment of urban cancer in China. China Cancer. 2019;28:3.

Google Scholar  

Huang J, Yang XH, Liu A, Zhou WJ. Problems and countermeasures in the implementation of national cervical and breast screening program for women in rural areas. Chin Gen Prac. 2020;23:7.

Panic N, Leoncini E, de Belvis G, Ricciardi W, Boccia S. Evaluation of the endorsement of the preferred reporting items for systematic reviews and meta-analysis (PRISMA) statement on the quality of published systematic review and meta-analyses. PLoS ONE. 2013;8(12): e83138.

Article   PubMed   PubMed Central   ADS   Google Scholar  

Hoy D, Brooks P, Woolf A, Blyth F, March L, Bain C, et al. Assessing risk of bias in prevalence studies: modification of an existing tool and evidence of interrater agreement. J Clin Epidemiol. 2012;65(9):934–9.

Yang S, Wang D, Wang XF. Analysis and Discussion of Cervical and Breast Cancer Screening Results in Women of Different Age Groups. Systems Medicine. 2023;8(2):157–60.

Wu S, Liang D, Shi J, Li D, Liu Y, Hao Y, et al. Evaluation of a population-based breast cancer screening in North China. J Cancer Res Clin Oncol. 2023;149(12):10119–30.

Wu L, Chen GZ, Ma YZ, Li TT, Xia Jh, Liu GC. Analysis of breast cancer screening status among women aged 35–64 years in Guangdong Province in 2021. Chinese Journal of Woman and Child Health Research. 2023;34(7):51–8.

Han T, Gong HS, Quan SM, Chen L, Li ZH, Xiao D. Analysis of breast cancer screening among rural women in Qinba area. Cancer Research and Clinic. 2023;35(1):44–7.

Wu JM, Wu JL, Reng WH, Ma L, Pang XP, Zao YX. Analysis of Breast Cancer Screening and Influencing Factors in Some Rural Regions of China. Med Soc. 2022;35(6):54–9.

Zhou TH, Gu XY, Yao F, Song SM. Analysis for screening results of breast cancer in Urumqi from 2014 to 2018. Practical Oncology Journal. 2021;35(5):391–5.

Xiao BL. Analysis of breast cancer screening results of rural women in Meizhou city. China Practical Medical. 2021;16(25):188–91.

Shen SJ, Xu YL, Zhou YD, Reng GS, Jiang J, Jiang HZ, et al. A comparative study of breast cancer mass screening and opportunistic screening in Chinese women. Chinese Journal of Surgery. 2021;59(2):109–15.

CAS   Google Scholar  

Shang GXH, Luo Q. Analysis of breast cancer screening results from 2017 to 2020 in Sanming City. Strait Journal of Preventive Medicine. 2021;27(6):102–4.

Ma L, Lian ZQ, Zhao YX, Di JL, Song B, Ren WH, et al. Breast ultrasound optimization process analysis based on breast cancer screening for 1 501 753 rural women in China. Chinese Journal of Oncology. 2021;43(4):497–503.

CAS   PubMed   Google Scholar  

Lin HZ, Miu HZ, Lian ZQ, Wang X. Empirical research of three-level breast cancer screening and diagnosis mode based on ultrasound examination in somewhere of Guangdong province. International Medicine and Health Guidance News. 2021;27(7):983–7.

Zhao YX, Ma L, Lian ZQ, Wang LH, Wang X, Wu JL. Ultrasound-based breast cancer screening in Chinese rural women in 2014: A multi-center data analysis. Chinese Journal of Cancer Prevention and Treatment. 2020;27(3):172–8.

Yang YP, Gong C, Liu FT, Mei JS, Hu Y, Yun MM, et al. Preliminary results of the SYSU strategy for screening breast cancerin southern Chinese women. SCIENTIA SINICA Vitae. 2020;50(10):1114–20.

Article   Google Scholar  

Yang L, Zhang X, Liu S, Li HC, Zhang Q, Wang N, et al. Breast cancer screening in urban Beijing, 2014–2019. Chinese Journal of Preventive Medicine. 2020;54(9):974–80.

Wang R, Meng XJ, Wei L. Analysis of breast cancer screening results of women in Shihezi area. Women’s Health Research. 2020;7:186–7.

Liu GM, Wang QJ, Wang QX, Wang FJ, Kang FX. Screening results of cervical cancer and breast cancer in Daxing District, Beijing from 2013 to 2019. Modern Medicine Journal of China. 2020;22(7):39–43.

CAS   ADS   Google Scholar  

Lili X, Zhiyu L, Yinglan W, Aihua W, Hongyun L, Ting L, et al. Analysis of breast cancer cases according to county-level poverty status in 3.5 million rural women who participated in a breast cancer screening program of Hunan province, China from 2016 to 2018. Medicine (Baltimore). 2020;99(17):e19954.

Huang XX, Huang XX, Chen ZW, Wu JB, Wang HM. Epidemiological analysis and screening mode of breast cancer screening in Fujian province from 2015 to 2018. National Medical Journal of China. 2020;100(30):2367–71.

Ding ST, Ma XJ, He XP, Chen S, Zhang Y, Qing J. Analysis of breast cancer screening results of 24693 people of physical examination combined with ultrasound and selective complementary mammography. China Medical Herald. 2020;17(29):98–101.

Wu MQ, Li P, Huang XM, Li Z, Deng LJ. Analysis of screening results of breast cancer high-risk groups in the First Division of Xinjiang Production and Construction Corps. China Preventive Medicine. 2018;19(1):75–7.

Shen SY, Yang YP, Wang HL, Hu Y, Gu R, Liu FT, et al. Opportunistic screening compared with organized screening in Guangzhou for breast cancer. Lingnan Modern Clinics in Surgery. 2017;17(5):511–5.

Huang Y, Dai H, Song F, Li H, Yan Y, Yang Z, et al. Preliminary effectiveness of breast cancer screening among 1.22 million Chinese females and different cancer patterns between urban and rural women. Sci Rep. 2016;6:39459.

Article   CAS   PubMed   PubMed Central   ADS   Google Scholar  

Shen S, Zhou Y, Xu Y, Zhang B, Duan X, Huang R, et al. A multi-centre randomised trial comparing ultrasound vs mammography for screening breast cancer in high-risk Chinese women. Br J Cancer. 2015;112(6):998–1004.

Mo M, Zheng Y, Liu GY, Fang H, Zhang XH, Zhai LF, et al. Cost-effectiveness analysis of two breast cancer screening modalities in Shanghai. China Chinese Journal of Oncology. 2015;37(12):944–51.

Ma HM. A pooled analysis and comparison of breast cancer screening programs and screening schemes in shandong. Master: Shan Dong University; 2015.

Yu HY, Li WP, Wang X, Zhang X, Lian ZQ, Xu J. Evaluation of breast screening in city women from 2006 to 2011. Chinese Journal of Cancer Prevention and Treatment. 2013;20(12):894–7.

Xu J, Wang X, Ma HM, Xia JH. Primary efficacy of physical examination combined with ultragraphy and complemented with mammography for breast cancer screening Chinese Journal of Cancer Prevention and Treatment. 2013;20(17):1295–9.

Shi SD, Zhao FH, Zhang YZ, Li XL, Qiao YL. The rural women cervical cancer and breast cancer screening practices and discussion. The Medical Forum. 2013;17:2185–7.

Mo M, Liu GY, Zheng Y, Di LF, Ji YJ, Lv LL, et al. Performance of breast cancer screening methods and modality among Chinese women: a report from a society-based breast screening program (SBSP) in Shanghai. Springerplus. 2013;2(1):276.

Gong YH, Zhang HY, Xie YG, Zhang CH, SHen HX. Early detection and study of breast cancer in healthy women. Maternal & Child Health Care of China. 2013;28(4):618–21.

Yang ZH, Dai HJ, Yan H, Chen H. Comparison of pathological characteristics in screen-detected and unscreened breast cancer. Tumori. 2012;32(1):56–9.

Kuang XM, Xiao L, He YX, Yang CM, Huang HY. Analysis of results of opportunistic breast screening in 5722 women and its preventive significance. Practical Preventive Medicine. 2012;19(11):1679–80.

Huang Y, Kang M, Li H, Li JY, Zhang JY, Liu LH, et al. Combined performance of physical examination, mammography, and ultrasonography for breast cancer screening among Chinese women: a follow-up study. Curr Oncol. 2012;19(Suppl 2):eS22-30.

Han LL, Qi QQ, Wang C, Zhang Y, Dong CY, Wang LY. Comparison of detection rates of cervical cancer and breast cancer in Beijing. Maternal & Child Health Care of China. 2011;26(16):2426–8.

Xu GW, Hu YS, Kan X. The Preliminary Report of Breast Cancer Screening for 100 000 Women in China. China Cancer. 2010;19(9):565–8.

Gianino MM, Lenzi J, Bonaudo M, Fantini MP, Siliquini R, Ricciardi W, et al. Organized screening programmes for breast and cervical cancer in 17 EU countries: trajectories of attendance rates. BMC Public Health. 2018;18(1):1236.

Dominitz JA, Levin TR. What Is Organized Screening and What Is Its Value? Gastrointest Endosc Clin N Am. 2020;30(3):393–411.

Sun L, Legood R, Sadique Z, Dos-Santos-Silva I, Yang L. Cost-effectiveness of risk-based breast cancer screening programme. China Bull World Health Organ. 2018;96(8):568–77.

Blumen H, Fitch K, Polkus V. Comparison of Treatment Costs for Breast Cancer, by Tumor Stage and Type of Service. Am Health Drug Benefits. 2016;9(1):23–32.

PubMed   PubMed Central   Google Scholar  

Cancer Research UK. Survival. 2020. Available online: https://www.cancerresearchukorg/about-cancer/breast-cancer/survival (Accessed 23 Aug 2022).

Zeng H, Ran X, An L, Zheng R, Zhang S, Ji JS, et al. Disparities in stage at diagnosis for five common cancers in China: a multicentre, hospital-based, observational study. The Lancet Public Health. 2021;6(12):e877–87.

Kubo M, Kumamaru H, Isozumi U, Miyashita M, Nagahashi M, Kadoya T, et al. Annual report of the Japanese Breast Cancer Society registry for 2016. Breast Cancer. 2020;27(4):511–8.

Wong JZY, Chai JH, Yeoh YS, Mohamed Riza NK, Liu J, Teo YY, et al. Cost effectiveness analysis of a polygenic risk tailored breast cancer screening programme in Singapore. BMC Health Serv Res. 2021;21(1):379.

Vanier A, Leux C, Allioux C, Billon-Delacour S, Lombrail P, Molinié F. Are prognostic factors more favorable for breast cancer detected by organized screening than by opportunistic screening or clinical diagnosis? A study in Loire-Atlantique (France). Cancer Epidemiol. 2013;37(5):683–7.

Yang L, Wang S, Zhang L, Sheng C, Song F, Wang P, et al. Performance of ultrasonography screening for breast cancer: a systematic review and meta-analysis. BMC Cancer. 2020;20(1):499.

Gartlehner G, Thaler K, Chapman A, Kaminski-Hartenthaler A, Berzaczy D, Van Noord MG, et al. Mammography in combination with breast ultrasonography versus mammography for breast cancer screening in women at average risk. Cochrane Database Syst Rev. 2013;2013(4):CD009632.

Ren W, Chen M, Qiao Y, Zhao F. Global guidelines for breast cancer screening: A systematic review. Breast. 2022;64:85–99.

Omidiji OA, Campbell PC, Irurhe NK, Atalabi OM, Toyobo OO. Breast cancer screening in a resource poor country: Ultrasound versus mammography. Ghana Med J. 2017;51(1):6–12.

Sung H, Ren J, Li J, Pfeiffer RM, Wang Y, Guida JL, et al. Breast cancer risk factors and mammographic density among high-risk women in urban China. NPJ Breast Cancer. 2018;4:3.

He J, Chen WQ, Li N, Shen HB, Li J, Wang Y, et al. China guideline for the screening and early detection of female breast cancer(2021, Beijing). Zhonghua Zhong Liu Za Zhi. 2021;43(4):357–82.

Liu LY, Wang F, Yu LX, Ma ZB, Zhang Q, Gao DZ, et al. Breast cancer awareness among women in Eastern China: a cross-sectional study. BMC Public Health. 2014;14:1004.

Zeng H, Chen W, Zheng R, Zhang S, Ji JS, Zou X, et al. Changing cancer survival in China during 2003–15: a pooled analysis of 17 population-based cancer registries. Lancet Glob Health. 2018;6(5):e555–67.

Williams F, Thompson E. Disparity in Breast Cancer Late Stage at Diagnosis in Missouri: Does Rural Versus Urban Residence Matter? J Racial Ethn Health Disparities. 2016;3(2):233–9.

Obeng-Gyasi S, Obeng-Gyasi B, Tarver W. Breast Cancer Disparities and the Impact of Geography. Surg Oncol Clin N Am. 2022;31(1):81–90.

Jadav S, Rajan SS, Abughosh S, Sansgiry SS. The Role of Socioeconomic Status and Health Care Access in Breast Cancer Screening Compliance Among Hispanics. J Public Health Manag Pract. 2015;21(5):467–76.

Moubadder L, Collin LJ, Nash R, Switchenko JM, Miller-Kleinhenz JM, Gogineni K, et al. Drivers of racial, regional, and socioeconomic disparities in late-stage breast cancer mortality. Cancer. 2022;128(18):3370–82.

He J, Chen WQ, Li N, Shen HB, Li J, Wang Y, et al. China guideline for the screening and early detection of female breast cancer(2021, Beijing). China Cancer. 2021;43(4):357–82.

Walker MJ, Meggetto O, Gao J, Espino-Hernandez G, Jembere N, Bravo CA, et al. Measuring the impact of the COVID-19 pandemic on organized cancer screening and diagnostic follow-up care in Ontario, Canada: A provincial, population-based study. Prev Med. 2021;151: 106586.

Li N, Tan F, Chen W, Dai M, Wang F, Shen S, et al. One-off low-dose CT for lung cancer screening in China: a multicentre, population-based, prospective cohort study. Lancet Respir Med. 2022;10(4):378–91.

Schopper D, de Wolf C. How effective are breast cancer screening programmes by mammography? Review of the current evidence. Eur J Cancer. 2009;45(11):1916–23.

Miles A, Cockburn J, Smith RA, Wardle J. A perspective from countries using organized screening programs. Cancer. 2004;101(S5):1201–13.

Bretthauer M, Wieszczy P, Løberg M, Kaminski MF, Werner TF, Helsingen LM, et al. Estimated Lifetime Gained With Cancer Screening Tests. JAMA Intern Med. 2023;183(11):1196.

Myers ER, Moorman P, Gierisch JM, Havrilesky LJ, Grimm LJ, Ghate S, et al. Benefits and Harms of Breast Cancer Screening: A Systematic Review. JAMA. 2015;314(15):1615–34.

Download references

Acknowledgements

Not applicable.

This work was supported by the National Key Research and Development Program (2016YFC1303000).

Author information

Authors and affiliations.

Department of Liaoning Office for Cancer Prevention and Control, Cancer Hospital of China Medical University, Liaoning Cancer Hospital & Institute, No.44 Xiaoheyan Road, Dadong District, Shenyang, Liaoning, 110042, China

Mengdan Li & Bo Zhu

Department of School of Public Health, China Medical University, Shenyang, Liaoning, 110122, China

Hongying Wang

Department of Radiology, Cancer Hospital of China Medical University, Liaoning Cancer Hospital & Institute, Shenyang, Liaoning, 110042, China

Department of Neurosurgery, Cancer Hospital of China Medical University, Liaoning Cancer Hospital & Institute, No.44 Xiaoheyan Road, Dadong District, Shenyang, Liaoning, 110042, China

Haozhe Piao

You can also search for this author in PubMed   Google Scholar

Contributions

PHZ and ZB conceived this study and take responsibility for its all aspects. ZB and LMD designed the study and conceived this article. ZB and LMD retrieved and screened relevant literatures, with further contributions from QN. LMD wrote the manuscript. LMD completed all the data extraction and statistical analysis supported by WHY . All authors contributed to critical revisions and approved the final version of the Article.

Corresponding authors

Correspondence to Haozhe Piao or Bo Zhu .

Ethics declarations

Ethics approval and consent to participate, consent for publication, competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: table s1..

The summary of detailed search keywords. Table S2. PRISMA checklist. Table S3. The pooled breast cancer detection rates in different subgroups of organized screening programs (China, 2010-2023). Table S4. The pooled early-stage (0–II) breast cancer detection rates in different subgroups of organized screening programs (China, 2010-2023). Table S5. The pooled proportion of early-stage (0–II) breast cancer in different subgroups of organized screening programs (China, 2010-2023). Figure S1. Forest plot of pooled breast cancer detection rate (China, 2010-2023) (A) opportunistic screening; (B) population screening. Figure S2. Forest plot of pooled early-stage (0–II) cancer detection rate (China, 2010-2023) (A) opportunistic screening; (B) population screening. Figure S3. Forest plot of pooled the proportion of early-stage (0–II) cancer (China, 2010-2023) (A) opportunistic screening; (B) organized screening.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Li, M., Wang, H., Qu, N. et al. Breast cancer screening and early diagnosis in China: a systematic review and meta-analysis on 10.72 million women. BMC Women's Health 24 , 97 (2024). https://doi.org/10.1186/s12905-024-02924-4

Download citation

Received : 28 September 2023

Accepted : 22 January 2024

Published : 07 February 2024

DOI : https://doi.org/10.1186/s12905-024-02924-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Breast cancer
  • Opportunistic screening
  • Population screening
  • Early diagnosis

BMC Women's Health

ISSN: 1472-6874

literature review on breast cancer detection

literature review on breast cancer detection

Novel Technique Has Potential to Transform Breast Cancer Detection

Low-dose pem offers strong sensitivity with low false-positive rate.

Vivianne Freitas MD MSC

An innovative breast imaging technique provides high sensitivity for detecting cancer while significantly reducing the likelihood of false positive results, according to a study published today in Radiology: Imaging Cancer . Researchers said the technique has the potential to offer more reliable breast cancer screening for a broader range of patients.

Mammography is an effective screening tool for early detection of breast cancer, but its sensitivity is reduced in dense breast tissue. This is due to the masking effect of overlying dense fibroglandular tissue. Since almost half of the screening population has dense breasts, many of these patients require additional breast imaging, often with MRI, after mammography.

Low-dose positron emission mammography (PEM) is a novel molecular imaging technique that provides improved diagnostic performance at a radiation dose comparable to that of mammography.

For the study, 25 women, median age 52, recently diagnosed with breast cancer, underwent low-dose PEM with the radiotracer fluorine 18-labeled fluorodeoxyglucose ( 18 F-FDG). Two breast radiologists reviewed PEM images taken one and four hours post 18 F-FDG injection and correlated the findings with lab results.

PEM displayed comparable performance to MRI, identifying 24 of the 25 invasive cancers (96%). Its false positive rate was only 16%, compared with 62% for MRI.

Along with its strong sensitivity and low false-positive rate, PEM has the advantage of low cost compared with MRI, making it a more accessible option for widespread use. Additionally, the technology is designed to deliver a radiation dose comparable to that of traditional mammography without the need for breast compression, which can often be uncomfortable for patients.

“The integration of these features—high sensitivity, lower false-positive rates, cost-efficiency, acceptable radiation levels without compression, and independence from breast density—positions this emerging imaging modality as a potential groundbreaking advancement in the early detection of breast cancer,” said study lead author Vivianne Freitas, MD, MSc, assistant professor at the University of Toronto. “As such, it holds the promise of transforming breast cancer diagnostics and screening in the near future, complementing or even improving current imaging methods, marking a significant step forward in breast cancer care.”

Fig 3 Freitas Imaging of 50 year old female patient with a new biopsy-proven malignant lesion in the left breast

Images obtained in a 50-year-old female patient with a new biopsy-proven malignant lesion in the left breast. (A) Craniocaudal mammogram of the right breast does not show any lesion. (B) The malignant lesion corresponds with a 7.0-cm irregular and spiculated mass on the left craniocaudal mammogram. US-guided core-needle biopsy revealed grade 2 invasive lobular carcinoma. (C) The bilateral positron emission mammographic craniocaudal color image obtained 1 hour after intravenous injection of 185 MBq of fluorine 18–labeled fluorodeoxyglucose ( 18 F-FDG) shows a mass with intense uptake in the left breast with known cancer and no abnormal uptake in the right breast. Positron emission mammographic craniocaudal images of the left breast obtained (D) 1 hour and (E) 4 hours after intravenous injection of 185 MBq of 18 F-FDG show no substantial visual difference in uptake of the known cancer. (F) Axial contrast-enhanced fat-saturated subtracted T1-weighted MR image with maximum intensity projection reconstruction obtained 90 seconds after intravenous injection of 0.1 mmol of gadolinium-based contrast material per kilogram of body weight also shows the enhancing mass corresponding to known malignancy (arrow) and marked bilateral background parenchymal enhancement, with multiple nonspecific foci of enhancement in the contralateral breast. The patient opted for bilateral mastectomy, which confirmed left-sided malignancy and no malignancy in the contralateral breast.

https://doi.org/10.1148/rycan.230020 © RSNA 2024

Low-dose PEM offers potential clinical uses in both screening and diagnostic settings, according to Dr. Freitas.

“For screening, its ability to perform effectively regardless of breast density potentially addresses a significant shortcoming of mammography, particularly in detecting cancers in dense breasts where lesions may be obscured,” she said. “It also presents a viable option for patients at high risk who are claustrophobic or have contraindications for MRI.”

Technique Also Aids in Evaluating Treatment Response, Disease Recurrence

The technology could also play a crucial role in interpreting uncertain mammogram results, evaluating the response to chemotherapy and ascertaining the extent of disease in newly diagnosed breast cancer, including involvement of the other breast.

Dr. Freitas, who is also staff radiologist of the Breast Imaging Division of the Toronto Joint Department of Medical Imaging, University Health Network, Sinai Health System and Women’s College Hospital, is currently researching PEM’s ability to reduce the high rates of false positives typically associated with MRI scans. Should PEM successfully lower these rates, it could significantly lessen the emotional distress and anxiety linked to false positives, Dr. Freitas said. Additionally, it might lead to a decrease in unnecessary biopsies and treatments.

More studies are needed to determine low-dose PEM’s exact role and efficacy in the clinical setting.

“While the full integration of this imaging method into clinical practice is yet to be confirmed, the preliminary findings of this research are promising, particularly in demonstrating the capability of detecting invasive breast cancer with low doses of fluorine-18-labeled FDG,” Dr. Freitas said. “This marks a critical first step in its potential future implementation in clinical practice.”

For More Information

Access the Radiology: Imaging Cancer article, “ Breast Cancer Detection Using a Low-Dose Positron Emission Digital Mammography System ,” and the related commentary, “ Low-Dose Positron Emission Mammography: A Novel, Promising Technique for Breast Cancer Detection .”

Read previous RSNA News articles on breast imaging:

  • Understanding How Pregnancy and Breastfeeding Impact Breast Imaging and Diagnosis
  • Radiologists Should Ready Their Practices for Automated US Breast Screening
  • The Cost of Survival  

Book cover

International Conference on Advanced Intelligent Systems for Sustainable Development

AI2SD 2019: Advanced Intelligent Systems for Sustainable Development (AI2SD’2019) pp 247–254 Cite as

Machine Learning Techniques for Breast Cancer Diagnosis: Literature Review

  • Djihane Houfani 15 ,
  • Sihem Slatnia 15 ,
  • Okba Kazar 15 ,
  • Noureddine Zerhouni 16 ,
  • Abdelhak Merizig 15 &
  • Hamza Saouli 15  
  • Conference paper
  • First Online: 06 February 2020

675 Accesses

5 Citations

3 Altmetric

Part of the Advances in Intelligent Systems and Computing book series (AISC,volume 1103)

Breast cancer is one of major diseases that cause high number of women’s death. To decrease these numbers, early diagnosis is an important task in medical process. Machine learning (ML) technics are an effective way to classify data especially in medical field, where those methods are widely used in diagnosis and decision making. In this paper, we present a review of the most recent publications that employ Machine Learning a pproaches in breast cancer diagnosis. The classification models discussed here are based on various ML techniques applied on different datasets.

  • Breast cancer
  • Medical diagnosis
  • Machine learning
  • Classification

This is a preview of subscription content, log in via an institution .

Buying options

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

U.S. Cancer Statistics Working Group. United States Cancer Statistics: 1999–2008 Incidence and Mortality Web-based Report. Department of Health and Human Services, Centers for Disease Control and Prevention, and National Cancer Institute, Atlanta, GA (2012)

Google Scholar  

Sarah, M.: Cancers du sein et Immunologie anti-tumorale. Universite de Reims Champagne-ardenne, Ecole doctorale Sciendes Technologie Sante (547) (2014). Doctoral thesis

Vikas, C., Saurabh, P.: A novel approach for breast cancer detection using data mining techniques. Int. J. Innov. Res. Comput. Commun. Eng. 2 (1), 2456–2465 (2014)

http://www.springer.com/lncs . Accessed 1 Feb 2019

Arpit, B., Aruna, T.: Breast cancer diagnosis using genetically optimized neural network model. Expert Syst. Appl. 42 (10), 1–15 (2015)

Ashraf, O.I., Siti, M.S.: Intelligent breast cancer diagnosis based on enhanced Pareto optimal and multilayer perceptron neural network. Int. J. Comput. Aided Eng. Technol. Indersci. 10 (5), 543–556 (2018)

CrossRef   Google Scholar  

Na, L., Qi, E., Xu, M., Bo, G., Gui-Qiu, L.: A novel intelligent classification model for breast cancer diagnosis. Inf. Process. Manag. 56 , 609–623 (2019)

Nawel, Z., Nabiha, A., Nilanjan, D., Mokhtar, S.: Adaptive semi supervised support vector machine semi supervised learning with features cooperation for breast cancer classification. J. Med. Imaging Health Inf. 6 , 53–62 (2016)

Abdulkader, H., John, B.I., Rahib, H.A.: Machine learning techniques for classification of breast tissue. In: 9th International Conference on Theory and Application of Soft Computing, Computing with Words and Perception, ICSCCW, pp. 402–410. Proc. Comput. Sci. Elsevier, Budapest (2017)

Haifeng, W., Bichen, Z., Sang, W.Y., Hoo, S.K.: A support vector machine-based ensemble algorithm for breast cancer diagnosis. Eur. J. Oper. Res. 267 (2), 1–33 (2017)

Kemal, P., Ümit, Ş.: A novel ML approach to prediction of breast cancer: combining of mad normalization, KMC based feature weighting and AdaBoostM1 classifier’. In: 2nd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), pp. 1–4. IEEE (2018)

Teresa, A., Guilherme, A., Eduardo, C., José, R., Paulo, A., Catarina, E., António, P., Aurélio, C.: Classification of breast cancer histology images using convolutional neural networks. PLoS One 12 (6), 1–14 (2017)

Fabio, A.S., Luiz, S.O., Caroline, P., Laurent, H.: Breast cancer histopathological image classification using convolutional neural networks. In: Conference: International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2016)

Hiba, A., Hajar, M., Hassan, A.M., Thomas, N.: Using machine learning algorithms for breast cancer risk prediction and diagnosis. In: The 6th International Symposium on Frontiers in Ambient and Mobile Systems (FAMS), Proc. Comput. Sci., pp. 1064–1069. Elsevier (2016)

Download references

Author information

Authors and affiliations.

LINFI Laboratory, University of Biskra, Biskra, Algeria

Djihane Houfani, Sihem Slatnia, Okba Kazar, Abdelhak Merizig & Hamza Saouli

Institut FEMTO-ST, UMR CNRS 6174 - UFC/ENSMM/UTBM, Besançon, France

Noureddine Zerhouni

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Djihane Houfani .

Editor information

Editors and affiliations.

Faculty of Sciences and Techniques of Tangier, Abdelmalek Essaâdi University, Tangier, Morocco

Mostafa Ezziyyani

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Cite this paper.

Houfani, D., Slatnia, S., Kazar, O., Zerhouni, N., Merizig, A., Saouli, H. (2020). Machine Learning Techniques for Breast Cancer Diagnosis: Literature Review. In: Ezziyyani, M. (eds) Advanced Intelligent Systems for Sustainable Development (AI2SD’2019). AI2SD 2019. Advances in Intelligent Systems and Computing, vol 1103. Springer, Cham. https://doi.org/10.1007/978-3-030-36664-3_28

Download citation

DOI : https://doi.org/10.1007/978-3-030-36664-3_28

Published : 06 February 2020

Publisher Name : Springer, Cham

Print ISBN : 978-3-030-36663-6

Online ISBN : 978-3-030-36664-3

eBook Packages : Intelligent Technologies and Robotics Intelligent Technologies and Robotics (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • Advanced Search
  • Journal List
  • Springer Nature - PMC COVID-19 Collection

Logo of phenaturepg

Analysis of Breast Cancer Detection Using Different Machine Learning Techniques

Siham a. mohammed.

9 Taiz University, Taiz, Yemen

Sadeq Darrab

11 University of Magdeburg, Magdeburg, Germany

Salah A. Noaman

10 Aden University, Aden, Yemen

Gunter Saake

Data mining algorithms play an important role in the prediction of early-stage breast cancer. In this paper, we propose an approach that improves the accuracy and enhances the performance of three different classifiers: Decision Tree (J48), Naïve Bayes (NB), and Sequential Minimal Optimization (SMO). We also validate and compare the classifiers on two benchmark datasets: Wisconsin Breast Cancer (WBC) and Breast Cancer dataset. Data with imbalanced classes are a big problem in the classification phase since the probability of instances belonging to the majority class is significantly high, the algorithms are much more likely to classify new observations to the majority class. We address such problem in this work. We use the data level approach which consists of resampling the data in order to mitigate the effect caused by class imbalance. For evaluation, 10 fold cross-validation is performed. The efficiency of each classifier is assessed in terms of true positive, false positive, Roc curve, standard deviation (Std), and accuracy (AC). Experiments show that using a resample filter enhances the classifier’s performance where SMO outperforms others in the WBC dataset and J48 is superior to others in the Breast Cancer dataset.

Introduction

Breast cancer is the second leading cause of death among women worldwide [ 1 ]. In 2019, 268,600 new cases of invasive breast cancer were expected to be diagnosed in women in the U.S., along with 62,930 new cases of non-invasive breast cancer [ 2 ]. Early detection is the best way to increase the chance of treatment and survivability. Data mining has become a popular tool for knowledge discovery which shows good results in marketing, social science, finance and medicine [ 19 , 20 ]. Recently, multiple classifiers algorithms are applied on medical datasets to perform predictive analysis about patients and their medical diagnosis [ 6 , 9 , 10 , 21 ]. For example, using machine learning techniques to assess tumor behavior for breast cancer patients. One problem is that there is a class imbalance in the training data, since the probability of not having this disease is higher than the one of having it. This paper introduces a comparison between three different classifiers: J48, NB, and SMO with respect to accuracy in detection of breast cancer. Our aim is to prepare the dataset by proposing a suitable method that can manage the imbalanced dataset and the missing values, to enhance the classifier’s performance. All tasks were conducted using Weka 3.8.3.

The remainder of this paper is organized as follows. Section  2 presents literature review. Section  3 introduces the datasets. Section  4 describes the research methodology including pre-processing experiments, classification and performance evaluation criteria. The experimental results are presented in Sect.  5 . Finally, Sect.  6 shows the conclusion and future work.

Literature Review

In recent years, several studies have applied data mining algorithms on different medical datasets to classify Breast Cancer. These algorithms show good classification results and encourage many researchers to apply these kind of algorithms to solve challenging tasks. In [ 21 ], a convolutional neural network (CNN) was used to predict and classify the invasive ductal carcinoma in breast histology images with an accuracy of almost 88%. Moreover, data mining is used widely in medical fields to predict and classify abnormal events to create a better understanding of any incurable diseases such as cancer. The outcomes of using data mining in classification are promising for breast cancer detection. Therefore, data mining approach is used in this work. A list of some literature studies related to this method is presented in Table  1 .

Table 1.

Breast cancer detection research using different machine learning algorithms.

The datasets that are used in this paper are available at the UCI Machine Learning Repository [ 13 ].

WBC Dataset

The WBC dataset contains 699 instances and 11 attributes in which 458 were benign and 241 were malignant cases [ 14 ]. In the WBC, the value of the attribute (Bare Nuclei) status was missing for 16 records. Hence data preprocessing is essential and important for this dataset, requiring us to manage the imbalanced data and the missing values.

Breast Cancer Dataset

The feature form this dataset are computed from a digitized image of a fine needle aspirate (FNA) of a breast tumor. The target feature records the prognosis (i.e., malignant or benign). The dataset contains 286 instances and 10 attributes in which 201 were no-recurrence-events and 85 were recurrence events. In the Breast Cancer dataset, the value of the attribute (node-caps) status was missing in 8 records.

Research Methodology

The two datasets used in this work are vulnerable to missing and imbalanced data therefore, before performing the experiments, a large fraction of this work will be for preprocessing the data in order to enhance the classifier’s performance. Preprocessing will focus on managing the missing values and the imbalanced data. To manage the missing attributes, all the instances with missing values are removed. The imbalance data problem needs to adjust either the classifier or the training set balance. To do so, the resample filter is used to rebalance the data artificially. Then, 10 fold cross validation is applied and finally a comparison between these three classifiers is implemented.

Preprocessing Phase

First, the data were discretized using discretize filter, then missing values were removed from the dataset. Second, instances were resampled using the resample filter in order to maintain the class distribution in the subsample and to bias the class distribution toward a uniform distribution. Section 5 will show that this idea is improving the classifier’s performance. Third, 10 fold cross validation was applied then experiments were carried out over three classifiers Naïve Bayes, SMO and J48, as illustrated in Fig.  1 .

An external file that holds a picture, illustration, etc.
Object name is 497957_1_En_10_Fig1_HTML.jpg

Proposed breast cancer detection model using Breast Cancer and WBC datasets.

In Fig.  1 , the data preprocessing technique has been applied including three steps: discretization, instances resampling and removing the missing values. After that, 10 fold cross validation has been applied. Then, three classifiers have been evaluated over the prepared datasets.

Training and Classification

In order to minimize the bias associated with the random sampling of the training data, we use 10 fold cross validation after the pre-processing phase. In k-fold cross-validation, the original dataset is randomly partitioned into k equal size subsets. The classification model is trained and tested k times. Each time, a single subset is retained as the validation data for testing the model, and the remaining k−1 subsets are used as training data. Three classification techniques were selected: a Naïve Bayes (NB), a Decision Tree built on the J48 algorithm, and a Sequential Minimal Optimization (SMO). The NB classifier is a probabilistic classifier based on the Bayes rule. It works by estimating the portability of each class value that a given instance belongs to that class [ 15 ]. The J48 algorithm [ 16 ] uses the concept of information entropy and works by splitting each data attributers into smaller datasets in order to examine entropy differences. It is an improved and enhanced version of C4.5 [ 17 ]. The SMO model implements John Platt’s sequential minimal optimization algorithm for training a support vector classifiers. This implementation globally replaces all missing values and transforms nominal attributes into binary ones. It also normalizes all attributes by default [ 18 ].

Performance Evaluation Criteria

In this study, we use five performance measures to evaluate all the classifiers: true positive, false positive, ROC curve, standard deviation (Std) and accuracy (AC).

equation M1

Where TP, TN, FP and FN denote true positive, true negative, false positive and false negative, respectively.

Experimental Results

First, the three classifications algorithms were tested on the WBC and the Breast Cancer datasets without applying the preprocessing techniques. Among them, the best result was recorded for J48: 75.52% in the Breast Cancer dataset and for SMO: 96.99% in the WBC dataset. Next, after applying preprocessing techniques accuracy increases to 98.20% with J48 in the Breast Cancer dataset and 99.56% with SMO in the WBC dataset.

Experiment Using the Breast Cancer Dataset

First, the three classifiers are tested over original data (without any preprocessing).The results show that J48 is the best one with 75.52% accuracy where the accuracy of NB and SMO are 71.67% and 69.58%, respectively. Next, we apply discretization filter and remove the records with missing values, results improved with NB and SMO as follows: NB: 75.53% and SMO: 72.66% where J48: 74.82%. After that, resample filter was applied for 7 times. The Performance of the classifiers are improved and enhanced as shown in Table  2 .

Table 2.

Performance of the classifiers in the Breast Cancer Dataset.

As illustrated in Table  2 , we can obviously notice that the more resample filter we apply, the improved accuracy we obtain. That is because the data is imbalanced and the filter maintains the class distribution. For the Breast cancer dataset, J48 outperforms others with 98.20%. Accuracy measures for J48 classifier is shown in Table  3 and Roc curve of J48 is shown in Fig.  2 .

Table 3.

Accuracy measures for J48 in the Breast Cancer Dataset.

An external file that holds a picture, illustration, etc.
Object name is 497957_1_En_10_Fig2_HTML.jpg

J48 ROC curve in Breast Cancer Dataset.

To measure the performance of the proposed model, we compare the obtained results with the study proposed in [ 9 ]. The same dataset and three classifiers including J48 algorithm are used to evaluate the model’s performance. According to the results, the J48 classifier of the proposed model achieves high accuracy comparing to other classifiers. This is because of using the resample filter for the pre-processing phase in the proposed model rather than feature selection technique that used in [ 9 ] as illustrated in Table  4 .

Table 4.

Compression of accuracy measures for the Breast Cancer Dataset.

Experiment Using the WBC Dataset

Same experiments were applied with the WBC dataset. With respect to applying preprocessing techniques all algorithms present higher classification accuracy, the difference lies in the fact that using the resample filter several times improves the classification accuracy. SMO classifier achieve 99.56% efficiency compared to 99.12% of the Naïve Bayes and 99.24% of the J48. Results are illustrated in Table  5 .

Table 5.

Performance of the classifiers in WBC dataset.

In the WBC dataset, SMO superior than others with 99.56%. Accuracy measures for SMO classifier is shown in Table  6 and Roc curve of SMO is shown in Fig.  3 .

Table 6.

Accuracy measures for SMO in WBC Dataset.

An external file that holds a picture, illustration, etc.
Object name is 497957_1_En_10_Fig3_HTML.jpg

SMO ROC curve in WBC Dataset.

In terms of the WBC dataset, our proposed method is compared with two studies [ 6 , 10 ]. Results shows that the performance of SMO classifier is better since our model employs pre-processing, and resampling approaches. Thus, utilizing pre-processing, and resampling techniques play an important role in increasing the SMO accuracy comparable to the other techniques in [ 6 , 10 ]. Details are shown below in Table  7 .

Table 7.

Compression of accuracy measures for the WBC Dataset.

Breast cancer is considered to be one of the significant causes of death in women. Early detection of breast cancer plays an essential role to save women’s life. Breast cancer detection can be done with the help of modern machine learning algorithms. In this paper, we focus on how to deal with imbalanced data that have missing values using resampling techniques to enhance the classification accuracy of detecting breast cancer. In our work, three classifiers algorithms J48, NB, and SMO applied on two different breast cancer datasets. Results show that using the resample filter in the preprocessing phase enhances the classifier’s performance. In the future, the same experiments will apply to different classifiers and different datasets.

Contributor Information

Ying Tan, Email: nc.ude.ukp@naty .

Yuhui Shi, Email: nc.ude.ctsus@hyihs .

Milan Tuba, Email: sr.ca.pn@abut .

Siham A. Mohammed, Email: [email protected] .

Sadeq Darrab, Email: [email protected] .

Salah A. Noaman, Email: moc.liamtoh@71-halas-s .

Gunter Saake, Email: [email protected] .

Breast cancer screening and early diagnosis in China: a systematic review and meta-analysis on 10.72 million women

Affiliations.

  • 1 Department of Liaoning Office for Cancer Prevention and Control, Cancer Hospital of China Medical University, Liaoning Cancer Hospital & Institute, No.44 Xiaoheyan Road, Dadong District, Shenyang, Liaoning, 110042, China.
  • 2 Department of School of Public Health, China Medical University, Shenyang, Liaoning, 110122, China.
  • 3 Department of Radiology, Cancer Hospital of China Medical University, Liaoning Cancer Hospital & Institute, Shenyang, Liaoning, 110042, China.
  • 4 Department of Neurosurgery, Cancer Hospital of China Medical University, Liaoning Cancer Hospital & Institute, No.44 Xiaoheyan Road, Dadong District, Shenyang, Liaoning, 110042, China. [email protected].
  • 5 Department of Liaoning Office for Cancer Prevention and Control, Cancer Hospital of China Medical University, Liaoning Cancer Hospital & Institute, No.44 Xiaoheyan Road, Dadong District, Shenyang, Liaoning, 110042, China. [email protected].
  • PMID: 38321439
  • PMCID: PMC10848517
  • DOI: 10.1186/s12905-024-02924-4

Background: The incidence of breast cancer among Chinese women has gradually increased in recent years. This study aims to analyze the situation of breast cancer screening programs in China and compare the cancer detection rates (CDRs), early-stage cancer detection rates (ECDRs), and the proportions of early-stage cancer among different programs.

Methods: We conducted a systematic review and meta-analysis of studies in multiple literature databases. Studies that were published between January 1, 2010 and June 30, 2023 were retrieved. A random effects model was employed to pool the single group rate, and subgroup analyses were carried out based on screening model, time, process, age, population, and follow-up method.

Results: A total of 35 studies, including 47 databases, satisfied the inclusion criteria. Compared with opportunistic screening, the CDR (1.32‰, 95% CI: 1.10‰-1.56‰) and the ECDR (0.82‰, 95% CI: 0.66‰-0.99‰) were lower for population screening, but the proportion of early-stage breast cancer (80.17%, 95% CI: 71.40%-87.83%) was higher. In subgroup analysis, the CDR of population screening was higher in the urban group (2.28‰, 95% CI: 1.70‰-2.94‰), in the breast ultrasonography (BUS) in parallel with mammography (MAM) group (3.29‰, 95% CI: 2.48‰-4.21‰), and in the second screening follow-up group (2.47‰, 95% CI: 1.64‰-3.47‰), and the proportion of early-stage breast cancer was 85.70% (95% CI: 68.73%-97.29%), 88.18% (95% CI: 84.53%-91.46%), and 90.05% (95% CI: 84.07%-94.95%), respectively.

Conclusion: There were significant differences between opportunistic and population screening programs. The results of these population screening studies were influenced by the screening process, age, population, and follow-up method. In the future, China should carry out more high-quality and systematic population-based screening programs to improve screening coverage and service.

Keywords: Breast cancer; Early diagnosis; Opportunistic screening; Population screening.

© 2024. The Author(s).

Publication types

  • Meta-Analysis
  • Systematic Review
  • Research Support, Non-U.S. Gov't
  • Breast Neoplasms* / epidemiology
  • China / epidemiology
  • Early Detection of Cancer / methods
  • Mammography / methods
  • Mass Screening
  • Ultrasonography, Mammary

Grants and funding

  • 2016YFC1303000/National Key Research and Development Program of China

IMAGES

  1. (PDF) Breast Cancer Detection and Diagnosis Using Mammographic Data

    literature review on breast cancer detection

  2. (PDF) Breast cancer detection: A review on mammograms analysis techniques

    literature review on breast cancer detection

  3. (PDF) Current treatment landscape for patients with locally recurrent

    literature review on breast cancer detection

  4. (PDF) Radiation therapy for breast cancer: Literature review

    literature review on breast cancer detection

  5. (PDF) The Role of Deep Learning in Advancing Breast Cancer Detection

    literature review on breast cancer detection

  6. (PDF) Breast Cancer Detection with Mammogram Segmentation: A

    literature review on breast cancer detection

COMMENTS

  1. A literature review on the imaging methods for breast cancer

    A literature review on the imaging methods for breast cancer - PMC Journal List Int J Physiol Pathophysiol Pharmacol v.14 (3); 2022 PMC9301184 As a library, NLM provides access to scientific literature. Inclusion in an NLM database does not imply endorsement of, or agreement with, the contents by NLM or the National Institutes of Health.

  2. Breast Cancer Detection and Diagnosis Using Mammographic Data

    This review aimed to survey both traditional ML and DL literature with particular application for breast cancer diagnosis. The review also provided a brief insight into some well-known DL networks. Methods In this paper, we present an overview of ML and DL techniques with particular application for breast cancer.

  3. Deep learning applications to breast cancer detection by magnetic

    This paper systematically reviewed the current literature on deep learning detection of breast cancer based on magnetic resonance imaging (MRI). The literature search was performed from 2015 to Dec 31, 2022, using Pubmed.

  4. Breast Cancer—Epidemiology, Classification, Pathogenesis and Treatment

    The article presents a review of the literature on breast carcinoma - a disease affecting women in the world. Abstract Breast cancer is the most-commonly diagnosed malignant tumor in women in the world, as well as the first cause of death from malignant tumors. The incidence of breast cancer is constantly increasing in all regions of the world.

  5. A Systematic Literature Review of Breast Cancer Diagnosis Using Machine

    1 Introduction According to the GLOBOCAN, 2020 report, 19.3 million cancer cases and 10 million deaths were recorded in 2020 [ 1, 2 ]. The number of female breast cancer cases has surpassed that of lung cancer, with 2.3 million new cases predicted [ 3, 4, 5, 6, 7 ].

  6. A Comprehensive Review on Breast Cancer Detection ...

    6 Citations Explore all metrics Abstract The incidence and mortality rate of Breast Cancer (BC) are global problems for women, with over 2.1 million new diagnoses each year worldwide. There is no age range, race, or ethnicity threshold, as all women are susceptible; however, no permanent remedy has been developed for it.

  7. Breast cancer detection using artificial intelligence techniques: A

    Breast cancer detection using artificial intelligence techniques: A systematic literature review Ali BouNassif, Manar AbuTalib, QassimNasir, YamanAfadar, OmarElgendy Show more Add to Mendeley https://doi.org/10.1016/j.artmed.2022.102276Get rights and content Highlights

  8. Literature review of breast cancer detection using machine learning

    Cancer is the leading cause of non-accidental deaths worldwide. Specifically, nearly 10 million people died globally from cancer in the year 2020. Breast Cancer (BC) is a common and fatal disease among women worldwide, and ranks fourth among the fatal diseases among various cancers, such as cervical, colorectal, and cervical tumors and brain ...

  9. Systematic Review of Computing Approaches for Breast Cancer Detection

    This paper introduces the findings of a systematic review that seeks to examine the state-of-the-art CAD systems for breast cancer detection. This review is based on 118 publications published in 2018-2021 and retrieved from major scientific publication databases while using a rigorous methodology of a systematic review.

  10. Breast cancer detection using artificial intelligence techniques: A

    Introduction. Breast cancer is one of the major causes of death in women around the world. According to the American cancer society, 41,760 women and more than 500 men died from breast cancer recently. 1 Breast cancer occurs in four main types: normal, benign, in-situ carcinoma and invasive carcinoma [1]. A benign tumor involves a minor change in the breast structure.

  11. An anatomization on breast cancer detection and ...

    This paper aims to review Artificial neural networks, Multi-Layer Perceptron Neural network (MLP) and Convolutional Neural network (CNN) employed to detect breast malignancies for early diagnosis of breast cancer based on their accuracy in order to identify which method is better for the diagnosis of breast cell malignancies.

  12. Breast cancer detection using artificial intelligence techniques: A

    Breast cancer detection using artificial intelligence techniques: A systematic literature review Ali Bou Nassif*, Manar Abu Talib, Qassim Nasir, Yaman Afadar, Omar Elgendy {anassif, mtalib, nasir, u17104387, u16104886}@sharjah.ac.ae University of Sharjah, UAE * Corresponding Author Abstract

  13. Deep Learning Based Methods for Breast Cancer Diagnosis: A ...

    Breast cancer is one of the precarious conditions that affect women, and a substantive cure has not yet been discovered for it. With the advent of Artificial intelligence (AI), recently, deep learning techniques have been used effectively in breast cancer detection, facilitating early diagnosis and therefore increasing the chances of patients' survival. Compared to classical machine learning ...

  14. Breast Mammograms Diagnosis Using Deep Learning: State of ...

    Usually, screening (mostly mammography) is used by radiologists to manually detect breast cancer. The likelihood of identifying suspected cases as false positives or false negatives is significant, contingent on the experience of the radiologist and the kind of imaging screening device/method utilized. The confirmation of the type of tumour seen by the radiologist is sent for histological ...

  15. Machine Learning Techniques for Breast Cancer Diagnosis: Literature Review

    ML T echniques for Breast Cancer Diagnosis: Literature Review 5 neural networks can be weak in generalizing and can get stuc k in local optima. Haifeng W. et al. [10] designed an SVM-based ...

  16. A Comprehensive Survey on Deep-Learning-Based Breast Cancer Diagnosis

    Simple Summary Breast cancer was diagnosed in 2.3 million women, and around 685,000 deaths from breast cancer were recorded globally in 2020, making it the most common cancer. Early and accurate detection of breast cancer plays a critical role in improving the prognosis and bringing the patient survival rate to 50%.

  17. Breast cancer detection using artificial intelligence techniques: A

    Breast cancer detection using artificial intelligence techniques: A systematic literature review Authors Ali Bou Nassif 1 , Manar Abu Talib 2 , Qassim Nasir 3 , Yaman Afadar 4 , Omar Elgendy 5 Affiliations 1 University of Sharjah, United Arab Emirates. Electronic address: [email protected]. 2 University of Sharjah, United Arab Emirates.

  18. Unveiling promising breast cancer biomarkers: an integrative approach

    Breast cancer remains a significant health challenge worldwide, necessitating the identification of reliable biomarkers for early detection, accurate prognosis, and targeted therapy. Breast cancer RNA expression data from the TCGA database were analyzed to identify differentially expressed genes (DEGs). The top 500 up-regulated DEGs were selected for further investigation using random forest ...

  19. (PDF) A Systematic Review on Breast Cancer Detection Using Deep

    This paper aims to provide a detailed survey dealing with the screening techniques for breast cancer with pros and cons. The applicability of deep learning techniques in breast cancer detection is ...

  20. Healthcare

    Breast cancer survival has increased significantly over the last few decades due to more effective strategies for prevention and risk modification, advancements in imaging detection, screening, and multimodal treatment algorithms. However, many have observed disparities in benefits derived from such improvements across populations and demographic groups. This review summarizes published works ...

  21. Novel technique has potential to transform breast cancer detection

    Breast Cancer Detection Using a Low-Dose Positron Emission Digital Mammography System. Radiology: Imaging Cancer , 2024; 6 (2) DOI: 10.1148/rycan.230020 Cite This Page :

  22. A Systematic Review on Breast Cancer Detection Using Deep ...

    Article 20 March 2023 1 Introduction Breast cancer is categorized among the most frequently reported cancers in the World. It has been reported in both males and females. However, its frequency with females is far beyond the comparison.

  23. Breast cancer screening and early diagnosis in China: a systematic

    The incidence of breast cancer among Chinese women has gradually increased in recent years. This study aims to analyze the situation of breast cancer screening programs in China and compare the cancer detection rates (CDRs), early-stage cancer detection rates (ECDRs), and the proportions of early-stage cancer among different programs. We conducted a systematic review and meta-analysis of ...

  24. Breast Cancer—Epidemiology, Risk Factors, Classification, Prognostic

    Simple Summary Breast cancer is the most common cancer among women. It is estimated that 2.3 million new cases of BC are diagnosed globally each year.

  25. Breast cancer detection using artificial intelligence techniques: A

    Breast cancer is one of the most common cancer types. According to the National Breast Cancer Foundation, in 2020 alone, more than 276,000 new cases of invasive breast cancer and more than 48,000 ...

  26. Novel Technique Has Potential to Transform Breast Cancer Detection

    February 09, 2024. Freitas. An innovative breast imaging technique provides high sensitivity for detecting cancer while significantly reducing the likelihood of false positive results, according to a study published today in Radiology: Imaging Cancer. Researchers said the technique has the potential to offer more reliable breast cancer ...

  27. Machine Learning Techniques for Breast Cancer Diagnosis: Literature Review

    Various machine learning techniques are used for cancer classification, such as k-nearest neighbor, multi-layer perceptron, support vector machine, and neural network techniques etc. Neural network techniques are very useful for cancer detection. The rest of the paper is organized as follows. Section 2 is an overview on breast cancer disease.

  28. Analysis of Breast Cancer Detection Using Different Machine Learning

    Section 2 presents literature review. ... Breast cancer detection can be done with the help of modern machine learning algorithms. In this paper, we focus on how to deal with imbalanced data that have missing values using resampling techniques to enhance the classification accuracy of detecting breast cancer. In our work, three classifiers ...

  29. Breast cancer screening and early diagnosis in China: a systematic

    Background: The incidence of breast cancer among Chinese women has gradually increased in recent years. This study aims to analyze the situation of breast cancer screening programs in China and compare the cancer detection rates (CDRs), early-stage cancer detection rates (ECDRs), and the proportions of early-stage cancer among different programs.