Reviewing the Role of Artificial Intelligence in Cancer

The human body comprises trillions of cells, and the chance of occurrence of cancer in any part of it is fairly significant. Non-communicable diseases (NCDs) are now a major cause of global deaths, and cancer is expected to top the ranked list of leading causes of deaths in every country [1]. Cancer incidences and mortality are rapidly growing worldwide [2]. As per the report of the World Health Organization (WHO) in 2015, cancer is one of the topmost ranking causes of death before the age group of 70-75 years in 91 countries out of 172 and holds the third or possibly fourth position in 22 other countries. One of the major factors that play a vital role in tackling cancer is its early detection and prompt diagnosis. There are different imaging techniques available for cancer Abstract


Introduction
The human body comprises trillions of cells, and the chance of occurrence of cancer in any part of it is fairly significant. Non-communicable diseases (NCDs) are now a major cause of global deaths, and cancer is expected to top the ranked list of leading causes of deaths in every country [1]. Cancer incidences and mortality are rapidly growing worldwide [2]. As per the report of the World Health Organization (WHO) in 2015, cancer is one of the topmost ranking causes of death before the age group of 70-75 years in 91 countries out of 172 and holds the third or possibly fourth position in 22 other countries.
One of the major factors that play a vital role in tackling cancer is its early detection and prompt diagnosis. There are different imaging techniques available for cancer screening and diagnosis among which the investigative methods that top the list are mammography, ultrasound, and thermography. Mammography is one of the most important early diagnostic methods for breast cancer but it is not very successful for dense breasts. For this reason, ultrasound or diagnostic sonographic techniques are recommended [3]. In recent years, technological advancement in medical imaging as well as the discovery of minimally invasive biomarkers have shown possibilities of curbing such challenges across a wide spectrum including detection of cancer, therapeutics and monitoring techniques. However, one of the major challenges lies in the interpretation of the large volume of data being generated by such advancements.
Over the past few years, Machine Learning's (ML's) potential in precision oncology has become more apparent. The application of deep learning (DL), a broader part of ML, in wide array of aspects including diagnosis, prognostic determination, and prediction tasks have been reported [4][5][6][7]. DL has shown an impressive performance in the classification of image data in varied clinical fields. Advances in DL has greatly improved its efficiency and precision in oncology. Few examples would include detecting and classifying skin lesions, the identifying and categorizing of lung cancers, detecting breast cancer metastases and the like. All these DL techniques on images primarily employ convolutional neural networks (CNNs) [8][9][10][11]. Different Machine learning approaches have been utilized in the field of oncology. These include analysis of datasets from varied sources using both supervised and unsupervised learning [12][13].

Concepts of Artificial Intelligence, Machine Learning and Deep Learning
Artificial intelligence (AI), and its enhancing mathematical systems for estimation and classification, is one of the fields of computer science that was conceived around the beginning of the 1940s. AI is primarily centered on mathematical models that mimic the functioning of the human brain. ML is a branch of AI where a system learns from large amounts of data samples and provide the conclusive result for classification and regression [14][15][16]. As the size of the data set increases learning improves, and it becomes possible via ML to estimate unknowns and to predict outputs. As formally defined by Tom Mitchell: "A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E" [17]. ML has been seen as the new field of research for biomedical studies and provides a variety of applications in various subdomains like cancer detection and monitoring [18]. ML algorithms are completely dynamic in nature and always try to improve as more data are added to the dataset. Mostly ML algorithms are represented as mathematical models, where data samples map to observed variables termed as features, and as variables outcome termed as labels [14][15][16][17][18][19]. The optimization of algorithms through a particular process termed as training employs the training set (of available data from previous measurements) and as output, it predicts the labels by extracting and analyzing the exact features, even with newly added data samples.

Classification of Machine Learning Techniques
ML techniques can be classified based on label type and feature type. Label type techniques are mostly categorized into three: (i) supervised, (ii) unsupervised, and, (iii) reinforcement learning. Based on the type of features the classification is mostly as (i) handcrafted, and; (ii) non-handcrafted feature-based techniques.
Supervised Learning: Supervised learning applies to exact data sets that are labeled by working researchers of their particular research areas and by industry professionals. Feature engineering algorithms are used to train and reduce the quantified prediction error rates, that is, the difference between the predicted labels and the known labels. In general, types of algorithms that are employed are: linear and logistic regression, Naive Baye's classification [19], support vector machines (SVMs) [14], and random forests [20]. The histopathology based classification, for example, is one of the supervised learning applications where pathological images are labeled by the expert for cancer versus non-cancer and for different Gleason grades [21][22][23].

Unsupervised Learning
Unsupervised learning applies to datasets where the algorithm separates into different classes, depending on the input features and training data, which is not explicitly labeled. In general, types of algorithms employed include k-means clustering [19] which involves finding groups or clusters of provided data, principal component analysis (PCA) [24], and, autoencoders [25]. The unsupervised learning-based algorithms are applied to recognize patterns of immunohistochemical staining when stained for histone modifications in tissue samples. The study inferred that different patterns lead to variations in the amount of risk of the repetitiveness of cancer and it is almost independent of applying tumor stage or PSA type of clinical parameters [26].

Reinforcement Learning
Reinforcement learning is applied to datasets where the algorithm and its function are conclusive. The reinforcement algorithm invokes an assistant that acts to predict the features for further prospective steps. It always depends on the present and past features of datasets. The assistant ultimately learns from selecting the individual and takes action at each stage to maximize the expectation.

Feature-Based ML Techniques
These techniques utilize handcrafted and nonhandcrafted features for ML. Handcrafted-feature based implementation is associated with the derivation of infinite numbers of exact features. As mentioned earlier, these features are provided by biomedical experts who are looking for them in their diagnostic or decision process. Selective features are mostly based on subject matter experts who possess a vast knowledge of the corresponding topic. Dataset designed confine to various features captured from glands and tarnish histopathology slide's nuclei per unit area feature and its corresponding shape and statistical properties. ML techniques associated with preprocessing step calibrates proper algorithms like edge or object detection for image processing are employed [27]. Non-handcrafted feature-based techniques associate the derivation of the raw data for learning. The algorithm learns and adapts to extract its features even without exact labeling and tries to minimize the prediction error. These methods scale with data, that is, as larger data sets are used for training, their performance improves and the resulting features may not necessarily be interpretative by humans [5-25-28].
The generated large amount of clinical laboratory apjcb.waocp.com Shankargouda Patil, et al: Artificial Intelligence and Cancer mathematical concepts that are generally used in ANN system. Researcher have shown that biomedical systems mostly represent nonlinear systems, thereby making ANNs a valuable computing tool for research in the field of biology. AI is being applied to various aspects of cancer for few decades [30]. With the ongoing research, the use of computational methods have only become more effective than before.

AI Applications in Cancer Imaging
AI-based deep learning algorithms have been implemented to identify complex patterns in medical and clinical images. They attempt to translate images and complement clinical decisions, thereby enabling meaningful decisions that are most times hard for humans. AI enhances the gathering of various data streams into dynamically integrated symptomatic systems. These include radiographic images studied in pathology, genomics and capturing electronic health reports, and social networks. The study of cross-sectional radiographic images reproduced by MRI and CT scanning is always challenging in recognizing complicated patterns. Whereas computers can potentially be trained efficiently to get results that can be produced rapidly. ML can be implemented for MRI datasets or digitally captured images. Figure 2 describes cancer detection phases for image processing. Low-level transformation methods are used to implement classification of images that are the initial stage of image analysis like segmentation and registration, and are mathematical formulated using statistical and biomechanical modeling and targeted to solve computer vision-based image processing. Higher-level transformation-based tasks have provided relevant info corresponding to prostate cancer detection, characterization, and grading. Cancer imaging based on AI implementation provides great applicability and flexibility and enhances three major biomedical works like sensing, classification, and treatment monitoring of tumors.
Computer-aided detection (CADe) is a term used for detection associated with finding objects in radiographs.
reports and medical data are generally in the form of text, which is not properly structured, and is incomprehensible for the computer program, whereas image EP (expand EP) and genetic data are mostly logical to the machine so that the ML algorithms are easily implemented once it uses preprocessed data. The implementation of ML methodologies to data samples establish the basic segments. Captured data may subject an issue related to the quality of the data and after completing the preprocessing implementation of ML algorithms become more suitable. These biomedical captured data possibly have outliers, missing data, and noise due to duplicate data that degrade data quality. ML algorithms and their performance and analysis improve by improving data quality. Figure 1 represents a summary of the work-flow of a biomedical imaging system that implements a machine learning and deep learning methodology. Study shows that different techniques are available for data preprocessing that focus on modifying the data for better fitting in a specific ML method [16]. Technique implemented for data preprocessing includes feature selection, dimensionality reduction, and feature extraction. Dimensionality reduction improves preprocessing and implements when the number of features in the datasets present, the output results of ML algorithms perform better and significant performance improvement can be obtained when the dimensionality is lower [29]. Feature extraction comprises selecting a subset of all features that are sufficient to capture a significant amount of relevant information in a dataset.

Deep Learning
Deep learning is a branch of machine learning that primarily deals with algorithms derived from the functioning and structure of human brains-Artificial neural networks. ANN (Artificial neural network) generated a diversity of classification or pattern recognition problem-solving scenarios. Numerous hidden layers produce neural connections based on CADe has been used for companion assistant in recognizing hidden cancers in cases of low-quantity CT screening [31] and identifying brain tumor progression in MRI images with tremendous sensitivity during detection [32]. CADe have also aided in mammography right from spotting micro-calcification clusters to the indexing of the initial stage of breast cancer lump [33]. Recent studies have proved CADe to be efficient reducing some of the diagnostic constraints like inter-rater bias, irregular regenerative reports by biomedical professionals, time utilization as well as labor [34][35].
Application-based on AI adds high efficiency in the recreation of the nature of tumor productively with automated segmentation. Images of the entire body can potentially be interpreted by AI algorithms to perform tasks of segmentation. Its performance can be enhanced in the identification of organ structures which is mostly not detected by most personnel except for a pathological expert. The radiologic data are being used to train AI in the diagnosis of skeptical lesions and classify them as benign or malignant. Recent research works are mainly working on tumor extension and multi-nodality in breast MRI [36]. The newly computerized lesion depends on the volume-based analysis tools in contrast-upgraded magnetic resonance mammography (MRM) [37]. The advancement of genomics study from a data outlook maintains collaborative scope by adding AI-based imaging endeavor [38]. World Health Organization (WHO) and Response Evaluation Criteria in Solid Tumors (RECIST) principally work to identify difficulties in conventional physical tracking of tumors and resolve them. Various biomarkers are also being studied and implemented for cancer treatment in addition to their use as an alternative for continuous tracking of cancer. Investigation of circulating tumor DNA (ctDNA) discharged from tumor cells contributes toward the recent and dynamic state of art of work in the field of cancer and enhances tracking of disease evolution [39][40][41][42]. AI implemented unified treatment connecting molecular and pathological information with image-based searching that could aid in decision making. The various fields where AI can be implemented in cancer research are listed in Table 1.

Applications of AI in Lung Cancer
Biomedical imaging and ML provided a new dimension to research. The initial stage of lung cancer detection is always important. ML added new features and possibilities to enhance lung cancer diagnosis and tracking treatment response. Various models are being designed to propagate the initial stage of detection and enable AI to meaningfully categorize lung nodules into two classes namely benign or malignant [43][44][45].
The National Lung Screening Trial (NLST) exhibited a 20% decrement in mortality rate in lung cancers in recent and old smokers obscurely with the use of low-dose CT (LDCT) for screening [46]. NLST exhibits a list of constraints that confide to distinguish the initial stages of lung cancer which can potentially be solved by the computational approach [46][47][48][49]. Till-date, there are no authenticated and verified approach setups to categorize whether nodules are malignant or benign. Classical biostatistics and ML methodology implement to discuss various obstruction in lung cancer screening. ML has shown multiple possibilities and newer techniques to recognize biomarkers to minimize imaging false-positive outcomes and more precisely categorize benign and  [50]. In a recent study, four quantitatively scored semantic features such as short-axis radius, contour, concavity, and texture were considered in an ML model to classify benign or malignant nodules in the lung cancer screening setting. The model classified the nodules with an accuracy of 74.3% [44]. Image-based biomarkers can be stored in the radiographs and featured into the elemental pathophysiology of a tumor. Clinical and biomedical implementation depends on size-based measurement and gives an appropriate estimation on prognostic factors such as survival and recurrence rates [51][52][53][54]. AI methods are being explored to quantify phenotypic characteristics of radiographic images based on the presence of specific mutations employing predefined algorithms and deep learning (a process termed as radiomics) [54]. Research works numerous cancer types like lung cancer have provided results with P less than 3.53 × 10 −6 [54].

Applications of AI in Breast Cancer
A statistical report says, among the various cancers, breast cancer is the most frequently diagnosed cancer [55]. Breast cancer can be classified as a heterogeneous disease wherein there is a wide variation with respect to the size of tumors, prognosis, etiology as well as response to treatment. Recent advances in imaging, as well as computer systems, have resulted in a rapid rise in the potential use of AI for numerous amounts of tasks in the field of breast imaging. These AI applications are mostly applied for diagnosis and prediction treatment response and prognosis [56-57-66-71-58-65].
Breast cancer screening is done using an imaging technique called CADe and CADx. Study shows a large amount of work has been done in this field in the last decades [71][72][73]. CADe is mostly applied to distinguish mammography translation and it has been part of regular biomedical applications since 1990 [72][73]. Various challenges present in the detection of cancer by radiologists include complex noise (incomplete visual search patterns, camouflaging normal anatomic background), fatigue, the estimation of the indirect complicated state of diseases, a huge number of image data and the quality of the image. CADe based implementation remains as a continuing research field in mammography to automatize the identification of breast lesions based on MRI, 3D ultrasound and tomosynthesis images by consolidating already defined algorithms and deep learning methodology [74][75][76][77]. CNN's model is applied for the identification of mammograms [56] and studies show deep learning methodology [59] to provide great flexibility on CAD of breast lesions in ultrasound, MRI, and mammography [74][75][76][77].
Computer vision-based deep learning algorithms have frequently been applied in the past few years to identify the volume/density dimension in breast images and identify parenchyma arrangement, significant biomarkers for cancer risk estimation and finally to illustrate the treatment management. As the density of images goes high, it raises risk factors for breast cancer which could have an obscure effect in the identification of those lesions. Volumetricbased estimation of density are more likely applied [78][79][80]. In full-field digital mammography (FFDM), tissues are categorized into different classes based on differences in x-ray signal attenuation of fibroglandular and fat tissues. The other feature is variability in parenchymal patterns depending upon the spatial distribution of dense tissue which are image-based risk factors. Deep learning-based Texture looks into BRCA1/BRCA2 gene mutations and parenchymal patterns analyze the risk of breast cancer and chances of occurrence. Results achieved an AUC of relatively around 0.82 [71-81-82].
Research society has been actively working from the 1980s to promote ML techniques for CADx. The purpose is to perform the work of classifying benign and malignant breast lesions [73]. CADx with AI implemented computerized tumor classification and basic level representation as done by a radiologist expert. The AI-based software system can be used to characterize the skeptical lesion, predict prognosis of cancer and also provide a patient tracking system to the specialist. AI-based software systems are being extensively used in breast cancer and captured image data are being successfully classified based on tumor size, kinetics, texture, shape and morphology [86].

Application of AI in CNS Tumor
CNS tumors occurrence present itself with a large spectrum in the field of pathology and are possibly more diverse with respect to any other tumors in the human body. This wide range of diagnoses demands a very unique and accurate estimation of imaging modalities. One of the most important biomarkers that aid in determining the prognosis in CNS tumors is Isocitrate Dehydrogenase (IDH). The changes in the presence of IDH mutation can be effectively recognized using machine learning methods including deep CNNs trained on conventional MR images [87][88]. Technically similar kind of work has been already done on other brain tumors. The study results demonstrated that algorithms trained to extract radiomics features from conventional MRI can generate predictive models for pituitary adenoma subtypes and pediatric brain tumors. Various challenges arise when distinguishing between different tumor types. One of the major challenges faced in the diagnosis of CNS tumors is differentiating between primary CNS lymphoma and glioblastoma due to their similarity in imaging phenotypes. Results have shown radiomics models using image texture-based features to boost the differences between glioblastoma and primary CNS lymphoma [87][88][89]. Interestingly, a similar diagnostic dilemma often arises when evaluating histopathology slides of these same two different disease processes [90]. Recent research implemented AI in brain tumors focusing on efficiently categorizing biological and histopathologic subclass of brain tumors [87][88][89][90][91]. AI-based system demanding new models to accurately classify tumors requires dataset testing and training of a large number of corresponding data. Thereby AI would aid in providing better treatment quality with increasing accuracy of discrimination among multiple tumors [92].
Treatment of tumors is decided upon by the accurate classification of tumor subclasses. MR imaging is very useful in the process of defining CNS neoplasms. These tumors may reveal with different classes of contrast improvement and possibly be linked with hemorrhage and peritumoral edema or may blur in the limit from adjacent bone, blood vessels, fat, or surgical packing materials. The automatic identification of CNS tumors is expected to develop strong density-based algorithms to describe tumor as well as link them with the microenvironment that play a major role. Recent studies and methodologies implemented on automatic and semiautomatic detection of CNS tumors majorly applied to conventional MR imaging, ultrasound and PET images [93]. Research has been conducted where models are being created for applications such as treatment planning stereotactic radiosurgery [94], volume-based detection of residual tumor after surgery and tracking tumor growth over time [93]. Algorithms that automatically detect a tumor in the evaluation of patients having numerous intracranial lesions could be of great advantage to monitor metastases, growth rate and response to treatment over time. In the case of lesions in the skull-base that are mostly irregularly shaped extending across extracranial and intracranial compartments, AI could help in automatic volumetric reconstruction and detecting sensitive variations in growth which are often missed out by a normal observer. Spatial classification of heterogeneous tissues present in both tumor lesions and treatment-related changes remains to be a challenge. However, via the machine-learning approach, we can combine multiple imaging features thereby improve the ability to create a tissue classifier that would not only be accurate but also takes the heterogeneity of treated tumors into account. One of the examples include differentiating radionecrosis from recurrent brain tumors using conventional MRI based Texture features extraction [95][96]. MRI sequences based on susceptibility-weighted and perfusionweighted can also be integrated to differentiate between recurrence and radionecrosis in patients with high-grade gliomas Another area where ML is being explored in the discovering of image biomarkers. The research on image biomarkers concentrates on finding associations between radiological features and histologic features. One such example includes the use of supervised machine learning in predicting the status of MGMT (methylation of the O(6)-methylguanine methyltransferase) in preoperative glioblastoma multiforme tumors, where the model exhibited a maximum area under the receiver-operating characteristic (ROC) curve of 0.85 (95% CI: 0.78-0.91). Radiomic based system is also being designed by implementing traditional and diffusion MR imaging characteristics to identify the life span of survivors [97].

Applications of AI in Prostate Cancer
The clinical non-uniformity of prostate cancer with reference to tumor size (very less to extremely destructive tumors), high recurrence and varying mortality rate from patient to patient poses a lot of challenges in itself. An ML-based supervised technique is frequently applied to imaging modalities like US imaging to find skeptical lesions and provide full extent biological advantage in cancer studies. Deep learning-based application in prostate cancer would be beneficial for treatment and generating high-performance results.
Multi-parametric magnetic resonance imaging (mpMRI) are capable of portraying soft-tissue contrast for the identification of doubtful prostate lesions and provide insight into tissue properties. Study shows mpMRI is a promising imaging technique for prostate cancer due to its potential to find lesions and provide surgical features. Identification and classification of prostate tumors based on AI models provided flexibility with the advancement in CADe and CADx systems [98]. In partnership with PI-RADS, CAD systems could possibly improve the feasibility and treatment accuracy of mpMRI [99]. Initial work on mpMRI based CADx systems targeted on supervised learning model, adding feature extraction and trivial classification. The report stated that feature extraction plays a major role in enhancing system outcomes depending upon CAD. CNN's added the full extension of work and convey good performance report in prostate cancer identification and treatment. A different feature of CNN algorithms such as an auto windowing are added for better MRI image classification and normalization with the addition of mpMRI images [99].

Limitations and Future prospects of AI in cancer
AI is continuing to prove its potential and efficacy in various stages of disease confronting such as early detection, treatment planning and prediction of future outcomes. Despite the increased advancements being made in AI and its applications in oncology, there are numerous limitations and setbacks that needs to be addressed. Few of them include issues with data access, generalizability, developing real-world applications, interpretation issues, 'black box' problem, and challenges pertaining to education and expertise in the field. Although various literature evidences have proved AI to be efficient in diagnosing and outcome prediction of various cancers, the generalizability of the said AI application needs to be validated, as most studies would be confined to a particular disease type in a specific population, with data being obtained from a particular institution/repository. Efforts needs to be made in terms of promoting medical data sharing among institutes and carrying out multiple external validations. Numerous attempts are being carried out in developing real world applications, however, AI training is a data-hungry method requiring a multi-faceted approach from all the institutions worldwide. Apart from data draughts arising from patient privacy issues and dearth of data-sharing facilities in institutions, obtaining a complete data with the required quality is yet another obstacle. Training of AI in diagnosing or prediction of a specific disease with proper data from various populations would strengthen the AI's ability to perform with accuracy, irrespective of where the application would be used. One of the other major challenges that's being currently faced while using AI in medical domain, is trying to interpret as to how AI model came up with the solution. This limitation in the ability to precisely understand the logic behind these algorithms is termed as the "black box" problem. Various methods such as saliency maps, sensitivity analysis, feature visualization and class activation mapping are being utilized to tackle these issues. Further research is needed to decode and extract human understandable explanations from these AI algorithms. These explanations may pave way to develop newer and more efficient methods to understanding disease process, diagnosis and prediction of cancer.
In conclusions, the increasing incidence and mortality of cancer necessitate the need for more medical and technological advancements which would aid in early detection and better treatment. Advancements in Machine learning and artificial intelligence have reached a point where they are being incorporated in most of the fields in science including medicine. On the basis of wide research being done, AI is proving itself to be a very reliable adjunct to medical professionals and promises to significantly improve detection and therapeutic methods. However, more interdisciplinary research is required to generalize the clinical application of AI, machine learning and deep learning in all cancer types as well as in different fields of oncology. Such research should also aim at overcoming the challenges being faced and collectively aid in benefiting the patients and enhance better clinical outcomes.