Machine Learning Assignment Sample
An investigative study on the use of machine learning (ML) models for prediction leukaemia
1. Introduction
Leukaemia is one of the common kinds of blood cancers found in all age groups, especially children. Leukaemia effects the white blood cells of the body. The three components of blood are: white blood cells, red blood cells and platelets. The role of platelets to clot the blood and control bleeding. Red blood cells or enthrocytes help in transferring of oxygen through the lungs to the tissues of the body. White blood cells or leukocytes fight the diseases or infections that attack the body. Leukaemia is a disease where white blood cells are produced in very large quantities as a result of which the immune system of the human body is compromised. In 2018, the United States reported 60,000 cases of Leukaemia. There are different kinds of Leukaemia, identified by hematologists by staining the slides.
Figure 1: Different kinds of Leukaemia
Detecting Leukaemia at early stages has been a major challenge faced by hematologists, doctors and researchers. There are certain symptoms that a patient with Leukaemia shows such as loss of weight, fever, enlarged lymph nodes and pallor. However, all these symptoms are also associated with several other diseases as well. Moreover, the symptoms of Leukaemia are mild in the early stages, making it extremely difficult to differentiate from other diseases and detect Leukaemia. Traditionally, Leukaemia is either diagnosed by microscopically evaluating the Peripheral Blood Smear (PBS) or by analyzing the bone marrow samples of the patient.
However, in the past two decades the use of machine learning (ML) and other computer aided diagnostic (CAD) methods have become very popular. These methods analyze the images of the blood smear and use the same for diagnosis, differentiation and taking counts of Leukaemia cells. Machine Learning is a very well-developed branch of Artificial Intelligence (AI) which comprises of mathematical relations and different algorithms which help in clinical research. So far, machine learning techniques wherever applied in the domain of medical sciences has shown outstanding outcomes and are therefore, seen as a remarkable success in the field. With the use of machine learning it became possible for researchers in the field of medical science to produced high-quality data which resulted in the need to establish better and highly advanced modes of data analysis as the traditional methods were not enough.
2. Methodology
The present review seeks to identify various studies on the detection of Leukaemia and its diagnosis by the use various machine learning techniques for PBS image analysis. The systematic review strategy is based on the existence of previous studies, selected by virtue of the search criteria established.
2.1 Search Criteria
By surveying the e-databases that deal with scientific articles, the author has chosen articles which answer the following questions:
How far has machine learning been effective in diagnosing Leukaemia and classifying the same using PBS imaging?
Which machine learning algorithm has been the most effective in the Peripheral Blood Smear (PBS) analysis?
Most articles on these questions were found on platforms such as Scopus, Web of Science, PubMed and ScienceDirect. The keywords used for the search were: “Leukaemia”, “Leukaemia Diagnosis”, “Leukaemia detection” and “Machine Learning and Leukaemia”.
2.2 Inclusion and exclusion criteria
Articles published between 2011-2020 were extracted from the said data bases. Articles which were similar and available on different platforms have been excluded. The inclusion and exclusion criteria for review has been tabulated below:
2.3 Data Extraction
In specially designed forms, we extracted the details of the previous articles’ methods and results by studying the previous articles. Data was extracted by two researchers and disagreements were addressed via discussion. A number of data elements were extracted, such as the title of the article, the country, the year of publication, the population studied, the machine learning technique, the evaluation method, and the results.
3. Literature Review
Leukaemia has been described as a category of “hematological malignancies” which manifest by way proliferation which are tumorous in nature or increased span of life of the white blood cells existing in the bone marrow of the human body [1]. The white blood cells or the leukocytes are highly differentiated and an important role in the immune system. The malignancy of Leukaemia also varies, while some are non-malignant others are highly aggressive and thus, the immature white blood cells are incapable of performing their functions [2]. The normal hematopoiesis is suppressed by the leukemic cell crowds (i.e., the excessive production of white blood cells) making it difficult for the body to fight infections, transport oxygen and to control bleeding [3]. Based on the rapidity of the disease, Leukaemia is classified into acute and chronic forms. In case of acute Leukaemia, the disease develops at a very rapid rate meaning that the number immature white blood cells increase at a very fast rate. However, in case of chronic Leukaemia, the process of development of the disease happens at a comparatively slower rates and the mature white blood cells are able to perform some of their normal functions [2]. Apart from the rate rapidity, the disease is further classified based on “the type of affected cell from which the malignancy develops” into myelogenous and lymphoid forms of Leukaemia.
To diagnose Leukaemia, a wide variety of information is required including “morphology, cell phenotyping, cytochemistry, cytogenetics, and molecular genetics” [4]. Despite the immense growth in the sphere of medical sciences, visual examination of PBS remains to be in the forefront of the diagnostic techniques associated with Leukaemia. However, because such microscopic examination requires a lot of time and comes with the chances of human error [4], there is a requirement of automated process of detection and diagnosis of Leukaemia.
Researchers recently investigated whether Leukaemia can be detected automatically from microscopic blood smear images [5]. The proposed methods of automated detection include “sequential image pre-processing, cell segmentation, feature extraction, and cell classification” [6].
In addition to the clustering segmentation method, many authors have used thresholding-based techniques to segment WBCs. In particular, Joshi et al. [7] reported the usage of Otsu’s global thresholding on an enhanced greyscale image. To differentiate blasts in a microscopic blood smear image, they extracted the area, perimeter, and circularity from the equivalent binary image and employed the K-nearest neighbor decision algorithm for classification.
Threshold techniques can sometimes fail to produce relevant and precise results due to the absence of spatial information. Consequently, they often combine mathematical morphology and other techniques to process images. Using an edge-based technique and seeded watershed techniques, Wang et al. [8] developed a segmentation algorithm capable of detecting nuclei of cells during different phases of their cycle. Recent research has shown that morphological and texture features, combined with supervised learning algorithms, are the most commonly used methodologies for feature extraction and classification. A multilayer perceptron and SVM have proven more accurate than other classifiers [9]. Researchers Neoh et al. [10] compared the classification performance of a SVM and a multilayer perceptron with the aid of 80 feature descriptors containing color, shape, and texture information. With over 95% accuracy, both classifiers achieved similar results, but the multilayer perceptron classifier achieved slightly higher accuracy.
4. Results
The search strategy uncovered one hundred sixteen articles from the four reliable databases. Using the abstract, full text, and inclusion and exclusion criteria, we selected 17 full-text articles that met the eligibility and inclusion criteria for the present study [11]. Using the PRISMA flow chart (Figure 2), the process was conducted. We conducted this systematic search over the past five years, since the application of ML methods to blood smear image analysis has recently emerged. According to a review of the articles, over time, the use of ML methods in PBS image analysis has increased; in 2020, seven articles were published, in 2019, five articles were published, and in 2018, four articles were published [12-15]. There are also several articles published before 2015 which has been discussed in the literature review section [16-20].
Figure 2: PRISMA Flowchart
Staining quality plays an important role in detecting Leukaemia in peripheral blood slides. Due to this, there are not many quality standards available. Public datasets are used in most studies. These datasets have been made available to researchers so that they can design and develop ML algorithms. ML techniques have been used to diagnose and classify acute lymphoblastic Leukaemia (ALL) using ALL-IDB, one of the most well-known datasets that is published in two versions. Some studies have also used the Benchmark for the development of Machine Learning algorithms Leukaemia dataset. Studies have generally focused on testing their proposed models on homogeneous or private databases [21,22]. As a result, the capability of diagnosing Leukaemia with different characteristics poses a major challenge for a robust detection and classification model. Several studies have combined these datasets into one cross-dataset to present a robust model and produce reliable and valid results. Sharif has created a system to diagnose various leukocytes that is highly precise and efficient. Local datasets were also used by some researchers. ML was the most common method of diagnosing and classifying all types of Leukaemia [23]. According to image processing based on PBS, figure 3 shows how various types of Leukaemia may be diagnosed. Leukocytes have been counted by image analysis in some articles.
Figure 3: Different types of Leukaemia processing using PBS and ML
“Examining the methods adopted by the reviewed studies indicated that two categories of machine vision techniques have been used in PBS image analysis; machine learning and its important subclass, deep learning, are two categories of learning algorithms. The first strategy relies on selective image feature extraction” [24]. Through mathematical and machine learning algorithms, these methods are common for extracting image features from a volume of images. Feature extraction in this view is meant to produce a set of descriptors about an image. It is possible to discover the patterns of images by determining the relationship between these descriptors. The most valuable and most effective classification performance is determined by analyzing several classes of features with ML algorithms. The cytomorphological structure can be used to extract characteristics such as cell structure, nucleus structure, chromatin composition, etc. Other characteristics may also be considered.
Among the most common tasks in natural and medical picture analysis is segmentation. To improve classification rates, researchers use various types of segmentation. The segmentation method is a pre-processing method used for selecting and extracting features from an image and the first step in the process. Segmentation based on the intent of isolating and removing a cell and its nuclei from its cytoplasm provides an accurate understanding of the structure and features of the blast, and ML analysis by analyzing the origin of the blast identifies the type of Leukaemia. Researchers have studied segments and extracted features of each segment to diagnose Leukaemia. Other researchers have studied images and extracted the features of the whole image to diagnose Leukaemia.
5. Discussion
Leukaemia diagnosis is most commonly based on microscopic examination of PBS images at the early stages. The hand examination of these smears can, however, lead to improper diagnosis and nonstandard reporting [23, 25]. Furthermore, testing these smears takes a lot of time and effort, thus affecting the diagnostic accuracy. It is therefore important to have a method that offers a precise diagnosis without being influenced by technician experience or operator fatigue or job demands [26].
A comprehensive systematic review of PBS image analysis via ML methods had not been conducted, according to a search of scientific databases. Different authors conducted studies to determine the efficacy of machine learning in diagnosing and classifying Leukaemia based on PBS imaging [27-30].
Through comparisons with previous studies, the researcher’s current study answered the question he posed at the outset.
A number of factors affect the quality of images produced in the laboratory by smear preparation. It is difficult to detect and monitor blood smears precisely because of these issues. Preprocessing is required since ML is incapable of processing these smear images [31]. ML algorithms can be used to detect Leukaemia more accurately by preprocessing data. A set of preprocessing techniques for dataset preparation is recommended for precise Leukaemia detection with minimal error using ML methods.
Preliminary processing of blood smears by ML methods is based primarily on selecting effective features. The main problem related to selecting features of blood cells to determine Leukaemia when the researcher could control the selection and analysis. While some studies have found that color and shape are important for identifying blast cells, others found that texture and different texture metrics are important. Manual methods selected by manual methods have not been featured as a definitive method for differential diagnosis of Leukaemia in medical texts. The selection of several important features from a very large set depends solely on the algorithm, and the method of the algorithm determines the efficiency of the feature selection. Leukaemia diagnosis has a lower precision using methods that extract fewer cell features. One could employ hybrid algorithms or swarm intelligence to extract features that will be more effective for Leukaemia detection and diagnosis, and pay attention to further coverage of the feature space [32]. As part of the Leukaemia detection process, a set of various geometrical, statistical, and morphological features should be employed. In ML, there is a requirement for manually extracting and selecting features. If there are enough images for DL rather than ML, the DL method is more appropriate because of its mechanics [33].
In different studies, lack of comprehensive datasets of Leukaemia smear images has been found to be a major challenge in Leukaemia diagnosis using ML algorithms. This constraint leads to problems for the ML methods, for instance, overfitting. Studies have shown that diagnostic errors are higher in smaller datasets due to the data-driven nature of these methods. Due to use of small/local datasets, many studies cannot be confirmed. The reviewed studies did not provide a comprehensive dataset with sufficient data for ML-based Leukaemia diagnosis/classification, which is a prerequisite for having a robust ML method. As a result of processing the main images into new images, there are certainly techniques to increase the data that do so while maintaining the features of the main images. The augmentation of pattern recognition techniques may be able to alleviate this problem in DL.
Nowadays, very common methods for determining and counting blood components are based on ML. “It is thought that, in the near future, bone marrow transplant laboratories could replace traditional devices with applications and software based on ML, especially DL, to offer a timely method and assist a diagnosis with high certainty and low detection error in the early stages”.
6. Conclusions
Analysis of blood smear image plays a crucial role in diagnosing several blood diseases includig Leukaemia. The early diagnosis of Leukaemia can lead to a revolution in the sphere of medical and diagnostic sciences, especially with the initiation with early treatment. Use of machine learning can aid in the diagnosis of Leukaemia at its very onset and help in quick classification of the different sub-types. Machine Learning algorithms can help in better detection and prevention of Leukaemia. It is, therefore, recommended that the use of machine learning algorithms must be developed to analyze blood smears as highly developed image processing methods combined with novel pre-processing techniques can help in extremely high accuracy levels.
References
- American Dental Association [ADA] (2012). The ADA Practical Guide to Patients with Medical Conditions, ed. L. L. Patton (New York, NY: Wiley).
- Serfontein, W. (2011). Cancer Diagnosed: What Now?2nd Edn. Bloomington: Xlibris.
- Daniels, R., and Nicoll, L. H. (2012). Contemporary Medical Surgical Nursing, 2nd Edn. New York, NY: Cengage Learning.
- Inaba, H., Greaves, M., and Mullighan, C. G. (2013). Acute lymphoblastic leukaemia. Lancet381, 1943–1955. doi: 10.1016/S0140-6736(12)62187-4
- Alsalem, M. A., Zaidan, A. A., Zaidan, B. B., Hashim, M., Madhloom, H. T., Azeez, N. D., et al. (2018). A review of the automated detection and classification of acute leukaemia: Coherent taxonomy, datasets, validation and performance measurements, motivation, open challenges and recommendations. Methods Programs158, 93–112.
- Bodzas, A. (2019). Diagnosis of Malignant Haematopoietic Diseases based on the Automation of Blood Microscopic Image Analysis.Master’s thesis, Technical University of Ostrava, Ostrava, CZ.
- Joshi, M. D., Karode, A. H., and Suralkar, S. R. (2013). White blood cells segmentation and classification to detect acute Leukaemia. J. Emerg. Trends Technol. Computer Sci.2, 147–151.
- Wang, M., Zhou, X., Li, F., Huckins, J., King, R., and Wong, S. (2008). Novel Cell Segmentation and Online SVM for Cell Cycle Phase Identification in Automated Microscopy. 24, 94–101. doi: 10.1093/bioinformatics/btm530
- Aljaboriy, S., Sjarif, N., and Chuprat, S. (2019). Segmentation and detection of acute Leukaemia using image processing and machine learning techniques: a review. AUS26, 511–531. doi: 10.4206/aus.2019.n26.2.60
- Neoh, S., Srisukkham, W., Zhang, L., Todryk, S., Greystoke, B., Lim, C., et al. (2015). An intelligent decision support system for leukaemia diagnosis using microscopic blood images. Rep.5:14938. doi: 10.1038/srep14938
- Ahmed, A. S., Morsy, M., and Abou-Elsoud, M. E. A. (2016). Microscopic digital image segmentation and feature extraction of acute Leukaemia. J. Sci. Eng. Appl.5, 228–233. doi: 10.7753/IJSEA0505.1001
- Bagasjvara, R. G., Candradewi, I., Hartati, S., and Harjoko, A. (2016). Automated detection and classification techniques of Acute Leukaemia using image processing: A review. Paper Presented at the 2nd International Conference on Science and Technology-Compute, Yogyakarta. 35–43. doi: 10.1109/ICSTC.2016.7877344
- Batchelor, B. G., and Waltz, F. M. (2001). Intelligent machine vision: techniques, implementations, and applications.New York, NY: Springer.
- Chan, Y. K., Tsai, M. H., Huang, D. C. H., Zheng, Z. H., and Hung, K. D. (2010). Leukocyte nucleus segmentation and nucleus lobe counting. BMC Bioinformatics11:558. doi: 10.1186/1471-2105-11-558
- Chen, J., Ying, H., Liu, X., Gu, J., Feng, R., Chen, T., et al. (2020). A transfer learning based super-resolution microscopy for biopsy slice images: the joint methods perspective. IEEE/ACM Trans. Comput. Biol. Bioinform.(in press). doi: 10.1109/TCBB.2020.2991173
- Chen, T., Xu, J., Ying, H., Chen, X., Feng, R., Fang, X., et al. (2019). Prediction of Extubation Failure for Intensive Care Unit Patients Using Light Gradient Boosting Machine. IEEE Access.7, 150960–150968. doi: 10.1109/ACCESS.2019.2946980
- Chiaretti, S., Zini, G., and Bassan, R. (2014). Diagnosis and Subclassification of Acute Lymphoblastic Leukaemia. J. Hematol. Infect. Dis.6:e2014073. doi: 10.4084/MJHID.2014.073
- Díaz, G., and Manzanera, A. (2009). “Automatic Analysis of Microscopic Images in Hematological Cytology Applications,” in Biomedical Image Analysis and Machine Learning Technologies: Applications and Techniques, eds F. A. González and E. Romero (Landisville, PA: Yurchak Printing Inc), 167–196.
- Fairchild, M. D. (2005). Color Appearance Models, 2nd Edn. Chichester: John Wiley & Sons.
- Gao, W., Zhu, Y., Zhang, W., Zhang, K., and Gao, H. (2019). A hierarchical recurrent approach to predict scene graphs from a visualion-oriented perspective. Intellig.35, 496–516. doi: 10.1111/coin.12202
- Hariprasath, S., Dharani, T., and Santh, M. (2019). Detection of acute lymphocytic Leukaemia using statistical features. Paper Presented at the 4th International Conference on Current Research in Engineering Science and Technology, Trichy. Available online at: http://www.internationaljournalssrg.org/uploads/specialissuepdf/ICCREST/2019/ECE/IJECE-ICCREST-P102-JRCE1119.pdf
- James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013). An Introduction to Statistical Learning: With Applications in R.New York, NY: Springer.
- Katz, A. J., Chia, V. M., and Schoonen, W. M. (2015). Acute lymphoblastic Leukaemia: an assessment of international incidence, survival, and disease burden. Cancer Causes Control26, 1627–1642. doi: 10.1007/s10552-015-0657-6
- Kazemi, F., Najafabadi, T., and Araabi, B. (2016). Automatic Recognition of Acute Myelogenous Leukaemia in Blood Microscopic Images Using K-means Clustering and Support Vector Machine. Med. Signals Sens.6, 183–193.
- Manisha, P. (2012). Leukaemia: a review article. J. Adv. Res. Pharm. Bio Sci.2, 397–407.
- Moradiamin, M., Samadzadehaghdam, N., Kermani, S., and Talebi, A. (2015). Enhanced recognition of acute lymphoblastic Leukaemia cells in microscopic images based on feature reduction using principle component analysis. Biomed. Technol.2:128–136.
- Nailon, W. H. (2010). “Texture analysis methods for medical image characterisation,” in Biomedical Imaging, ed. Y. Mao (London: Intech Publishing), 75–100.
- Pisner, D. A., and Schnyer, D. M. (2020). “Chapter 6 – Support vector machine,” in Machine Learning, eds A. Mechelli and S. Vieira (Cambridge, MA: Academic Press), 101–121.
- Putzu, L., Caocci, G., and Di Ruberto, C. (2014). Leucocyte classification for leukaemia detection using image processing techniques. Intellig. Med.62, 179–191. doi: 10.1016/j.artmed.2014.09.002
- Shafique, S., and Thesin, S. (2018). Acute lymphoblastic Leukaemia detection and classification of its subtypes using pretrained deep convolutional neural networks. Cancer Res. Treatment17:1533033818802789. doi: 10.1177/1533033818802789
- Tharwat, A. (2018). Deep belief networks and cortical algorithms: A comparative study for supervised classification. Comput. Inform.15, 81–93. doi: 10.1016/j.aci.2018.08.003
- Wan, S., and Mak, M. W. (2015). Machine Learning for Protein Subcellular Localization Prediction.Boston: De Gruyter.
- Wang, Q., Bi, S., Sun, M., Wang, Y., Wang, D., and Yang, S. (2019). Deep learning approach to peripheral leukocyte recognition. PLoS ONE.14: e0218808. doi: 10.1371/journal.pone.0218808
- Wiernik, P. H. (2001). Adult Leukaemia (Atlas of Clinical Oncology).Hamilton: BC Decker Inc.
- Zayegh, A., and Bassam, N. (2018). “Neural Network Principles and Applications,” in Digital Systems, Ed. R. J. Tocci (London: Pearson), doi: 10.5772/intechopen.80416
………………………………………………………………………………………………………………………..
Know more about UniqueSubmission’s other writing services: