Microcalcification and Macrocalcification Detection in Mammograms Based on GLCM and ODCM Texture Features Using SVM Classifier

Breast cancer is a common cancer in women and the second leading cause of cancer deaths worldwide. Photographing the changes in internal breast structure due to formation of masses and microcalcification for detection of Breast Cancer is known as Mammogram, which are low dose x-ray images. These images play a very significant role in early detection of breast cancer. Usually in pattern recognition texture analysis is used for classification based on content of image or in image segmentation based on variation of intensities of gray scale levels or colours. Similarly texture analysis can also be used to identify masses and microcalcification in mammograms. However Grey Level Co-occurrence Matrices (GLCM) technique introduced by Haralick was initially used in study of remote sensing images. Radiologists f i n d i t d i f f i c u l t to identify the mass in a mammogram, since the masses are surrounded by pectoral muscle and blood vessels. In breast cancer screening, radiologists usually miss approximately 10% - 30% of tumors because of the ambiguous margins of tumors resulting from long-time diagnosis. Computer-aided detection system is developed to aid radiologists in detecting ma mammographic masses which indicate the presence of breast cancer. In this paper the input image is pre-processed initially that includes noise removal, pectoral muscle removal, thresholding, contrast enhancement and suspicious mass is detected and the features are extracted based on the mass detected. A feature extraction method based on grey level co- occurrence matrix and optical density features called GLCM -OD features is used to describe local texture characteristics and the discrete photometric distribution of each ROI. Finally, a support vector machine is used to classify abnormal regions by selecting the individual performance of each feature. The results prove that the proposed system achieves an excellent detection performance using SVM classifier.


INTRODUCTION
Breast cancer is a common form of cancer disease among women with nearly 1.7 million new cases diagnosed in 2015 and the second cause of cancer deaths worldwide [1]. Early detection of breast cancer is a key factor for successful cancer treatment. This represents about 1 2 % o f all new cancer cases and 25% of all cancers in women [1]. For every 2 women newly diagnosed with breast cancer in India one women is dying of breast cancer [1], [2].In women breast cancer is most common and deadliest forms of cancer found worldwide. Ten years back in India cervical cancer was having highest mortality rate whereas breast cancer was second highest but within ten years the statistics have changed, now breast cancer tops the list of cancer related mortality rate. The Indian breast cancer scenario is more worrisome and disturbing as compared to western countries or even with neighbours like china. If we look at the statistics of all Indian cities, breast cancer alone accounts to 25% to 31% of all cancers in women [1]. It has been observed that there is a significant age shift, and the average age of developing breast cancer in India has shifted from 50 -70 years to 30 -50 years; and sadly cancers in the young tend to be more aggressive. For just the year 2012, GLOBOCAN (WHO), had estimated 70218 women deaths in India due to breast cancer, more than any other country in the world, china was second with 47984 deaths and US third with 43909 deaths [2]. The vast difference in numbers of mortalities is alarming, and not just worth noticing but acting on it. It is projected For the years 2015, there will be an estimated 1,55,000 new cases of breast cancer and about 76000 women in India are expected to die of the disease. Breast cancer is related to hormones and the factors that modify the risk of this cancer . These systems act as a second opinion for radiologist in better reading and understanding of mammography images. Mammography is currently the most effective imaging modality used by radiologists for the screening of breast cancer [3]. In this paper we explore an automated technique for mammogram mass detection. The proposed method removes noises, separates background region from the breast profile region and removes the pectoral muscle for accentuating the breast profile region. A Computer Aided Diagnosis system is implemented under the MATLAB environment for classifying malignant masses in digital mammograms using Support Vector Machines (SVM) [16], [17]. The proposed method successfully achieves an accuracy of 95% which is considered as a good result when compared with similar works in the same research field. Jiß ri Grim, Petr Somol, Michal Haindl and Jan Danes ß [2] proposed a new approach to diagnostic evaluation of screening mammograms based on local statistical texture models. The local evaluation tool has the form of a multivariate probability density of gray levels in a suitably chosen search window. The density function in the form of Gaussian mixture is estimated from data obtained by scanning of the mammogram with the search window Al Mutaz M. Abdalla, Safaai Dress, Nazar Zaki [14] proposed modern textural features analysis of breast tissues on mammograms by Detecting masses in digital mammogram based on second order statistics. The extraction of the textural features of the segmented region of interest (ROI) is done by using gray level cooccurrence matrices(GLCM) which is extracted from four spatial orientation. This is distinct from existing approach that tend to concentrate on the morphology of individual microcalci¿FDtions and global(statistical) cluster features. A set of microcalci¿FDtion graphs are generated to represent the topological structure of microcalci¿cation clusters at different scales [21].

II. METHODS & METHODOLOGY A. Image Preprocessing
The proposed CAD system for mammographic mass detection comprises four major stages: preprocessing, detection of suspicious mass region, feature extraction, and classi¿cation. Fig. 1 shows an overview of the proposed mass detection scheme, and the following subsection present each component in detail.

Fig.1: Block diagram of proposed mammographic mass detection scheme
The preprocessing method consists of the following steps for noise reduction, removal pectoral muscle, separation of the image from the background and contrast enhancement.

Noise reduction
The noise reduction in mammogram is done by using median filter to remove the salt and pepper noise present in the input mammogram image. The pectoral muscle are brightest pixels present in the top corners of the mammogram image. Because the pectoral muscle represents a brighter region, it may affect the detection results [14]. Hence, the maximum connected component ¿nds the position of the pectoral region and adaptive thresholding is done inorder to remove the pectoral muscle from the input mammogram image. Consequently, the breast region is obtained by removing the pectoral muscle from the foreground.

Thresholding
The thresholding is done to separate the input image from the background. The Otsu thresholding method is applied to the digital mammogram to ¿nd the foreground of concern, which contains a breast region in most mediolateral oblique (MLO) views of mammograms [9]. The otsu thresholding is applied to the mammogram input image to separate the breast region from the background.

Contrast enhancement
The contrast enhancement is done to increase the brightness of the image. The adaptive histogram equalization technique is applied to increase the contrast of the mammogram image. The preprocessing method reduces noises, removes the pectoral muscle, separates the breast image from the background and increases the contrast of the mammogram input image. 1. Grey-level co-occurrence matrix Therefore, some pattern recognition methods use a gray level co-occurrence matrix (GLCM) to extract characteristics [9], [14]. Statistical distributions of observed combinations of intensities at specified positions relative to each other in an image are used to obtain statistical textural features. These features can be classified into first-order, second-order, and higher order according to the number of intensity points (connected pixels) considered in each combination. The Grey level Co-ocurrence Matrix (GLCM) technique is used to compute second-order statistical textural features. A GLCM is a matrix where the number of rows and columns is equal to the number of gray levels, in the image. The matrix element P(i, j / ûx, ûy) is the relative frequency with which two pixels, separated by a pixel distance (û[ ûy), occur within a given neighborhood, one with intensity i and the other with intensity j. One may also say that the matrix element P(i, j 1 d,8) contains the second order statistical probability values for changes between gray levels i and j at a particular displacement distance d and at a particular angle (8). Figure 1 below illustrates the geometrical relationships of GLCM measurements made for four distances d (d = max {l û x I, 1 û y I}) and angles of 8 = 0, rr/4, rr/2 and 3 rr /4 radians under the assumption of angular symmetry. The idea behind GLCM  [9]. GLCM texture considers the relation between two pixels at a time, called the reference and the neighbour pixel. The neighbour pixel is chosen to at the right of each reference pixel. This can also be expressed as a (1,0) relation: 1 pixel in the x direction, 0 pixels in the y direction. Each pixel within the window is a reference pixel , starting in the upper left corner it proceeds to the lower right. Pixels along the right edge have no right hand neighbour, so they are not used in count.

Optical density co-occurance matrix
The background information is considered in the discrete texture feature category because it transforms an intensity into an optical density value. The optical density transformation for each pixel (i,j) of an object region is de¿ned as OD ij = log (I ij / I o ) where I ij is the intensity value of pixel, and Io is the average background intensity. The background is an ROI excluding the pixels belonging to the object region. This study proposes two complex feature extraction methods to achieve a complete description of quantitative characteristics. The ¿rst feature extraction module adopts GLCM features and optical density features. This type of complex texture feature extraction method extracts the information of local intensity relation and discrete photometric distribution. The proposed scheme computes four co-occurrence matrices with one pixel distance in four directions: left diagonal, right diagonal, vertical, and horizontal [9]. Another complex feature extraction method is also constructed that is similar to the proposed complex module, but translates the gray level co-occurrence matrix into the optical density cooccurrence matrix (ODCM) to characterize the photometric textures [14]. The optical density cooccurance matrix is a co-occurrence matrix of the optical density image. An optical density image can be obtained by converting the intensity of the gray level image into optical density image and linearly mapping all the optical density values one by one to an image that has an 8-bit depth information [9]. The minimum optical density value was mapped to 0, and the maximum optical density value was mapped to 255. After transformed the gray level image into the optical density image, the differences between gray level values are enlarged, enhancing the simulated mass region.
Since the background represents the surrounding normal tissues in an ROI with appropriate thresholding, an optical density image can serve as a graph that represents the degree of malignant tissue based on the intensity (the lighter area represents greater possibility of malignant tissue) [14]. Finally, the two proposed methods combining texture features and optical density features use seventy-six statistics to achieve a complete description of characteristics.

C. Classification
Several classi¿cation methods are developed in recent years. One classification technique that is widely used for the diagnosis of breast tumors is the Support Vector Machine (SVM) [16]. SVM is one o f the shining peaks in many learning algorithms which is inspired by statistical learning theory and has appeared in the machine learning community in the last few decades [16], [17].Consequently, the proposed classi¿cation module consists of reduced features which were selected by support vector machine classifier after the performance comparison of classi¿ers. The single stage SVM classifier identifies the image from the testing dataset to be benign or malignant by comparing the image from the trained datasets. Thus the SVM classifier proves to be significant compared to other classifiers by providing an accuracy of 95% with sensitivity of 0.9 which is considered as better performance than other experimental methods.

A. Datasets
The mammogram datasets were taken from the Mammogram image analysis society (MIAS)

International Journal of Advanced Engineering Research and Science (IJAERS)
[ Vol-4, Issue-6, Jun-2017]  https://dx.doi.org/10.22161/ijaers.4.6.2  ISSN: 2349-6495(P) | 2456-1908(O) www.ijaers.com Page | 14 database for both benign and malignant breast cancer images [20]. A set of benign and malignant images are trained to determine the breast tumors and undergo the following training process such as preprocessing, feature extraction. The testing process consists of a set of benign and malignant breast cancer images. Images are digitized to 50 micron pixel edge representing each pixel with an8-bit word. The database contains 322 digitized images. The database has been reduced to a 200 micron pixel edge and padded/clipped so that all the images are 1024x1024.MIAS database is used in for this work, this is freely available and has been widely used for mammogram classification. This database is made up of medio-lateral oblique views of both right and left breast of women.

IV. DISCUSSION AND CONCLUSION
In the proposed method an automatic CAD system for mammographic mass detection that uses complex texture features for classifying the suspected mass region preprocesses the input mammogram image to acquire the breast region and suppress the effects of noise using median ¿lter. Then the two feature extraction methods using grey-level co-occurance matrix and optical density co-occurance matrix which then combines the GLCM features and optical density features to describe both the grey level characteristics of local textures and photometric discrete textures. The optical density image enhances the difference of grey level based on the normal tissue intensity to strengthen the description of the suspicious area shape for feature extraction in CAD system.. The proposed method proves that the classification using support vector machine achieves satisfactory detection with sensitivity of 0.9, accuracy of 95% for both two feature extraction methods. The ODCMoptical density features that can increase the mass detection rate of the CAD system for the dense breast is proposed in this study to reduce the burden of radiologists and conserve resources. The future work of the proposed method comprises of an automatic CAD system for microcalcification identification in mammogram detection adapting various texture features using suitable classifier to increase the accuracy, sensitivity and reduce false positive rates and improve overall performance of the system.