Tsallis Entropy In Bi-level And Multi-level Image Thresholding

: The maximum entropy principle has a relevant role in image processing, in particular for thresholding and image segmentation. Different entropic formulations are available to this purpose; one of them is based on the Tsallis non-extensive entropy. Here, we propose a discussion of its use for bi-and multi-level thresholding.


Introduction
In 1988, Constantino Tsallis proposed, in a paper entitled "Possible generalization of Boltzmann-Gibbs statistics" [1], a new concept of entropy, which is known today as "Tsallis entropy". This entropy was embedded in a generalization of the classical statistics, formulated for a non-extensive thermodynamics. For systems with long-range interactions or long time memory, Tsallis used an approach which was inspired by some multifractals concepts. As the scaling functions of the universal multifractals are depending on a multifractality index, the Tsallis entropy depends on a dimensionless parameter; when this parameter has the limit value of 1, the entropy is recovering the expression of Boltzmann-Gibbs entropy.
Today, we can see that a strongly increasingly number of natural and artificial systems is studied by means of Tsallis entropy [2]: of all the researches referring to it, a large part, at least more than a thousand, are on its application in image processing. The Tsallis entropy enters the image processing through the problem of image segmentation. This is a processing task which aims separating the image pixels in some manner, for instance, in pixels pertaining to objects or to the background. This very important process is the first step to understand the components of the image and for a recognition and extraction of their features [3].
The segmentation can be made by image thresholding, usually classified as bi-level and multilevel thresholding. Bi-level thresholding separates the pixels into two classes, one containing pixels with gray-levels below the threshold, the other with graylevels above it. Multi-level thresholding generalizes this to several classes [3]. Here, we will discuss the use of Tsallis entropy in bi-and multi-level image thresholding.

Tsallis entropy and non-additivity
Let us have a discrete set of probabilities   i p , where i is a discrete random variable. Condition on probabilities is:   i i p 1 . For any real parameter q , Tsallis entropy is defined as: Sometimes, parameter q appearing in (1) is named the "entropic index". k is a constant.
In the limit 1  q , the usual Boltzmann-Gibbs entropy is recovered, namely: In the case this entropy is used in information theory, it assumes the form of the Shannon entropy ( The Tsallis Entropy has been used along with the principle of maximum entropy to derive Tsallis distributions. For instance, the q-Gaussians are the distributions maximizing the entropy, having the same role of Gaussians in the Boltzmann-Gibbs theory. The Boltzmann-Gibbs and Shannon (BGS) statistics is naturally applied to systems having short-range microscopic interactions and microscopic memory. Systems obeying BGS statistics are called extensive systems. Let us consider a physical system decomposed into two independent systems A and B, having joint probability: , BGS entropy is additive: In the case Tsallis entropy is evaluated, we have: In fact, Costantino Tsallis and Alfred Renyi both proposed entropies that, for 1  q , reduce to the Shannon entropy. The Renyi entropy is defined as [4]: This entropy is additive and the parameter q is used to have it more or less sensitive to the shape of probability [5]. The link between Tsallis and Renyi entropy is given in [1]: Let us assume two independent systems A and B again. Since the Renyi entropy is additive: This entropy differs from that proposed independently by Tsallis for the normalization factor. The Havrda and Charvát entropy is normalized to 1, whereas the Tsallis entropy is not normalized. As told in [7], for the use made in the reference, both entropies yield the same result and for this reason the entropy is also named Tsallis-Havrda-Charvát entropy.

Entropy, information and images
Before Tsallis had proposed his entropy, the use of maximum entropy principle was already considered a powerful method for image processing and reconstruction. As explained in [8], the method had the privileged position of being the only consistent method for combining different data into a single image. In fact, the maximum entropy method allows incorporating extra knowledge about the object which is represented in the image.
In 2000, Takuya Yamano generalized the Shannon's information theory in a non-additive way, and proposed this generalization in a work [9] where he explored the consequences of adopting a non-additive information content and a non-additive entropy. In 2003, this generalization was extended to image processing areas, specifically to image segmentation [10].
One of the simplest methods used for segmentation is the thresholding. In [11], a survey of thresholding is given, which is categorizing the methods into some groups based on the information the algorithms are manipulating. Among them, we find methods based on histograms of the gray-level sample or methods based on clustering, where the gray-level samples are clustered in two parts as background and objects. We have also the entropy-based methods: as told in [10], Kapur et al. [12] assumed two probability distributions, one for the object and the other for the background and maximized the total entropy of the partitioned image in order to obtain the threshold level. In [10], the authors used a method similar to the maximum entropy sum method of Kapur et al., however, using the Tsallis entropy.

Entropy of an image
Let us consider an image having k gray levels. Let ) , ( y x f be the gray value of the pixel located at the point   (12) Note that (11) is estimated by: The Tsallis entropy is: Let us stress again that Tsallis entropy is a function of parameter q.

Entropic segmentation via thresholding
Let us consider again an image having k gray levels, with a distribution of probabilities, Let us assume a bi-level threshold t for the gray levels. In [10], two classes had been introduced, A and B, and their probability distributions: In (15) The Tsallis entropy for each distribution is: The entropy is: In [10], this entropy, which is a function of threshold the final result depends on q , this coefficient can be Renyi entropy based image thresholding [13][14][15] and Tsallis entropy based image thresholding [10] are two important global threshold selection approaches in image segmentation. The equivalence relationship between these two approaches is revealed, that is, with the same parameter, the two approaches will obtain the same threshold.
In the previous discussion we have talk about gray tones. Of course images have colors: all what we have previously told can be applied to each of the colour tone (red, green and blue). In some previous papers [16][17][18], the reader can find an example of thresholding on color tones for a specific application: the digital restoration of manuscripts and drawings.

Multi-level thresholding
In computer vision and image processing, the reduction of a gray level image to a binary image can be obtained through a clustering-based image thresholding. Examples are given in the Figures 1 and  2. However, we can extend the method to multilevel thresholding. Let us consider again an image having k gray levels, with a distribution of probabilities, The Tsallis entropy for each distribution and the total entropy are: In [19], a multi-level thresholding in image segmentation is obtained combining Tsallis entropy and Particle Swarm Optimization (PSO). PSO is a computational method of optimization which use iterative tests to improve a candidate solution. In [20], the authors proposed the multi-level thresholding method for image segmentation, using the artificial bee colony approach to reduce the time of processing.

The Two-Dimensional Histogram
As previously told, f(x,y) is the gray value of the pixel located at the point (x,y). We can use, for segmentation, the average gray value of the Let us discuss the threshold to distinguish object and background. The threshold is obtained through a vector ) , , represents the neighborhood of each pixel too. Let g(x,y) be the average of the neighborhood of the pixel located at the point (x,y). For instance, we can use the integer part of the arithmetic mean obtained with gray values of the given pixel at (x,y) and of its eight nearest neighboring pixels. While computing the average gray value, it is necessary to disregard the two rows from the top and bottom and two columns from the sides [15]. The gray value of the pixel, f(x,y), and the average of its neighborhood, g(x,y), are used to construct a two-dimensional histogram. The normalized histogram is approximated by using the formula is [15]: threshold of the gray level of the pixel and s, for ) , ( y x g , represents the threshold of the average gray level of the pixels neighborhood. Using (30), we find a surface that will have two peaks and one valley. The object and background correspond to the peaks and can be separated by selecting the vector ) , ( s t that maximizes a suitable criterion function ) , ( s t U .
Using vector ) , ( s t , the domain of the histogram is contain information about edges and noise alone, and therefore they are ignored in the calculation. The quadrants which contain the object and the background are the second and fourth; they are considered to be independent distributions, the probability values in each case must be normalized in order to have a total probability equal to 1. The normalization is accomplished by using a posteriori class probabilities: In [12], the contribution of the quadrants which contains the edges and noise is assumed negligible, hence it is approximated: The distributions and entropies are: Again, the total entropy is: The entropy (38) is a function of thresholds t and s; we can find them maximizing the entropy. Parameter q can be used as a tuning parameter.

Discussion
As we have seen, Tsallis entropy is easy to use for discrete data and related frequencies. For this reason, it assumed a relevant role in several numerical applications. In particular we find it used in medical image processing, a quite dynamic branch of image processing [21]. This processing is based, among several other methods, on clustering the presence of unwanted lesions/regions in a noisy background and in highlighting the edges of poorly illuminated images. As a starting point of analysis, segmentation is often used. In [22], to set parameters on a segmentation based on pulse-coupled neural network (PCNN), the Tsallis entropy is used. Pulse-coupled networks are models proposed for high-performance biomimetic image processing.
Feature extraction method for image processing via PCNN and Tsallis entropy is presented in Ref. 23 too. Some most recent papers using the Tsallis entropy in medical image processing are going from thresholding to the problem of image registration [24][25][26][27].
For what concerns the role of Tsallis entropy in the pattern recognition, Ref.28 compared the effectiveness of it over the classic Boltzmann-Gibbs-Shannon entropy and proposed a multi-q approach to improve pattern analysis. Experiments in [28] show that the Tsallis entropy using the multi-q approach has great advantages over the Boltzmann-Gibbs-Shannon entropy for pattern classification. Moreover, the approach is improving the image recognition rates. As explained in [28], this happens because the Tsallis entropy for different values of parameter q is encoding much more information from the given probability distribution than the Boltzmann-Gibbs-Shannon entropy.