Biometric application and classification of individuals using postural parameters

580 | P a g e J u n e 5 , 2 0 1 3 Biometric application and classification of individuals using postural parameters Dhouha Maatar 1 , Régis Fournier, Zied Lachiri, Amine Naitali 2 1 UR, Traitement du Signal de l'Image et Reconnaissance des Formes (UR-TSIRF), École Nat. Des Ingénieurs de Tunis, Tunisie. doha.maatar@gmail.com, Zied.lachiri@enit.rnu.tn 2 Laboratoire Image, signaux et Systèmes Intelligents (LISSI), Université Paris Est Créteil (UPEC), Paris France. naitali@u-pec.fr, rfournier@u-pec.fr ABSTRACT


INTRODUCTION
Being able to determine effectively and exactly the identity of an individual or to determine its physical category, according to age or gender or size or weight, has become a critical issue because, these days access, secure and monitoring is a matter of great importance.Indeed our identity is verified daily by multiple organizations: when we access our workplace, when we use our credit card when we connect to a computer network.Biometrics consists of methods for identifying humans based upon one or more physical or behavioral traits.The main desirable properties of a biometrics are: universal, measurable, unique, permanent, powerful, difficult to falsify or reproducible, and well accepted by users.Biometric identifiers are the distinctive, measurable characteristics used to identify individuals.The two categories of biometric identifiers include physiological and behavioral characteristics.Physiological characteristics are related to the shape of the body, and include fingerprint, face recognition, DNA, palm print, hand geometry, iris recognition.Behavioral characteristics are related to the behavior of a person, including typing rhythm, gait, and voice.Recently, Biometrics science is growing.It becomes a focus of several researches and tends to associate to high security technology but the low cost of biometric technologies has, for a long time hampered their development.So the challenge is to find biometric identifiers ensuring effective and exact body recognition with the lowest technology cost.The researches interesting to biometrics and classification use several recognition approaches such as linear discriminant analysis (LDA) used to face and iris multimodal biometric recognition [ 21] or only face recognition [20,27].LDA is also used to study the age-related changes in postural control during quiet standing [11].The recognition approaches: K-Nearest Neighbors (KNN), Support Vector Machine (SVM), linear discriminant analysis (LDA) and Neural Networks (NN) are also used to classify human postural and gestural movements [24].
In this study, we compare the SVM, KNN and LDA methods to identify persons and classify them by their age, gender, height and weight using stabilometric parameters.
So, in this study we consider the stabilometry (postural sway control) as a behavioral biometric identifier.The postural sway is generally quantified by displacement of center of pressure (CoP) over time.This displacement is performed by standing in static position on a low cost platform based on magnetic field [15,16].
The study of postural control sway is performed by analyzing the stabilogram which is the representation of the center of pressure (COP)'s displacement in anteroposterior (AP) and mediolateral (ML) direction.
To study the human stability several studies analyze the stabilogram known as to be non-stationary and nonlinear signal [10].For that they used different decomposition methods such the Empirical Mode Decomposition (EMD) consisting in extracting intrinsic mode functions (IMF) from stabilogram [3,4,17,18].This decomposition allows discriminating between elderly and control [17,18].Standard Fourier measures is also used in analyzing stability to demonstrate a relationship between fear of falling and the strategies used for human postural control [13].
Wavelet method is also used in several studies to determine critical point and both short-term and long-term diffusion Coefficients from the stabilogram diffusion control [6,8,29].Wavelet analysis was used also to discriminate chronic ankle instability [19].It was also used to decompose the signal in multiple timescales and extract the energy to study the effect of vision and age on the postural stability [12].
Similarly, discrete wavelet method is used to decompose the CoP time series in different time scales [25].
Another method is also used to decompose the stabilogram into three timescale components (trend, rambling, trembling).
It is the PCA decomposition, used in several previous studies [15,17].This method is slightly similar to that in previous study suggesting the presence of two timescale components (rambling and trembling) in the CoP time series [28].
Other studies didn't decompose the stabilogram signal but extract specific parameters as RMS, mean velocity of body sway, mean CoP amplitude to study the effect of different factor like aging on the stability [2,7,14,23].
In this study, discrete wavelet decomposition, mPCA decomposition and time_frequency analysis are performed to extract stabilometric parameters used to identify subjects and classify them by their age, gender, height and weight.This is ensured by the pattern recognition approaches; SVM, LDA and KNN.This paper is organized as follows: the description of the experimental protocol is presented in section 2. The decomposition methods, the description of parameters and classifiers are presented in section 3. Finally, our results and discussions are presented in section 4.This paper is organized as follows: the description of the platform and the experimental protocol are presented in section 2. The mPCA decomposition method and the analysis phase are presented in section 3.In section 4, we described the parameters.Finally, our results and discussions are presented in section 5.

.1 Experimental protocol
All material Experimental measures were recorded while an individual stood upright on an electromagnetic platform [15].After performing a calibration and correction phases on these measures, we obtained the CoP displacement in the horizontal plane.The representation of this COP displacement in mediolateral (ML) or anterioposterior (AP) directions is called the stabilogram [17]  The measures are achieved with subjects standing in orthostatic position with arms by their sides during 30 seconds.Each recording signal is sampled at 60 Hz.The measures are evaluated for twenty five healthy subjects (8 females and 17 males aged between 19 years and 42 years) for four types of measures: The first measure is tested by keeping foot outspread and opened eyes fixing a point placed on the wall in front of the subject (FO_EO), The second measure is evaluated with tighten foot and opened eyes (FT_EO), The third measure is with outspread foot and closed eyes (FO_EC), The last is measured with tightened foot and closed eyes (FT_EC).
The during of each recording is 30 seconds and is sampled at 60 Hz.Each group of the four types of measures is gathered in a set and for each subject we extract multiple sets of constitutes a database.

Decomposition methods mPCA decomposition
The stabilogram known as to be nonlinear and non-stationary signal seems to be the result of superposition of many signals with different characteristics [10].These signals can be differentiated by temporal and dimensional characteristics.
As it is developed in Fournier [15], the mPCA decomposition is performed on two steps.
In a first time the signal is considered as to be composed of a determinist signal with slow fluctuations and a signal with high frequency.The two signals must have distinct temporal and dimensional characteristics.In particular, the embedding dimension Dc of high frequency signal in phase space must be higher than the dimension of deterministic signal Dd.Thus, representation of the signal in a phase space with the dimension Dr as (Dr>Dd) allows after projection on Dp first principles axes, obtained by analysis in principle components to retain only the signal from deterministic system.So, after subtracting the determinist signal from the original one, the chaotic signal with highest frequency is extracted.
Secondly, after this first decomposition, we divide the resulting determinist signal into trend and rambling.The separation is performed by searching a polynomial approximation.The chaotic signal resulting from the first decomposition is the trembling.
So, the signal is decomposed into three distinct components, namely (Figure.

Discrete wavelet decomposition
One of the most powerful mathematical tools for signal analysis is the wavelet transform method.
It is particularly suitable for analyzing non-stationary signals.The wavelet transform method also has the advantage of analyzing signals in a multi-scale manner by varying the scale coefficient (representing frequency) [29].
The wavelet function is defined at scale a and location b as: (t) is also known as ''child wavelets'' and are derived from a basis function referred to as the ''mother wavelet,'' ψ(t) [1].
The wavelet transform is given by: Where x(t) is the time series data and T(a,b) is the ''wavelet coefficient'' (WC) at timescale ta and time instant b [29].
The decomposition of a signal requires a discrete wavelet transform determining the WCs; however, both the timescale ta and time instant b are discrete: and where j = 1,…, J are the discrete levels of timescales at which the signal can be represented so that the timescale range at level j is tj and k = 0,…, K(j) are the discrete locations.
''Detail WCs'' is WCs at each of the discrete levels of timescales and defined as: Similarly, ''approximation WCs'' is determined as: where is a scaling function of a wavelet which is associated with the smoothing of the signal and S(j,k) then is the ''approximation WC'' at level j and location k.These scaling functions derive from a basis function referred to as the ''father wavelet,'' [5], In this study, the wavelet function used is Daubechies (db2) to decompose the stabilogram into 3timescales levels detail WCs and approximation WCs.(Figure .3)

2.3Features description mPCA and wavelet parameters
Each signal has its related analytic signal Z(t) defined as : Where s(t) is the original signal and h(t) is the Hilbert transform of the signal s(t).It is defined as: where P.V is the Cauchy principle value [3].
The trajectory in the complex plan of the stabilogram's analytic signal can be visualized (Figure .4).As shown in figure.4,one can notice that the trajectory in the complex plan doesn't show a unique rotation center but a multiplicity of centers.
However, the visualization of the trajectory in the complex plan of the trembling (either rambling) resulting from PCA decomposition highlights a unique rotation center (respectively Figure.

Figure.9: Trajectory in the complex plan (s,h ,t) and projection in the plan (s,h) related to cd2.
Based on the property of having a unique rotation center from the trembling and rambling trajectory either from detail wavelet coefficients trajectory, a specific parameter is defined; it is the area of the circle in which 95% of the data points are located [4].This feature is calculated for AP and ML directions, for the four measures situations (FO_EO, FO_EC, FT_EO, FT_EC) and for each of these components: trembling and rambling resulting from mCPA decomposition.These parameters (16 values of area) are labelled mPCA parameters.
The feature is also calculated for AP and ML directions, for the four measures situations (FO_EO, FO_EC, FT_EO, FT_EC) and for each of these components: cd2, cd3 resulting from wavelet decomposition.These parameters (16 values of area) are labelled wavelet parameters.

Classical parameters
The analysis of balance control was assessed by several features extracted directly from the stabilogram.
For all features described in this section, CoP is the signal of the Center of pressure's displacement in AP direction ( or in ML direction ( .

Mean displacement
The mean displacement of CoP calculated for AP and ML ( directions, where:

Mean velocity
The second feature is mean velocity of CoP in AP and ML directions (VAP, VML), where:

Confidence ellipse area
The 95% confidence ellipse area (CEA) is a method to estimate the confidence area of the COP path on the platform that encloses approximately 95% of the points on the COP path, where: J u n e 5 , 2 0 1 3

Centroidal Frequency
The centroidal fréquency of CoP in AP (CFAP) and ML directions (CFML) is half of the number of zero crossings per second of the time series, where: Power spectral density This parameter calculates the average power in a given frequency band.The spectrum used is a spectrum estimated by the method of Welch.The mean power spectral density of CoP is defined in AP (PSDAP) and ML (PSDML) directions.

Classifiers
In this study, we used three supervised classification method.

K nearest neighbours (KNN)
The k-nearest neighbour algorithm (kNN) is a method for classifying objects based on closest training examples in the feature space.
The k-nearest neighbour algorithm is amongst the simplest of all machine learning algorithms.It classifies a data point by assigning it the label most frequently represented among the k nearest training data points [9].So, the object is classified by a majority vote of its neighbours, with the object being assigned to the class most common amongst its k nearest neighbours.If k = 1, then the object is simply assigned to the class of its nearest neighbour.
The KNN algorithm can separate multiple classes and distinguish nonlinear data classes.This classifier is nonlinear because the decision surface or boundary separation is non-linear form.
Despite his speed, stability and scalability, kNN is sensitive to outliers and noise.

Linear Discriminant Analysis (LDA)
The idea of LDA is that for each class to be identified, calculate a linear function of the attributes.The class function yielding the highest score represents the predicted class.Linear Discriminant Analysis maximizes the ratio of betweenclass variance to the within-class variance in any particular data set thereby guaranteeing maximal separability [20,24].Initially, an hyperplane is calculated to separate the two best known classes.This separation is achieved so that the intraclass variation is minimal and inter-class variation is maximized.The unknown sample is positioned into linear combinations in space and associated with the class which is closest in the plan.Thus, the LDA is linear classifier as its decision boundaries are affine hyper planes.The decision boundary is the decision surface separating the points assigned to a class of points assigned to another class.
LDA easily handles the case where the within-class frequencies are unequal and their performances have been examined on randomly generated test data.Also, this method naturally handles problems with more than two classes and it can provide probability estimates for each of the candidate classes.
Linear discriminant analysis is a powerful technique.However, it requires that the data is linearly separable.

Support vector machines (SVM)
The support vector machines (SVM) is a major evolution of supervised classification algorithms [26].The principle of SVM is to draw an hyperplane to separate the best two or more of learning classes present in the n-dimensional space of data.The objective is to maximize the distance of the samples at the borders of the hyperplane.The boundaries are defined by a set of data belonging to the training data.These boundaries are the support vectors (support vectors).They can be defined linearly or from a family of functions (polynomial, spline).The distance between the support vector classes, or border is the margin of the hyperplane.When there is no hyperplane separating the data, the samples are transposed into a higher dimensional space.SVM minimize the risk of overfitting allowing the generalization of the classification rule.This technique has also the advantage of being particularly robust to dimensional problems [22].

Results
The parameters described above are used initially to recognize 10 subjects.Classifiers used in this section are LDA and KNN with k = 1.In a second step, these parameters are used to classify 25 subjects into groups.Each group contains two classes: These classifications are applied by three classifiers: LDA, KNN and SVM.Each application is carried out according to seven observations, each corresponding to a combination of different types of parameters.The observation is defined as: 1) All the chosen parameters (classical + wavelet + mPCA).
3) Parameters from the mPCA decomposition (mPCA).4) Parameters of the wavelet decomposition (wavelet).5) Combination of the classical parameters and parameters from the wavelet decomposition (classical + wavelet).6) Combination of classical parameters and parameters from the mPCA decomposition (classical + mPCA).7) Combination of the wavelet and mACP parameters (wavelet + mPCA).
A recognition correct rate (CR) is calculated for each observation and each classification method.
There are also two rates developed for each application: -An observational average rate (CRO) related to each observation: corresponds to the average of the correct rates for all classifiers by one observation.-A classifier average rate (CRC) related to each classifier: corresponds to the average correct rates a value for all observations by one classifier.

Subjects recognition
The table.1 exhibits the results of the 10 subjects' recognition.The results of table.1 show that the LDA provides the better recognition performance (CR = 80.43%) related to the observation combining all parameters.The higher CRC corresponds to LDA classifier (CRC = 66.30%).The highest CRO rate (67.39%) is provided by the the combination of all parameters while the lowest CRO (47.83%) is related to the observation of mPCA parameter.

Age classification
The healthy subjects are divided into two groups according to their ages: control subject's mean age is 22.

Gender classification
The healthy subjects are now divided into two groups according to their gender.Female subject's (mean age is 24.5 ± 5.5y and males subject's mean age is 31± 11y.The table.3 presents the results of performance of each classification methods (LDA, KNN and SVM) to distinguish between women and men among the 25 subjects studied.The results show that SVM provides the best CR (85.87%) related to the combination of all parameters.The best CRC is related to SVM (77.48%).The combination of all parameters is associated to highest CRO (80.07%) while mPCA parameters provide the lowest CRO (68.48%).

Weight classification
The subjects are now divided into groups according to their weight: Fat group (13 subjects with weight varying between 72 and 105 kg) and thin group (12 subjects with weight varying between 52 and 66 kg) noting that no subject is suffering from obesity.The table.4shows the results of the subjects' classification by weight.So it displays the performance of each classification methods (LDA, K-NN and SVM) to distinguish between the fat group and thin group.).This performance is associated to the observation combining all parameters.The SVM classification method provides also the highest CRC rates 84.32%).The best CRO (88.77%)rate is associated to the combination of all parameters.The lowest CRO (69.57%) is provided by the mACP parameters.

Height classification
The healthy subjects are now divided into two groups according to their height: Tall group (14 subjects with height varying between 174 and 192 cm) and small group (11 subjects with height varying between 160 and 172cm).
The table.5 shows the results of the classification of the subjects by height, so studying the performance of each classification methods (LDA, KNN and SVM) to distinguish between the tall group and the small group.

Figure. 1 :
Figure.1: Displacement of the center of pressure in (a)the horizontal plane(b) stabilogram in mediolatéral (ML) direction and (c) stabilogram in anteroposterior (AP) displacement.

Figure. 8 :
Figure.8:(a) Trajectory in the complex plan (s,h ,t) and (b) projection in the plan (s,h) related to cd2.

-A
Classification according to age (adult / control).-A classification by gender (male / female).-A classification according to weight (thin / fat).-A classification according to height (tall/ small).

Table . 2: Results of the age classification.
5 ± 2.5y and adult subject's mean age is 34.5±7.5y.The table.2shows the classification results of subjects (25 subjects) by age.It shows the performance ofThe results of table.2show that the LDA classifier provides the best classification performance (CR = 93.48%).This performance is associated to the observation of classical parameter.The SVM classification method provides the highest CRC rate (86.65%); however, the lowest CRC (81.96%) corresponds to KNN.The highest CRO rate (89.13%) is related to the classical parameter while the lowest CRO (64.85%) is related to the mPCA parameter.