Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Extended sparse representation-based classification method for face recognition

  • Special Issue Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

In sparse representation algorithms, a test sample can be sufficiently represented by exploiting only the training samples from the same class. However, due to variations of facial expressions, illuminations and poses, the other classes also have different degrees of influence on the linear representation of the test sample. Therefore, in order to represent a test sample more accurately, we propose a new sparse representation-based classification method which can strengthen the discriminative property of different classes and obtain a better representation coefficient vector. In our method, we introduce a weighted matrix, which can make small deviations correspond to higher weights and large deviations correspond to lower weights. Meanwhile, we improve the constraint term of representation coefficients, which can enhance the distinctiveness of different classes and make a better positive contribution to classification. In addition, motivated by the work of ProCRC algorithm, we take into account the deviation between the linear combination of all training samples and of each class. Thereby, the discriminative representation of the test sample is further guaranteed. Experimental results on the ORL, FERET, Extended-YaleB and AR databases show that the proposed method has better classification performance than other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Liu, W., Zha, Z.J., Wang, Y., Lu, K., Tao, D.: $p$-Laplacian regularized sparse coding for human activity recognition. IEEE Trans. Ind. Electron. 63(8), 5120–5129 (2016)

    Google Scholar 

  2. Xu, Y., Fei, L., Wen, J., Zhang, D.: Discriminative and robust competitive code for palmprint recognition. IEEE Trans. Syst. Man Cybern. Syst. PP(99), 1–10 (2016)

    Google Scholar 

  3. Chen, G., Tao, D., Wei, L., Liu, L., Jie, Y.: Label propagation via teaching-to-learn and learning-to-teach. IEEE Trans. Neural Netw. Learn. Syst. 28(6), 1452–1465 (2017)

    Article  Google Scholar 

  4. Yong, X., Li, X., Yang, J., Lai, Z., Zhang, D.: Integrating conventional and inverse representation for face recognition. IEEE Trans. Cybern. 44(10), 1738–1746 (2014)

    Article  Google Scholar 

  5. Yong, X., Fang, X., Li, X., Yang, J., You, J., Liu, H., Teng, S.: Data uncertainty in face recognition. IEEE Trans. Cybern. 44(10), 1950–1961 (2014)

    Article  Google Scholar 

  6. Chen, X., Ziarko, W.: Experiments with rough set approach to face recognition. Int. J. Intell. Syst. 26(6), 499–517 (2011)

    Article  Google Scholar 

  7. Fang, Y., Lin, W., Fang, Z., Chen, Z., Lin, C.W., Deng, C.: Visual acuity inspired saliency detection by using sparse features. Inf. Sci. Int. J. 309(C), 1–10 (2015)

    Google Scholar 

  8. Du, B., Wang, Z., Zhang, L., Zhang, L., Liu, W., Shen, J., Tao, D.: Exploring representativeness and informativeness for active learning. IEEE Trans. Cybern. PP(99), 1–13 (2015)

    Google Scholar 

  9. Liu, W., Ma, T., Xie, Q., Tao, D., Cheng, J.: LMAE: a large margin auto-encoders for classification. Sig. Process. 141, 137–143 (2017)

    Article  Google Scholar 

  10. Liu, W., Tao, D., Cheng, J., Tang, Y.: Multiview Hessian discriminative sparse coding for image annotation. Comput. Vis. Image Underst. 118(1), 50–60 (2014)

    Article  Google Scholar 

  11. Fang, Y., Wang, J., Narwaria, M., Le Callet, P., Lin, W.: Saliency detection for stereoscopic images. IEEE Trans. Image Process. 23(6), 2625–2636 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  12. Du, B., Xiong, W., Wu, J., Zhang, L., Zhang, L., Tao, D.: Stacked convolutional denoising auto-encoders for feature representation. IEEE Trans. Cybern. 47(4), 1017–1027 (2016)

    Article  Google Scholar 

  13. Gong, C., Liu, T., Tao, D., Fu, K., Tu, E., Yang, J.: Deformed graph laplacian for semisupervised learning. IEEE Trans. Neural Netw. Learn. Syst. 26(10), 2261–2274 (2015)

    Article  MathSciNet  Google Scholar 

  14. Liu, T., Tao, D.: On the performance of manhattan nonnegative matrix factorization. IEEE Trans. Neural Netw. Learn. Syst. 27(9), 1851–1863 (2016)

    Article  MathSciNet  Google Scholar 

  15. Du, B., Wang, N., Wang, N., Zhang, L., Zhang, L., Zhang, L.: Hyperspectral signal unmixing based on constrained non-negative matrix factorization approach. Neurocomputing 204(C), 153–161 (2016)

    Article  Google Scholar 

  16. Liu, W., Yang, X., Tao, D., Cheng, J., Tang, Y.: Multiview dimension reduction via Hessian multiset canonical correlations. Inf. Fusion 41, 119–128 (2017)

    Article  Google Scholar 

  17. Liu, T., Gong, M., Tao, D.: Large-cone nonnegative matrix factorization. IEEE Trans. Neural Netw. Learn. Syst. 28(9), 2129–2142 (2017)

    MathSciNet  Google Scholar 

  18. Yu, J., Hong, C., Rui, Y., Tao, D.: Multi-task autoencoder model for recovering human poses. IEEE Trans. Ind. Electron. PP(99), 1 (2017)

    Google Scholar 

  19. Gong, C., Tao, D., Maybank, S.J., Liu, W., Kang, G., Yang, J.: Multi-modal curriculum learning for semi-supervised image classification. IEEE Trans. Image Process. 25(7), 3249–3260 (2016)

    Article  MathSciNet  Google Scholar 

  20. Bo, D., Zhang, M., Zhang, L., Ruimin, H., Tao, D.: PLTD: patch-based low-rank tensor decomposition for hyperspectral images. IEEE Trans. Multimed. 19(1), 67–79 (2017)

    Article  Google Scholar 

  21. Liu, W., Zhang, L., Tao, D., Cheng, J.: Support vector machine active learning by Hessian regularization. J. Vis. Commun. Image Represent. 49, 47–56 (2017)

    Article  Google Scholar 

  22. Yang, X., Liu, W., Tao, D., Cheng, J.: Canonical correlation analysis networks for two-view image recognition. Inf. Sci. Int. J. 385(C), 338–352 (2017)

    Google Scholar 

  23. Fang, Y., Wang, Z., Lin, W.: Video saliency incorporating spatiotemporal cues and uncertainty weighting. In: IEEE International Conference on Multimedia and Expo, pP. 1–6 (2013)

  24. Bo, D., Zhao, R., Zhang, L., Zhang, L.: A spectral-spatial based local summation anomaly detection method for hyperspectral images. Signal Process. 124(C), 115–131 (2016)

    Google Scholar 

  25. Tao, D., Li, X., Wu, X., Maybank, S.J.: Geometric mean for subspace selection. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 260–274 (2009)

    Article  Google Scholar 

  26. Li, L., Liu, S., Peng, Y., Sun, Z.: Overview of principal component analysis algorithm. Optik Int. J. Light Electron Opt. 127(9), 3935–3944 (2016)

    Article  Google Scholar 

  27. Gong, C., Tao, D., Fu, K., Yang, J.: Fick’s law assisted propagation for semisupervised learning. IEEE Trans. Neural Netw. Learn. Syst. 26(9), 2148–2162 (2015)

    Article  MathSciNet  Google Scholar 

  28. Chen, G., Liu, T., Tang, Y., Jian, Y., Jie, Y., Tao, D.: A regularization approach for instance-based superset label learning. IEEE Trans. Cybern. PP(99), 1–12 (2017)

    Article  Google Scholar 

  29. Yu, J., Yang, X., Fei, G., Tao, D.: Deep multimodal distance metric learning using click constraints for image ranking. IEEE Trans. Cybern. PP(99), 1–11 (2016)

    Google Scholar 

  30. Fang, Y., Fang, Z., Yuan, F., Yang, Y., Yang, S., Xiong, N.N.: Optimized multioperator image retargeting based on perceptual similarity measure. IEEE Trans. Syst. Man Cybern. Syst. 47(11), 2956–2966 (2017)

    Article  Google Scholar 

  31. Gong, C., Tao, D., Chang, X., Yang, J.: Ensemble teaching for hybrid label propagation. IEEE Trans. Cybern. PP(99), 1–15 (2017)

    Article  Google Scholar 

  32. Yong, X., Zhong, A., Yang, J., Zhang, D.: LPP solution schemes for use with face recognition. Pattern Recognit. 43(12), 4165–4176 (2010)

    Article  MATH  Google Scholar 

  33. Yu, J., Rui, Y., Tang, Y.Y., Tao, D.: High-order distance-based multiview stochastic learning in image classification. IEEE Trans. Cybern. 44(12), 2431 (2014)

    Article  Google Scholar 

  34. Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)

    Article  Google Scholar 

  35. Belkin, M., Niyogi, P.: Laplacian Eigenmaps for Dimensionality Reduction and Data Representation. MIT Press, Cambridge (2003)

    MATH  Google Scholar 

  36. Ahonen, T., Hadid, A., Pietikainen, M.: Face description with local binary patterns: application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 2037–2041 (2006)

    Article  MATH  Google Scholar 

  37. Wright, J., Ganesh, A., Zhou, Z., Wagner, A., Ma, Y.: Demo: robust face recognition via sparse representation. In: IEEE International Conference on Automatic Face and Gesture Recognition, pp. 1–2 (2009)

  38. Naseem, I., Togneri, R., Bennamoun, M.: Linear regression for face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 32(11), 2106–2112 (2010)

    Article  Google Scholar 

  39. Yong, X., Zhang, D., Yang, J., Yang, J.Y.: A two-phase test sample sparse representation method for use with face recognition. IEEE Trans. Circuits Syst. Video Technol. 21(9), 1255–1262 (2011)

    Article  Google Scholar 

  40. Zhang, L., Yang, M., Feng, X.: Sparse representation or collaborative representation: which helps face recognition? In: IEEE International Conference on Computer Vision, pp. 471–478 (2012)

  41. Deng, W., Jiani, H., Guo, J.: Extended SRC: undersampled face recognition via intraclass variant dictionary. IEEE Trans. Pattern Anal. Mach. Intell. 34(9), 1864–1870 (2012)

    Article  Google Scholar 

  42. Tang, X., Feng, G., Cai, J.: Weighted group sparse representation for undersampled face recognition. Neurocomputing 145(18), 402–415 (2014)

    Article  Google Scholar 

  43. Timofte, R., Van Gool, L.: Adaptive and weighted collaborative representations for image classification. Pattern Recognit. Lett. 43(1), 127–135 (2014)

    Article  Google Scholar 

  44. Wu, J., Timofte, R., Van Gool, L.: Learned collaborative representations for image classification. In: IEEE Winter Conference on Applications of Computer Vision, pp. 456–463 (2015)

  45. Yong, X., Zhong, Z., Jian, Y., You, J., Zhang, D.: A new discriminative sparse representation method for robust face recognition via regularization. IEEE Trans. Neural Netw. Learn. Syst. PP(99), 1–10 (2016)

    Google Scholar 

  46. Cai, S., Zhang, L., Zuo, W., Feng, X.: A probabilistic collaborative representation based approach for pattern classification. In: Computer Vision and Pattern Recognition, pp. 2950–2959 (2016)

  47. Amaldi, E., Kann, V.: On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems. Theor. Comput. Sci. 209(1–2), 237–260 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  48. Liu, T., Tao, D.: Classification with noisy labels by importance reweighting. IEEE Trans. Pattern Anal. Mach. Intell. 38(3), 447 (2016)

    Article  Google Scholar 

  49. Candès, E.J., Romberg, J.K., Tao, T.: Stable signal recovery from incomplete and inaccurate measurements. Commun. Pure Appl. Math. 59(8), 1207–1223 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  50. Fang, Y., Lin, W., Chen, Z., Tsai, C.M., Lin, C.W.: A video saliency detection model in compressed domain. IEEE Trans. Circuits Syst. Video Technol. 24(1), 27–38 (2014)

    Article  Google Scholar 

  51. Huang, W., Wang, X., Jin, Z., Li, J.: Penalized collaborative representation based classification for face recognition. Appl. Intell. 4(4), 12–19 (2015)

    Google Scholar 

  52. Xu, Y., Zhu, Q., Chen, Y., Pan, J.S.: An improvement to the nearest neighbor classifier and face recognition experiments. Int. J. Innov. Comput. Inf. Control 9(2), 543–554 (2013)

    Google Scholar 

  53. Yong, X., Zhu, Q., Fan, Z., Qiu, M., Chen, Y., Liu, H.: Coarse to fine K nearest neighbor classifier. Pattern Recognit. Lett. 34(9), 980–986 (2013)

    Article  Google Scholar 

  54. Yong, X., Fang, X., You, J., Chen, Y., Liu, H.: Noise-free representation based classification and face recognition experiments. Neurocomputing 147(1), 307–314 (2015)

    Google Scholar 

  55. Yong, X., Fan, Z., Zhu, Q.: Feature space-based human face image representation and recognition. Opt. Eng. 51(1), 7205 (2012)

    Google Scholar 

  56. Yong, X., Li, X., Yang, J., Zhang, D.: Integrate the original face image and its mirror image for face recognition. Neurocomputing 131(7), 191–199 (2014)

    Google Scholar 

  57. ORL: Face database. http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html. Accessed 1 Mar 2017

  58. FERET: Face database. http://www.itl.nist.gov/iad/humanid/feret/feret_master.html. Accessed 1 Mar 2017

  59. YaleB: Face database. http://vision.ucsd.edu/content/yale-face-database. Accessed 1 Mar 2017

  60. AR: Face database. http://web.mit.edu/emeyers/www/face_databases.html#ar. Accessed 1 Mar 2017

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (Nos. 61672333, 61402274, 61703096, 41471280), China Postdoctoral Science Foundation (No. 2017M611655), the Program of Key Science and Technology Innovation Team in Shaanxi Province (No. 2014KTC-18), the Key Science and Technology Program of Shaanxi Province (No. 2016GY-081), the National Natural Science Foundation of Jiangsu Province (No. BK20170691), the Fundamental Research Funds for the Central Universities (Nos. GK201803059, GK201803088), Interdisciplinary Incubation Project of Learning Science of Shaanxi Normal University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shigang Liu.

Appendices

Appendix 1: The derivative over \(\beta \) of \(\frac{1}{2}{\left( {y - {\mathbf{X}}\beta } \right) ^\mathrm{T}}{\mathbf{W}}\left( {y - {\mathbf{X}}\beta } \right) + \gamma \sum _{i = 1}^c \sum _{j = 1}^c {\beta _i^\mathrm{T}{\mathbf{X}}_i^\mathrm{T}{{\mathbf{X}}_j}{\beta _j}} + \lambda \sum _{i = 1}^c {\left\| {{\mathbf{X}}\beta - {{\mathbf{X}}_i}{\beta _i}} \right\| _2^2} \)

First, \(\frac{d}{{d\beta }}\left( {\frac{1}{2}{{\left( {y - {\mathbf{X}}\beta } \right) }^\mathrm{T}}{\mathbf{W}}\left( {y - {\mathbf{X}}\beta } \right) } \right) = - {{\mathbf{X}}^\mathrm{T}}{\mathbf{W}}\left( {y - {\mathbf{X}}\beta } \right) \).

Next, letting \(f\left( \beta \right) = \gamma \sum _{i = 1}^c {\sum _{j = 1}^c {\beta _i^\mathrm{T}{\mathbf{X}}_i^\mathrm{T}{{\mathbf{X}}_j}{\beta _j}} } \), we can calculate the partial derivatives \(\frac{{\partial f}}{{\partial {\beta _k}}}\). Then \(\frac{{df}}{{d\beta }}\) can be obtained by using all \(\frac{{\partial f}}{{\partial {\beta _k}}}\) \(k = 1, \ldots ,c\). Based on mathematical experience,

$$\begin{aligned} \beta _i^\mathrm{T}{\mathbf{X}}_i^\mathrm{T}{{\mathbf{X}}_j}{\beta _j}= & {} {\left( {{{\mathbf{X}}_i}{\beta _i}} \right) ^\mathrm{T}}{{\mathbf{X}}_j}{\beta _j} \\= & {} \frac{1}{2}\left( {\left\| {{{\mathbf{X}}_i}{\beta _i} + {{\mathbf{X}}_j}{\beta _j}} \right\| _2^2 - \left\| {{{\mathbf{X}}_i}{\beta _i}} \right\| _2^2 - \left\| {{{\mathbf{X}}_j}{\beta _j}} \right\| _2^2} \right) . \end{aligned}$$

So \(f\left( \beta \right) \) can be rewritten as

$$\begin{aligned}&f\left( \beta \right) = \gamma \sum _{i = 1}^c {\sum _{j = 1}^c {\beta _i^\mathrm{T}{\mathbf{X}}_i^\mathrm{T}{{\mathbf{X}}_j}{\beta _j}} } \\&\quad = \frac{\gamma }{2}\left[ {\sum _{\begin{array}{c} \scriptstyle i = 1, \ldots ,c \\ \scriptstyle i \ne k \end{array}} {\left( {\left\| {{{\mathbf{X}}_i}{\beta _i} + {{\mathbf{X}}_k}{\beta _k}} \right\| _2^2 - \left\| {{{\mathbf{X}}_i}{\beta _i}} \right\| _2^2 - \left\| {{{\mathbf{X}}_k}{\beta _k}} \right\| _2^2} \right) } } \right. \\&\qquad + \sum _{\begin{array}{c} \scriptstyle j = 1, \ldots ,c \\ \scriptstyle j \ne k \end{array}} {\left( {\left\| {{{\mathbf{X}}_k}{\beta _k} + {{\mathbf{X}}_j}{\beta _j}} \right\| _2^2 - \left\| {{{\mathbf{X}}_k}{\beta _k}} \right\| _2^2 - \left\| {{{\mathbf{X}}_j}{\beta _j}} \right\| _2^2} \right) } \\&\qquad \left. { + \sum _{\begin{array}{c} \scriptstyle i = 1, \ldots ,c \\ \scriptstyle i \ne k \end{array}} {\sum _{\begin{array}{c} \scriptstyle j = 1, \ldots ,c \\ \scriptstyle j \ne k \end{array}} {\left( {\left\| {{{\mathbf{X}}_i}{\beta _i} + {{\mathbf{X}}_j}{\beta _j}} \right\| _2^2 - \left\| {{{\mathbf{X}}_i}{\beta _i}} \right\| _2^2 - \left\| {{{\mathbf{X}}_j}{\beta _j}} \right\| _2^2} \right) } } } \right] \\&\quad = \gamma \sum _{\begin{array}{c} \scriptstyle i = 1, \ldots ,c \\ \scriptstyle i \ne k \end{array}} {\left( {\left\| {{{\mathbf{X}}_i}{\beta _i} + {{\mathbf{X}}_k}{\beta _k}} \right\| _2^2 - \left\| {{{\mathbf{X}}_i}{\beta _i}} \right\| _2^2 - \left\| {{{\mathbf{X}}_k}{\beta _k}} \right\| _2^2} \right) } \\&\qquad + \frac{\gamma }{2}\sum _{\begin{array}{c} \scriptstyle i = 1, \ldots ,c \\ \scriptstyle i \ne k \end{array}} {\sum _{\begin{array}{c} \scriptstyle j = 1, \ldots ,c \\ \scriptstyle j \ne k \end{array}} {\left( {\left\| {{{\mathbf{X}}_i}{\beta _i} + {{\mathbf{X}}_j}{\beta _j}} \right\| _2^2 - \left\| {{{\mathbf{X}}_i}{\beta _i}} \right\| _2^2 - \left\| {{{\mathbf{X}}_j}{\beta _j}} \right\| _2^2} \right) } } . \end{aligned}$$

The calculation procedure of \(\frac{{\partial f}}{{\partial {\beta _k}}}\) is as follows,

$$\begin{aligned} \frac{{\partial f}}{{\partial {\beta _k}}}&= \frac{\partial }{{\partial {\beta _k}}}\left( {\gamma \sum _{i = 1}^c {\sum _{j = 1}^c {\beta _i^\mathrm{T}{\mathbf{X}}_i^\mathrm{T}{{\mathbf{X}}_j}{\beta _j}} } } \right) \\&=\frac{\partial }{{\partial {\beta _k}}}\left( {\gamma \sum _{\begin{array}{c} \scriptstyle i = 1, \ldots ,c \\ \scriptstyle i \ne k \end{array}} {\left( {\left\| {{{\mathbf{X}}_i}{\beta _i} + {{\mathbf{X}}_k}{\beta _k}} \right\| _2^2 - \left\| {{{\mathbf{X}}_i}{\beta _i}} \right\| _2^2 - \left\| {{{\mathbf{X}}_k}{\beta _k}} \right\| _2^2} \right) } } \right) \\&= \gamma \sum _{\begin{array}{c} \scriptstyle i = 1, \ldots ,c \\ \scriptstyle i \ne k \end{array}} {\left( {2{\mathbf{X}}_k^\mathrm{T}\left( {{{\mathbf{X}}_i}{\beta _i} + {{\mathbf{X}}_k}{\beta _k}} \right) - 2{\mathbf{X}}_k^\mathrm{T}{{\mathbf{X}}_k}{\beta _k}} \right) } \\&=\gamma \sum _{\begin{array}{c} \scriptstyle i = 1, \ldots ,c \\ \scriptstyle i \ne k \end{array}} {\left( {2{\mathbf{X}}_k^\mathrm{T}{{\mathbf{X}}_i}{\beta _i}} \right) } = 2\gamma \left[ {\left( {\sum _{i = 1, \ldots ,c} {{\mathbf{X}}_k^\mathrm{T}{{\mathbf{X}}_i}{\beta _i}} } \right) - {\mathbf{X}}_k^\mathrm{T}{{\mathbf{X}}_k}{\beta _k}} \right] \\&= 2\gamma {\mathbf{X}}_k^\mathrm{T}{\mathbf{X}}\beta - 2\gamma {\mathbf{X}}_k^\mathrm{T}{{\mathbf{X}}_k}{\beta _k} . \end{aligned}$$

Thus, the derivative over \(\beta \) of \(f\left( \beta \right) \) is \(\frac{{df}}{{d\beta }} = \left[ \begin{array}{c} \frac{{\partial f}}{{\partial {\beta _1}}} \vdots \frac{{\partial f}}{{\partial {\beta _c}}} \end{array} \right] = \left[ \begin{array}{c} 2\gamma {\mathbf{X}}_1^\mathrm{T}{\mathbf{X}}\beta - 2\gamma {\mathbf{X}}_1^\mathrm{T}{{\mathbf{X}}_1}{\beta _1} \vdots 2\gamma {\mathbf{X}}_k^\mathrm{T}{\mathbf{X}}\beta - 2\gamma {\mathbf{X}}_k^\mathrm{T}{{\mathbf{X}}_k}{\beta _k} \end{array} \right] =2\gamma {{\mathbf{X}}^\mathrm{T}}{\mathbf{X}}\beta - 2\gamma {\mathbf{M}}\beta \) ,

where \({\mathbf{M}} = \left( {\begin{matrix} {{\mathbf{X}}_1^\mathrm{T}{{\mathbf{X}}_1}} &{} \ldots &{} 0 \\ \vdots &{} \ddots &{} \vdots \\ 0 &{} \cdots &{} {{\mathbf{X}}_c^\mathrm{T}{{\mathbf{X}}_c}} \\ \end{matrix} } \right) .\)

As for \(\frac{\partial }{{\partial \beta }}\left( {\lambda \sum _{i = 1}^c {\left\| {{\mathbf{X}}\beta - {{\mathbf{X}}_i}{\beta _i}} \right\| _2^2} } \right) \), we need to analyze \(\sum _{i = 1}^c {\left\| {{\mathbf{X}}\beta - {{\mathbf{X}}_i}{\beta _i}} \right\| _2^2} \) and deduce the deformation formula of \(\sum _{i = 1}^c {\left\| {{\mathbf{X}}\beta - {{\mathbf{X}}_i}{\beta _i}} \right\| _2^2} \) for convenience of calculation. Due to \({\mathbf{X}}\beta = \left[ {{{\mathbf{X}}_1}, \ldots ,{{\mathbf{X}}_c}} \right] \left[ \begin{matrix} {\beta _1} \\ {\beta _2} \\ \vdots \\ {\beta _c} \\ \end{matrix} \right] = {{\mathbf{X}}_1}{\beta _1} + \cdots + {{\mathbf{X}}_c}{\beta _c}\), we have \({\mathbf{X}}\beta - {{\mathbf{X}}_i}{\beta _i}={{\mathbf{X}}_1}{\beta _1} + \cdots + {{\mathbf{X}}_{i - 1}}{\beta _{i - 1}} + {{\mathbf{X}}_{i + 1}}{\beta _{i + 1}} + \cdots + {{\mathbf{X}}_c}{\beta _c}\). Letting \({{\mathbf{S}}_i}=\left[ {0, \ldots ,{{\mathbf{X}}_i}, \ldots ,0} \right] \) and \({{\mathbf{Z}}_i}={\mathbf{X}} - {{\mathbf{S}}_i}=\left[ {{{\mathbf{X}}_1}, \ldots ,{{\mathbf{X}}_{i - 1}},0,{{\mathbf{X}}_{i + 1}}, \ldots ,{{\mathbf{X}}_c}} \right] \), we can obtain the deformation formula of \({\mathbf{X}}\beta - {{\mathbf{X}}_i}{\beta _i}\), i.e., \({\mathbf{X}}\beta - {{\mathbf{X}}_i}{\beta _i}={{\mathbf{Z}}_i}\beta ={{\mathbf{X}}_1}{\beta _1} + \cdots + {{\mathbf{X}}_{i - 1}}{\beta _{i - 1}} + {{\mathbf{X}}_{i + 1}}{\beta _{i + 1}} + \cdots + {{\mathbf{X}}_c}{\beta _c}\). Therefore, the derivative over \(\beta \) of \(\lambda \sum _{i = 1}^c {\left\| {{\mathbf{X}}\beta - {{\mathbf{X}}_i}{\beta _i}} \right\| _2^2} \) is

$$\begin{aligned}&\frac{\partial }{{\partial \beta }}\left( {\lambda \sum _{i = 1}^c {\left\| {{\mathbf{X}}\beta - {{\mathbf{X}}_i}{\beta _i}} \right\| _2^2} } \right) \\&\quad = \frac{\partial }{{\partial \beta }}\left( {\lambda \sum _{i = 1}^c {\left\| {{{\mathbf{Z}}_i}{\beta _i}} \right\| _2^2} } \right) =2\lambda \left[ {\sum _{i = 1}^c {{{\left( {{{\mathbf{Z}}_i}} \right) }^\mathrm{T}}{{\mathbf{Z}}_i}} } \right] \beta . \end{aligned}$$

Eventually, the derivative over \(\beta \) of \(\frac{1}{2}{\left( {y - {\mathbf{X}}\beta } \right) ^\mathrm{T}}{\mathbf{W}}\left( {y - {\mathbf{X}}\beta } \right) + \gamma \sum _{i = 1}^c {\sum _{j = 1}^c {\beta _i^\mathrm{T}{\mathbf{X}}_i^\mathrm{T}{{\mathbf{X}}_j}{\beta _j}} + \lambda \sum _{i = 1}^c {\left\| {{\mathbf{X}}\beta - {{\mathbf{X}}_i}{\beta _i}} \right\| _2^2} } \) is

$$\begin{aligned}&\frac{\partial }{{\partial \beta }}\left( \frac{1}{2}{{\left( {y - {\mathbf{X}}\beta } \right) }^\mathrm{T}}{\mathbf{W}}\left( {y - {\mathbf{X}}\beta } \right) \right. \\&\qquad \left. + \,\gamma \sum _{i = 1}^c {\sum _{j = 1}^c {\beta _i^\mathrm{T}{\mathbf{X}}_i^\mathrm{T}{{\mathbf{X}}_j}{\beta _j}} + \lambda \sum _{i = 1}^c {\left\| {{\mathbf{X}}\beta - {{\mathbf{X}}_i}{\beta _i}} \right\| _2^2} } \right) \\&\quad = - {{\mathbf{X}}^\mathrm{T}}{\mathbf{W}}\left( {y - {\mathbf{X}}\beta } \right) +2\gamma {{\mathbf{X}}^\mathrm{T}}{\mathbf{X}}\beta - 2\gamma {\mathbf{M}}\beta \\&\qquad + \,2\lambda \left[ {\sum _{i = 1}^c {{{\left( {{{\mathbf{Z}}_i}} \right) }^\mathrm{T}}{{\mathbf{Z}}_i}} } \right] \beta . \end{aligned}$$

Appendix 2: Proof of our objective function is convex function

In the literature [49], there is a description that one function is a convex function as long as it satisfies some certain conditions. Specifically, suppose f is a twice differentiable function, namely, its second derivative or Hessian \({\nabla ^2}f\) is continuous and exists at each point in \({\mathbf{dom}}f\), where \({\mathbf{dom}}f\) is open. Then, f is a convex function if and only if \({\mathbf{dom}}f\) is convex, and also the Hessian of f is positive semidefinite, i.e., \({\nabla ^2}f\left( x \right) \underline{\succ }\, 0\), all \(x \in {\mathbf{dom}}f\). In addition, there is an example which can help us to better explain and prove the convex characteristic of the objective function, as follows.

Example 1

Consider the quadratic function \(f:{{\mathbf{R}}^n} \rightarrow {\mathbf{R}}\), with \({\mathbf{dom}}f = {{\mathbf{R}}^n}\), given by \(f\left( x \right) = \left( {{1 \big / }2} \right) {x^\mathrm{T}}{\mathbf{P}}x + {q^\mathrm{T}}x + r\), where \({\mathbf{P}}\) is a symmetric matrix of size \(n \times n\), \(q \in {{\mathbf{R}}^n}\), and \(r \in {\mathbf{R}}\). Due to \({\nabla ^2}f\left( x \right) ={\mathbf{P}}\) for all x, f is convex if and only if \({\mathbf{P}}\underline{\succ }0\).

Let \(g\left( \beta \right) =\frac{1}{2}{\left( {y - {\mathbf{X}}\beta } \right) ^\mathrm{T}}{\mathbf{W}}\left( {y - {\mathbf{X}}\beta } \right) + \gamma \sum _{i = 1}^c \sum _{j = 1}^c \beta _i^\mathrm{T}{\mathbf{X}}_i^\mathrm{T}{{\mathbf{X}}_j}{\beta _j} + \lambda \sum _{i = 1}^c {\left\| {{\mathbf{X}}\beta - {{\mathbf{X}}_i}{\beta _i}} \right\| _2^2} \).

Then according to the aforementioned theorem and example, we can infer that the function \(g\left( \beta \right) \) is convex function if \({\nabla ^2}g\left( \beta \right) \underline{\succ }0\) is proved to be valid, that is, \({\nabla ^2}g\left( \beta \right) \) is a positive semidefinite matrix. As for the problem of how to determine a matrix is positive semidefinite matrix, as long as this matrix is a real symmetric matrix and all order principal minor determinant are greater than or equal to zero, we can conclude that it is positive semidefinite matrix. From Eq. (7), we can get \({\nabla ^1}g\left( \beta \right) \), i.e., \({\nabla ^1}g\left( \beta \right) = - {{\mathbf{X}}^\mathrm{T}}{\mathbf{W}}\left( {y - {\mathbf{X}}\beta } \right) +2\gamma {{\mathbf{X}}^\mathrm{T}}{\mathbf{X}}\beta - 2\gamma {\mathbf{M}}\beta +2\lambda \left[ {\sum _{i = 1}^c {{{\left( {{{\mathbf{Z}}_i}} \right) }^\mathrm{T}}{{\mathbf{Z}}_i}} } \right] \beta \), and then \({\nabla ^2}g\left( \beta \right) = - {{\mathbf{X}}^\mathrm{T}}{\mathbf{WX}} + 2\gamma {{\mathbf{X}}^\mathrm{T}}{\mathbf{X}} - 2\gamma {\mathbf{M}}+2\lambda \sum _{i = 1}^c {{{\left( {{{\mathbf{Z}}_i}} \right) }^\mathrm{T}}{{\mathbf{Z}}_i}} \). Because \({\nabla ^2}g\left( \beta \right) \) satisfies the above determination conditions of positive semidefinite matrix, it is concluded that our objective function is convex function.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Peng, Y., Li, L., Liu, S. et al. Extended sparse representation-based classification method for face recognition. Machine Vision and Applications 29, 991–1007 (2018). https://doi.org/10.1007/s00138-018-0941-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00138-018-0941-z

Keywords