research-article

FedSH: Towards Privacy-Preserving Text-Based Person Re-Identification

Authors:

Meng WangAuthors Info & Claims

IEEE Transactions on Multimedia, Volume 26

Pages 5065 - 5077

https://doi.org/10.1109/TMM.2023.3330091

Published: 01 January 2024 Publication History

Abstract

Text-based person re-identification (ReID) has enabled canonical applications in searching for and tracking targets from large-scale surveillance images with textual descriptions. Yet, existing text-based person ReID systems employ centralized model training that gathers images captured by different institutes' cameras into one place, which poses severe privacy threats to sensitive institutional information. This work is then devoted to exploring privacy-preserving text-based person ReID and proposes the framework of FedSH by tailoring the federated learning paradigm for distributed searching knowledge extraction. Specifically, FedSH resolves the local model generalization and entity boundary obscuring limitations, caused by inner-institute data homogeneity and inter-institute data heterogeneity, via building multi-granularity feature representation and a semantically self-aligned network. Meanwhile, it reduces the communication burden introduced by the embedding for multiple modals by updating common representation subspaces during federated learning. Experimental results on two public benchmarks demonstrate that our method can achieve at most 16.47% and 16.02% person ReID performance improvement by the Rank-1 metric, compared with 6 State-of-The-Art (SoTA) baselines and 6 ablation studies. We believe that our work will inspire the community to investigate the potential of implementing Federated Learning in real-world image retrieval and ReID scenarios.

References

[1]

M. Ye et al., “Deep learning for person re-identification: A survey and outlook,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 6, pp. 2872–2893, Jun. 2022.

[2]

J. Liu et al., “From distributed machine learning to federated learning: A survey,” Knowl. Inf. Syst., vol. 64, pp. 885–917, 2022.

Digital Library

[3]

D. H. Mahlool and M. H. Abed, “A comprehensive survey on federated learning: Concept and applications,” in Proc. Mobile Comput. Sustain. Inform., 2022, pp. 539–553.

[4]

S. Li et al., “Person search with natural language description,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 5187–5196.

[5]

S. Li, T. Xiao, H. Li, W. Yang, and X. Wang, “Identity-aware textual-visual matching with latent co-attention,” in Proc. IEEE Int. Conf. Comput. Vis., 2017, pp. 1908–1917.

[6]

S. Aggarwal, V. B. Radhakrishnan, and A. Chakraborty, “Text-based person search via attribute-aided matching,” in Proc. IEEE Winter Conf. Appl. Comput. Vis., 2020, pp. 2606–2614.

[7]

Z. Wang, Z. Fang, J. Wang, and Y. Yang, “ViTAA: Visual-textual attributes alignment in person search by natural language,” in Proc. Eur. Conf. Comput. Vis., 2020, pp. 402–420.

[8]

Z. Ding, C. Ding, Z. Shao, and D. Tao, “Semantically self-aligned network for text-to-image part-aware person re-identification,” 2021, arXiv:2107.12666.

[9]

S. Zhang et al., “Text-based person search in full images via semantic-driven proposal generation,” in Proc. 4th Int. Workshop Human Centric Multimedia Anal., 2023, pp. 5–14.

Digital Library

[10]

S. Zhao, C. Gao, Y. Shao, W.-S. Zheng, and N. Sang, “Weakly supervised text-based person re-identification,” in Proc. IEEE Int. Conf. Comput. Vis., 2021, pp. 11375–11384.

[11]

A. Farooq, M. Awais, J. Kittler, and S. S. Khalid, “AXM-Net: Implicit cross-modal feature alignment for person re-identification,” in Proc. AAAI Conf. Artif. Intell., 2022, pp. 4477–4485.

[12]

Y. Chen, G. Zhang, Y. Lu, Z. Wang, and Y. Zheng, “TIPCB: A simple but effective part-based convolutional baseline for text-based person search,” Neurocomputing, vol. 494, pp. 171–181, 2022.

[13]

L. Zheng, Y. Yang, and A. G. Hauptmann, “Person re-identification: Past, present and future,” 2016, arXiv:1610.02984.

[14]

W. Zhuang et al., “Performance optimization of federated person re-identification via benchmark analysis,” in Proc. ACM Int. Conf. Multimedia, 2020, pp. 955–963.

[15]

J. Konečný et al., “Federated learning: Strategies for improving communication efficiency,” 2016, arXiv:1610.05492.

[16]

S. Caldas et al., “Leaf: A benchmark for federated settings,” 2018, arXiv:1812.01097.

[17]

Y. Guo et al., “PREFER: Point-of-interest recommendation with efficiency and privacy-preservation via federated edge learning,” Proc. ACM Interactive, Mobile, Wearable Ubiquitous Technol., vol. 5, no. 1, pp. 1–25, 2021.

[18]

B. Liu et al., “DISTFL: Distribution-aware federated learning for mobile scenarios,” Proc. ACM Interactive, Mobile, Wearable Ubiquitous Technol., vol. 5, no. 4, pp. 1–26, 2021.

[19]

J. Yao, Z. Dou, and J.-R. Wen, “FedPS: A privacy protection enhanced personalized search framework,” in Proc. Web Conf., 2021, pp. 3757–3766.

[20]

W. Zhuang, Y. Wen, and S. Zhang, “Joint optimization in edge-cloud continuum for federated unsupervised person re-identification,” in Proc. ACM Int. Conf. Multimedia, 2021, pp. 433–441.

[21]

L. Zong et al., “FedCMR: Federated cross-modal retrieval,” in Proc. Int. ACM SIGIR Conf. Res. Develop. Inf. Retrieval, 2021, pp. 1672–1676.

[22]

H. Zeng, T. Zhou, Y. Guo, Z. Cai, and F. Liu, “FedCav: Contribution-aware model aggregation on distributed heterogeneous data in federated learning,” in Proc. Int. Conf. Parallel Process., 2021, pp. 1–10.

[23]

B. Xiong, X. Yang, F. Qi, and C. Xu, “A unified framework for multi-modal federated learning,” Neurocomputing, vol. 480, pp. 110–118, 2022.

Digital Library

[24]

S. Sun, G. Wu, and S. Gong, “Decentralised person re-identification with selective knowledge aggregation,” 2021, arXiv:2110.11384.

[25]

F. Yang, Z. Zhong, Z. Luo, S. Li, and N. Sebe, “Federated and generalized person re-identification through domain and feature hallucinating,” 2022, arXiv:2203.02689.

[26]

G. Wu and S. Gong, “Decentralised learning from independent multi-domain labels for person re-identification,” in Proc. AAAI Conf. Artif. Intell., vol. 35, no. 4, 2021, pp. 2898–2906.

[27]

Q. Xie, W. Zhou, G.-J. Qi, Q. Tian, and H. Li, “Progressive unsupervised person re-identification by tracklet association with spatio-temporal regularization,” IEEE Trans. Multimedia, vol. 23, pp. 597–610, 2021.

Digital Library

[28]

H. Galiyawala, M. S. Raval, and D. Savaliya, “DSA-PR: Discrete soft biometric attribute-based person retrieval in surveillance videos,” in Proc. IEEE Int. Conf. Adv. Video Signal Based Surveill., 2021, pp. 1–7.

[29]

T. Xiao, S. Li, B. Wang, L. Lin, and X. Wang, “Joint detection and identification feature learning for person search,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 3376–3385.

[30]

N. Spolaor et al., “A systematic review on content-based video retrieval,” Eng. Appl. Artif. Intell., vol. 90, 2020, Art. no.

[31]

H. Luo, W. Jiang, X. Fan, and C. Zhang, “STNReID: Deep convolutional networks with pairwise spatial transformer networks for partial person re-identification,” IEEE Trans. Multimedia, vol. 22, pp. 2905–2913, 2020.

[32]

B. Chen, W. Deng, and J. Hu, “Mixed high-order attention network for person re-identification,” in Proc. IEEE Int. Conf. Comput. Vis., 2019, pp. 371–381.

[33]

Z. Dai, M. Chen, X. Gu, S. Zhu, and P. Tan, “Batch dropblock network for person re-identification and beyond,” in Proc. IEEE Int. Conf. Comput. Vis., 2019, pp. 3690–3700.

[34]

F. Yang, Z. Zhong, Z. Luo, S. Lian, and S. Li, “Leveraging virtual and real person for unsupervised person re-identification,” IEEE Trans. Multimedia, vol. 22, pp. 2444–2453, 2020.

[35]

X. Chen et al., “Salience-guided cascaded suppression network for person re-identification,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2020, pp. 3297–3307.

[36]

Z. Liu, L. Zhang, and Y. Yang, “Hierarchical bi-directional feature perception network for person re-identification,” in Proc. ACM Int. Conf. Multimedia, 2020, pp. 4289–4298.

[37]

G. Zhang, P. Zhang, J. Qi, and H. Lu, “Hat: Hierarchical aggregation transformers for person re-identification,” in Proc. ACM Int. Conf. Multimedia, 2021, pp. 516–525.

[38]

Y. Zhang and H. Lu, “Deep cross-modal projection learning for image-text matching,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 686–701.

[39]

K. Niu, Y. Huang, W. Ouyang, and L. Wang, “Improving description-based person re-identification by multi-granularity image-text alignments,” IEEE Trans. Image Process., vol. 29, pp. 5542–5556, 2020.

[40]

Y. Jing et al., “Pose-guided multi-granularity attention network for text-based person search,” in Proc. AAAI Conf. Artif. Intell., 2020, pp. 11189–11196.

[41]

C. Gao et al., “Contextual non-local alignment over full-scale representation for text-based person search,” 2021, arXiv:2101.03036.

[42]

C. Wang, Z. Luo, Y. Lin, and S. Li, “Text-based person search via multi-granularity embedding learning,” in Proc. Int. Joint Conf. Artif. Intell., 2021, pp. 1068–1074.

[43]

A. Zhu et al., “DSSL: Deep surroundings-person separation learning for text-based person retrieval,” in Proc. 29th ACM Int. Conf. Multimedia, 2021, pp. 209–217.

Digital Library

[44]

W. Ma, T. Zhou, J. Qin, Q. Zhou, and Z. Cai, “Joint-attention feature fusion network and dual-adaptive NMS for object detection,” Knowl.-Based Syst., vol. 241, 2022, Art. no.

[45]

S. Chen, Y. Zhao, Q. Jin, and Q. Wu, “Fine-grained video-text retrieval with hierarchical graph reasoning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2020, pp. 10635–10644.

[46]

Y. Sun, L. Zheng, Y. Yang, Q. Tian, and S. Wang, “Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline),” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 480–496.

[47]

H. Yao et al., “Deep representation learning with part loss for person re-identification,” IEEE Trans. Image Process., vol. 28, no. 6, pp. 2860–2871, Jun. 2019.

[48]

Z. Zheng et al., “Dual-path convolutional image-text embeddings with instance loss,” ACM Trans. Multimedia Comput., Commun. Appl., vol. 16, no. 2, pp. 1–23, 2020.

Digital Library

[49]

B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. Y. Arcas, “Communication-efficient learning of deep networks from decentralized data,” in Proc. Artif. Intell. Statist., 2017, pp. 1273–1282.

[50]

T. Li et al., “Federated optimization in heterogeneous networks,” Proc. Mach. Learn. Syst., vol. 2, pp. 429–450, 2020.

[51]

K.-H. Lee, X. Chen, G. Hua, H. Hu, and X. He, “Stacked cross attention for image-text matching,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 201–216.

[52]

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2014, arXiv:1412.6980.

[53]

L. Van der Maaten and G. Hinton, “Visualizing data using t-SNE,” J. Mach. Learn. Res., vol. 9, no. 11, pp. 2579–2605, 2008.

[54]

A. Radford et al., “Learning transferable visual models from natural language supervision,” in Proc. Int. Conf. Mach. Learn., 2021, pp. 8748–8763.

[55]

C. Jia et al., “Scaling up visual and vision-language representation learning with noisy text supervision,” in Proc. Int. Conf. Mach. Learn., 2021, pp. 4904–4916.

Cited By

Wu TZhang SChen DHu H(2024)Text-and-Image Learning Transformer for Cross-Modal Person Re-IdentificationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/368616021:1(1-18)Online publication date: 15-Oct-2024
https://dl.acm.org/doi/10.1145/3686160
Yan SLiu JDong NZhang LTang JCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Prototypical Prompting for Text-to-image Person Re-identificationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681165(2331-2340)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681165

Recommendations

SecureReID: Privacy-Preserving Anonymization for Person Re-Identification
Anonymization methods have gained widespread use in safeguarding privacy. However, conventional anonymization solutions inevitably lead to the loss of semantic information, resulting in limited data utility. Besides, existing deep learning-based ...
Privacy-preserving similarity-based text retrieval

Users of online services are increasingly wary that their activities could disclose confidential information on their business or personal activities. It would be desirable for an online document service to perform text retrieval for users, while ...
Privacy-preserving IR: when information retrieval meets privacy and security
SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval

Information retrieval (IR) and information privacy/security are two fast-growing computer science disciplines. There are many synergies and connections between these two disciplines. However, there have been very limited efforts to connect the two ...

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Multimedia

IEEE Transactions on Multimedia Volume 26, Issue

2024

11427 pages

ISSN:1520-9210

Issue’s Table of Contents

1520-9210 © 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

Publisher

IEEE Press

Publication History

Published: 01 January 2024

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wu TZhang SChen DHu H(2024)Text-and-Image Learning Transformer for Cross-Modal Person Re-IdentificationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/368616021:1(1-18)Online publication date: 15-Oct-2024
https://dl.acm.org/doi/10.1145/3686160
Yan SLiu JDong NZhang LTang JCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Prototypical Prompting for Text-to-image Person Re-identificationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681165(2331-2340)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681165

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents