Privacy Preserving Image Captioning

Privacy Preserving Image Captioning

Jan 1, 2024·
Paula Arguello
Paula Arguello
Jhon Lopez
Jhon Lopez
Carlos Hinojosa
Carlos Hinojosa
Karen Sanchez
Karen Sanchez
Hoover Rueda-Chacón
Hoover Rueda-Chacón
Henry Arguello
Henry Arguello
· 2 min read

screen reader text
Proposed end-to-end model. The optical encoder incorporates a camera with a refractive lens, which is parametrized by a linear combination of Zernike polynomials. The decoder is formed by a convolutional feature extraction and an LSTM Network with attention, which produces a caption from the private image

Scene captioning, the task of describing visual information using text, typically relies on high-resolution images, raising privacy concerns due to potential exposure of sensitive data like facial attributes. This project introduces a novel solution to protect privacy by integrating optics design with algorithms, using a learned refractive lens to obscure sensitive visual attributes in the acquired image while still enabling accurate scene captions. 📸🔒

Key Features

  • Privacy-Preserving Lens: A specially designed lens that distorts sensitive attributes, such as faces, ethnicity, and gender, ensuring privacy before image acquisition.
  • Optimized Deep Learning: Combines an optimized refractive lens with a deep network architecture for end-to-end scene captioning from distorted images. 🤖🔍
  • Comprehensive Evaluation: Validated through extensive simulations and hardware experiments, showing a superior balance between privacy and utility.

Published Papers

📌 Optics lens design for privacy-preserving scene captioning: Paula Arguello, Jhon Lopez, Carlos Hinojosa, Henry Arguello🏆BEST PAPER AWARD ICIP 2022🏆

📌 Learning to Describe Scenes via Privacy-aware Optical Lens: Paula Arguello, Jhon Lopez, Karen Sanchez, Carlos Hinojosa, Hoover Rueda-Chacón, Henry Arguello LatinX at CVPR 2024

📌 Learning to Describe Scenes via Privacy-aware Designed Optical Lens: Paula Arguello, Jhon Lopez, Karen Sanchez, Carlos Hinojosa, Fernando Rojas-Morales, Henry Arguello IEEE Transactions on Computational Imaging

Results

Ablation against other privacy-preserving approaches

screen reader text
Qualitative results on two test set samples. Insets display the SSIM and Meteor between the distorted and original images

Robustness to deconvolution

screen reader text
Evaluation of the robustness of our lens-protected images against deconvolution attacks. Qualitative results show that the identities of individuals cannot be recovered after applying non-blind (Wiener) and blind (DeblurGANv2) deconvolution

Hardware Implementation

screen reader text
(Top) Experimental hardware setup for our proposed privacy-preserving image captioning approach. (Bottom) PSFs and qualitative results on an image example acquired with a conventional camera (left), our proof-of-concept camera (middle), and simulated camera (right)

Related works that you may be interested 👇

🔗 Privhar: Recognizing Human Actions From Privacy-Preserving Lens: Carlos Hinojosa, Miguel Marquez, Henry Arguello, Ehsan Adeli,Li Fei-Fei, Juan Carlos Niebles.

🔗 Learning Privacy-preserving Optics for Human Pose Estimation: Carlos Hinojosa, Juan Carlos Niebles, Henry Arguello.

Paula Arguello
Authors
B.S. Systems Engineer
Jhon Lopez
Authors
Ph.D.(c) in Computer Science, Universidad Industrial de Santander
Carlos Hinojosa
Authors
Postdoctoral Researcher at KAUST
Karen Sanchez
Authors
Postdoctoral Researcher at KAUST
Hoover Rueda-Chacón
Authors
Assistant Professor, Universidad Industrial de Santander
Henry Arguello
Authors
Professor at Universidad Industrial de Santander, Colombia