Adversarial Robustness of Perceptual Hashing Systems: A Unified Security Evaluation Framework

Avijit Roy, CUNY John Jay CollegeFollow

Date of Award

Spring 6-2026

Document Type

Thesis

Degree Name

Master of Science (MS)

Department/Program

Digital Forensics and Cybersecurity

Language

English

First Advisor or Mentor

Shweta Jain

Second Reader

Fatma Najar

Abstract

Social network platforms, child safety organizations, and image provenance systems use perceptual hashing to identify known child sexual abuse material (CSAM), support content moderation and reverse image search, and verify image integrity. Perceptual hashing works by producing similar fingerprints for visually similar images, even after common transformations such as compression, resizing, or minor brightness changes. This useful similarity-preserving property also creates an adversarial attack surface, as attackers can use AI-assisted or conventional image manipulation techniques to move a hash across a matching threshold while maintaining visual similarity, often without access to specialized hardware.

The security failures produced by adversarial attacks are asymmetric in their operational consequences. Evasion attacks generate false negatives: content that should match a known reference hash may avoid detection, causing investigative or review leads to be missed. Collision and near-collision attacks generate false positives, where legitimate images are falsely matched or moved close enough to match a target hash. In high-volume or automated settings, these false positives can increase human review burden, create reviewer fatigue, and raise privacy concerns when legitimate user images are escalated for review.

This thesis evaluates six hashing configurations (pHash-64, PDQ, pHash-256, NeuralHash, SmartHash-WI (whole image), and SmartHash-OV (overlapping blocks)) on the same 500-image ImageNette validation subset under a matched experimental protocol. All six were assessed under evasion and near-collision attacks. Exact collision was additionally evaluated for NeuralHash and pHash-64, the two algorithms for which prior exact-match baselines exist in the literature. Near-collision reflects the threshold-based matching criterion commonly used by perceptual hashing systems, where an alert may be triggered when a hash falls within a chosen distance of a target hash rather than matching it exactly. These false-positive and false-negative outcomes are evaluated separately because they create different operational burdens for review and investigation.

The results show that all six evaluated algorithms were vulnerable to evasion under the tested white-box or surrogate-gradient settings. At operational thresholds, evasion succeeded for every tested image, with high structural similarity between the original and modified images. Near-collision results show that exact-match collision alone is not sufficient for assessing false-positive risk in threshold-based image matching systems. For pHash-64, the exact collision success rate of 42.2% increased to 99.4% under the threshold-based near-collision criterion at T = 0.10. NeuralHash also showed high near-collision susceptibility at the same threshold, while pHash-256, PDQ, and the two SmartHash variants showed lower near-collision success under the specific surrogate methods evaluated here.

The findings in this thesis are best read as a threat model characterization: they define the adversarial boundary of an operationally mature technology. Adversarial robustness is influenced more by pipeline structure, optimization landscape, and quantization design than by hash length alone. DCT-based systems with simple thresholding were generally easier to evade, while adaptive quantization in SmartHash reduced near-collision success under the tested surrogate because the source and target images define different quantization bin boundaries. The results indicate that perceptual hashes should not be treated as standalone security mechanisms in high-stakes image matching or provenance workflows. They are better understood as one component in a broader review pipeline that may require additional verification, auditing, and system-level safeguards. Perceptual hashing has demonstrated genuine operational value at a scale that few alternative approaches can match at comparable computational cost. It enables low-latency content matching across the upload volumes of major platforms and performs reliably under ordinary, non-adversarial conditions.

Recommended Citation

Roy, Avijit, "Adversarial Robustness of Perceptual Hashing Systems: A Unified Security Evaluation Framework" (2026). CUNY Academic Works.
https://academicworks.cuny.edu/jj_etds/396

Download

Included in

Cybersecurity Commons, Forensic Science and Technology Commons, Information Security Commons

COinS

Adversarial Robustness of Perceptual Hashing Systems: A Unified Security Evaluation Framework

Date of Award

Document Type

Degree Name

Department/Program

Language

First Advisor or Mentor

Second Reader

Abstract

Recommended Citation

Included in

Browse

Author Corner

Search

Links

Adversarial Robustness of Perceptual Hashing Systems: A Unified Security Evaluation Framework

Author

Date of Award

Document Type

Degree Name

Department/Program

Language

First Advisor or Mentor

Second Reader

Abstract

Recommended Citation

Included in

Share

Browse

Author Corner

Search

Links