Perceptual Hash Algorithm

Perceptual hashing algorithms are designed to create hashes based on the content's characteristics that a human might perceive, such as the appearance of an image, rather than its byte-level composition. Unlike traditional cryptographic hashing (e.g., SHA or MD5) that produces completely different outputs for even the slightest data alteration, perceptual hashing produces similar hashes for similar data, allowing for approximate matching. These algorithms are widely used in tasks like duplicate image detection, content moderation, and copyright infringement detection, where slight variations (like resizing or color adjustments) should not lead to totally different hashes. The algorithms often involve downscaling the image, converting it to grayscale, and extracting key features that capture the essence of the visual information.

The concept of perceptual hashing emerged in the early 2000s, as the need for approximate matching in multimedia content detection grew. The technique gained wider recognition in the mid-2000s, as digital content proliferation required efficient ways to identify similar or derivative works.

One of the early key contributions to the field was the development of pHash (perceptual hash) by Johannes Buchmann and others, which became a popular open-source library for image comparison. Additionally, many advancements came from research in multimedia content retrieval and image processing communities, as the need for efficient algorithms in these areas grew.

Perceptual Hash Algorithm

Key Contributors

Newsletter

Academic Papers

A novel liver image classification method using perceptual hash-based convolutional neural network

Learning to break deep perceptual hashing: The use case neuralhash

An overview of perceptual hashing

A novel perceptual hash algorithm for multispectral image authentication

A new perceptual hashing method for verification and identity classification of occluded faces