Photo Recognition

SIFT was invented by David Lowe who works in the Computer Science Department at the University of British Columbia in Vancouver (CA). SIFT is a computer vision technology that enables the computer to recognize elements in an image. The general principle is that it associates to the image a number of keypoints that it can later detect elsewhere. This has received quite a bit of attention in the particular application of automatic panorama stitching, where it has shown great precision and convenience.

In gpuViewer, SIFT is associated with a proprietary indexing scheme to provide a means to locate an image in the database, given another version. This can be used to find the original given a web version, a scan of a print or a variation. Sift keypoints are very robust and can usually still be located after classic photo manipulations such as cropping, contrast and saturation adjustments, conversion to black&white or sepia, and even sharpening operations.

Downloading the SIFT detector

By default, gpuViewer doesn’t use SIFT and you must download the SIFT detector from the University of British Columbia at www.cs.ubc.ca/~lowe/keypoints/. Make sure you fit the conditions imposed by them:

This software for the detection of invariant keypoints is being made available for individual research use only. Any commercial use or any redistribution of this software requires a license from the University of British Columbia.

At this stage, this should be the case for everybody as gpuViewer is at an R&D stage and made available for review and comment purposes only.

Installing the SIFT detector

Decompress the file siftWin32.exe from the siftDemoV4.zip archive just downloaded, and copy it alongside gpuViewer.exe in the directory where it was installed. By default, this is in C:\Program Files\webphotomag\ .

Once siftWin32.exe is in the right place, it will be detected by gpuViewer the next time it starts and this will add new menus to the application:

  • the main menu gets a new SIFT submenu
  • the photo right-click menu gets new items

Building the recognition database

gpuViewer will create a new file in the main catalog’s directory, alongside index.vg. It is called SIFT.vg and stores all keypoints and indexing data. This will grow much larger than index.vg, expect a little under 128K per image. This is still quite reasonable, compared to the space taken by the images themselves (in case of our test databases, it is a less than 2% overhead).

To trigger the creation of this database, use the menu SIFT/Build Database. Unlike the population of the main database, you can’t use gpuViewer while it builds the SIFT database. The reason being that this is a computationally intensive task and it has been coded to use all the resource it can get to try and be finished with this as soon as it can. Actually, when I will have figured a way to keep computing while the database is committing data to disk, this will be even more intensive…

The best thing to do is to let gpuViewer update its SIFT database when you don’t use the computer: the indexing rate can be in the order of a thousand images an hour on a powerful machine, that’s fast but still a long time for large collections… Fortunately, in the future, two things will help: there is room for optimisation plus this process could easily be spread over several machines. But we’re not there yet.

At any time, you can cancel the building of the database by pressing the Cancel button. Give it a long press, gpuViewer isn’t paying much attention, it is very busy at this point… Once the button has changed state, the click has been registered, it will take a few more seconds before data has been commited to disk. Please be patient.

Showing keypoints

Ok, now you have a database (even if it might be partial). Zoom to a picture until it is at the preview stage (about 500 pixels) : it displays more resolution than a thumbnail, but still isn’t 100% full-resolution. Right-Click and choose “SIFT: Select Keypoints”:

If this image has already been indexed, keypoints will show up immediately and, if not, they’ll be calculated first. Note that there are two images in the screenshot above. The one with green keypoints only is the one we selected: they all match. The next one is showing the same subject and keypoints have various colours. Green means we have a full descriptor match, grey means we don’t have a match and red means we have an actual SIFT match but the short descriptor (used for indexing) doesn’t match. This last case isn’t symptomatic of a problem, even if the colour might suggest that: these descriptors will be useful to confirm a match but not to suggest one (gpuViewer starts by running quickly through the index, gathering potential matches that get thoroughly compared afterwards).

Once you have highlighted the keypoints of a photo, all photos that have keypoints will display them and provide a visual information on how similar they are to gpuViewer.

If you want to temporarily hide the keypoints, hold Ctrl. If you want to hide them for good, use menu SIFT/Hide keypoints.

Finding similar images

In some cases, what you need is to find the original, or a variation, of a given photo in your database. To do this, you right-click on the image and select SIFT: Find images with similar keypoints.


This is the result of searching for the images similar to the one selected above. Here, many views of the same landscape have been returned, all showing the same place over several days. Sometimes, results will also show totally unrelated pictures: for the human eye they are different, but for gpuViewer they have enough similitudes. The reverse can happen too: sometimes you’d expect photos of the same subject to be similar enough but they won’t be considered a match by gpuViewer. Still, the one you’re looking for will be at the top of the list: it’s a bit like Google, the machines don’t understand things as we do, but they’re great at narrowing things down.

That’s why I talk about “photo recognition”: the feature currently lets you look for a specific picture, not for an object contained in it. I expect that some form of object search will be possible, but we’re not there yet, we’re just getting started ;) The great thing is that you can look for images similar to any one of the resulting images, right there from the selection view, by repeating the operation above. In this way, it is possible to approach the “find a photo of this object” Graalesque sort of feature…

searching via the clipboard

This is what makes photo recognition versatile: if you can get an image in the clipboard, you can search for it. Whether you copy-paste from your website, scan a print or do a screen capture, this can all be used. You then only have to go to the menu SIFT/Search image in clipboard. gpuViewer will start by analyzing the clipboard content and extract keypoints, it will then match the result with your database.

Please note that keypoint detection is done at the resolution of the clipboard content, and passing in a large image will take longer as more pixels need to be analyzed, they will results in more keypoints and more potential matches will need to be checked. In the example of a scanned print, a 150dpi scan of a postcard sized image is quite enough data for this process.

navigating the selection

This selection view works just like the calendar view: you can zoom in and out and drag the canvas around. Yes, down to 100% view if you like. There is a current image and the left and right arrow keys move to previous and next images.

When photos reach their preview size, the keypoints appear and let you visually check how similar gpuViewer finds them with respect to the one searched for. As mentionned before, you can hide these keypoints using menu SIFT/Hide keypoints.

Usually, you will be interested by other shots from the same session, and will want to jump to calendar view. Right-click on the picture you’re interested in and select Locate in calendar, and voilà !

The key ‘C’ takes you to the calendar, at the place you left it; ‘Z’ takes you to the selection, in the same way. Using these keys, you can switch between these views easily. It is also possible to use the menu View to do so.

reindexing penalty

When new keypoints have been added to the database, the first “photo recognition” call will take a few seconds longer: there is a secondary index that needs rebuilding. This is why your first match will often take longer than the next ones.