Data Mining - What Hidden Information Do Your Photos Contain?

Time was when a photo was just a captured moment in time, /end nostalgia

Nowadays though what people do not realize is the shear amount of “extra” information is embedded in “that picture you just uploaded to flikr/facebook/photo bucket” especially if you are uploading from a “smart phone” as more and more people are now.

Most photos now contain GPS data embedded in them, this information will survive a resize / upload process, at the time of writing images tested from Facebook appear to have the exif data stripped out (thumbs up for facebook maybe), and it appears php GD by default replaces all EXIF data with it’s own (bug maybe?).

For non sanitized images however you can discern a wealth of information such as:

  1. Make of camera
  2. Model of camera
  3. Software version
  4. Unix timestamp of time taken
  5. DateTime stamp of time taken
  6. Focal length used
  7. Shutter speed
  8. if flash used

And if GPS is embedded:

  1. Longitude
  2. Latitude
  3. Altitude
  4. GPS timestamp
  5. Direction facing when photo taken

There is yet more data such as the colour profile used, and image resolutions, in my tests photos taken from my iPhone 4 were within 10 meters of where I was actually standing when I took the picture, and in which direction I was facing when I took them.

So one more thing to note in your applications “data sanity” is to strip EXIF tags from uploaded images, lest your contributors private details be leaked from your application.

For example:

  1. User uploads photo for competition
  2. Site uses resized photo on competition page to allow visitor voting
  3. malicious user, saves image from site (or just uses the copy from thier browser cache), gets gps data from photo
  4. malicious user now knows exact whereabouts photo was taken aswell as the time.

And it doesn’t have to be a malicious user, it could be anyone/anything, if you want to check your images for EXIF data you can use my tool here:

No data is stored, and images are deleted immediately after processing, you use this at your own risk however, if you misuse the tool you accept all liability for the legal action to follow, you have been warned.