Beomagi's thoughts...: April 2013

Friday, April 19, 2013

Similar image identification theory

Let's assume that we've got thousands of images. If we're working on these images applying post process techniques, we're probably also in possession of duplicates. What's a good method to find these duplicates?

Let's consider what "fixes" we may do to a photo.

JPEG compression level - Introduces artifacts, colors may change a little, banding, file size.
De-noise - Reduces noise, generally reduces filesize a little, removes some detail/micro-contrast.
Exposure/color correction - Changes average brightness of image channel (red/green/blue). Color correction changes the ratio of one channel to another over samples of the image
Scale - Changing the size of the image means we have to look at areas of the image proportional to the width and height.
HDR - HDR I think is misunderstood. The term means capturing more stops of light than normally permitted my a digital camera. The act of displaying that many levels of light on a screen which only shows eight stops of light per channel is tone mapping. Tone mapping can be done in many ways, and can be a type of local contrast manipulation - i.e. making a dark area a little brighter to show details there better. Unlike simply lowering the contrast of the entire image which will effectively do the same thing, local contrast manipulation means a solid dark or bright area, may not be the same brightness throughout. e.g. a dark area surrounded by a bright area - the dark area may be brighter overall, but not uniformly so, in order to keep contrast high at the border to the brighter area of the image. More on playing with this in a future post ;)
Cropping - this can be quite difficult or at least intensive to check.

So, looking at these, we can rule out

file-size and checksums
image size and aspect ratio checks
pixel by pixel comparisons - noise and jpeg artifacts will render this unusable

One check that can be made is relative average comparisons between clumps of pixels and the overall average. If the clump of pixels is done relative to the change in width and height, then even re-sized images can be found.

Assuming a picture is divided by a grid into sections. The average brightness of each section is then compared to the overall average (add red, green, blue values of each pixel). If the section is brighter than overall average, call it '1', if darker it's '0'. If it's within a threshold, a third value can be used.

Here's how it works:

I've compared some pictures and added them below.

The first and third are the same - just color and exposure corrections, some curves used. The picture in the center is from a slightly different position and the position of my baby's head is different. The transparent yellow/green overlays highlight the sections where the local:global exposure ratio is different.

Writing out the flags for the exposure checks,

Picture 1:

00000x010000x11101000001111100000000010000000x0x00000000111001110010x11100011111x000001111x110111011

Picture 2:

00000x0100001111010000x1111100001000010000000x0x0000000011100111001011110x011111x000001111x110111011

Picture 1-color corrected:

00000001000001111100000011110000010000000000010000000011000011110001110111110111xx000001x1x110111011

I'm using a 5% threshold value - i.e. if the difference between local and global exposure of a tile is under 5%, that "bit" is marked "x" and not compared against the alternate image string.

From the image, the colored sections show the tiles which have been found to be different. As we can see the color corrected image only had a slight difference. This can be considered within a threshold and added to a collection of similar images for later review.

Here's what the python code for the image comparison looks like.

Requirements:

Python, Numpy, PIL (or pillow).

___________________________________________________

from PIL import Image
import numpy
#------------------------------ DEFINED CONSTANTS ----------------------------
grid=10  # 10x10 dot sample
debugger=0
unclear=1 # permit use of 'x' in hash string if variance threshold is met
localcontrast=0
unclearvariance=0.05
#------------------------------ DEFINED CONSTANTS ----------------------------

def debug(string):
  global debugger
  if (debugger==1):
  print(string)

def imgsum(filename):

  def fuzzsamxy(px,py,prx,pry): # at x,y, get average in radius pr
  cntx = px-prx
  psum = 0
  ptot = 0
  x1=max(px-prx,0)
  x2=min(px+prx+1,ax) #quirk: operation.array[a:b] apparently operates on indexes a to b-1
  y1=max(py-pry,0)
  y2=min(py+pry+1,ay)
  pavg=int(3*numpy.average(im1arr[x1:x2,y1:y2]))
  return pavg

  global grid
  #read image
  debug("Opening "+filename)
  img_a = Image.open(filename)
  im1arr = numpy.asarray(img_a)
  ax,ay,colors=im1arr.shape
  debug("Size of image"+str([ax, ay]))
  debug("Determining average brightness")
  avg = int(3*numpy.average(im1arr))
  debug("Grid comparison")
  cx=0
  signature=""
  radius=(ax//(grid*4))
  debug("radius of :"+str(radius))
  while cx < grid:
  cy = 0
  while cy < grid:
  chkpntx=int((float(cx)+0.5)*ax)//grid
  chkpnty=int((float(cy)+0.5)*ay)//grid
  radx=ax//(grid*2)
  rady=ay//(grid*2)
  if (localcontrast==1):
  avg = fuzzsamxy(chkpntx,chkpnty,radx,rady) #get sample about chkpntx-chkpnty

  #get sample about chkpntx-chkpnty
  sampavg = fuzzsamxy(chkpntx,chkpnty,min(1,radx//8),min(1,rady//8))
  if float(abs(sampavg-avg))/avg < unclearvariance and unclear==1:
  signature=signature+"x"
  else:
  if sampavg > avg:
  signature=signature+"1"
  else:
  signature=signature+"0"

  cy = cy + 1
  cx = cx + 1
  return(signature)

def stringdiffs(str1, str2):
  if len(str1) != len(str2):
  return -1
  a=0
  cum=0
  while a<len(str1):
  if  str1[a] != str2[a]:
  if (str1[a] != "x" and str2[a] != "x"):
  cum = cum + 1;
  a = a + 1
  return [cum,(float(cum)/len(str1))]

########################## MAIN HERE ################################
# replace with your own pictures
print("identifying image 1")
id1=imgsum('1.jpg')
print("identifying image 3")
id3=imgsum('1b.jpg')
print("identifying image 4")
id4=imgsum('2.jpg')

# output the ID strings - these "hash" values are simple strings
print (id1)
print (id3)
print (id4)

# string diffs(str1, str2) will compute the total number of differences and the fraction of the total checks)
print("ID differential similar stuff    :"+str(stringdiffs(id1,id3)))
print("ID differential little different :"+str(stringdiffs(id1,id4)))

___________________________________________________________

And here's the output:

>pythonw -u "imageiden2b.py"

identifying image 1

identifying image 3

identifying image 4

00000x010000x11101000001111100000000010000000x0x00000000111001110010x11100011111x000001111x110111011

00000x0100001111010000x1111100001000010000000x0x0000000011100111001011110x011111x000001111x110111011

00000001000001111100000011110000010000000000010000000011000011110001110111110111xx000001x1x110111011

ID differential similar stuff :[1, 0.01]

ID differential little different :[18, 0.18]

>Exit code: 0 Time: 1.896

Averaging 0.63 seconds per image isn't bad considering the image is fully loaded, and all pixels are used in averages. Numpy is extremely efficient at speeding this up. Accessing individual pixels, instead of Numpy built in array averages is several times slower.

From this point, it's pretty simple to create a main function that will output filename, image-hash-string if passed an image, so it's trivial to use this to get a list of hash strings.

The next difficulty is more of a UI layout. Using "stringdiffs" to identify similar images, I believe forming collections and letting the user commit actions to images would be efficient.

Either way, the above functions would permit someone to handle it in a manner of their choosing.

For lower contrast images, I've also added a flag to the function for doing local contrast comparisons instead of global contrast. The difference check here will be the tile's average vs the center of the tile - though this has some weaknesses.

Sunday, April 7, 2013

Alternative to rocks

I've been using the website "Alternativeto.net" for a while now. If I had to pick a social online feature I like, it would be an easy win - ahead of facebook, blogger, twitter etc. It's pretty simple. People add software entries and say what it's an alternative to. Other people then vote as to how good of an alternative it is.

As a result you've got a huge database of alternatives to many common/uncommon software packages.

Microsoft Office 365 is an annual Benjamin. Want an alternative to Microsoft Office - http://alternativeto.net/software/microsoft-office-suite/
The first 5 options presented are by alternativeto.net are free.
So many free options. Um - yay?

Well, that's a really common example - lets try some other options:

Photoshop: http://alternativeto.net/software/adobe-photoshop/
Notables: Gimp, Paint.net

Visual Studio: http://alternativeto.net/software/visual-studio/
Notables: Eclipse, NetBeans, Aptana Studio, CodeBlocks

Adobe Lightroom: http://alternativeto.net/software/adobe-lightroom/
Notables: Picassa, RawTherapee, UFRaw, Bibble

Notepad: http://alternativeto.net/software/notepad/
Notables: Notepad++, GVim, Sublime Text, SciTE,

PERL: http://alternativeto.net/software/perl/
Notables: Python, C#, Go

SNEX9X: http://alternativeto.net/software/snes9x/
Notables: ZSNES, MESS, Snes9x ES

Skype: http://alternativeto.net/software/skype/
Notables: Google Voice and Video chat, ooVoo, Voxeet

Hugin: http://alternativeto.net/software/hugin/
Notables: Microsoft ICE, Photosynth, PTgui

...

Well the list goes on as you can see. Even more esoteric software, such as Super Nintendo emulators and Panoramic image creators were listed and had many options. This is great - since you can try as many free/trial options and just use the one that best fits your needs.

Wednesday, April 3, 2013

An ARM and a leg (up)...

Six years ago, I plunked down about $250 for my first tablet. That's right - March of 2008. That tablet was the Nokia N800 - fondly called the Nokia Internet Tablet.

It wasn't very well received, but did have a dedicated following. I don't remember what I was using to convert videos at the time, but I had streamlined it to drag and drop to make the n800 my primary multimedia device. Evince PDF reader, and CBR viewer, several giant JPGs of google maps, and eventually a webkit browser made this a rather nice little (somewhat pokey) device.

The OS at the time, was Maemo. Maemo was interesting - quite efficient given the CPU, however I don't think they ever invested enough time to make it as stable as it could have been. Widgets, taskbars, gizmo/skype calls - it's feature set was quite well rounded - and the fact that it took not just an SD card, but also a mini-SD as well was fantastic. All built on Debian ARM based Linux.

The Hardware was decent at the time - a 5-inch 800x480 resolution screen - even today, not bad. The CPU was a 330MHz Texas Instruments OMAP 2 (ARM11 base). It was rather pokey - but it definitely showed me that ARM was capable of general purpose processing.

Today we're pushing vastly more powerful CPUs, and I'll have to give Android/IOS the nudge for a more friendly and functional OS.

Cortex-A8 - This was a pretty good jump over ARM11, and it really felt like ARM was starting to become competitive. When I say competitive, I'm referring to x86 CPUs. ARM increased the frequency, made this dual issue (able to optimally fetch, decode execute etc 2 instructions at once).

Cortex-A9 - Another nice jump over the previous CPU. The A9 design's big wins - out of order processing, and multi-core capable. This CPU was powerful enough to not only make a big splash at the time, but to be used and overused for the last 2 years.I say overused, because I'm really looking forward to Cortex-A15 :)

Not everyone stuck to ARM's designs. Several manufacturers developed CPUs with ARM compatible instruction sets with their own designs. One of the more notable manufacturers being Qualcomm. Looking at the last couple of designs - the Snapdragon S3 and S4.

With the Snapdragon S3 made the jump to multicore, raised the frequency level, and doubled addressable RAM to 2GB. Maybe not quite as powerful as Cortex A9, but definitely ahead of the A8.

Already this is sounding like several of today's x86 lower power CPUs. With the S4, the die shrank from 45nm to 28nm. While scorpion was competitive with Cortex-A9 designs in performance, the Snapdragon S4 went further. Each core is a 3-issue core, increasing parallelism. A longer 11-stage pipeline compared to 8-stage A9 in hopes to keep clock speed high. S4 supported quad core. With the enhanced capabilities of this CPU, Qualcomm is looking at derivative designs to compete with upcoming Cortex A15.

The S4's advantages over cortex A9 include the 3 issue wide decode (vs 2), advanced floating point unit (VFPv4 vs v3), 128bit wide NEON (i.e. SIMD instructions) vs 64bit, and L0 cache (not sure how that works...). Cortex A15, which is starting to get some play (Google's Nexus 10 tablet) implements a lot of these advantages.

And hey - it's efficient enough to not melt butter...
http://www.youtube.com/watch?v=4mSD_EhgGSc&feature=player_embedded

With the S4 derivatives and Cortex A15 based CPUs (even the old Cortex A9), the mobile space has made some big jumps. People are questioning if ARM can compete with x86. Well, it can, and does. If you're comfortable with an app's performance, it doesn't matter what CPU is powering it. A CPU competes with another if the user doesn't notice or mind the difference between the two - and that's very much the case with many applications. If I'm browsing the web, playing an HD movie, reading books or magazines (content consumption) ARM is fine - as is the interface we're normally associating it with i.e. touchscreens for phones and tablets.

Can ARM compete at a higher level? I think it's possible - BUT, OS support can get hairy. Android isn't quite ready to replace my desktop, and Linux support is spotty among ARM CPU manufacturers. Here's hoping that situation improves to give us more choice.