This is probably not the right forum for your question, but I though someone might want to see one way to approach a problem like this. There are lots of things like this going on in the RAW converters / retouching tools we use for photo editing.
I am too lazy to dig up any literature for you. Problems like this is probably best understood by trying yourself first. A lot of it is art rather than hard science.
I think you are trying to use the wrong tool for the job: autocorrelation is a statistical measure and is mostly useful for, say, characterizing noise. Sometimes it can work for other problems, just because the math turns out to yield the same equations. That is just beginners luck, and it won't work well in this case.
First, when you compute the autocorrelation (and ignore the statistical definition) for a given shift you essentially do the following:
Here I2 is the same as I1, except it is shifted.
c = I1(:)'*I2(:) / length(I1(:))
The correlation coefficient, c, is "big" when the image and its shifted version are similar, but you need to be very careful about what "big" means. As you already found out.
[Warning: I do not have Matlab on this computer, so this is just from the top of my head. ]
You are better off doing some more normalization; to see if the images I1 and I2 are similar (i.e. I2 is a shifted version of I1), do something like this:
The first part is essentially the same as the correlation coefficient; for positive images this will be between 0 and 1. You only get 1 if I2 is a scaled version of I1.
s = (I1(:)'*I2(:)) / sqrt(I1(:)'*I1(:) * I2(:)'*I2(:))
Now, instead of inspecting the entire image which only works in trivial cases. Rather, you normally want to find out if you have two (or more) locations that looks suspiciously alike.
Rather you would start by extracting a patch (say, 11x11 pixels) at location (x,y):
Now you want to compute "s"... for each pixel in the image. You could loop and apply the above, or you could use convolution. First the numerator:
P = img(y+1:y+11, x+1:x+11);
This is essentially what you get when you use xcorr2, except it only applies a small part of the image (P). This is why the autocorrelation almost works.
num = conv2(img, P, 'same');
The denominator can be computed as follows:
Now you get s pixel for pixel:
den = sqrt(P(:)'*P(:) * conv2(img.*img, ones(11,11), 'same'));
It will have a "1" at the location where P was extracted.
Now you just need to threshold it to see if there are other large values.
- - -
Of course, you need to do this for many possible location of the patch P:
[ Compute s and find large values ]
This will be very expensive. Essentially, this is a big part of many of the video encoding algorithms. These does not usually look in the entire image, but rather limits the search to the neighborhood where P was extracted.
Since it is so expensive you can get hardware that is specialized for it. Now you could also use graphics cards or similar to make it more tolerable.
- - -
A more reasonable way to proceed is to be smarter about which locations you compare. Essentially, you want to compute s only at locations that are similar.
[ Easy case: if you only want to locate places with exact copies, just compute a hash of the pixel values. ]
For something that is reasonable to toy with in Matlab, invent some way to characterize your square.
Say we want to checks locations with roughly the same brightness. In that case we can start from the average:
The rest is just usual Matlab stuff:
mu = conv2(img, ones(11,11)/121, 'same');
This is a bit difficult to do well in Matlab, but binary search should work well.
[H,W] = size(img);
[mu_s, inx] = sort(mu);
xx = repmat((1:W), [H,1]); xx = xx(inx);
yy = repmat((1:H)', [1,W]); yy = yy(inx);
[ Find indices so mu_s(i) is close to mu(y,x) ]
[ For each index compute s with patch at xx(i),yy(i)]
Hope this was useful to someone.