You're on the right track—yes, introducing multiple vanishing points is exactly the issue with rotating the whole camera instead of shifting the lens or sensor. Whether or not it's noticeable depends on the subject, but once you're working with lines, especially architecture or interiors, it can get messy fast.
Tilt-shift lenses absolutely
can be used for flat stitching, when you shift left and right (or up and down), you’re effectively mimicking the kind of parallel movement you'd get from a technical camera. And because the sensor stays stationary, you avoid introducing new vanishing points. Stitching those images is usually seamless if everything's aligned and the subject is relatively flat.
Rotating 180º around the sensor axis, though, if you mean flipping the whole camera from one side to the other, can work in some cases, but that’s still a rotation, not a lateral translation. You’re likely to introduce subtle perspective shifts unless everything in the scene is at a uniform distance. So yes, it
can work, especially with longer focal lengths and distant scenes, but it’s a bit of a gamble if you’re aiming for architectural precision.
Maybe Matt
@MGrayson can add some diagrams to share here. It would be a great teaching aid for those exploring this thread.
Oooof. This is a hard one. I mean visually. Mathematically, it's simple, but connecting the math to the camera, lens, and final image is a mess. I'm going to start with the words and add diagrams later.
What's our goal? To create a single large image that our camera (sensor + lens) *can't* capture, and do this by combining a bunch of smaller images that it *can* capture. There are two cases here that differ HUGELY as to why our camera can't do this in a single capture:
Our lens's image circle is much larger than the sensor. A bigger sensor *could* give us the desired final image, but *our* sensor is too small.
or
Our lens's image circle just barely covers the sensor and our lens isn't wide-angle enough.
The first case is what view/tech cameras and shift lenses provide. Those lenses have large image circles -
the entire image is there already behind the lens, but the poor sensor is just too small to grab it. On an 8x10 camera, a 200mm lens is wide angle, and its image circle has to cover at least 8x10 film. But a 200mm lens is a 200mm lens. A crop the size of your sensor from the center of that 8x10 image will look EXACTLY the same as what your camera would capture with a 200mm lens.
Solution? Keep the lens right where it is and move the sensor around "sampling" this larger image. The captures are flat rectangular crops of a single larger flat virtual image, so they can be combined easily with scotch tape or glue onto a large piece of paper. Hence the name "flat stitching". Note that the *lens* has to stay fixed and the *sensor* has to move around. If you fix the sensor and move the lens around, you're doing almost, but not exactly the same thing. The large virtual image changes *sightly* each time the lens moves. Foreground objects will shift relative to background objects.
The lens makes the image. The sensor just samples it.
In the second (more typical) case our only option is to point the camera around like a flashlight "illuminating" the desired scene. (That metaphor is clear, right? The flashlight - with the same spread as our lens's FoV - just sends the light in the opposite direction, so what it illuminates is exactly what the camera captures.) There are two problems here. First, now that the lens *must* move, how do we guarantee that the images could, even theoretically, combine to make our desired final image (foreground can't move against background). And second, *how* in God's name do we combine them afterwards?
If everything is sufficiently far away we can ignore the first problem as there *is* no foreground. Otherwise, we have to be very careful to rotate the camera about - well - the point that the camera "sees" through. For a pinhole camera, it's the pinhole. For a very simple lens, it's the aperture. For a real-life modern lens, it's called the nodal point (or front nodal point?). It's where *you* see the aperture when you're looking into the lens from the front. Rotating around that point swings the aperture, but doesn't change the Point of View (it literally *is* the point of view). A nodal rail moves the lens back far enough so that it will rotate around this point when you rotate your tripod head. (For tilting up and down, this will be a problem unless you use special equipment, e.g., a gimbal mount - never mind that for now.)
Once we guarantee that all our sample images come from the same viewpoint, we have to stitch them together. Now we have the "changing vanishing point" problem. When you ask the computer to do this (and if the computer is in a good mood) you see weirdly distorted shapes that don't look at all like the rectangular images you thought you were seeing during capture. Remember our goal! We want to reconstruct the flat image that we
would see if the lens's image circle were larger and we had a larger sensor. That flat image lies on a single plane. But now we're
changing the plane of the sensor each time we rotate the camera. The weird shapes you see from the stitching programs are exactly what you get when you project a rectangle in one plane (the current sensor) onto a non-parallel second plane (the final image's plane). The shape is weird, but the content isn't. This is a piece of exactly what you want to see. It just might not be what you *thought* you were seeing - and the distortion affects resolution, sometimes drastically.
How does the computer do this? I'll save it for a later post - and after I've made some diagrams. Apologies,
Darr.
Matt
Ok, I'll give away the secret. Flat stitching is easy because shifting is an "isometry" of the plane (literally "same-distance") - it moves things rigidly without changing their shape or size. So fitting the pieces together is a simple matter of alignment. Rotation about a point, however, is an isometry of a sphere centered around that point. Project all your images onto that sphere, then slide them around until they align (remember, the computer doesn't know which way you were aiming the camera, so it has to move things around to find out where they go in the final image). Sliding them around on the sphere does not change their size and shape and, more importantly, where two images overlap, they
look identical - just like the flat stitch pieces did. Reread that sentence until it is part of your DNA*. So the computer perfectly aligns the spherical segments (they look like bulging rectangles) and then projects the resulting mess back onto the original plane. Tadaah! Stitched image.
(And if you're only doing a one-row stitch, you can project onto a cylinder, as rotation about a fixed axis is also an isometry (rigid motion) of the cylinder. Stitching programs will often offer this choice. Cylinders can be projected onto a plane, or unrolled. The latter is what a 360-degree panorama camera, like the Noblex, does. These days they're called spinners.)
* It is not true of the original pictures you took because of, e.g., converging lines when you change the camera angle. After projection onto the sphere, all those differences go away. Yes, this is a miracle.