You definitely don't need a bunker - you could shoot it handheld on a fairground ride if you want - but the less movement, the better the results will be.
Your photos clearly show an improvement over a single image (that said, the uprezzed image isn't a million miles away - smooth out the noise, get rid of the halos and sharpen it appropriately and it would be very close), but in terms of resolution it's not as an impressive a (relative) 'jump' as I've seen between Olympus hi-res shots and the uprezzed (Olympus) originals (though of course both are much lower resolution than what's possible with the 100MP Fuji), which again leads me to think that movement is an issue.
While I'm sure Fuji has some clever averaging going on in software to help deal with the issue, I still stand by the notion that it's pretty much impossible to stop the camera moving by a few 1000ths of a mm during one exposure, let alone 16, and this movement has to impact the final image. Years ago I bolted down a Leica laser to a huge Foba studio stand and pointed it at a thin frosted piece of plastic a few feet away. On the other side of the plastic I set up a camera on a studio stand and filmed the dot. When I played the video back I could see the dot moving (not much, but enough). Or was it the camera moving ? Or both ? You see the problem. Since then I've always taken the MP claim of a camera manufacturer with a pinch of salt (as in, if it says 50MP you're obviously going to get 50MP, but what those pixels actually represent is a bit up in the air - more so if it's a windy day).
All said and done, it's impossible to precisely quantify outside of a lab, so it's something of a moot point in terms of practical day-to-day use. Ultimately, if the photographer is happy with the image that's all that matters.