We present a variational model to perform the fusion of an arbitrary number of images while preserving the salient information and enhancing the contrast for visualization. We propose to use the structure tensor to simultaneously describe the geometry of all the inputs. The basic idea is that the fused image should have a structure tensor which approximates the structure tensor obtained from the multiple inputs. At the same time, the fused image should appear ‘natural’ and ‘sharp’ to a human interpreter. We therefore propose to combine the geometry merging of the inputs with perceptual enhancement and intensity correction. This is performed through a minimization functional approach which implicitly takes into account a set of human vision characteristics.