Accurate segmentation of anatomical structures in medical images is important in recent imaging based studies. In the past years, multi-atlas patch-based label fusion methods have achieved a great success in medical image segmentation. In these methods, the appearance of each input image patch is first represented by an atlas patch dictionary (in the image domain), and then the latent label of the input image patch is predicted by applying the estimated representation coefficients to the corresponding anatomical labels of the atlas patches in the atlas label dictionary (in the label domain). However, due to the generally large gap between the patch appearance in the image domain and the patch structure in the label domain, the estimated (patch) representation coefficients from the image domain may not be optimal for the final label fusion, thus reducing the labeling accuracy. To address this issue, we propose a novel label fusion framework to seek for the suitable label fusion weights by progressively constructing a dynamic dictionary in a layer-by-layer manner, where the intermediate dictionaries act as a sequence of guidance to steer the transition of (patch) representation coefficients from the image domain to the label domain. Our proposed multi-layer label fusion framework is flexible enough to be applied to the existing labeling methods for improving their label fusion performance, i.e., by extending their single-layer static dictionary to the multi-layer dynamic dictionary. The experimental results show that our proposed progressive label fusion method achieves more accurate hippocampal segmentation results for the ADNI dataset, compared to the counterpart methods using only the single-layer static dictionary.