Several methods have been developed to measure dynamic functional connectivity (dFC) in fMRI data. These methods are often based on a sliding-window analysis, which aims to capture how the brain's functional organization varies over the course of a scan. The aim of many studies is to compare dFC across groups, such as younger versus older people. However, spurious group differences in measured dFC may be caused by other sources of heterogeneity between people. For example, the shape of the haemodynamic response function (HRF) and levels of measurement noise have been found to vary with age. We use a generic simulation framework for fMRI data to investigate the effect of such heterogeneity on estimates of dFC. Our findings show that, despite no differences in true dFC, individual differences in measured dFC can result from other (non-dynamic) features of the data, such as differences in neural autocorrelation, HRF shape, connectivity strength and measurement noise. We also find that common dFC methods such as k-means and multilayer modularity approaches can detect spurious group differences in dynamic connectivity due to inappropriate setting of their hyperparameters. fMRI studies therefore need to consider alternative sources of heterogeneity across individuals before concluding differences in dFC.