A dynamic treatment regime (DTR) is a set of decision rules to be applied across multiple stages of treatments. The decisions are tailored to individuals, by inputting an individual's observed characteristics and outputting a treatment decision at each stage for that individual. Dynamic weighted ordinary least squares (dWOLS) is a theoretically robust and easily implementable method for estimating an optimal DTR. As many related DTR methods, the dWOLS treatment effects estimators can be non-regular when true treatment effects are zero or very small, which results in invalid Wald-type or standard bootstrap confidence intervals. Inspired by an analysis of the effect of diet in infancy on measures of weight and body size in later childhood—a setting where the exposure is distant in time and whose effect is likely to be small—we investigate the use of the m-out-of-n bootstrap with dWOLS as method of analysis for valid inferences of optimal DTR. We provide an extensive simulation study to compare the performance of different choices of resample size m in situations where the treatment effects are likely to be non-regular. We illustrate the methodology using data from the PROmotion of Breastfeeding Intervention Trial to study the effect of solid food intake in infancy on long-term health outcomes.