Bagging (Breiman, 1994a) is a technique that tries to improve a learning algorithm's performance by using bootstrap replicates of the training set (Efron & Tibshirani, 1993, Efron, 1979). The computational requirements for estimating the resultant generalization error on a test set by means of cross-validation are often prohibitive, for leave-one-out cross-validation one needs to train the underlying algorithm on the order of mν times, where m is the size of the training set and ν is the number of replicates. This paper presents several techniques for estimating the generalization error of a bagged learning algorithm without invoking yet more training of the underlying learning algorithm (beyond that of the bagging itself), as is required by cross-validation-based estimation. These techniques all exploit the bias-variance decomposition (Geman, Bienenstock & Doursat, 1992, Wolpert, 1996). The best of our estimators also exploits stacking (Wolpert, 1992). In a set of experiments reported here, it was found to be more accurate than both the alternative cross-validation-based estimator of the bagged algorithm's error and the cross-validation-based estimator of the underlying algorithm's error. This improvement was particularly pronounced for small test sets. This suggests a novel justification for using bagging—more accurate estimation of the generalization error than is possible without bagging.