There are many applications in which particle filters outperform traditional signal processing algorithms. Some of these applications include tracking, joint detection and estimation in wireless communication, and computer vision. However, particle filters are not used in practice for these applications mainly because they cannot satisfy real-time requirements. This paper presents an efficient resampling architecture for parallel particle filtering. The proposed architecture is flexible such that it supports various modes of parallel resampling operations with up to four processing elements. The resampling algorithm is developed in order to compensate for possible error caused by finite precision quantization in the resampling step. Communication between the processing elements after resampling is identified as an implementation bottleneck, and therefore, concurrent buffering is incorporated in order to speed up communication of particles among processing elements. The flexible resampling mechanism is implemented in 0.35 μ m CMOS process and its complexity and performance are analyzed.