Despite the left occipito-temporal region having shown consistent activation in visual word form processing across numerous studies in different languages, the mechanisms by which word forms of second languages are processed in this region remain unclear. To examine this more closely, 16 Chinese-English and 14 English-Chinese late bilinguals were recruited to perform lexical decision tasks to visually presented words in both their native and second languages (L1 and L2) during functional magnetic resonance imaging scanning. Here we demonstrate that visual word form processing for L1 versus L2 engaged different spatial areas of the left occipito-temporal region. Namely, the spatial organization of the visual word form processing in the left occipito-temporal region is more medial and posterior for L2 than L1 processing in Chinese-English bilinguals, whereas activation is more lateral and anterior for L2 in English-Chinese bilinguals. In addition, for Chinese-English bilinguals, more lateral recruitment of the occipito-temporal region was correlated with higher L2 proficiency, suggesting higher L2 proficiency is associated with greater involvement of L1-preferred mechanisms. For English-Chinese bilinguals, higher L2 proficiency was correlated with more lateral and anterior activation of the occipito-temporal region, suggesting higher L2 proficiency is associated with greater involvement of L2-preferred mechanisms. Taken together, our results indicate that L1 and L2 recruit spatially different areas of the occipito-temporal region in visual word processing when the two scripts belong to different writing systems, and that the spatial organization of this region for L2 visual word processing is dynamically modulated by L2 proficiency. Specifically, proficiency in L2 in Chinese-English is associated with assimilation to the native language mechanisms, whereas L2 in English-Chinese is associated with accommodation to second language mechanisms.