Parsing multisensory information from a complex external environment is a fundamental skill for all organisms. However, different organizational schemes currently exist for how multisensory information is processed in human (supramodal; organized by cognitive demands) versus primate (organized by modality/cognitive demands) lateral prefrontal cortex (LPFC). Functional magnetic resonance imaging results from a large cohort of healthy controls (N = 64; Experiment 1) revealed a rostral-caudal stratification of LPFC for auditory versus visual attention during an audio-visual Stroop task. The stratification existed in spite of behavioral and functional evidence of increased interference from visual distractors. Increased functional connectivity was also observed between rostral LPFC and auditory cortex across independent samples (Experiments 2 and 3) and multiple methodologies. In contrast, the caudal LPFC was preferentially activated during visual attention but functioned in a supramodal capacity for resolving multisensory conflict. The caudal LPFC also did not exhibit increased connectivity with visual cortices. Collectively, these findings closely mirror previous nonhuman primate studies suggesting that visual attention relies on flexible use of a supramodal cognitive control network in caudal LPFC whereas rostral LPFC is specialized for directing attention to auditory inputs (i.e., human auditory fields).