The present study examines the articulation and acoustics of the typologically rare and understudied 'whistled' fricative sound in Xitsonga, a Southern Bantu language. Using ultrasound imaging and video recording, we examine the lingual and labial articulation of the whistled fricative. For the acoustic analysis, we employ the multitaper spectral analysis, which ensures reliable spectral estimates. The results revealed an interplay between multiple articulators involved in the production of the sound: the retroflex lingual gesture and the narrowing of the lower lip toward the upper teeth. Acoustically, the spectra of the whistled fricative are more peaked and compact than the acoustically similar palatoalveolar fricative, and the differences manifest themselves most clearly in two acoustic parameters, dynamic amplitude (Ad) and M2 (variance). The acoustic differences are also manifested in F2 and F3 in the surrounding vowels. Additionally, the 'whistled' fricative in Xitsonga is not quite whistled, contrary to the label given to the sound in previous studies. Building on the current articulatory and acoustic results, we discuss two different aerodynamic models for the whistled fricatives in Southern Bantu languages and conclude that the whistled fricative in Xitsonga is best characterized as a retroflex segment accompanied by weak whistling.