Language is inherently combinatorial, and parallels of this combinatorial capacity are found in nonhuman systems, with animals combining sounds and calls into larger meaningful structures. However, further analogue examples are central in unveiling the diversity, distribution, and evolutionary drivers of combinatoriality. Here, we provide evidence for internal "meaning-refining" acoustic variation within a larger stereotyped signal in pied babblers (Turdoides bicolor). Using acoustic analyses, we demonstrate that males produce 2 long, raucous, "cry-like" structures, both starting with a wind-up segment grading into repetitions of A/single-note or AB/double-note motifs. Behavioral observations indicated that, consistent with similarities in their larger stereotyped structure, both variants function overall in recruiting group members during locomotion, but the internal A or AB substructure specifies the "precise" form of recruitment, from approaching the caller's announced location to following it over longer distances. Playing back cries from a stationary loudspeaker further supported that the 2 variants elicit different responses, with more individuals approaching the loudspeaker in response to single-note compared with double-note cries. Additionally, despite similarities in overall distance travelled, group movement was only directional for single-note, but undefined for double-note cries. We suggest that the overall structure of the 2 cry variants conveys the same general meaning, with embedded variation refining this meaning. These results further illustrate the variability of generative mechanisms outside of human language and lend support to the hypothesis that combinatorial structuring may have emerged in species with limited or fixed vocal repertoires in order to enhance communicative output.