At the onset of vocal development, both songbirds and humans produce variable vocal babbling with broadly distributed acoustic features. Over development, these vocalizations differentiate into the well-defined, categorical signals that characterize adult vocal behaviour. A broadly distributed signal is ideal for vocal exploration, that is, for matching vocal production to the statistics of the sensory input. The developmental transition to categorical signals is a gradual process during which the vocal output becomes differentiated and stable. But does it require categorical input?We trained juvenile zebra finches with playbacks of their own developing song, produced just a few moments earlier, updated continuously over development. Although the vocalizations of these self-tutored (ST) birdswere initially broadly distributed, birds quickly developed categorical signals, as fast as birds that were trained with a categorical, adult song template. By contrast, siblings of those birds that received no training (isolates) developed phonological categories much more slowly and never reached the same level of category differentiation as their ST brothers. Therefore, instead of simply mirroring the statistical properties of their sensory input, songbirds actively transform it into distinct categories. We suggest that the early self-generation of phonological categories facilitates the establishment of vocal culture by making the song easier to transmit at the micro level, while promoting stability of shared vocabulary at the group level over generations.