Research about Animal Speech and Singing Synthesis Model


In recent years, Artificial Intelligence has made significant advances in the field of music and sound generation. The development of AI technology has enabled computers to generate highly complex and diverse music and sound effects, opening up new paths for music composition and audio production. There are a variety of AI music creation platforms in widespread use that utilize deep learning models to create musical compositions in a wide range of styles and complexities. Real-time audio processing is another important application of AI in sound generation, and by using AI algorithms, sound effects can be generated or processed in real-time for scenarios such as live streaming, gaming, and virtual reality. Out of an interest in exploring how to combine this emerging technology with sound art, I taught myself some programming and algorithmic skills over spring break, exploring and trying to run some of the open-source modeling frameworks available on Git Hub. One of the training frameworks for tones imitation called So-VITS-SVC appealed to me, so I tried to do something a little more exotic with it. Once I had trained my model by putting in footage of my own recorded animal calls, I was able to use this model to reason about voice generation models that could mimic the timbre of a particular animal’s vocalizations. So I wrote up this experiment as a paper and made my first submission attempt, and was pleased that the 6th CONF-CDS conference accepted it. I will also be applying this technique to part of the creation of this Creative Sound Project EL2.


Leave a Reply

Your email address will not be published. Required fields are marked *