Skip to the content.

From Words to Sound: Neural Audio Synthesis of Guitar Sounds with Timbral Descriptors

This website is meant to be used as a companion to the paper From Words to Sound: Neural Audio Synthesis of Guitar Sounds with Timbral Descriptors presented by The Sound of AI - Open Source Research group at the 3rd Conference on AI Music Creativity 2022 and it contains a few examples of generated audio samples and how they were achieved.

You can try out the sound generation tool from our webapp.

Examples of audio samples generated by voice and text inputs

These examples were obtained using the end-to-end workflow of the app as described in the paper, by recording a voice prompt and generating the corresponding sound.

Voice Query Audio
Give me a bright guitar
Give me a blue guitar sound
Rich guitar tone
Give me a dark metallic sound
A soft acoustic sound
A noisy percussive guitar
Give me a warm hollow sound

Examples of audio samples generated by tweaking the timbral descriptors and latent variables

These examples try to showcase the sound design capabilities achievable by tweaking and exploring the sliders representing timbral characteristics and latent space parameters.

Meaningful Parameters Manipulated Audio
Very Low Attack, Very High Decay, No Low-Mid, Max Hi-Mids and High
Max Inharmonicity, Odd harmonics, Very Short Decay
Very high Z0, Long ATK, high RMS for ATK and DEC, Even Harmonics, lots of Mids and Hi-Mids

Examples of audio samples generated by tweaking the latent variables

These examples showcase the sounds obtained by manipulating the Z parameters keeping all the others timbral sliders unchanged together with the initial sound and the prompt used to obtain it.

Prompt: “Mellow Hollow Sparse Sound”

z0 = 0.342, z1 = 1

Initial Generated Sample:

Z0 Z1 Resulting Sound
0.2 0.5
1.0 1.0
1.0 0.0

Prompt: “Metallic Guitar Sound”

z0 = 0.356, z1 = 0.486

Initial Generated Sample:

Z0 Z1 Resulting Sound
0.5 0.0
1.0 1.0

Prompt: “Bright Guitar”

z0 = 0.586, z1 = 0.11

Initial Generated Sample:

Z0 Z1 Resulting Sound
1.0 0.1
0.3 1.0