godmother

The song "Godmother" uses stems from Jlin and Holly's track "Expand" put through a voice transformer network trained to turn Mat's voice into Holly's voice. It's called "Godmother" because the word Göttin came up in a Swiss article about Spawn. The song has a music video produced from pictures of Jlin and Holly taken for the album cover, a composite of the entire ensemble.

godmother (album mix)

Since it was 2018 and pix2pix was the only thing that worked, I tracked down a pix2pix vocoder called become-yukarin which was trained on hours of vocaloid data. Holly and Mat came up with a speech training set with broad phonetic coverage, as well as a singing dataset - the more the merrier.

The neural vocoder would divide a sound into harmonic and inharmonic spectra, plus a base frequency f0, then train a pix2pix network on these components. I experimented with sample rates, test data, super-resolution (sr), and f0 transfer - the latter being the most useful, as you could pitch up a recording into another register.

I believe most of the texture of Godmother is a result of the vocoder, although the pix2pix component does make the voices sound more like Holly. The stems were in stereo, so I processed them individually, resulting in a lovely stereo field as these insect-like voices fly around your head.

speaking + singing network, 44k

network trained on a combination of speaking + singing datasets

br_mat 00 br_mat 01 br_mat 02 br_mat 03 br_mat 04
br_lyra 00 br_lyra 01 br_lyra 02 br_lyra 03 br_lyra 04

fling_across_the_yard holly_2018_master jlin_drums_summed
jlin_mix holly_2_master xpand_master

singing 20 singing 21 singing 22 singing 23 singing 24
singing 25 singing 26 singing 27 singing 28 singing 29

voice 20 voice 21 voice 22 voice 23 voice 24
voice 25 voice 26 voice 27 voice 28 voice 29

speaking, 44k

speaking conversion + speaking super resolution

br_mat 00 br_mat 01 br_mat 02 br_mat 03 br_mat 04
br_lyra 00 br_lyra 01 br_lyra 02 br_lyra 03 br_lyra 04
fling_across_the_yard holly_2018_master jlin_drums_summed

singing dataset thru speaking network

singing 20 singing 21 singing 22 singing 23 singing 24
singing 25 singing 26 singing 27 singing 28 singing 29

speaking network with converted f0

voice 20 voice 21 voice 22 voice 23 voice 24
voice 25 voice 26 voice 27 voice 28 voice 29

speaking dataset with input f0

voice 20 voice 21 voice 22 voice 23 voice 24
voice 25 voice 26 voice 27 voice 28 voice 29

speaking conversion + singing super-resolution

br_mat 00 br_mat 01 br_mat 02 br_mat 03 br_mat 04
br_lyra 00 br_lyra 01 br_lyra 02 br_lyra 03 br_lyra 04
fling_across_the_yard holly_2018_master jlin_drums_summed

monotone singing, 44k

singing conversion using source f0

br_mat 00 br_mat 01 br_mat 02 br_mat 03 br_mat 04
br_lyra 00 br_lyra 01 br_lyra 02 br_lyra 03 br_lyra 04
fling_across_the_yard holly_2018_master jlin_drums_summed

singing conversion using transformed (i.e. monotone) f0

br_mat 00 br_mat 01 br_mat 02 br_mat 03 br_mat 04
br_lyra 00 br_lyra 01 br_lyra 02 br_lyra 03 br_lyra 04
fling_across_the_yard holly_2018_master jlin_drums_summed

mat → holly dataset

singing 00 singing 01 singing 02 singing 03 singing 04
singing 05 singing 06 singing 07 singing 08 singing 09
singing 10 singing 12 singing 13 singing 14 singing 15
singing 16 singing 17 singing 18 singing 19 singing 20
singing 21 singing 22 singing 23 singing 24 singing 25
singing 26 singing 27 singing 28 singing 29 singing 30
singing 31 singing 32 singing 33 singing 34 singing 35
singing 36 singing 37 singing 38 singing 40 singing 41
singing 42

speaking, 24k

copy f0 from source recording

mat talking input copy f0 copy f0, x1.2 copy f0, x2 convert f0 convert f0, x2
mat singing input copy f0 copy f0, x1.2 copy f0, x2 convert f0 convert f0, x2
lyra singing input copy f0 copy f0, x1.2 copy f0, x2 convert f0 convert f0, x2
jlin music input copy f0 copy f0, x1.2 copy f0, x2 convert f0 convert f0, x2

phrase 1: "a moth zig-zagged..."

mat mat + sr mat → holly mat → holly + sr
holly → mat holly → mat + sr holly holly + sr

phrase 2: "I assume moisture..."

mat mat + sr mat → holly mat → holly + sr
holly → mat holly → mat + sr holly holly + sr

mat tests

mat test 1 mat → holly mat → holly + sr
mat test 2 mat → holly mat → holly + sr
mat test 3 mat → holly mat → holly + sr
mat test 4 mat → holly mat → holly + sr
mat test 5 mat → holly mat → holly + sr
mat test 6 mat → holly mat → holly + sr

br song tests

br mat → holly + sr #1 + sr #2 + sr #3 + sr #4 + sr #5
br lyra → holly + sr #1 + sr #2 + sr #3 + sr #4 + sr #5

jlin tests

jlin → holly #1 + sr
jlin → holly #2 + sr
jlin → holly #3 + sr

Back to index