Transformation

pyvoimooo.pvoccombo(in, fs[, ps][, ps_max][, fw][, pe][, wm_gain][, efx_smile_alpha][, efx_smile_alpha_high_shelf][, efx_inteligibility_scaling][, efx_inteligibility_e0db][, efx_inteligibility_e0db_autobias][, mode][, specs][, ampenvs])

The Phase Vocoder with effect Combinations. The transformation engine that we advise to use in priority.

It uses a Pitch adaptive Overlap-Add (POLA) process.

Parameters
  • in (array<float>) – The input wavform to extract the frames from.

  • fs (int, Hz) – The sampling rate.

  • ps (float, optional) – Pitch scaling coefficient (def. 1.0). WARNING: This should not go above ps_max (see below). Expect artefacts otherwise.

  • ps_max (float, optional) – Maximum pitch scaling value. WARNING: The higher the value, the bigger the internal processing windows. Depending on the sound you process, if these windows are too big, reverberations effect might be heard. (def. 2.0)

  • dpss (2D array<float>, optional) –

    Time stamped pitch scaling coefficients. This has priority over ps.

    It must have two dimensions, with shapes [2,N], where the first row [0,:] are the time instants [s] of the pitch scaling coefficients of the second row [1,:]. WARNING: This cannot be a list of arrays.

    The time instants must be in ascending order.

    The pitch scaling coefficients used at each frame is then linearly interpolated between the given neighbor values (constant extrapolation is used for any time instant outside of the given time range).

  • psmv_mean_cent (float, cent, optional) – Pitch the mean on a scale in cents (def. no one applied). The scaling is made on a scale in cents. This takes priority over ps and dpss. Please consider warnings of ps.

  • psmv_var_coef (float, coefficent>=0, optional) – Set pitch variance scaling (def. no one applied). The scaling is made using a median value on a linear scale in cents. This takes priority over ps and dpss. Please consider warnings of ps.

  • psmv_forcemean (float, Hz, optional) – Force the mean value for the variance scaling. See psmv_var_coef

  • pst (float, Hz, optional) – Set pitch target (def. none). This takes priority over ps and dpss. Please consider warnings of ps.

  • fw (float, optional) – Frequency warping coefficient (def. 1.0). This warps the amplitude spectral envelope of the spectrum (push the smooth shapes higher (with fw>1.0), or lower (with fw<1.0)).

  • pe (bool, optional) – When a pitch scaling is applied, preserve the sectral envelope as is (def. true)

  • wm_gain (float, dB, optional) – Gain of the visual watermarking (def. -12dB)

  • efx_smile_alpha (float, optional) – Alpha parameter of the smile effect. The bigger the value, the more smily should be the effect (def. 1.0, in [1.0,+inf) )

  • efx_smile_alpha_high_shelf (bool, optional) – Activate or deactivate the extra high-shelf of the Smile effect (def. true)

  • efx_inteligibility_scaling (float, optional) – Size effect of the Intelligibility effect, 0.0 means no effect, 1.0 is maximum effect (def. 0.0, in [0.0,1.0] )

  • efx_inteligibility_e0db_autobias (float, dB, optional) – Bias for the automatic audio level correction of the Intelligibility effect (should not be changed) (def. +10)

  • efx_inteligibility_e0db (float, dB, optional) – Audio level correction of the Intelligibility effect (should not be changed) (def. not used, overwritten by efx_inteligibility_e0db_autobias)

  • efx_denoiser_gate_coef_autobias (float, dB, optional) – Bias for the automatic audio level of the Denoiser effect. The higher the value, the more denoised the sound. (def. -128.0)

  • efx_denoiser_gate_coef (float, dB, optional) – Audio level for the Denoiser effect (should not be changed) (def. not used, overwritten by efx_denoiser_gate_coef_autobias)

  • mode (string, optional) – onestep or analysis or synthesis or denoisenn (def. onestep)

  • denoisenn_modelpath (string, optional) – For option mode=:attr:denoisenn above. Give the path to the Neuralnet model. Has to be the root filename, which will extend to .json and .norm. Ex. ../mymodelpath, which points to ../mymodelpath.json and ../mymodelpath.norm

  • specs (list<array<complex<float>>>) – The complex spectrum, at each frame (as provided by analysis mode), which can be modified.

  • ampenvs (list<array<float>>) – The amplitude envelope, at each frame (as provided by analysis mode), which can be modified.

Returns

  • syn (array<float>) - The synthesized waveform.

  • tts (array<float>,seconds) - The time instant of the center of the analysis window at each frame (needs mode='analysis').

  • f0ss (array<float>,Hz) - The f_0 values at each frame (needs mode='analysis').

  • f0confs (array<float>) - A confidence factor for f_0 (needs mode='analysis').

  • specs (list<array<complex<float>>>) - The complex spectrum at each frame, which can be modified (needs mode='analysis').

  • ampenvs (list<array<float>>) - The amplitude envelope at each frame, which can be modified (needs mode='analysis').

Examples:

syn, tts, f0s, f0confs, specs, envs = vmo.pvoccombo(wav, fs, mode='analysis')

An example of analysis and synthesis steps:

features = vmo.pvoccombo(wav, fs, mode='analysis')

tts = features[1]
specs = features[4]
ampenvs = features[5]

for fi in range(len(ampenvs)):
    dftlen = (len(ampenvs[fi])-1)/2
    fleft = int(2000*dftlen/float(fs))
    fright = int(6000*dftlen/float(fs))
    ampenvs[fi][fleft:fright] *= 0.125

syn = vmo.pvoccombo(wav, fs, mode='synthesis', specs=specs, ampenvs=ampenvs)

Complete example for scaling the f0 variance:

  1import sys
  2import os
  3import numpy as np
  4from scipy import signal
  5import matplotlib.pyplot as plt
  6#plt.ion()
  7
  8os.environ["VOIMOOO_LICENSE_ID"]="THIS-IISS-INVA-LIDL-ICEN-SEID"
  9
 10# Load Voimooo python wrapper
 11sys.path.append('.')
 12import pyvoimooo as vmo
 13
 14# F0 analysis and scaling parameters
 15
 16v_th = 0.1 # voicing threshold
 17
 18expr_scale = 3. # 0 for flat, 1 for original, >1 for amplified expressivity
 19
 20down_r = 2. # reduction on down pitch: 1 for no reduction, 2 for half-range reduction, ...
 21up_l = 2.5 # limit for the deviation of the scaled pitch [multiplier of the original f0],
 22p_off = 1. # constant static pitch shift of the average f0 [multiplier of the original f0]
 23
 24
 25# Input and output files
 26
 27# wget http://apps-download.alta-voce.tech/data/db/Diversity/wav/07001104.fr.f.NCSE.F11n4.wav
 28input_file = '07001104.fr.f.NCSE.F11n4.wav' # absolute path to the input file
 29out_file = input_file+'.pvoccombo_f0_scaling.wav'
 30
 31wav, fs = vmo.readwav(input_file)
 32transf = wav
 33
 34# Analysis
 35v_tss, v = vmo.voicing(transf, fs) #voicing analysis
 36
 37# F0 analysis on voiced segments of the original signal
 38dum_syn, tts, f0s, dum_f0conf, dum_specs, dum_env = vmo.pvoccombo(transf, fs, mode='analysis')
 39
 40voiced = np.interp(tts,v_tss,v)
 41voiced_filt = f0s*(voiced>v_th)
 42
 43voiced_median = np.median(voiced_filt[voiced_filt>0])
 44voiced_std = np.std(voiced_filt[voiced_filt>0])
 45
 46print('Original voiced median: ',voiced_median)
 47print('Original voiced std: ',voiced_std)
 48
 49voiced_filt[voiced<=v_th] = voiced_median
 50
 51fig = plt.figure(figsize=(20,5))
 52
 53plt.subplot(121)
 54plt.plot(tts,f0s)
 55
 56plt.subplot(122)
 57plt.plot(tts,voiced_filt)
 58plt.show()
 59
 60
 61# Processing: F0 scaling
 62
 63#express_amp
 64voiced_median = voiced_median * p_off
 65pitch_scale = np.absolute((voiced_median + expr_scale*(f0s - voiced_median))/f0s)
 66pitch_scale[voiced<v_th] = 1
 67
 68# correction for pitch up
 69scaled_std = np.std(pitch_scale * f0s)
 70arr_1 = pitch_scale * f0s # scaled f0 values
 71sc_2 = up_l * voiced_median # upper frequency limit
 72scaled_f0s = np.minimum(arr_1,sc_2)
 73pitch_scale = scaled_f0s / f0s
 74
 75# correction for pitch down
 76down_val = pitch_scale[pitch_scale < 1.] # take pitch shift value between 0 and 1
 77down_val = ((down_val - 1.) / down_r) + 1. # shift start/end to 0 then divide and shift back
 78pitch_scale[pitch_scale < 1.] = down_val
 79
 80plt.subplot(121)
 81plt.plot(tts,pitch_scale)
 82
 83filt_pitch_scale = signal.medfilt(pitch_scale, kernel_size=25)
 84
 85plt.subplot(122)
 86plt.plot(tts,filt_pitch_scale)
 87plt.show()
 88
 89dpss = np.vstack([tts, filt_pitch_scale])
 90
 91transf = vmo.pvoccombo(transf, fs, dpss=dpss, ps_max=3.0)
 92
 93# F0 analysis on voiced segments of the transformed signal
 94
 95dum_syn, tts, scaled_f0s, dum_f0conf, dum_specs, dum_env = vmo.pvoccombo(transf, fs, mode='analysis')
 96
 97voiced = np.interp(tts,v_tss,v)
 98voiced_filt = scaled_f0s*(voiced>v_th)
 99
100voiced_median = np.median(voiced_filt[voiced_filt>0])
101voiced_std = np.std(voiced_filt[voiced_filt>0])
102
103v_tss, v = vmo.voicing(transf, fs) #voicing analysis
104print('Transformed voiced median: ',voiced_median)
105print('Transformed voiced std: ',voiced_std)
106
107
108fig = plt.figure(figsize=(20,5))
109
110plt.subplot(121)
111plt.plot(tts,scaled_f0s)
112
113plt.subplot(122)
114plt.plot(tts, f0s)
115plt.show()
116
117# Write out file
118vmo.writewav(out_file, fs, transf)

Complete example for making the voice more “afraid”:

 1import sys
 2import os
 3import numpy as np
 4from scipy import signal
 5import matplotlib.pyplot as plt
 6#plt.ion()
 7
 8os.environ["VOIMOOO_LICENSE_ID"]="THIS-IISS-INVA-LIDL-ICEN-SEID"
 9
10# Load Voimooo python wrapper
11sys.path.append('.')
12import pyvoimooo as vmo
13
14# F0 analysis and scaling parameters
15
16v_th = 0.1 # voicing threshold
17
18mod_amp = 0.3
19
20mod_freq = 8.5
21mod_rnd = 0.2
22rnd_freq = 10
23
24
25# Input and output files
26
27# wget http://apps-download.alta-voce.tech/data/db/Diversity/wav/07001104.fr.f.NCSE.F11n4.wav
28input_file = '07001104.fr.f.NCSE.F11n4.wav' # absolute path to the input file
29out_file = input_file+'.pvoccombo_f0_scaling.wav'
30
31wav, fs = vmo.readwav(input_file)
32transf = wav
33nsamp = len(wav)
34
35# Analysis
36v_tss, v = vmo.voicing(transf, fs) #voicing analysis
37
38
39# F0 analysis on voiced segments of the original signal
40dum_syn, tts, f0s, dum_f0conf, dum_specs, dum_env = vmo.pvoccombo(transf, fs, mode='analysis')
41
42voiced = np.interp(tts,v_tss,v)
43voiced_filt = f0s*(voiced>v_th)
44
45# Modulator
46rand_mod = np.interp(tts, np.linspace(0, nsamp/fs, num=rnd_freq*int(nsamp/fs)), mod_rnd*(np.random.random_sample(rnd_freq*int(nsamp/fs))-0.5))
47mod = np.interp(tts,tts,mod_amp*np.sin((mod_freq+rand_mod)*(2*np.pi*tts)))
48
49
50# Processing
51pitch_scale=1+mod
52pitch_scale[voiced<v_th] = 1
53
54filt_pitch_scale = signal.medfilt(pitch_scale, kernel_size=25)
55
56ps_max=np.max(filt_pitch_scale)
57
58plt.plot(tts,pitch_scale)
59plt.show()
60
61dpss = [tts,filt_pitch_scale]
62
63transf = vmo.pvoccombo(transf, fs, dpss=dpss, ps_max=ps_max)
64
65
66# Write out file
67
68vmo.writewav(out_file, fs, transf)

New in version 0.17.7.

pyvoimooo.pola_ana(in, fs[, timestep_seconds][, winlen_seconds][, frame_type])

The analysis step of a Pitch and OverLap-Add (POLA) transformation. It basically extracts frames that can be modified and then merged using pola_syn.

It uses a Pitch adaptive Overlap-Add (POLA) process.

Parameters
  • in (array<float>) – The input wavform to extract the frames from.

  • fs (int, Hz) – The sampling rate.

  • timestep_seconds (float, optional, seconds) – (def. 0.005s)

  • winlen_seconds (float, optional, seconds) – (def. 3 periods of the fundamental frequency)

  • frame_type (string, optional) – time or spec (def. time)

The length of the DFT is always the next power of two above the winlen.

Note

A good practice is to gather all the optional arguments into a dict() and pass it as argument to both pola_ana() and pola_syn() since they have to be common (see the example below).

Returns

  • tss (float,second) - Time instants of analysis (the center time of each frame)

  • frames (list<array<float>>) - The frames

  • f0ss (array<float>,Hz) - The f_0 values at each frame.

See pola_syn() below for a complete example.

pyvoimooo.pola_syn(in, fs, kwargs)

The synthesis step of a Pitch and OverLap-Add (POLA) transformation. It resynthesise frames that have been extracted using pola_ana(), and modified as desired.

It uses a Pitch adaptive Overlap-Add (POLA) process.

Parameters
  • tss (array<float>, second) – Times of analysis, as provided by pola_ana() (non-modified).

  • frames (list<array<float>>) – The frames, as provided by pola_ana() and modified as desired.

  • f0s (array<float>,Hz) – The f_0 values at each frame, as provided by pola_ana() (non-modified).

  • wavlen (int, number of samples) – The number of samples in the synthesized waveform.

  • fs (int, Hz) – The sampling rate.

  • kwargs – Extra arguments to chose the type of frames extraction, as given to pola_ana() (non-modified).

Returns

  • syn (array<float>) - The synthesized waveform.

Complete example:

 1import sys
 2import numpy as np
 3import matplotlib.pyplot as plt
 4plt.ion()
 5
 6import pysndfile # Get it from pip
 7
 8# Load Voimooo python wrapper
 9sys.path.append('.')
10import pyvoimooo as vmo
11
12# Read the source file
13wav, fs, enc = pysndfile.sndio.read('../test/snd/eng-usa.f.arctic_a0487_32khz.wav')
14wavts = np.arange(len(wav))/float(fs)
15
16# Prepare a dict with the options that _have_ to be common between analysis and synthesis stages
17opts = dict()
18opts['frame_type'] = 'spec' # 'time'
19opts['timestep_seconds'] = 0.010
20opts['winlen_seconds'] = 0.050
21
22# Run the analysis
23tss, frames, f0s = vmo.pola_ana(wav, fs, **opts)
24
25# Modify frames
26dftlen = (len(frames[0])-1)*2;
27framesnew = list()
28for fr in frames:
29    winlen = 1+2*int(0.5*opts['winlen_seconds']*fs)
30    fr = fr.astype('complex128')
31    if 1:
32        # Robot
33        fr = np.abs(fr)
34        fr = fr*np.exp(-2j*((winlen-1)/2)*np.pi*np.arange(dftlen/2+1)/float(dftlen))
35    else:
36        # Cepstral compensation
37        rcc = np.fft.irfft(np.log(abs(fr)))
38        rcc[int(dftlen/2):] = 0
39        rcc[1:] *= 2
40        rcc[1:2] = 0
41        fr = np.exp(np.real(np.fft.rfft(rcc, dftlen)))*np.angle(fr)
42    framesnew.append(fr.astype('complex64'))
43
44frames = framesnew # Just forget about the original frames
45
46# Synthesize the result
47syn = vmo.pola_syn(tss, frames, f0s, len(wav), fs, **opts)
48
49# Write down the synthesis
50pysndfile.sndio.write('eng-usa.f.arctic_a0487_32khz.pola.wav', syn, fs)
51
52# Plot the features
53plt.subplot(211)
54plt.plot(wavts, wav, 'k')
55plt.plot(tss, f0s/100.0, 'b')
56plt.plot(wavts, syn, 'r')
57plt.subplot(212)
58vmax = 20*np.log10(np.max(np.abs(frames)))
59vmin = vmax-70.0
60plt.imshow(20*np.log10(np.abs(frames)).T, origin='lower', aspect='auto', interpolation='none', cmap='jet', extent=[0.0, 1.0, 0.0, fs/2], vmin=vmin, vmax=vmax)
61
62from IPython.core.debugger import  Pdb; Pdb().set_trace()

New in version 0.8.9.

pyvoimooo.pitch_scaling_snm(in, fs[, sps][, dpss][, spvs][, ep][, ses])

Pitch scale the given waveform using a sinusoidal model.

It uses a Pitch adaptive Overlap-Add (POLA) process.

Parameters
  • in (array<float>) – The input wavform to transform.

  • fs (int, Hz) – The sampling rate.

  • sps (float, optional) –

    Static pitch scaling factor (def. 1.0).

    E.g. If sps is 2.0, the pitch curve of the whole waveform will be twice higher than the original.

  • dpss (2D array<float>, optional) –

    Time stamped pitch scaling coefficients.

    It must have two dimensions, with shapes [2,N], where the first row [0,:] are the time instants [s] of the pitch scaling coefficients of the second row [1,:]. WARNING: This cannot be a list of arrays.

    The time instants must be in ascending order.

    The pitch scaling coefficients used at each frame is then linearly interpolated between the given neighbor values (constant extrapolation is used for any time instant outside of the given time range).

  • spvs

    Static pitch variance scaling factor (def. 1.0).

    E.g. If spvs is 2.0, the variancee of the pitch curve of the whole waveform will be twice wider than in the original.

  • ep (boolean, optional) – Preserve the amplitude spectral envelope (def. true).

  • ses (float, optional) –

    Static envelope scaling factor (def. 1.0).

    E.g. If ses is 2.0, the envelope of the whole spectrum will be stretched with a factor 2.

Returns

  • syn (array<float>) - The transformed waveform.

  • tss (array<float>) - The time stamps of the f0 values.

  • f0s (array<float>) - The f_0 values estimated during processing.

Example:

syn, tts, f0s = vmo.pitch_scaling_snm(wav, fs, sps=2.0)
syn, tts, f0s = vmo.pitch_scaling_snm(wav, fs, dpss=[[0.0, 1.5, 2.0], [1.0, 0.8, 1.5]])

New in version 0.10.1.

pyvoimooo.freqwarp_pola(in, fs[, gfs][, static_freqs][, dynamic_times][, dynamic_freqs][, f0min][, f0max])

Warp the spectral envelope in frequency domain.

It uses a Pitch adaptive Overlap-Add (POLA) process.

Parameters
  • in (array<float>) – The input wavform to transform.

  • fs (int, Hz) – The sampling rate.

  • gfs (float, optional) –

    Global frequency scaling parameter (def. 1.0).

    E.g. If gfs is 2.0, the envelope bin at 1kHz will be warped/shifted to 2kHz.

    This is can be combined with SFW and DFW options below.

  • static_freqs (2D array<float>, Hz, optional) –

    Static Frequency Warping (SFW) parameters.

    It should have two dimensions, with shapes [2,N], where the first row [0,:] are the input frequencies and the the second row [1,:] are the corresponding output frequencies.

    The frequencies must be in ascending order.

    The warping function follows the same principle as for gfs. The warped frequencies between two given points in static_freqs are linearly interpolated between the given neighbor values.

    For a traditional use of frequency warping, two points should be part of static_freqs, one at zero frequency and one at Nyquist (please see the example below).

    This option is exclusive with the dynamic_freqs option.

  • dynamic_times (array<float>, seconds, optional) –

    Dynamic Frequency Warping (DFW) parameters time stamps.

    Time stamps of dynamic_freqs values (see below).

    This parameter has to be used jointly with the dynamic_freqs parameter. This option is exclusive with the static_freqs option.

  • dynamic_freqs (list< 2D array<float> >, Hz, optional) –

    Dynamic Frequency Warping (DFW) parameters.

    A list of 2D arrays as in static_freqs. The time stamps of each element of this list are in dynamic_times.

    This parameter has to be used jointly with the dynamic_times parameter. This option is exclusive with the static_freqs option.

  • f0min (float, Hz, optional) – The minimal f_0 value.

  • f0max (float, Hz, optional) – The maximal f_0 value.

Returns

  • syn (array<float>) - The transformed waveform.

  • tss (array<float>) - The time stamps of the f0 values.

  • f0s (array<float>) - The f_0 values estimated during processing.

Examples:

syn, tts, f0s = vmo.freqwarp_pola(wav, fs, gfs=0.5, static_freqs=[[0,2000,fs/2],[0,4000,fs/2]])
syn, tss, f0s = vmo.freqwarp_pola(wav, fs, dynamic_times=[0.0, 2.0], dynamic_freqs=[[[0,3000,fs/2],[0,2500,fs/2]],[[0,2000,fs/2],[0,4000,fs/2]]])

New in version 0.10.1.

pyvoimooo.smile(in, fs[, alpha][, alphas][, f0min][, f0max][, gender][, anchorfreqs][, shelf])

Transform a wavform using the SMILE algorithm.

It uses a Pitch adaptive Overlap-Add (POLA) process.

Parameters
  • in (array<float>) – The input wavform to transform.

  • fs (int, Hz) – The sampling rate.

  • alpha (float, optional) – The alpha value for scaling the effect (in [0.8,1.4], def. 1.0).

  • alphas (2D array<float>, optional) –

    Time stamped alpha values.

    It must have two dimensions, with shapes [2,N], where the first row [0,:] are the time instants [s] of the alpha values of the second row [1,:].

    The time instants must be in ascending order.

    The alpha value used at each frame is then linearly interpolated between the given neighbor values (constant extrapolation is used for any time instant outside of the given range in alphas).

  • f0min (float, Hz, optional) – The minimal f_0 value.

  • f0max (float, Hz, optional) – The maximal f_0 value.

  • gender (str, optional) – ‘male’ or ‘female’, specify the gender (def. to None, an average value).

  • anchorfreqs (1D array<float>, optional) –

    Custom attachment frequencies A size 4 vector that defines: the two values where the frequencies are preserved (values at index 0 and 3), and the two values defining the interval of the frequencies that are warped (values at index 1 and 2).

    Given FN the N-th formant freqeuncy, set them to [F1/2, F2, F3, F5], in order to fallback on the alternative.

    Set the first value to -1 in order to disable these custom values and re-use the internal values.

  • shelf (1D array<float>, optional) –

    Custom shelf parameters (frequency[Hz], gain[dB]) A size 2 vector with: The custom frequency [Hz] and gain [dB].

    Set the custom frequency to -1 in order to disable these custom values and re-use the internal values.

Returns

  • syn (array<float>) - The transformed waveform.

  • tss (array<float>) - The time stamps of the f0 values.

  • f0s (array<float>) - The f_0 values estimated during processing.

  • envs_ori (list<array<float>>) - The spectral envelopes of the analysed file.

  • envs_new (list<array<float>>) - The transformed spectral envelopes applied to obtain the resulting file.

Examples:

syn, tts, f0s, envs_ori, envs_new = vmo.smile(wav, fs, alpha=1.2)
syn, tts, f0s, envs_ori, envs_new = vmo.smile(wav, fs, alphas=[[0.0, 1.5, 2.0], [1.0, 1.0, 1.2]])

New in version 0.10.1.

pyvoimooo.intelligibility(in, fs[, IOEC0dB][, scaling])

Transform a waveform using the Intelligibility algorithm.

It uses a Pitch adaptive Overlap-Add (OLA) process.

Parameters
  • in (array<float>) – The input waveform to transform.

  • fs (int, Hz) – The sampling rate.

  • IOEC0dB (float, dB, optional) – The IOEC0dB compression reference (def. -10.0 dB).

  • scaling (float, optional) – The scaling value for scaling the effect (in [0.0,1.0], def. 1.0).

Returns

  • syn (array<float>) - The transformed waveform.

Examples:

syn = vmo.intelligibility(wav, fs, IOEC0dB=-15.0, scaling=1.0)

New in version 0.16.6.

pyvoimooo.denoise(in, fs[, gate_coef_db][, gate_coef_auto_bias_db])

Denoise a waveform using spectral noise gate.

It uses a Pitch adaptive Overlap-Add (OLA) process.

Parameters
  • in (array<float>) – The input waveform to transform.

  • fs (int, Hz) – The sampling rate.

  • gate_coef_db (float, dB, optional) – The noise threshold (def. -46.0 dB).

  • gate_coef_auto_bias_db (float, dB, optional) – The threshold bias used with semi-automatic gate threshold (def. -12.0 dB).

Returns

  • syn (array<float>) - The transformed waveform.

Examples:

syn = vmo.denoise(wav, fs, gate_coef_db=-15.0)

New in version 0.17.8.

pyvoimooo.pitch_scaling_doubledelay(in, fs[, sps][, dpss][, spvs])

Deprecated since version 0.20.1: Please use pvoccombo() or pitch_scaling_snm()