spafe.features.pncc

based on https://github.com/supikiti/PNCC/blob/master/pncc.py

spafe.features.pncc.asymmetric_lawpass_filtering(rectified_signal, lm_a=0.999, lm_b=0.5)[source]
spafe.features.pncc.mean_power_normalization(transfer_function, final_output, lam_myu=0.999, L=80, k=1)[source]
spafe.features.pncc.medium_time_power_calculation(power_stft_signal, M=2)[source]
spafe.features.pncc.medium_time_processing(power_stft_signal, nfilts=22)[source]
spafe.features.pncc.pncc(sig, fs=16000, num_ceps=13, pre_emph=0, pre_emph_coeff=0.97, power=2, win_len=0.025, win_hop=0.01, win_type='hamming', nfilts=26, nfft=512, low_freq=None, high_freq=None, scale='constant', dct_type=2, use_energy=False, dither=1, lifter=22, normalize=1)[source]

Compute the power-normalized cepstral coefficients (SPNCC features) from an audio signal.

Parameters:
  • sig (array) – a mono audio signal (Nx1) from which to compute features.
  • fs (int) – the sampling frequency of the signal we are working with. Default is 16000.
  • num_ceps (float) – number of cepstra to return. Default is 13.
  • pre_emph (int) – apply pre-emphasis if 1. Default is 1.
  • pre_emph_coeff (float) – apply pre-emphasis filter [1 -pre_emph] (0 = none). Default is 0.97.
  • power (int) – spectrum power. Default is 2.
  • win_len (float) – window length in sec. Default is 0.025.
  • win_hop (float) – step between successive windows in sec. Default is 0.01.
  • win_type (float) – window type to apply for the windowing. Default is “hamming”.
  • nfilts (int) – the number of filters in the filterbank. Default is 40.
  • nfft (int) – number of FFT points. Default is 512.
  • low_freq (int) – lowest band edge of mel filters (Hz). Default is 0.
  • high_freq (int) – highest band edge of mel filters (Hz). Default is samplerate / 2 = 8000.
  • scale (str) – choose if max bins amplitudes ascend, descend or are constant (=1). Default is “constant”.
  • dct_type (int) – type of DCT used - 1 or 2 (or 3 for HTK or 4 for feac). Default is 2.
  • use_energy (int) – overwrite C0 with true log energy Default is 0.
  • dither (int) – 1 = add offset to spectrum as if dither noise. Default is 0.
  • lifter (int) – apply liftering if value > 0. Default is 22.
  • normalize (int) – apply normalization if 1. Default is 0.
Returns:

2d array of PNCC features (num_frames x num_ceps)

Return type:

(array)

spafe.features.pncc.temporal_masking(rectified_signal, lam_t=0.85, myu_t=0.2)[source]
spafe.features.pncc.weight_smoothing(final_output, medium_time_power, N=4, L=128)[source]

Example:

import scipy.io.wavfile
import spafe.utils.vis as vis
from spafe.features.mfcc import pncc


#read wave file
fs, sig = scipy.io.wavfile.read('../test.wav')

# compute pnccs
pnccs  = pncc(sig, 13)

# visualize features
vis.visualize(pnccs, 'PNCC Index','Frame Index')
alternate text