NF-FastMNMF

A. A. Nugraha, K. Sekiguchi, M. Fontaine, Y. Bando, and K. Yoshii, "Flow-Based Fast Multichannel Nonnegative Matrix Factorization for Blind Source Separation," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Singapore, Singapore, 2022, pp. 501-505.

Abstract

This paper describes a blind source separation method for multichannel audio signals, called NF-FastMNMF, based on the integration of the normalizing flow (NF) into the multichannel nonnegative matrix factorization with jointly-diagonalizable spatial covariance matrices, a.k.a. FastMNMF. Whereas the NF of flow-based independent vector analysis, called NF-IVA, acts as the demixing matrices to transform an $M$-channel mixture into $M$ independent sources, the NF of NF-FastMNMF acts as the diagonalization matrices to transform an $M$-channel mixture into a spatially-independent $M$-channel mixture represented as a weighted sum of $N$ source images. This diagonalization enables the NF, which has been used only for determined separation because of its bijective nature, to be applicable to non-determined separation. NF-FastMNMF has time-varying diagonalization matrices that are potentially better at handling dynamical data variation than the time-invariant ones in FastMNMF. To have an NF with richer expression capability, the dimension-wise scalings using diagonal matrices originally used in NF-IVA are replaced with linear transformations using upper triangular matrices; in both cases, the diagonal and upper triangular matrices are estimated by neural networks. The evaluation shows that NF-FastMNMF performs well for both determined and non-determined separations of multiple speech utterances by stationary or non-stationary speakers from a noisy reverberant mixture.


Reference

A. A. Nugraha, K. Sekiguchi, M. Fontaine, Y. Bando, and K. Yoshii, “Flow-Based Fast Multichannel Nonnegative Matrix Factorization for Blind Source Separation,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Singapore, Singapore, 2022, pp. 501-505, doi: 10.1109/ICASSP43922.2022.9747718.


Audio Samples

  • For the listening purpose, all audio files are stereo obtained by taking the first two channels from the estimated multichannel source images.
  • The order of the separated sources may not be the same as that of the reference sources because we do not apply a source permutation solver.

Stationary Data

PCAFETER_12dB -- 447o030b_446o0315_444o030t
 
Mixture Sources
 
Methods 3 microphones 4 microphones 7 microphones
IVA-BP n/a
NF-IVA
- $\mathbf{W}_{k^{\prime\prime},ft}$: diagonal
- flow block number: 1
n/a
NF-IVA
- $\mathbf{W}_{k^{\prime\prime},ft}$: diagonal
- flow block number: 2
n/a
NF-IVA
- $\mathbf{W}_{k^{\prime\prime},ft}$: upper triangular
- flow block number: 1
n/a
NF-IVA
- $\mathbf{W}_{k^{\prime\prime},ft}$: upper triangular
- flow block number: 2
n/a
FastMNMF-BP
NF-FastMNMF
- $\mathbf{W}_{k^{\prime\prime},ft}$: diagonal
- flow block number: 1
NF-FastMNMF
- $\mathbf{W}_{k^{\prime\prime},ft}$: diagonal
- flow block number: 2
NF-FastMNMF
- $\mathbf{W}_{k^{\prime\prime},ft}$: upper triangular
- flow block number: 1
NF-FastMNMF
- $\mathbf{W}_{k^{\prime\prime},ft}$: upper triangular
- flow block number: 2
 

(back to the top of this section)


Non-stationary Data

PCAFETER_12dB -- 447o030b_446o0315_444o030t
 
Mixture Sources
 
Methods 3 microphones 4 microphones 7 microphones
IVA-BP n/a
NF-IVA
- $\mathbf{W}_{k^{\prime\prime},ft}$: diagonal
- flow block number: 1
n/a
NF-IVA
- $\mathbf{W}_{k^{\prime\prime},ft}$: diagonal
- flow block number: 2
n/a
NF-IVA
- $\mathbf{W}_{k^{\prime\prime},ft}$: upper triangular
- flow block number: 1
n/a
NF-IVA
- $\mathbf{W}_{k^{\prime\prime},ft}$: upper triangular
- flow block number: 2
n/a
FastMNMF-BP
NF-FastMNMF
- $\mathbf{W}_{k^{\prime\prime},ft}$: diagonal
- flow block number: 1
NF-FastMNMF
- $\mathbf{W}_{k^{\prime\prime},ft}$: diagonal
- flow block number: 2
NF-FastMNMF
- $\mathbf{W}_{k^{\prime\prime},ft}$: upper triangular
- flow block number: 1
NF-FastMNMF
- $\mathbf{W}_{k^{\prime\prime},ft}$: upper triangular
- flow block number: 2
 

(back to the top of this section)