This repo is an implementation of TAANet.

msp 47b4335be6 更新 'README.md' 1 year ago
data 49f767d22f first commit 1 year ago
nets 49f767d22f first commit 1 year ago
samples 49f767d22f first commit 1 year ago
saves 49f767d22f first commit 1 year ago
utils c721677102 更新 'utils/solver.py' 1 year ago
README.md 47b4335be6 更新 'README.md' 1 year ago
TAANet.py 49f767d22f first commit 1 year ago
evaluate_taanet.py 49f767d22f first commit 1 year ago
separate_taanet.py 49f767d22f first commit 1 year ago
train_log.txt 49f767d22f first commit 1 year ago
train_taanet.py b04f87490b 更新 'train_taanet.py' 1 year ago

README.md

TAANet

A PyTorch implementation of TAANet on WSJ0-2mix described in the paper "Time-domain Adaptive Attention Network for Single-channel Speech Separation".

This implementation is based on:

  1. https://github.com/kaituoxu/Conv-TasNet,

  2. https://github.com/yluo42/TAC.

Thanks Kaituo and Yi Luo for sharing.


step 1:

Generate .json files with wav path and length.

python ./utils/preprocess.py

step 2:

Train the model with train & valid set.

CUDA_VISIBLE_DEVICES=0 python train_taanet.py

step 3:

Evaluate the trained model with test set.

CUDA_VISIBLE_DEVICES=0 python evaluate_taanet.py

We obtain a SI-SNRi of 20.7 dB and SDRi of 20.9 dB on WSJ0-2mix test set, and the model is available at saves/temp/temp_best.pth.tar.

step 4:

Separate the mixed speech.

CUDA_VISIBLE_DEVICES=0 python separate_taanet.py

Some separation samples can be found here.


Please cite the following reference if you utilize this repository for your project.

@article{cai2022,
author = {Cai, Jingxiang and Wang, Kunpeng and Yao, Juan and Zhou, Hao},
title = {Time-domain Adaptive Attention Network for Single-channel Speech Separation},
journal = {},
year = {2022},
pages = {}
}