前期准备

准备过程

Visit our demo for audio samples.
We also provide the pretrained models.
** Update note: Thanks to Rishikesh (ऋषिकेश), our interactive TTS demo is now available on Colab Notebook.

预训练模型在Google Drive上,需要科学上网

  1. 下载数据集
    i. 下载并解压缩 LJ Speech 数据集,然后重命名或创建指向数据集文件夹的链接: ln -s /path/to/LJSpeech-1.1/wavs DUMMY1
    ii. 对于 mult-speaker 设置,下载并提取VCTK数据集,并将wav文件降采样至22050 Hz。然后重命名或创建指向数据集文件夹的链接: ln -s /path/to/VCTK-Corpus/downsampled_wavs DUMMY2
    在Linux系统中存在两种链接文件方式
    • 软链接(类似windows下的快捷方式)
    ln -s 原文件名 链接文件名
    
    • 硬链接(类似复制文件)
    ln 原文件名 链接文件名
    
    如果文件被删除,则软链接文件失去指向,变为不可用
    如果文件被删除,由于硬链接文件直接指向内容,因此不受影响
    详解:深度剖析 Linux 的 3 种“拷贝”命令
Administrator@AUTOBVT-Q90417J MINGW64 /e/vits (main)
$ ln -s "E:\vits\LJSpeech-1.1\wavs" DUMMY1
/*请用上面的命令,生成的 DUMMY1 文件夹里是 wavs 文件夹中的文件,没有 wavs 文件夹*/
Administrator@AUTOBVT-Q90417J MINGW64 /e/vits (main)
$ ln -s "E:\vits\LJSpeech-1.1\wavs" DUMMY1
/*如果删掉 DUMMY1 文件夹中 wavs 文件,输入上面的命令 DUMMY1 文件夹中会出现 wavs 文件夹*/
Administrator@AUTOBVT-Q90417J MINGW64 /e/vits (main)
$ ln -s E:\vits\LJSpeech-1.1\wavs DUMMY1
/*如果删掉 DUMMY1 文件夹中 wavs 文件,输入上面的命令会出现*/
ln: failed to create symbolic link 'DUMMY1/vitsLJSpeech-1.1wavs': No such file or directory
/*创建一个 DUMMY1 空白文件夹,使用下面的命令*/
Administrator@AUTOBVT-Q90417J MINGW64 /e/vits (main)
$ ln -s E:\vits\LJSpeech-1.1\wavs DUMMY1
ln: failed to create symbolic link 'DUMMY1/vitsLJSpeech-1.1wavs': No such file or directory
/*不创建 DUMMY1 空白文件夹,使用下面的命令*/
Administrator@AUTOBVT-Q90417J MINGW64 /e/vits (main)
$ ln -s E:\vits\LJSpeech-1.1\wavs DUMMY1
ln: failed to create symbolic link 'DUMMY1': No such file or directory
  1. 如果您使用自己的数据集,请构建单调对齐搜索并运行预处理。
# Cython-version Monotonoic Alignment Search
cd monotonic_align
python setup.py build_ext --inplace

# Preprocessing (g2p) for your own datasets. Preprocessed phonemes for LJ Speech and VCTK have been already provided.
# python preprocess.py --text_index 1 --filelists filelists/ljs_audio_text_train_filelist.txt filelists/ljs_audio_text_val_filelist.txt filelists/ljs_audio_text_test_filelist.txt 
# python preprocess.py --text_index 2 --filelists filelists/vctk_audio_sid_text_train_filelist.txt filelists/vctk_audio_sid_text_val_filelist.txt filelists/vctk_audio_sid_text_test_filelist.txt

翻回头我们看看数据集

LJspeech数据集

描述
这是一个公共领域的语音数据集,由13,100个简短的音频剪辑组成,这些音频剪辑是单个说话者阅读7本非小说类书籍中的段落。为每个剪辑提供转录。短片的长度从1秒到10秒不等,总长度约为24小时。

这些文本出版于1884年至1964年,属于公有领域。该音频于2016-17年由LibriVox项目录制,也属于公有领域。

Homepage: The LJ Speech Dataset

介绍
ljspeech

在网上翻了翻——
LJspeech数据集 1.0版
链接:https://pan.baidu.com/s/1OGDXtmNtKn-5258HfabTGA
提取码:jkre
LJspeech数据集 1.1版
数据集:http://data.keithito.com/data/speech/LJSpeech-1.1.tar.bz2 (用迅雷下载很快)
百度网盘地址:链接:https://pan.baidu.com/s/197LRZLNBb5gyREpYsMpkCg 提取码:7o1a

现在我没下载官方提供的预训练模型,

VCTK数据集

描述
CSTR VCTK语料库包括110名英语使用者使用不同口音发出的语音数据。每个演讲者朗读大约400个句子,这些句子选自一份报纸、rainbow文章和一段用于语音重音档案的启发段落。
文本是根据贪婪算法选择的,贪婪算法可以增加上下文和语音覆盖率。
所有语音数据均使用相同的录音设置进行录音:一个全向麦克风(DPA 4035)和一个小振膜电容麦克风,带宽非常宽(Sennheiser MKH 800),采样频率为96kHz,24位,位于爱丁堡大学的半消声室中。
所有记录均转换为16位,降采样至48 kHz
该语料库最初用于基于HMM的文本到语音合成系统,尤其是基于说话人自适应HMM的语音合成,该合成使用多个说话人的平均语音模型和说话人自适应技术。该语料库也适用于基于DNN的多说话人文语合成系统和波形建模。这里的思想和PCA提取人脸特征加上平均人脸来合成指定人脸的思想类似

Homepage: CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit (version 0.92)

介绍
关于VCTK数据集
-MNIST dataset
vctk

训练示例

# LJ Speech
python train.py -c configs/ljs_base.json -m ljs_base

# VCTK
python train_ms.py -c configs/vctk_base.json -m vctk_base

训练测试

用 LJspeech 和 VCTK 数据集测试

已下载 LJspeech 数据集并创建指向数据集文件夹的链接,没有下载预训练模型,直接运行

# LJ Speech
python train.py -c configs/ljs_base.json -m ljs_base

cmd运行结果——

Administrator@AUTOBVT-Q90417J MINGW64 /e/vits (main)
$ python train.py -c configs/ljs_base.json -m ljs_base
DEBUG:numba.core.byteflow:bytecode dump:
>          0    NOP(arg=None, lineno=1054)
           2    LOAD_FAST(arg=0, lineno=1054)
           4    LOAD_CONST(arg=1, lineno=1054)
           6    BINARY_SUBSCR(arg=None, lineno=1054)
           8    LOAD_FAST(arg=0, lineno=1054)
          10    LOAD_CONST(arg=2, lineno=1054)
          12    BINARY_SUBSCR(arg=None, lineno=1054)
          14    COMPARE_OP(arg=4, lineno=1054)
          16    LOAD_FAST(arg=0, lineno=1054)
          18    LOAD_CONST(arg=1, lineno=1054)
          20    BINARY_SUBSCR(arg=None, lineno=1054)
          22    LOAD_FAST(arg=0, lineno=1054)
          24    LOAD_CONST(arg=3, lineno=1054)
          26    BINARY_SUBSCR(arg=None, lineno=1054)
          28    COMPARE_OP(arg=5, lineno=1054)
          30    BINARY_AND(arg=None, lineno=1054)
          32    RETURN_VALUE(arg=None, lineno=1054)
DEBUG:numba.core.byteflow:pending: deque([State(pc_initial=0 nstack_initial=0)])
DEBUG:numba.core.byteflow:stack: []
DEBUG:numba.core.byteflow:dispatch pc=0, inst=NOP(arg=None, lineno=1054)
DEBUG:numba.core.byteflow:stack []
DEBUG:numba.core.byteflow:dispatch pc=2, inst=LOAD_FAST(arg=0, lineno=1054)
DEBUG:numba.core.byteflow:stack []
DEBUG:numba.core.byteflow:dispatch pc=4, inst=LOAD_CONST(arg=1, lineno=1054)
DEBUG:numba.core.byteflow:stack ['$x2.0']
DEBUG:numba.core.byteflow:dispatch pc=6, inst=BINARY_SUBSCR(arg=None, lineno=1054)
DEBUG:numba.core.byteflow:stack ['$x2.0', '$const4.1']
DEBUG:numba.core.byteflow:dispatch pc=8, inst=LOAD_FAST(arg=0, lineno=1054)
DEBUG:numba.core.byteflow:stack ['$6binary_subscr.2']
DEBUG:numba.core.byteflow:dispatch pc=10, inst=LOAD_CONST(arg=2, lineno=1054)
DEBUG:numba.core.byteflow:stack ['$6binary_subscr.2', '$x8.3']
DEBUG:numba.core.byteflow:dispatch pc=12, inst=BINARY_SUBSCR(arg=None, lineno=1054)
DEBUG:numba.core.byteflow:stack ['$6binary_subscr.2', '$x8.3', '$const10.4']
DEBUG:numba.core.byteflow:dispatch pc=14, inst=COMPARE_OP(arg=4, lineno=1054)
DEBUG:numba.core.byteflow:stack ['$6binary_subscr.2', '$12binary_subscr.5']
DEBUG:numba.core.byteflow:dispatch pc=16, inst=LOAD_FAST(arg=0, lineno=1054)
DEBUG:numba.core.byteflow:stack ['$14compare_op.6']
DEBUG:numba.core.byteflow:dispatch pc=18, inst=LOAD_CONST(arg=1, lineno=1054)
DEBUG:numba.core.byteflow:stack ['$14compare_op.6', '$x16.7']
DEBUG:numba.core.byteflow:dispatch pc=20, inst=BINARY_SUBSCR(arg=None, lineno=1054)
DEBUG:numba.core.byteflow:stack ['$14compare_op.6', '$x16.7', '$const18.8']
DEBUG:numba.core.byteflow:dispatch pc=22, inst=LOAD_FAST(arg=0, lineno=1054)
DEBUG:numba.core.byteflow:stack ['$14compare_op.6', '$20binary_subscr.9']
DEBUG:numba.core.byteflow:dispatch pc=24, inst=LOAD_CONST(arg=3, lineno=1054)
DEBUG:numba.core.byteflow:stack ['$14compare_op.6', '$20binary_subscr.9', '$x22.10']
DEBUG:numba.core.byteflow:dispatch pc=26, inst=BINARY_SUBSCR(arg=None, lineno=1054)
DEBUG:numba.core.byteflow:stack ['$14compare_op.6', '$20binary_subscr.9', '$x22.10', '$const24.11']
DEBUG:numba.core.byteflow:dispatch pc=28, inst=COMPARE_OP(arg=5, lineno=1054)
DEBUG:numba.core.byteflow:stack ['$14compare_op.6', '$20binary_subscr.9', '$26binary_subscr.12']
DEBUG:numba.core.byteflow:dispatch pc=30, inst=BINARY_AND(arg=None, lineno=1054)
DEBUG:numba.core.byteflow:stack ['$14compare_op.6', '$28compare_op.13']
DEBUG:numba.core.byteflow:dispatch pc=32, inst=RETURN_VALUE(arg=None, lineno=1054)
DEBUG:numba.core.byteflow:stack ['$30binary_and.14']
DEBUG:numba.core.byteflow:end state. edges=[]
DEBUG:numba.core.byteflow:-------------------------Prune PHIs-------------------------
DEBUG:numba.core.byteflow:Used_phis: defaultdict(<class 'set'>, {State(pc_initial=0 nstack_initial=0): set()})
DEBUG:numba.core.byteflow:defmap: {}
DEBUG:numba.core.byteflow:phismap: defaultdict(<class 'set'>, {})
DEBUG:numba.core.byteflow:changing phismap: defaultdict(<class 'set'>, {})
DEBUG:numba.core.byteflow:keep phismap: {}
DEBUG:numba.core.byteflow:new_out: defaultdict(<class 'dict'>, {})
DEBUG:numba.core.byteflow:----------------------DONE Prune PHIs-----------------------
DEBUG:numba.core.byteflow:block_infos State(pc_initial=0 nstack_initial=0):
AdaptBlockInfo(insts=((0, {}), (2, {'res': '$x2.0'}), (4, {'res': '$const4.1'}), (6, {'index': '$const4.1', 'target': '$x2.0', 'res': '$6binary_subscr.2'}), (8, {'res': '$x8.3'}), (10, {'res': '$const10.4'}), (12, {'index': '$const10.4', 'target': '$x8.3', 'res': '$12binary_subscr.5'}), (14, {'lhs': '$6binary_subscr.2', 'rhs': '$12binary_subscr.5', 'res': '$14compare_op.6'}), (16, {'res': '$x16.7'}), (18, {'res': '$const18.8'}), (20, {'index': '$const18.8', 'target': '$x16.7', 'res': '$20binary_subscr.9'}), (22, {'res': '$x22.10'}), (24, {'res': '$const24.11'}), (26, {'index': '$const24.11', 'target': '$x22.10', 'res': '$26binary_subscr.12'}), (28, {'lhs': '$20binary_subscr.9', 'rhs': '$26binary_subscr.12', 'res': '$28compare_op.13'}), (30, {'lhs': '$14compare_op.6', 'rhs': '$28compare_op.13', 'res': '$30binary_and.14'}), (32, {'retval': '$30binary_and.14', 'castval': '$32return_value.15'})), outgoing_phis={}, blockstack=(), active_try_block=None, outgoing_edgepushed={})
DEBUG:numba.core.interpreter:label 0:
    x = arg(0, name=x)                       ['x']
    $const4.1 = const(int, 0)                ['$const4.1']
    $6binary_subscr.2 = getitem(value=x, index=$const4.1, fn=<built-in function getitem>) ['$6binary_subscr.2', '$const4.1', 'x']
    $const10.4 = const(int, -1)              ['$const10.4']
    $12binary_subscr.5 = getitem(value=x, index=$const10.4, fn=<built-in function getitem>) ['$12binary_subscr.5', '$const10.4', 'x']
    $14compare_op.6 = $6binary_subscr.2 > $12binary_subscr.5 ['$12binary_subscr.5', '$14compare_op.6', '$6binary_subscr.2']
    $const18.8 = const(int, 0)               ['$const18.8']
    $20binary_subscr.9 = getitem(value=x, index=$const18.8, fn=<built-in function getitem>) ['$20binary_subscr.9', '$const18.8', 'x']
    $const24.11 = const(int, 1)              ['$const24.11']
    $26binary_subscr.12 = getitem(value=x, index=$const24.11, fn=<built-in function getitem>) ['$26binary_subscr.12', '$const24.11', 'x']
    $28compare_op.13 = $20binary_subscr.9 >= $26binary_subscr.12 ['$20binary_subscr.9', '$26binary_subscr.12', '$28compare_op.13']
    $30binary_and.14 = $14compare_op.6 & $28compare_op.13 ['$14compare_op.6', '$28compare_op.13', '$30binary_and.14']
    $32return_value.15 = cast(value=$30binary_and.14) ['$30binary_and.14', '$32return_value.15']
    return $32return_value.15                ['$32return_value.15']

DEBUG:numba.core.byteflow:bytecode dump:
>          0    NOP(arg=None, lineno=1060)
           2    LOAD_FAST(arg=0, lineno=1060)
           4    LOAD_CONST(arg=1, lineno=1060)
           6    BINARY_SUBSCR(arg=None, lineno=1060)
           8    LOAD_FAST(arg=0, lineno=1060)
          10    LOAD_CONST(arg=2, lineno=1060)
          12    BINARY_SUBSCR(arg=None, lineno=1060)
          14    COMPARE_OP(arg=0, lineno=1060)
          16    LOAD_FAST(arg=0, lineno=1060)
          18    LOAD_CONST(arg=1, lineno=1060)
          20    BINARY_SUBSCR(arg=None, lineno=1060)
          22    LOAD_FAST(arg=0, lineno=1060)
          24    LOAD_CONST(arg=3, lineno=1060)
          26    BINARY_SUBSCR(arg=None, lineno=1060)
          28    COMPARE_OP(arg=1, lineno=1060)
          30    BINARY_AND(arg=None, lineno=1060)
          32    RETURN_VALUE(arg=None, lineno=1060)
DEBUG:numba.core.byteflow:pending: deque([State(pc_initial=0 nstack_initial=0)])
DEBUG:numba.core.byteflow:stack: []
DEBUG:numba.core.byteflow:dispatch pc=0, inst=NOP(arg=None, lineno=1060)
DEBUG:numba.core.byteflow:stack []
DEBUG:numba.core.byteflow:dispatch pc=2, inst=LOAD_FAST(arg=0, lineno=1060)
DEBUG:numba.core.byteflow:stack []
DEBUG:numba.core.byteflow:dispatch pc=4, inst=LOAD_CONST(arg=1, lineno=1060)
DEBUG:numba.core.byteflow:stack ['$x2.0']
DEBUG:numba.core.byteflow:dispatch pc=6, inst=BINARY_SUBSCR(arg=None, lineno=1060)
DEBUG:numba.core.byteflow:stack ['$x2.0', '$const4.1']
DEBUG:numba.core.byteflow:dispatch pc=8, inst=LOAD_FAST(arg=0, lineno=1060)
DEBUG:numba.core.byteflow:stack ['$6binary_subscr.2']
DEBUG:numba.core.byteflow:dispatch pc=10, inst=LOAD_CONST(arg=2, lineno=1060)
DEBUG:numba.core.byteflow:stack ['$6binary_subscr.2', '$x8.3']
DEBUG:numba.core.byteflow:dispatch pc=12, inst=BINARY_SUBSCR(arg=None, lineno=1060)
DEBUG:numba.core.byteflow:stack ['$6binary_subscr.2', '$x8.3', '$const10.4']
DEBUG:numba.core.byteflow:dispatch pc=14, inst=COMPARE_OP(arg=0, lineno=1060)
DEBUG:numba.core.byteflow:stack ['$6binary_subscr.2', '$12binary_subscr.5']
DEBUG:numba.core.byteflow:dispatch pc=16, inst=LOAD_FAST(arg=0, lineno=1060)
DEBUG:numba.core.byteflow:stack ['$14compare_op.6']
DEBUG:numba.core.byteflow:dispatch pc=18, inst=LOAD_CONST(arg=1, lineno=1060)
DEBUG:numba.core.byteflow:stack ['$14compare_op.6', '$x16.7']
DEBUG:numba.core.byteflow:dispatch pc=20, inst=BINARY_SUBSCR(arg=None, lineno=1060)
DEBUG:numba.core.byteflow:stack ['$14compare_op.6', '$x16.7', '$const18.8']
DEBUG:numba.core.byteflow:dispatch pc=22, inst=LOAD_FAST(arg=0, lineno=1060)
DEBUG:numba.core.byteflow:stack ['$14compare_op.6', '$20binary_subscr.9']
DEBUG:numba.core.byteflow:dispatch pc=24, inst=LOAD_CONST(arg=3, lineno=1060)
DEBUG:numba.core.byteflow:stack ['$14compare_op.6', '$20binary_subscr.9', '$x22.10']
DEBUG:numba.core.byteflow:dispatch pc=26, inst=BINARY_SUBSCR(arg=None, lineno=1060)
DEBUG:numba.core.byteflow:stack ['$14compare_op.6', '$20binary_subscr.9', '$x22.10', '$const24.11']
DEBUG:numba.core.byteflow:dispatch pc=28, inst=COMPARE_OP(arg=1, lineno=1060)
DEBUG:numba.core.byteflow:stack ['$14compare_op.6', '$20binary_subscr.9', '$26binary_subscr.12']
DEBUG:numba.core.byteflow:dispatch pc=30, inst=BINARY_AND(arg=None, lineno=1060)
DEBUG:numba.core.byteflow:stack ['$14compare_op.6', '$28compare_op.13']
DEBUG:numba.core.byteflow:dispatch pc=32, inst=RETURN_VALUE(arg=None, lineno=1060)
DEBUG:numba.core.byteflow:stack ['$30binary_and.14']
DEBUG:numba.core.byteflow:end state. edges=[]
DEBUG:numba.core.byteflow:-------------------------Prune PHIs-------------------------
DEBUG:numba.core.byteflow:Used_phis: defaultdict(<class 'set'>, {State(pc_initial=0 nstack_initial=0): set()})
DEBUG:numba.core.byteflow:defmap: {}
DEBUG:numba.core.byteflow:phismap: defaultdict(<class 'set'>, {})
DEBUG:numba.core.byteflow:changing phismap: defaultdict(<class 'set'>, {})
DEBUG:numba.core.byteflow:keep phismap: {}
DEBUG:numba.core.byteflow:new_out: defaultdict(<class 'dict'>, {})
DEBUG:numba.core.byteflow:----------------------DONE Prune PHIs-----------------------
DEBUG:numba.core.byteflow:block_infos State(pc_initial=0 nstack_initial=0):
AdaptBlockInfo(insts=((0, {}), (2, {'res': '$x2.0'}), (4, {'res': '$const4.1'}), (6, {'index': '$const4.1', 'target': '$x2.0', 'res': '$6binary_subscr.2'}), (8, {'res': '$x8.3'}), (10, {'res': '$const10.4'}), (12, {'index': '$const10.4', 'target': '$x8.3', 'res': '$12binary_subscr.5'}), (14, {'lhs': '$6binary_subscr.2', 'rhs': '$12binary_subscr.5', 'res': '$14compare_op.6'}), (16, {'res': '$x16.7'}), (18, {'res': '$const18.8'}), (20, {'index': '$const18.8', 'target': '$x16.7', 'res': '$20binary_subscr.9'}), (22, {'res': '$x22.10'}), (24, {'res': '$const24.11'}), (26, {'index': '$const24.11', 'target': '$x22.10', 'res': '$26binary_subscr.12'}), (28, {'lhs': '$20binary_subscr.9', 'rhs': '$26binary_subscr.12', 'res': '$28compare_op.13'}), (30, {'lhs': '$14compare_op.6', 'rhs': '$28compare_op.13', 'res': '$30binary_and.14'}), (32, {'retval': '$30binary_and.14', 'castval': '$32return_value.15'})), outgoing_phis={}, blockstack=(), active_try_block=None, outgoing_edgepushed={})
DEBUG:numba.core.interpreter:label 0:
    x = arg(0, name=x)                       ['x']
    $const4.1 = const(int, 0)                ['$const4.1']
    $6binary_subscr.2 = getitem(value=x, index=$const4.1, fn=<built-in function getitem>) ['$6binary_subscr.2', '$const4.1', 'x']
    $const10.4 = const(int, -1)              ['$const10.4']
    $12binary_subscr.5 = getitem(value=x, index=$const10.4, fn=<built-in function getitem>) ['$12binary_subscr.5', '$const10.4', 'x']
    $14compare_op.6 = $6binary_subscr.2 < $12binary_subscr.5 ['$12binary_subscr.5', '$14compare_op.6', '$6binary_subscr.2']
    $const18.8 = const(int, 0)               ['$const18.8']
    $20binary_subscr.9 = getitem(value=x, index=$const18.8, fn=<built-in function getitem>) ['$20binary_subscr.9', '$const18.8', 'x']
    $const24.11 = const(int, 1)              ['$const24.11']
    $26binary_subscr.12 = getitem(value=x, index=$const24.11, fn=<built-in function getitem>) ['$26binary_subscr.12', '$const24.11', 'x']
    $28compare_op.13 = $20binary_subscr.9 <= $26binary_subscr.12 ['$20binary_subscr.9', '$26binary_subscr.12', '$28compare_op.13']
    $30binary_and.14 = $14compare_op.6 & $28compare_op.13 ['$14compare_op.6', '$28compare_op.13', '$30binary_and.14']
    $32return_value.15 = cast(value=$30binary_and.14) ['$30binary_and.14', '$32return_value.15']
    return $32return_value.15                ['$32return_value.15']

Traceback (most recent call last):
  File "train.py", line 23, in <module>
    from models import (
  File "E:\vits\models.py", line 10, in <module>
    import monotonic_align
  File "E:\vits\monotonic_align\__init__.py", line 3, in <module>
    from .monotonic_align.core import maximum_path_c
ModuleNotFoundError: No module named 'monotonic_align.monotonic_align'

创建了E:\vits\monotonic_align\__pycache__E:\vits\__pycache__
ModuleNotFoundError:没有名为“monotonic_align.monotonic_align”的模块
构建单调对齐搜索并运行预处理

# Cython-version Monotonoic Alignment Search
cd monotonic_align
python setup.py build_ext --inplace

cmd运行结果——

Administrator@AUTOBVT-Q90417J MINGW64 /e/vits (main)
$ cd monotonic_align

Administrator@AUTOBVT-Q90417J MINGW64 /e/vits/monotonic_align (main)
$ python setup.py build_ext --inplace
Compiling core.pyx because it changed.
[1/1] Cythonizing core.pyx
running build_ext
building 'monotonic_align.core' extension
C:\Program Files\Python38\lib\site-packages\Cython\Compiler\Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: E:\vits\monotonic_align\core.pyx
  tree = Parsing.p_module(s, pxd, full_module_name)
error: Unable to find vcvarsall.bat

报告了两个错误
1、C:\Program Files\Python38\lib\site-packages\Cython\Compiler\Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release!

如果你期望编译的版本不是python2,那就指定自己要用哪个版本编译,或者在每个要编译的版本 .py 文件顶上添加一行指定cython版本,即# cython: language_level=3,但如果有成千上成个 .py 或 .pyx 文件,就不好处理了,在 setup.py 中添加:

cythonize(module_item,compiler_directives={'language_level': '3'})

此处摘自Cython directive 'language_level' not set, using 2 for now (Py2)
Cython——[FutureWarning: Cython directive ‘language_level’ not set, using 2 for now (Py2)]解决方案
2、在运行带Cython模块的py文件时,有可能输出如下报错信息:

error: Unable to find vcvarsall.bat

原因是没有找到vcvarsall.bat指定的vc++编译器进行编译。大多数解决方案都要求安装Visual Studio。
当前主流Python版本与VC和VS的版本对应关系及各版本VS下载地址:

CPython Visual C++ Visual Studio Visual Studio下载地址
2.6, 2.7, 3.0, 3.1, 3.2 9.0 Visual Studio 2008 x86下载 x64下载
3.3, 3.4 10.0 Visual Studio 2010 x86下载 x64下载
3.5 14.0 Visual Studio 2015 下载

上表摘自Cython出现错误:Unable to find vcvarsall.bat
无需安装VS,一行命令解决"Unable to find vcvarsall.bat"提供了另一种解决方法

运行环境

  • Windows 10 (64-bit)
  • Python 3.7

1、安装anaconda。Anaconda强大的包管理和环境管理可以帮助我们节省大量时间与精力,让我们能更专注于代码,而不是把精力花在各种莫名其妙的环境或依赖问题上。
2、在anaconda的命令行中输入命令:conda install libpython

我用pip安装它:pip install libpython
cmd运行结果——

$ pip install libpython
Collecting libpython
  Downloading libpython-0.2.tar.gz (15 kB)
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Requirement already satisfied: requests in c:\program files\python38\lib\site-packages (from libpython) (2.28.2)
Requirement already satisfied: idna<4,>=2.5 in c:\program files\python38\lib\site-packages (from requests->libpython) (3.4)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in c:\program files\python38\lib\site-packages (from requests->libpython) (1.26.14)
Requirement already satisfied: certifi>=2017.4.17 in c:\program files\python38\lib\site-packages (from requests->libpython) (2022.12.7)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\program files\python38\lib\site-packages (from requests->libpython) (3.0.1)
Building wheels for collected packages: libpython
  Building wheel for libpython (setup.py): started
  Building wheel for libpython (setup.py): finished with status 'done'
  Created wheel for libpython: filename=libpython-0.2-py3-none-any.whl size=14410 sha256=c8c0bf0dbd5502f14e73d0da51314ce2507c4e118dc866d6722720c3f5c8c743
  Stored in directory: c:\users\administrator\appdata\local\pip\cache\wheels\f8\0e\ae\9a8610c41be91787c7899e435d6bcb161fa8df32ea3d371ecf
Successfully built libpython
Installing collected packages: libpython
Successfully installed libpython-0.2

回到「构建单调对齐搜索并运行预处理」,看看会发生什么

Administrator@AUTOBVT-Q90417J MINGW64 /e/vits (main)
$ cd monotonic_align

Administrator@AUTOBVT-Q90417J MINGW64 /e/vits/monotonic_align (main)
$ python setup.py build_ext --inplace
running build_ext
building 'monotonic_align.core' extension
error: Unable to find vcvarsall.bat

即便卸载 libpython 也不再出现

C:\Program Files\Python38\lib\site-packages\Cython\Compiler\Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: E:\vits\monotonic_align\core.pyx
  tree = Parsing.p_module(s, pxd, full_module_name)

删掉本地仓库重来方重现报错
看来在Windows 7 上安装 libpython 可能解决不了问题……

# Preprocessing (g2p) for your own datasets. Preprocessed phonemes for LJ Speech and VCTK have been already provided.
# python preprocess.py --text_index 1 --filelists filelists/ljs_audio_text_train_filelist.txt filelists/ljs_audio_text_val_filelist.txt filelists/ljs_audio_text_test_filelist.txt 
# python preprocess.py --text_index 2 --filelists filelists/vctk_audio_sid_text_train_filelist.txt filelists/vctk_audio_sid_text_val_filelist.txt filelists/vctk_audio_sid_text_test_filelist.txt

cmd运行结果——

$ python preprocess.py --text_index 1 --filelists filelists/ljs_audio_text_train_filelist.txt filelists/ljs_audio_text_val_filelist.txt filelists/ljs_audio_text_test_filelist.txt
START: filelists/ljs_audio_text_train_filelist.txt
Traceback (most recent call last):
  File "preprocess.py", line 20, in <module>
    cleaned_text = text._clean_text(original_text, args.text_cleaners)
  File "E:\vits\text\__init__.py", line 53, in _clean_text
    text = cleaner(text)
  File "E:\vits\text\cleaners.py", line 98, in english_cleaners2
    phonemes = phonemize(text, language='en-us', backend='espeak', strip=True, preserve_punctuation=True, with_stress=True)
  File "C:\Program Files\Python38\lib\site-packages\phonemizer\phonemize.py", line 206, in phonemize
    phonemizer = BACKENDS[backend](
  File "C:\Program Files\Python38\lib\site-packages\phonemizer\backend\espeak\espeak.py", line 45, in __init__
    super().__init__(
  File "C:\Program Files\Python38\lib\site-packages\phonemizer\backend\espeak\base.py", line 39, in __init__
    super().__init__(
  File "C:\Program Files\Python38\lib\site-packages\phonemizer\backend\base.py", line 77, in __init__
    raise RuntimeError(  # pragma: nocover
RuntimeError: espeak not installed on your system

创建了E:\vits\text\__pycache__E:\vits\__pycache__
解决方法
RuntimeError: espeak not installed on your system【已解决】
RuntimeError: espeak not installed on your system #44

未完待续

jaywalnut310/vits坑就踩到这里,安装的依赖库严重影响Whisper正常使用。以后用Linux再试。
jaywalnut310/vits相关的「端到端语音合成模型VITS,日语数据训练」Ikaros/vits-japanese

下一篇开始学习
CjangCjengh/vits
下江小春也能看懂的语音模型训练教程
【VITS/语音合成】使用『预训练模型』快速拟合你的语音模型