MURAL - Maynooth University Research Archive Library



    Efficient Approaches for Voice Change and Voice Conversion Systems


    Ye, Yuhang (2019) Efficient Approaches for Voice Change and Voice Conversion Systems. Masters thesis, National University of Ireland Maynooth.

    [thumbnail of Yuhang-Ye-11126876-MEngSc-EE-MU.pdf]
    Preview
    Text
    Yuhang-Ye-11126876-MEngSc-EE-MU.pdf

    Download (3MB) | Preview

    Abstract

    In this thesis, the study and design of Voice Change and Voice Conversion systems are presented. Particularly, a voice change system manipulates a speaker’s voice to be perceived as it is not spoken by this speaker; and voice conversion system modifies a speaker’s voice, such that it is perceived as being spoken by a target speaker. This thesis mainly includes two sub-parts. The first part is to develop a low latency and low complexity voice change system (i.e. includes frequency/pitch scale modification and formant scale modification algorithms), which can be executed on the smartphones in 2012 with very limited computational capability. Although some low-complexity voice change algorithms have been proposed and studied, the real-time implementations are very rare. According to the experimental results, the proposed voice change system achieves the same quality as the baseline approach but requires much less computational complexity and satisfies the requirement of real-time. Moreover, the proposed system has been implemented in C language and was released as a commercial software application. The second part of this thesis is to investigate a novel low-complexity voice conversion system (i.e. from a source speaker A to a target speaker B) that improves the perceptual quality and identity without introducing large processing latencies. The proposed scheme directly manipulates the spectrum using an effective and physically motivated method – Continuous Frequency Warping and Magnitude Scaling (CFWMS) to guarantee high perceptual naturalness and quality. In addition, a trajectory limitation strategy is proposed to prevent the frame-by-frame discontinuity to further enhance the speech quality. The experimental results show that the proposed method outperforms the conventional baseline solutions in terms of either objective tests or subjective tests.
    Item Type: Thesis (Masters)
    Keywords: Efficient Approaches; Voice Change; Voice Conversion; Systems;
    Academic Unit: Faculty of Science and Engineering > Electronic Engineering
    Item ID: 13829
    Depositing User: IR eTheses
    Date Deposited: 13 Jan 2021 15:38
    URI: https://mu.eprints-hosting.org/id/eprint/13829
    Use Licence: This item is available under a Creative Commons Attribution Non Commercial Share Alike Licence (CC BY-NC-SA). Details of this licence are available here

    Repository Staff Only (login required)

    Item control page
    Item control page

    Downloads

    Downloads per month over past year

    Origin of downloads