Top > Search of International Patents > VOICE QUALITY CONVERSION DEVICE, VOICE QUALITY CONVERSION METHOD AND PROGRAM

VOICE QUALITY CONVERSION DEVICE, VOICE QUALITY CONVERSION METHOD AND PROGRAM

Foreign code F170009223
File No. (S2016-0415-N0)
Posted date Sep 14, 2017
Country WIPO
International application number 2017JP006478
International publication number WO 2017146073
Date of international filing Feb 22, 2017
Date of international publication Aug 31, 2017
Priority data
  • P2016-032488 (Feb 23, 2016) JP
Title VOICE QUALITY CONVERSION DEVICE, VOICE QUALITY CONVERSION METHOD AND PROGRAM
Abstract In order to provide a voice quality conversion device that makes it possible to convert voice quality to the voice quality of a target speaker, even without identifying in advance an inputting speaker, the present invention includes a parameter learning unit which: by using as variables voice information based on a voice, speaker information corresponding to the voice information, and phoneme information expressing a phoneme in the voice, prepares a probability model expressing, according to a parameter, the combining energy relationship among the voice information, the speaker information and the phoneme information; and by the sequential input to the probability model of voice information and speaker information corresponding to the voice information, determines the parameter by learning. In addition, the present invention includes a voice quality conversion processing unit which, on the basis of the parameter determined by the parameter learning unit and speaker information of a target speaker, performs voice quality conversion processing of voice information based on the voice of an input speaker.
Scope of claims [claim1]
1. Sound of the input speaker voice quality is converted being the voice quality converter which in sound of the target speaker,
The aforementioned speaker information which by the fact that each one of the speaker information which corresponds to audio information and the audio information which are based on sound and the phonological information which displays the phonology in sound is designated as variable, prepares the probability model which displays the related characteristic of binding energy between each one of the aforementioned audio information, the aforementioned speaker information and the aforementioned phonological information with parameter corresponds to the aforementioned audio information and the aforementioned audio information by the fact that consecutively it inputs into the aforementioned probability model, the parameter learning unit which decides the aforementioned parameter with study and,
The voice quality conversion processing unit which does the voice quality conversion processing of the aforementioned audio information which is based on the sound of the aforementioned input speaker on the basis of with the aforementioned parameter and the aforementioned speaker information of the aforementioned target speaker which are decided by the aforementioned parameter learning unit, and,
The voice quality converter which it has.
[claim2]
2. The aforementioned parameter, projection queue gathering A, bias b of the aforementioned audio information and the bias consists of seven parameters of deviation IìL of c of the aforementioned phonological information which are decided by M, displays the related characteristic extent of the aforementioned phonological information and the aforementioned speaker information V, display the related characteristic extent of the aforementioned speaker information and the aforementioned audio information U and the aforementioned speaker information which displays the related characteristic extent of the aforementioned audio information and the aforementioned phonological information, and the aforementioned audio information,
These seven parameters, the aforementioned audio information v, by the fact that h and the aforementioned speaker information are designated as s, can connect the aforementioned phonological information with (A) the type -(D) formula below,
In claim 1 voice quality converter of statement.
[claim3]
3. Sound of the input speaker being the voice quality conversion method voice quality of converting in sound of the target speaker,
By the fact that each one of the speaker information which corresponds to audio information and the audio information which are based on sound and the phonological information which displays the phonology in sound is designated as variable, in the probability model which displays the related characteristic of binding energy between each one of the aforementioned audio information, the aforementioned speaker information and the aforementioned phonological information with parameter, the aforementioned speaker information which corresponds to the aforementioned audio information and the aforementioned audio information by the fact that consecutively it inputs into the aforementioned probability model, the parameter learning step which decides the aforementioned parameter with study and,
The voice quality conversion processing step which does the voice quality conversion processing of the aforementioned audio information which is based on the sound of the aforementioned input speaker on the basis of with the aforementioned parameter and the aforementioned speaker information of the aforementioned target speaker which are decided by the aforementioned parameter learning step,
It includes, voice quality conversion method.
[claim4]
4. By the fact that each one of the speaker information which corresponds to audio information and the audio information which are based on sound and the phonological information which displays the phonology in sound is designated as variable, in the probability model which displays the related characteristic of binding energy between each one of the aforementioned audio information, the aforementioned speaker information and the aforementioned phonological information with parameter, the aforementioned speaker information which corresponds to the aforementioned audio information and the aforementioned audio information by the fact that consecutively it inputs into the aforementioned probability model, the parameter learning step which decides the aforementioned parameter with study and,
The voice quality conversion processing step which does the voice quality conversion processing of the aforementioned audio information which is based on the sound of the input speaker on the basis of with the aforementioned parameter and the aforementioned speaker information of the target speaker which are decided by the aforementioned parameter learning step,
The program which is made to execute to the computer.
  • Applicant
  • ※All designated countries except for US in the data before July 2012
  • THE UNIVERSITY OF ELECTRO-COMMUNICATIONS
  • Inventor
  • NAKASHIKA TORU
  • MINAMI YASUHIRO
IPC(International Patent Classification)
Specified countries (WO2017146073)
National States: AE AG AL AM AO AT AU AZ BA BB BG BH BN BR BW BY BZ CA CH CL CN CO CR CU CZ DE DJ DK DM DO DZ EC EE EG ES FI GB GD GE GH GM GT HN HR HU ID IL IN IR IS JP KE KG KH KN KP KR KW KZ LA LC LK LR LS LU LY MA MD ME MG MK MN MW MX MY MZ NA NG NI NO NZ OM PA PE PG PH PL PT QA RO RS RU RW SA SC SD SE SG SK SL SM ST SV SY TH TJ TM TN TR TT TZ UA UG US UZ VC VN ZA ZM ZW
ARIPO: BW GH GM KE LR LS MW MZ NA RW SD SL SZ TZ UG ZM ZW
EAPO: AM AZ BY KG KZ RU TJ TM
EPO: AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR
OAPI: BF BJ CF CG CI CM GA GN GQ GW KM ML MR NE SN ST TD TG
Please contact us by E-mail or facsimile if you have any interests on this patent.

PAGE TOP

close
close
close
close
close
close