TOP > 外国特許検索 > VOICE QUALITY CONVERSION DEVICE, VOICE QUALITY CONVERSION METHOD AND PROGRAM

VOICE QUALITY CONVERSION DEVICE, VOICE QUALITY CONVERSION METHOD AND PROGRAM

外国特許コード F170009223
整理番号 (S2016-0415-N0)
掲載日 2017年9月14日
出願国 世界知的所有権機関(WIPO)
国際出願番号 2017JP006478
国際公開番号 WO 2017146073
国際出願日 平成29年2月22日(2017.2.22)
国際公開日 平成29年8月31日(2017.8.31)
優先権データ
  • 特願2016-032488 (2016.2.23) JP
発明の名称 (英語) VOICE QUALITY CONVERSION DEVICE, VOICE QUALITY CONVERSION METHOD AND PROGRAM
発明の概要(英語) In order to provide a voice quality conversion device that makes it possible to convert voice quality to the voice quality of a target speaker, even without identifying in advance an inputting speaker, the present invention includes a parameter learning unit which: by using as variables voice information based on a voice, speaker information corresponding to the voice information, and phoneme information expressing a phoneme in the voice, prepares a probability model expressing, according to a parameter, the combining energy relationship among the voice information, the speaker information and the phoneme information; and by the sequential input to the probability model of voice information and speaker information corresponding to the voice information, determines the parameter by learning. In addition, the present invention includes a voice quality conversion processing unit which, on the basis of the parameter determined by the parameter learning unit and speaker information of a target speaker, performs voice quality conversion processing of voice information based on the voice of an input speaker.
従来技術、競合技術の概要(英語) BACKGROUND ART
Conventional, the input of the speaker's voice while maintaining phonetic information, only information relating to the speaker output of a speaker to convert the field of voice conversion technology, at the time of learning of the model, the input speech by the speaker and the output of the speaker in the same parallel data to the speech-to-voice conversion is mainly used in parallel. Parallel as a voice, a method based on GMM(Gaussian Mixture Model), NMF(Non-negative Matrix Factrization) -based approaches, such as techniques based on DNN(Deep Neural Network), various statistical approach has been proposed (see Patent Document 1). In parallel the converted speech, by virtue of the parallel constraint and a relatively high accuracy is obtained on the other hand, the learning data is input and the output of the speaker utterance of the speaker it is required that the contents agree with each other, the convenience is lost.
On the other hand, upon learning of the model using the above parallel data of a non-parallel to the voice conversion is getting much attention. Voice conversion is non-parallel, parallel-to-voice conversion as compared to speech degrades the precision of the freedom can be performed by learning using the convenience and practicability is high. Non-Patent Document 1 is, input and output voice of a speaker using the voice of the individual parameters in advance can be learned, the learning data included in the speaker or target speaker and input speaker and the voice conversion according to the present invention.
  • 出願人(英語)
  • ※2012年7月以前掲載分については米国以外のすべての指定国
  • THE UNIVERSITY OF ELECTRO-COMMUNICATIONS
  • 発明者(英語)
  • NAKASHIKA Toru
  • MINAMI Yasuhiro
国際特許分類(IPC)
指定国 National States: AE AG AL AM AO AT AU AZ BA BB BG BH BN BR BW BY BZ CA CH CL CN CO CR CU CZ DE DJ DK DM DO DZ EC EE EG ES FI GB GD GE GH GM GT HN HR HU ID IL IN IR IS JP KE KG KH KN KP KR KW KZ LA LC LK LR LS LU LY MA MD ME MG MK MN MW MX MY MZ NA NG NI NO NZ OM PA PE PG PH PL PT QA RO RS RU RW SA SC SD SE SG SK SL SM ST SV SY TH TJ TM TN TR TT TZ UA UG US UZ VC VN ZA ZM ZW
ARIPO: BW GH GM KE LR LS MW MZ NA RW SD SL SZ TZ UG ZM ZW
EAPO: AM AZ BY KG KZ RU TJ TM
EPO: AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR
OAPI: BF BJ CF CG CI CM GA GN GQ GW KM ML MR NE SN ST TD TG
ライセンスをご希望の方、特許の内容に興味を持たれた方は、下記までご連絡ください

PAGE TOP

close
close
close
close
close
close