Top > Search of International Patents > Voice interaction apparatus, its processing method, and program

Voice interaction apparatus, its processing method, and program

Foreign code F200010122
File No. 5788
Posted date May 18, 2020
Country United States of America
Application number 201815883240
Gazette No. 10416957
Date of filing Jan 30, 2018
Gazette Date Sep 17, 2019
Priority data
  • P2017-040580 (Mar 3, 2017) JP
Title Voice interaction apparatus, its processing method, and program
Abstract A voice interaction apparatus incudes voice recognition means for recognizing a voice of a user, response-sentence generation means for generating a response sentence to the voice of the user based on the recognized voice, filler generation means for generating a filler word to be inserted in a conversation, output means for outputting the generated response sentence and the generated filler word, and classification means for classifying the generated response sentence into one of predetermined speech patterns indicating predefined speech types. When the output means outputs, after the user utters a voice subsequent to the first response sentence, the filler word and outputs a second response sentence, the classification means classifies the first response sentence into one of the speech patterns, and the filler generation means generates the filler word based on the speech pattern into which the first response sentence has been classified.
Outline of related art and contending technology BACKGROUND
The present disclosure relates to a voice interaction apparatus that performs a voice interaction with a user, and its processing method and a program therefor.
A voice interaction apparatus that inserts filler words (i.e., words for filling silences in conversations) to prevent silences in conversations from being unnaturally prolonged has been known (see Japanese Unexamined Patent Application Publication No. 2014-191030).
However, the present inventors have found the following problem. That is, the aforementioned voice interaction apparatus outputs a formal (i.e., perfunctory) filler word as a word for filling a silence when a waiting time occurs in a conversation. Therefore, there is a possibility that the inserted filler word may not fit well with the content (e.g., meaning) of the conversation and hence make the conversation unnatural.
Scope of claims [claim1]
1. A voice interaction apparatus comprising:
circuitry configured to:
recognize a voice of a user;
generate a first response sentence to the voice of the user based on the voice recognized voice;
generate a filler word to be inserted in a conversation with the user;
output the first response sentence and the filler word;
classify the first response sentence into one of predetermined speech patterns indicating predefined speech types; and
when outputting, after the user utters a voice subsequent to the outputting of the first response sentence, the filler word and outputting a second response sentence:
classify the first response sentence into one of the predetermined speech patterns, and
generate the filler word based on the predetermined speech pattern into which the first response sentence has been classified.

[claim2]
2. The voice interaction apparatus according to claim 1, wherein the circuitry is configured to:
store table information including the predetermined speech patterns and information about types of feature values associated with the predetermined speech patterns;
calculate a feature value of a preceding or subsequent speech based on information about a type of a feature value associated with the predetermined speech pattern into which the first response sentence has been classified; and
generate the filler word based on the calculated feature value.

[claim3]
3. The voice interaction apparatus according to claim 2, wherein the information about the type of the feature value includes at least one of prosodic information of the preceding speech, linguistic information of the preceding speech, linguistic information of the subsequent speech, and prosodic information of the subsequent speech.

[claim4]
4. The voice interaction apparatus according to claim 2, wherein the circuitry is configured to:
store filler form information associated with respective feature values of filler types each of which includes at least one filler word and indicates a type of the at least one filler word; and
narrow down a number of filler types based on the speech predetermined pattern into which the first response sentence has been classified;
select one filler type associated with the calculated feature value from among the narrowed-down number of filler types; and
generate the filler word by selecting the filler word included in the selected filler type.

[claim5]
5. A processing method for voice interaction, comprising:
recognizing a voice of a user;
generating a first response sentence to the voice of the user based on the recognized voice;
generating a filler word to be inserted in a conversation with the user;
outputting the first response sentence and the filler word; and
when outputting, after the user utters a voice subsequent to the outputting of the first response sentence, the filler word and outputting a second response sentence:
classifying the first response sentence into one of predetermined speech patterns indicating predefined speech types, and
generating the filler word based on the predetermined speech pattern into which the first response sentence has been classified.

[claim6]
6. A non-transitory computer readable medium that stores a program for voice interaction which when executed causes a computer to preform a method comprising:
recognizing a voice of a user;
generating a first response sentence to the voice of the user based on the recognized voice;
generating a filler word to be inserted in a conversation with the user;
outputting the first response sentence and the filler word,
when outputting, after the user utters a voice subsequent to the outputting of the first response sentence, the filler word and outputting a second response sentence:
classifying the first response sentence into one of predetermined speech patterns indicating predefined speech types, and
generating the filler word based on the predetermined speech pattern into which the first response sentence has been classified.
  • Inventor, and Inventor/Applicant
  • KAWAHARA Tatsuya
  • TAKANASHI Katsuya
  • NAKANISHI Ryosuke
  • WATANABE Narimasa
  • KYOTO UNIVERSITY
  • TOYOTA MOTOR
IPC(International Patent Classification)
Please contact us by e-mail or facsimile if you have any interests on this patent. Thanks.

PAGE TOP

close
close
close
close
close
close