Top > Search of International Patents > Solution search system and method, and solution search program

Solution search system and method, and solution search program UPDATE_EN

Foreign code F190009889
File No. 08135-KR
Posted date Aug 23, 2019
Country Republic of Korea
Application number 20157026885
Gazette No. 20150137073
Date of filing Mar 17, 2014
Gazette Date Dec 8, 2015
International application number JP2014001506
International publication number WO2014156044
Date of international filing Mar 17, 2014
Date of international publication Oct 2, 2014
Priority data
  • P2013-066768 (Mar 27, 2013) JP
  • 2014JP01506 (Mar 17, 2014) WO
Title Solution search system and method, and solution search program UPDATE_EN
Abstract In order to obtain a solution of a combinatorial bandit problem rapidly and efficiently, a solution search system searches for a combination of test objects 5 which is expected to output an optimal result from among at least two test objects each of which outputs a result based on a predetermined probability distribution. The system includes: a record superiority comparing part 1 configured to obtain a past record, for each of the test objects 5, based on an accumulation of the results having been output, and to compare the records of the test objects 5 in terms of superiority and inferiority in relation to the records of all the test objects 5; a controlling part 3 configured to perform a control to increase or decrease a measurement variable, for each of the test objects 5, based on the superiority/inferiority of the record having been compared by the record superiority comparing unit, and a latest result having been output from the test object 5; and an output instructing part 4 configured to instruct the test object 5, the measurement variable of which has exceeded a threshold value, to output a result. The output instructing part 4 determines, as a desired combination, a combination of the test objects 5 to which the largest number of the output instructions have finally been given after repetition of the output instructions.
(From EP2983098 A1)
Outline of related art and contending technology BACKGROUND ART
Conventionally, maximizing the expected value of the solution for the problem as a typical example of searching a bandit problems. Bandit problem is capable of receiving the repair to maximize the total expected value of an object, the player from the N kinds of different behavior options in plain line of one of the operation is repeated as needed. After each selection selected each time depending on the action probability distribution is selected from the results given as the complement of the player.
If there exists a plurality of slot machines, each of the slot machine by pulling the lever under any probability distribution may be affected by a coin (repair) are represented. This probability distribution (probability of winning) coming out of the coin is different for each slot machine as it is, it is also possible to know the probability of the winning player is considered below. At this time, the probability of winning can be seen that each of the slot machine of the most common methods include a first slot machine over each of a plurality of times in order to play and, in practice most maintenance was most winning slot machine is large is determined as having a high probability.
However, in this method in practice most winning slot machine by specifying a high probability of significant numbers of times in the slot machine should play as a result of many investment required can be calculated as follows. For this reason, the probability of winning in each slot machine is irradiated as much as possible with reduced investment in spares can be efficiently search algorithm needs to be considered can be found.
In this case, as described above can be subjected to maintenance such as to maximize the total expected value of the bandit problem can be solved by replacing this invention (for example, see non patent document 1.). In the bandit problem in particular in two different kinds of N best result from the output of the plain line behavior is expected to select a combination, a combination of recently called bandit problem has been noted. Bandit problem in the combination of a plurality of slot machine in a higher payout can be expected in the case of selecting a combination of a slot machine as well as, for example the cognitive radio communication system capable of maximizing the amount of data transmission channels which may be the selection of an optimum combination of, the Internet advertisements to maximize the number of clicks which may be an optimum combination of an advertisement, a large financial investment return is most such as selecting a portfolio of, various needs have to be in the field. In the case of this application, a more general combination becomes the maximization problem repair. In other words, the player has a large number and, depending on the selection of each player (for example by means of a payoff matrix) retention amount of each player is determined. However, in the present specification for the sake of simplicity, in the case where each slot machine is a combination of independent repair the maximization problem (in particular a combination of 2) will be described as an example.
Scope of claims [claim1]
1. Based on the probability distribution of at least 2 subject and outputs a result of the output of the expected best result searched by the search subject in the system, based on the scale of the output result to the previous distance value for each of the determined full target subject, wherein each of the target subject of the whole of the subject exclusively in relation to the full compare means with the winning priority exclusively, wherein the priority comparing means comparing the whole of priority and exclusively, output from the test subject based on the result of the recent, increasing or decreasing the variable metering control the subject to be performed for each control means, said metering parameter is above a defined threshold value for a subject result means outputting the output instruction and, wherein the output instruction means comprises instruction repetition of the results via the output of the final result of an instruction of the output of a most and at least 1 subject to search as a solution to the step of specifying the search system.

[claim2]
2. Method according to claim 1, based on the probability distribution of each set and outputs a result of at least 3 subject the output of the best expected result of the target subject and discovery, wherein the output of the instruction means comprises instruction repetition of an output result via the output of the final one of the most and the instruction is executed as a result of the discovery of the target subject specified as a solution to the search system.

[claim3]
3. Method according to claim 1, wherein the output indication means in accordance with the output instruction, the probability distribution is changed in time series combination of the target subject to searching, at least the 2 search system performing a search using the search system.

[claim4]
4. Method according to any one of claims 1-3, the comparing means compares the priority of the target subject does better results and outputted results are inferior to those of another subject is outputted, in the one subject to far superior side exclusively to improve, one subject and outputs the result obtained is inferior to that in the case and other samples to be analyzed to output a result superior to the case, wherein the one of the subject in the fully lowered toward the far inferior to the search system.

[claim5]
5. Method according to any one of claims 1-4, each comparing means compares the priority of the target subject does and does all of the differences between the subject does and internal resource average value, the control means is an internal resource value is correct and hanging, output from the subject which is more excellent date if it is, increasing the measuring variable, the internal resource value is correct and, wherein the subject output from the up date is inferior to the case, and the metering parameters are maintained, the internal resource value is 0, the output from the subject which is more excellent date if it is, increasing the measuring variable, the value 0 and the internal resource, the subject output from the up date is inferior to the case, and lowering the measuring variable, the value of the internal resource parts, wherein the subject output from the up date is superior to the case, the weighed variables are intact, the internal resource and a value portion, wherein the subject output from the up date is inferior to the case, the weighed variables to the lowering of the search system.

[claim6]
6. Method according to any one of claims 1-5, wherein the control means is assigned to each subject is adjusted to make constant the sum of metered parameters controlled by the search system.

[claim7]
7. Based on the probability distribution of at least 2 subject and outputs a result of expected results in the best output search to search the target subject in the program, based on the scale of the output result to the previous distance value for each said subject is determined in whole, wherein each of the target subject based on a relationship between the entire subject exclusively in its entirely comparing step for comparing superiority priority exclusively, wherein the priority comparing means comparing the whole of priority and exclusively, from the test subject based on a result of recent output, increasing or decreasing the variable metering control so that the subject step is performed with each, said metering parameter is above a defined threshold value for a subject result output instruction instructing an output of a step of, in said step output instruction, wherein the result as final output of the repeated instruction of the instruction results in the most out of an output of at least 1 and specifying a search target subject as a solution to cause the computer to execute the search program.

[claim8]
8. Method according to claim 7, wherein the step of comparing priority exclusively, one of the samples to be analyzed to output a result superior to the case and other subject and outputs a result which is inferior to that in the case where, in one of the subject to far superior side exclusively to improve, one subject and outputs the result obtained is inferior to the case and other samples to be analyzed to output a result superior to the case, wherein the one of the subject in the fully lowered toward the far inferior to search the program.

[claim9]
9. Method according to claim 7 or 8, wherein the step of comparing priority exclusively, wherein each of the target subject and an average of the entire subject exclusively exclusively and the difference between the value of the internal resource, the internal resource information in said control step and the above- value, output from the subject which is more excellent date if it is, increasing the measuring variable, the internal resource value is correct and, wherein the subject output from the up date is inferior to the case, and the metering parameters are maintained, the internal resource value is 0, the output from the current subject to be better than the results of the case, increasing the measuring variable, the value 0 and the internal resource, the subject output from the up date is inferior to the case, and lowering the measuring variable, the internal resource value parts, wherein the subject output from the up date is superior to the case, the weighed variables are intact, the internal resource and a value portion, wherein the subject output from the up date is inferior to the case, the weighed variables to the lowering of the search frame.

[claim10]
10. Method according to any one of claims 7-9, in said control step, which are assigned to each subject the sum of a constant variable metering control so as to search the program.

[claim11]
11. Based on the probability distribution of at least 2 subject and outputs a result of expected results in the best output to a search subject search method, based on the scale of the output result to the previous distance value for each said subject is determined in whole, wherein each of the target subject based on a relationship between the entire subject exclusively in its entirely comparing step for comparing superiority priority exclusively, wherein the priority comparing means comparing the whole of priority and exclusively, output from the test subject based on the result of recent, metering parameters to control the increase or decrease the control step is carried out for each subject, the metering parameter is above a defined threshold value for a subject result output instruction instructing an output of a step of, in the output instruction step of, wherein the result as final output of the repeated instruction of the instruction of the output of one of the most is performed as a result of the search as a solution to at least 1 subject to the specified search method.

[claim12]
12. Method according to claim 11, wherein the step of comparing priority exclusively, one better results to be analyzed and outputted results are inferior to those of another subject is outputted, in the one of the subject side far superior improved wholly and, one of outputting a result to be analyzed are inferior to those of the case and other subject and outputs a result which is superior to the case, wherein the one of the subject in the fully lowered toward the far inferior to that search method.

[claim13]
13. Method according to claim 11 or 12, comparing priority in the step of exclusively, wherein each of the target subject does and does all of the differences between the subject value and the average of the internal resource, the internal resource information in said control step and the above- value, output from the subject which is more excellent date if it is, increasing the measuring variable, the internal resource value is correct and, wherein the subject output from the up date is inferior to the case, and the metering parameters are maintained, the internal resource value is 0, the output from the subject which is more excellent date if it is, increasing the measuring variable, the value 0 and the internal resource, the subject output from the up date is inferior to the case, and lowering the measuring variable, the value of the internal resource parts, wherein the subject output from the up date is superior to the case, the weighed variables are intact, the internal resource and a value portion, wherein the subject output from the up date is inferior to the case, the weighed variables to the lowering of the search method.

[claim14]
14. Method according to any one of claims 11-13, in said control step, the sum of metered parameters are assigned to each subject becomes constant and controls the search method.
  • Applicant
  • RIKEN
  • Inventor
  • KIM, SONG-JU
  • AONO MASASHI
  • NAMEDA ETSUSHI
  • HARA MASAHIKO
IPC(International Patent Classification)

PAGE TOP

close
close
close
close
close
close