外国特許コード F190009996
整理番号 S2018-0508-C0
掲載日 2019年10月30日
出願国 世界知的所有権機関(WIPO)
国際出願番号 2019JP013793
国際公開番号 WO 2019189661
国際出願日 平成31年3月28日(2019.3.28)
国際公開日 令和元年10月3日(2019.10.3)
  • 特願2018-066282 (2018.3.29) JP
発明の概要(英語) Provided are a method and a device that can efficiently create a learning dataset, which is used for machine learning and targets a plurality of objects. Object information is associated with a visual marker, a learning dataset generation jig that is configured from a base part and a marker is used, said base part being provided with an area that serves as a guide for positioning a target object and said marker being fixed on the base part, the target object is positioned using the area as a guide and in this condition an image group of the entire object including the marker is acquired, the object information that was associated with the visual marker is acquired from the acquired image group, a reconfigured image group is generated from this image group by performing a concealment process on a region corresponding to the visual marker or the learning dataset generation jig, a bounding box is set in the reconfigured image group on the basis of the acquired object information, information relating to the bounding box, the object information, and estimated target object position information and location information are associated with a captured image, and a learning dataset for performing object recognition and location/position estimation for the target object is generated.
従来技術、競合技術の概要(英語) BACKGROUND ART
Conventionally, in a factory or the like as the operation is automated, artificial intelligence and the "AI" (Artificial Intelligence, or less.) Is mounted on the robot are used. In recent years, the development of deep learning or machine learning (learning deep) by, in the production system in a factory or the like, toward full automation of a factory or the like as the approach, the machine learning that utilizes AI has been rapidly developed. Automation of the work by the robot, there is a need in the art that all, even during, the growth of the distribution industry or the food industry in the future is expected and in the art, the automatic operation by a robot needs to present high. However, the product is handled in the food industry or the distribution industry, more flexible, complicated handling for a change in shape, the robot hand robot provided in the problem of difficult handling. In addition, in the distribution industry or the food industry is handled, once at a few kinds, was centered at the mass production type of the, today it is not only limited production of diversified products is obtained to produce a variable or variant of it, the recognition of the many types of goods can be accurately and quickly is also a problem that it is difficult to conduct. That is, a wide variety of goods and recognized in a short time, such as reclaiming to remove the gift or a defective operation such as sorting can be obtained based on accurately. These circumstances, in the distribution industry or the food industry, a factory automation robot still does not proceed sufficiently in reality.
According to the prior art, in order to collect the learning data on the product is, of taking an image of the object of interest, obtained from the captured image and identify the object by a human hand, the bounding box is drawn that has been carried out. The target object position and orientation of the data input has been performed by a human hand. Therefore, an object in photographing the markers, in a person's hand have been made to automate the work is considered. However, the object markers to the photographing method, the object of interest and the marker for the positional relationship between the number of markers has not been sufficient studies have been made for, the object of interest or in the bounding box of the reflection marker, or, the marker is hidden by the object of interest was a problem. That is, in the bounding box of the target object and reflection of the marker and, if the high quality learning data and the user does not, the marker is learned as a feature of the object may be a possibility. In addition, the object of interest and may be hidden by a marker, the accuracy of object recognition problem.
Object recognition techniques, object recognition techniques in the vicinity of the vehicle, the database construction system to automatically collect learning data has been known (see Patent Document 1). This is, detection result of the sensor data as the supervisor data from the output of another sensor for recognition of the object machine learning supervised learning to automatically collect data according to the present invention. However, disclosed in Patent Document 1 a database construction system, a plurality of types of sensors are used to perform object recognition in the art, can be detected is learned in a sensor is required.
Furthermore, shooting conditions for image recognition from an unknown image of the machine learning can be used to generate a training image which has been known an image generation method (see Patent Document 2). This is, for image recognition in order to generate a training image 2 from an image carried by the first 3 to produce a new training image. However, disclosed in Patent Document 2 the image generating method, an image different from the real environment is created, a high quality learning data set is a problem that cannot be manufactured.
In addition, the position and posture of the camera is performed in the RGB estimation, object model 3D using known methods (see non-patent document 1). However, the method disclosed in Non-Patent Document 1 is, the object 3D is always in advance is required and the model, the real environment is different from the learning image is generated so that, a high quality learning data set is also a problem that cannot be manufactured.
Picking up an image is a marker, the marker is learned as a feature of the object will be a problem in that, in the image marker is considered hidden. In this regard, in the moving image shooting, it is actually in the scene being photographed, the moving object in real-time to hide the hidden de-known technique (see Patent Document 3). This is because, in the moving body detects the object, moving direction and speed is calculated, the calculated moving direction and speed of the moving object when the object is moved to the background image to obtain the speculative. The background image is taken speculatively, the moving object in real time is used to de-hidden. However, disclosed in Patent Document 3 if the moving body is a problem that it cannot de-hidden.
  • 出願人(英語)
  • ※2012年7月以前掲載分については米国以外のすべての指定国
  • 発明者(英語)
  • TOMOCHIKA, Keita
  • KIYOKAWA, Takuya
  • OGASAWARA, Tsukasa
  • DING, Ming