TOP > 外国特許検索 > INFORMATION EXTRACTION APPARATUS, INFORMATION EXTRACTION METHOD, AND INFORMATION EXTRACTION PROGRAM

INFORMATION EXTRACTION APPARATUS, INFORMATION EXTRACTION METHOD, AND INFORMATION EXTRACTION PROGRAM 新技術説明会

外国特許コード F160008836
整理番号 (S2015-0247-N0)
掲載日 2016年8月18日
出願国 世界知的所有権機関(WIPO)
国際出願番号 2015JP084974
国際公開番号 WO 2016098739
国際出願日 平成27年12月14日(2015.12.14)
国際公開日 平成28年6月23日(2016.6.23)
優先権データ
  • 特願2014-253058 (2014.12.15) JP
発明の名称 (英語) INFORMATION EXTRACTION APPARATUS, INFORMATION EXTRACTION METHOD, AND INFORMATION EXTRACTION PROGRAM 新技術説明会
発明の概要(英語) Even in cases where a structured document specification has been altered, provided is an information extraction apparatus capable of easily and reliably extracting predetermined information extracted prior to the alteration and following the alteration. The information extraction apparatus (100) comprises a control unit (120) that extracts differing portions between a plurality of structured documents as variable elements and that extracts elements within a predetermined range from each of the variable elements as peripheral information and a memory unit (140) that sets at least one of the variable elements as an object to be extracted and that stores the variable elements and the peripheral information for at least the object to be extracted. The control unit re-extracts the variable elements and the peripheral information from the plurality of structured documents, calculates a degree of likeness between the variable elements and the peripheral information before and after re-extraction on the basis of the re-extracted variable elements and the peripheral information and the variable elements and the peripheral information stored in the memory unit, and identifies variable elements corresponding to the object to be extracted from the re-extracted variable elements on the basis of the calculated degree of likeness.
特許請求の範囲(英語) [claim1]
1. As you acquire the plural documents which are structured, you extract the part which differs between the plural documents which are acquired, as the variable element it extracts the element which from each variable element is inside the specified range, as peripheral information the control section and,
The memory section which among the aforementioned variable elements at least deal with one extraction, at least concerning the aforementioned extraction object the aforementioned variable element and houses the information around the description above and,
Possessing,
The description above which corresponds to the aforementioned extraction object on the basis of aforementioned similarity the description above acquiring the plural documents which are structured for the second time, as it re-extracts the part which differs between the plural documents which you acquire for the second time as the variable element, it re-extracted the aforementioned control section, the element which from each variable element which it re-extracts is inside the specified range as peripheral information, calculated the aforementioned variable element of re-extraction front and back and similarity of information around the description above re-are extracted on the basis of with the aforementioned variable element and the information around the description above and the aforementioned variable element and the information around the description above which is housed in the aforementioned memory section, calculated,The variable element it specifies from midst of the aforementioned variable element after the re-extracting,
Information extracting equipment.
[claim2]
2. From midst of the aforementioned variable element after the re-extracting, the variable element whose similarity for the variable element of the aforementioned extraction object is highest it specifies, in claim 1 the information extracting equipment of statement.
[claim3]
3. It calculates similarity of the aforementioned variable element which re-is extracted and the aforementioned variable element which is housed in the aforementioned memory section, at the same time it calculates with the information around the description above which re-is extracted, and similarity of information around the description above which is housed in the aforementioned memory section the variable element which corresponds to the aforementioned extraction object on the basis similarity of the aforementioned variable element and with of similarity of information around the description above, it specifies from midst of the aforementioned variable element after the re-extracting, in claim 1 the information extracting equipment of statement.
[claim4]
4. It divides the numeric part and the letter part which are respectively included in the aforementioned variable element which re-is extracted and the aforementioned variable element which is housed in the aforementioned memory section, into the aforementioned numeric part and the aforementioned letter part, it decides similarity of the aforementioned variable element on the basis similarity of the aforementioned numeric part and with of similarity of the aforementioned letter part, in claim 1 the information extracting equipment of statement.
[claim5]
5. The aforementioned variable element is extracted the description above by calculating the finite difference of the plural documents which are structured, in claim 1 the information extracting equipment of statement.
[claim6]
6. The indicatory department which indicates the aforementioned variable element which is extracted and,
The input section which inputs the aforementioned extraction object which is selected from midst of the aforementioned variable element which is indicated by the user and,
Furthermore it possesses, in claim 1 the information extracting equipment of statement.
[claim7]
7. The plural times you acquire the document which is dealt with, you exclude from the aforementioned variable element the plural times specified frequency the part which differs as an exclusion element between the documents which are acquired, in claim 1 the information extracting equipment of statement.
[claim8]
8. The step which acquires the plural documents which are structured and,
The step which extracts the part which differs between the plural documents which you acquire as the variable element and,
The step which extracts the element which from each variable element is inside the specified range as peripheral information and,
The step which among the aforementioned variable elements at least deal with one extraction, at least concerning the aforementioned extraction object the aforementioned variable element and houses the information around the description above in the memory section and,
The description above the step which acquires the plural documents which are structured for the second time and,
For the second time the part which differs between the plural documents which are acquired as a variable element the step which re-is extracted and,
The element which from each variable element which re-is extracted is inside the specified range as peripheral information the step which re-is extracted and,
Re-are extracted on the basis with of the aforementioned variable element and the information around the description above and the aforementioned variable element and the information around the description above which is housed in the aforementioned memory section, the aforementioned variable element of re-extraction front and back and the step which calculates similarity of information around the description above and,
The variable element which corresponds to the aforementioned extraction object on the basis of aforementioned similarity it calculated, the step which specifies from midst of the aforementioned variable element after the re-extracting and,
It includes, information extraction method.
[claim9]
9. From midst of the aforementioned variable element after the re-extracting, the variable element whose similarity for the variable element of the aforementioned extraction object is highest it specifies, in claim 8 information extraction method of statement.
[claim10]
10. It calculates similarity of the aforementioned variable element which re-is extracted and the aforementioned variable element which is housed in the aforementioned memory section, at the same time it calculates with the information around the description above which re-is extracted, and similarity of information around the description above which is housed in the aforementioned memory section the variable element which corresponds to the aforementioned extraction object on the basis similarity of the aforementioned variable element and with of similarity of information around the description above, it specifies from midst of the variable element after the re-extracting, in claim 8 information extraction method of statement.
[claim11]
11. It divides the numeric part and the letter part which are respectively included in the aforementioned variable element which re-is extracted and the aforementioned variable element which is housed in the aforementioned memory section, into the aforementioned numeric part and the aforementioned letter part, it decides similarity of the aforementioned variable element on the basis similarity of the aforementioned numeric part and with of similarity of the aforementioned letter part, in claim 8 information extraction method of statement.
[claim12]
12. The aforementioned variable element is extracted the description above by calculating the finite difference of the plural documents which are structured, in claim 8 information extraction method of statement.
[claim13]
13. The step which indicates the aforementioned variable element which is extracted and,
The step which inputs the aforementioned extraction object which is selected from midst of the aforementioned variable element which is indicated by the user and,
Furthermore it includes, in claim 8 information extraction method of statement.
[claim14]
14. The plural times you acquire the document which is dealt with, you exclude from the aforementioned variable element the plural times specified frequency the part which differs as an exclusion element between the documents which are acquired, in claim 8 information extraction method of statement.
[claim15]
15. The step which acquires the plural documents which are structured and,
The step which extracts the part which differs between the plural documents which you acquire as the variable element and,
The step which extracts the element which from each variable element is inside the specified range as peripheral information and,
The step which among the aforementioned variable elements at least deal with one extraction, at least concerning the aforementioned extraction object the aforementioned variable element and houses the information around the description above in the memory section and,
The description above the step which acquires the plural documents which are structured for the second time and,
For the second time the part which differs between the plural documents which are acquired as a variable element the step which re-is extracted and,
The element which from each variable element which re-is extracted is inside the specified range as peripheral information the step which re-is extracted and,
Re-are extracted on the basis with of the aforementioned variable element and the information around the description above and the aforementioned variable element and the information around the description above which is housed in the aforementioned memory section, the variable element of re-extraction front and back and the step which calculates similarity of peripheral information and,
The variable element which corresponds to the aforementioned extraction object on the basis of aforementioned similarity it calculated, the step which specifies from midst of the aforementioned variable element after the re-extracting and,
The information extraction program in order to make the computer execute.
  • 出願人(英語)
  • ※2012年7月以前掲載分については米国以外のすべての指定国
  • INTER-UNIVERSITY RESEARCH INSTITUTE CORPORATION RESEARCH ORGANIZATION OF INFORMATION AND SYSTEMS
  • 発明者(英語)
  • SAKAMOTO KAZUNORI
  • HONIDEN SHINICHI
国際特許分類(IPC)
指定国 National States: AE AG AL AM AO AT AU AZ BA BB BG BH BN BR BW BY BZ CA CH CL CN CO CR CU CZ DE DK DM DO DZ EC EE EG ES FI GB GD GE GH GM GT HN HR HU ID IL IN IR IS JP KE KG KN KP KR KZ LA LC LK LR LS LU LY MA MD ME MG MK MN MW MX MY MZ NA NG NI NO NZ OM PA PE PG PH PL PT QA RO RS RU RW SA SC SD SE SG SK SL SM ST SV SY TH TJ TM TN TR TT TZ UA UG US UZ VC VN ZA ZM ZW
ARIPO: BW GH GM KE LR LS MW MZ NA RW SD SL SZ TZ UG ZM ZW
EAPO: AM AZ BY KG KZ RU TJ TM
EPO: AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR
OAPI: BF BJ CF CG CI CM GA GN GQ GW KM ML MR NE SN ST TD TG
※ ライセンスをご希望の方、特許の内容に興味を持たれた方は、問合せボタンを押してください。

PAGE TOP

close
close
close
close
close
close