TOP > 外国特許検索 > Method and a system for predicting protein functional site, a method for improving protein function, and a function-modified protein

Method and a system for predicting protein functional site, a method for improving protein function, and a function-modified protein

外国特許コード F110005274
整理番号 E04401US2
掲載日 2011年8月29日
出願国 アメリカ合衆国
出願番号 34520503
公報番号 20030105615
公報番号 7231301
出願日 平成15年1月16日(2003.1.16)
公報発行日 平成15年6月5日(2003.6.5)
公報発行日 平成19年6月12日(2007.6.12)
国際出願番号 JP1998000430
国際公開番号 WO1998033900
国際出願日 平成10年2月2日(1998.2.2)
国際公開日 平成10年8月6日(1998.8.6)
優先権データ
  • 特願1997-019248 (1997.1.31) JP
  • 特願1997-019249 (1997.1.31) JP
  • 特願1997-332100 (1997.12.2) JP
  • 特願1998-018699 (1998.1.30) JP
  • 1998WO-JP00430 (1998.2.2) WO
  • 1999US-09355486 (1999.9.20) US
  • 2000US-09697138 (2000.10.27) US
発明の名称 (英語) Method and a system for predicting protein functional site, a method for improving protein function, and a function-modified protein
発明の概要(英語) (US7231301)
The present application provides a method for predicting the functional site of a protein using data of the entire proteins of an organism of which genome data or cDNA data is known.
More specifically, the present application provides a method for predicting a protein functional site, comprising the steps of calculating the frequency of occurrence of an oligopeptide in the entire proteins, calculating the value of each amino-acid residue contributing to the frequency of occurrence as the representative value of the function, and predicting the protein functional site by using the representative value of function as an indicator.
The present also provides a system for predicting a functional site for automatically performing said methods.
Additionally, the present application provides a method for preparing a function-modified protein comprising subjecting the amino-acid residues composing the functional site identified by the method described above to artificial mutation, and a novel thermophilic DNA polymerase prepared by the method.
特許請求の範囲(英語) [claim1]
1. A method for predicting functional site of a functionally unknown protein obtained from an organism, in which amino acid sequences for all proteins expressed by the organism are estimated from known cDNA, said method comprises:
(1) determining in the amino acid sequences from all proteins of the organism, the frequency of occurrence of each amino acid and the frequency of occurrence of individual oligopeptides produced by permutations of twenty amino acids, and determining the smallest length (n) of oligopeptides having criteria of among oligopeptides of length (n), the number of oligopeptides which occur once in all of the proteins is smaller than the number of oligopeptides which occur twice in all of the proteins, andamong oligopeptides of length (n+1), the number of oligopeptides which occur once in all of the proteins is larger than the number of oligopeptides which occur twice in all of the proteins;(2) determining from all of the proteins of the organism, the frequency of occurrence of an Aji-oligopeptide of length (n+1), which is a fragment of the protein for predicting the amino-acid residues responsible for functional activity, and contains the j-th amino-acid residue Aj (n+1 <= j <= L-n) from the N-terminus of the amino acid sequence (length of L) of the protein, wherein the j-th amino-acid residue Aj is the i-th residue Aji from the N-terminus of the Aji-oligopeptide,the Aji-oligopeptide is aj1aj2, . . . Aji . . . ajnaj(n+1),1 <= i <= n+1,Aj is Aji and Aj is the i-th residue of the oligopeptide, andaj1 is Aj-i+1, . . . , aj(n+1)=Aj-i+(n+1), anddetermining from all of the proteins of the organism, the frequency of occurrence of an Xji-oligopeptide of length (n+1), wherein the Xji-oligopeptide is aj1aj2 . . . Xji . . . ajnaj(n+1), and further wherein1 <= i <= n+1,n+1 <= j <= L-n;
and
the i-th residue Xji is any amino acid, andaj1 is Aj-i+1, . . . , aj(n+1)=Aj-i+(n+1);(3) calculating ratio value Yji of the frequency of occurrence of the Aji-oligopeptide to that of the Xji-oligopeptide;(4) determining mean value Yj of the value Yji, wherein
(Equation image 10 not included in text) (5) determining Zj, wherein Zj value is defined as the representative value of the function of the j-th amino-acid residue Aj of the amino acid sequence (length of L), and wherein Zj=f(Yj), and function f is a monotonously decreasing function or a monotonously increasing function;
and(6) repeating steps (2) to (5) sequentially and determining the Zj value of each Aj of all the amino-acid residues at positions between n+1 <= j <= L-n in the amino acid sequence (length of L), thereby predicting the degree of involvement of each amino-acid residue of said sequence in the function of the protein by using Zj value as an indicator.
[claim2]
2. The method according to claim 1, wherein the Zj value (n+1 <= j <= L-n) of each amino-acid residue in the amino acid sequence (length of L) is expressed in a distribution chart.
[claim3]
3. A system for automatically predicting functional site of a functionally unknown protein obtained from an organism, in which amino acid sequences for all proteins expressed by the organism are estimated from known cDNA, which comprises:
(a) an outer memory unit for memorizing the amino acid sequences of all proteins of the organism and an existing protein data base;(b) a first calculation/memory unit for calculating the frequency of occurrence of each amino acid and the frequency of occurrence of individual oligopeptides produced by permutations of twenty amino acids, in the amino acid sequences of all of the proteins of the organism, and a memory unit for storing the calculation results therein;(c) a second calculation/memory unit for calculating the smallest length (n) of oligopeptides having the criteria among the individual oligopeptides of which the frequencies of the occurrences being memorized in the unit (b) of among oligopeptides of length (n), the number of oligopeptides which occur once in all of the proteins is smaller than the number of oligopeptides which occur twice in all of the proteins, andamong oligopeptides of length (n+1), the number of oligopeptides which occur once in all of the proteins is larger than the number of oligopeptides which occur twice in all of the proteins, and
a memory unit for storing the calculation results therein;
(d) a third calculation/memory unit for calculating from all of the proteins of the organism, the frequency of occurrence of an Aji-oligopeptide of length (n+1), which is a fragment of the protein for predicting the amino-acid residues responsible for functional activity, and contains the j-th amino-acid residue Aj (n+1 <= j <= L-n) from the N-terminus of the amino acid sequence (length of L) of the protein, wherein the j-th amino-acid residue Aj is the i-th residue Aji from the N-terminus of the Aji-oligopeptide,the Aji-oligopeptide is aj1aj2 . . . Aji . . . ajnaj(n+1),1 <= i <= n+1,Aj is Aji and Aj is the i-th residue of the oligopeptide, andaj1 is Aj-i+1, . . . , aj(n+1)=Aj-i+(n+1), andcalculating from all of the proteins of the organism, the frequency of occurrence of an Xji-oligopeptide of length (n+1), wherein the Xji-oligopeptide is aj1aj2 . . . Xji . . . ajnaj(n+1), and further wherein1 <= i <= n+1,n+1 <= j <= L-n;
and
the i-th residue Xji is any amino acid, andaj1 is Aj-i+1, . . . , aj(n+1)=Aj-i+(n+1), and
a memory unit for storing the calculation results therein;
(e) a fourth calculation/memory unit for calculating ratio value Yji of the frequency of occurrence of the Aji-oligopeptide to that of the Xji-oligopeptide, and a memory unit for storing the calculation results therein;(f) a fifth calculation/memory unit for calculating mean value Yj of the value Yji, wherein
(Equation image 11 not included in text)
and a memory unit for storing the calculation results therein;
and (g) a sixth calculation/memory unit for determining Zj, wherein Zj value is defined as the representative value of the function of the j-th amino-acid residue Aj of the amino acid sequence (length of L), and wherein Zj=f(Yj), and function f is a monotonously decreasing function or a monotonously increasing function, and a memory unit for storing the calculation results therein;
wherein said system causes said first through sixth units to sequentially, in order from said first to said sixth units, to perform the respective calculations so as to determine the Zj value of each Aj of all the amino-acid residues at positions between n+1 <= j <= L-n in the amino acid sequence (length of L), thereby predicting the degree of involvement of each amino-acid residue of said sequence in the function of the protein by using Zj value as an indicator.
[claim4]
4. The system according to claim 3, the system being equipped with a display unit displaying the Zj value (n+1 <= j <= L-n) of each amino-acid residue in the amino acid sequence (length of L) in a distribution chart.
[claim5]
5. A computer-readable medium on which a program is stored, said program causing a computer to execute a method for predicting functional site of a functionally unknown protein obtained from an organism, in which amino acid sequences for all proteins expressed by the organism are estimated from known cDNA, said method comprises:
(1) determining in the amino acid sequences from all proteins of the organism, the frequency of occurrence of each amino acid and the frequency of occurrence of individual oligopeptides produced by permutations of twenty amino acids, and determining the smallest length (n) of oligopeptides having criteria of among oligopeptides of length (n), the number of oligopeptides which occur once in all of the proteins is smaller than the number of oligopeptides which occur twice in all of the proteins, andamong oligopeptides of length (n+1), the number of oligopeptides which occur once in all of the proteins is larger than the number of oligopeptides which occur twice in all of the proteins;(2) determining from all of the proteins of the organism, the frequency of occurrence of an Aji-oligopeptide of length (n+1), which is a fragment of the protein for predicting the amino-acid residues responsible for functional activity, and contains the j-th amino-acid residue Aj (n+1 <= j <= L-n) from the N-terminus of the amino acid sequence (length of L) of the protein, wherein the j-th amino-acid residue Aj is the i-th residue Aji from the N-terminus of the Aji-oligopeptide,the Aji-oligopeptide is aj1aj2 . . . Aji . . . ajnaj(n+1),1 <= i <= n+1,Aj is Aji and Aj is the i-th residue of the oligopeptide, andaj1 is Aj-i+1, . . . , aj(n+1)=Aj-i+(n+1), anddetermining from all of the proteins of the organism, the frequency of occurrence of an Xji-oligopeptide of length (n+1), wherein the Xji-oligopeptide is aj1aj2 . . . Xji . . . ajnaj(n+1), and further wherein1 <= i <= n+1,n+1 <= j <= L-n;
and
the i-th residue Xji is any amino acid, andaj1 is Aj-i+1, . . . ,aj(n+1)=Aj-i+(n+1);(3) calculating ratio value Yji of the frequency of occurrence of the Aji-oligopeptide to that of the Xji-oligopeptide;(4) determining mean value Yj of the value Yji, wherein
(Equation image 12 not included in text) (5) determining Zj, wherein Zj value is defined as the representative value of the function of the j-th amino-acid residue Aj of the amino acid sequence (length of L), and wherein Zj=f(Yj), and function f is a monotonously decreasing function or a monotonously increasing function;
and(6) repeating steps (2) to (5) sequentially and determining the Zj value of each Aj of all the amino-acid residues at positions between n+1 <= j <= L-n in the amino acid sequence (length of L), thereby predicting the degree of involvement of each amino-acid residue of said sequence in the function of the protein by using Zj value as an indicator.
[claim6]
6. A program recorded on a computer-readable medium for causing a computer to execute a method for predicting functional site of a functionally unknown protein obtained from an organism, in which amino acid sequences for all proteins expressed by the organism are estimated from known cDNA, said method comprises:
(1) determining in the amino acid sequences from all proteins of the organism, the frequency of occurrence of each amino acid and the frequency of occurrence of individual oligopeptides produced by permutations of twenty amino acids, and determining the smallest length (n) of oligopeptides having criteria of among oligopeptides of length (n), the number of oligopeptides which occur once in all of the proteins is smaller than the number of oligopeptides which occur twice in all of the proteins, andamong oligopeptides of length (n+1), the number of oligopeptides which occur once in all of the proteins is larger than the number of oligopeptides which occur twice in all of the proteins;(2) determining from all of the proteins of the organism, the frequency of occurrence of an Aji-oligopeptide of length (n+1), which is a fragment of the protein for predicting the amino-acid residues responsible for functional activity, and contains the j-th amino-acid residue Aj (n+1 <= j <= L-n) from the N-terminus of the amino acid sequence (length of L) of the protein, wherein the j-th amino-acid residue Aj is the i-th residue Aji from the N-terminus of the Aji-oligopeptide,the Aji-oligopeptide is aj1aj2 . . . Aji . . . ajnaj(n+1),1 <= i <= n+1,Aj is Aji and Aj is the i-th residue of the oligopeptide, andaj1 is Aj-i+1, . . . , aj(n+1)=Aj-i+(n+1), anddetermining from all of the proteins of the organism, the frequency of occurrence of an Xji-oligopeptide of length (n+1), wherein the Xji-oligopeptide is aj1aj2 . . . Xji . . . ajnaj(n+1), and further wherein1 <= i <= n+1,n+1 <= j <= L-n;
and
the i-th residue Xji is any amino acid, andaj1 is Aj-i+1, . . . , aj(n+1)=Aj-i+(n+1);(3) calculating ratio value Yji of the frequency of occurrence of the Aji-oligopeptide to that of the Xji-oligopeptide;(4) determining mean value Yj of the value Yji, wherein
(Equation image 13 not included in text) (5) determining Zj, wherein Zj value is defined as the representative value of the function of the j-th amino-acid residue Aj of the amino acid sequence (length of L), and wherein Zj=f(Yj), and function f is a monotonously decreasing function or a monotonously increasing function;
and(6) repeating steps (2) to (5) sequentially and determining the Zj value of each Aj of all the amino-acid residues at positions between n+1 <= j <= L-n in the amino acid sequence (length of L), thereby predicting the degree of involvement of each amino-acid residue of said sequence in the function of the protein by using Zj value as an indicator.
  • 発明者/出願人(英語)
  • DOI HIROFUMI
  • HIRAKI HIDEAKI
  • KANAI AKIO
  • JAPAN SCIENCE AND TECHNOLOGY AGENCY
国際特許分類(IPC)
参考情報 (研究プロジェクト等) ERATO DOI Bioasymmetry AREA
ライセンスをご希望の方、特許の内容に興味を持たれた方は、問合せボタンを押してください。

PAGE TOP

close
close
close
close
close
close