Top > Search of International Patents > Approval prediction device, approval prediction method, and program

Approval prediction device, approval prediction method, and program

Foreign code F170009036
File No. E096P02WO
Posted date Apr 26, 2017
Country EPO
Application number 13844515
Gazette No. 2905363
Gazette No. 2905363
Date of filing Sep 27, 2013
Gazette Date Aug 12, 2015
Gazette Date Feb 19, 2020
International application number JP2013076248
International publication number WO2014054526
Date of international filing Sep 27, 2013
Date of international publication Apr 10, 2014
Priority data
  • P2012-219730 (Oct 1, 2012) JP
  • 2013JP76248 (Sep 27, 2013) WO
Title Approval prediction device, approval prediction method, and program
Abstract According to an aspect of the present invention, similarity centrality measures that are centrality measures of proteins that a protein similarity network includes are calculated, interaction centrality measures that are centrality measures of the proteins that the protein-protein interaction network includes are calculated, a rejection score that represents probability of a compound to be validated to be classified as a rejected drug is calculated using classifiers that use, as training data, the approval attributes of the respective drugs, the sum and average of the similarity centrality measures per target for each drug, and the sum and average of the interaction centrality measures per target for each drug, and the rejection score is output.
Outline of related art and contending technology BACKGROUND ART
Conventional technologies for predicting off-targets and side effects of existing compounds have been disclosed.
As for the identification of protein functions according to Non Patent Literature 1, a technology for detecting off-targets of drugs by grouping proteins according to the similarities between their ligands has been disclosed where unexpected relations between drugs, such as methadone, emetine and loperamide, are found in that they antagonize receptors not previously reported in the literature.
As for the identification of drug targets according to Non Patent Literature 2, a technology has been disclosed where off-target effects are investigated using the side-effects caused by marketed drugs as a starting point and drugs are grouped according to their side effects to group the drugs having indications and structures, which makes it possible to determine additional protein targets for the drugs that were not known before.
As for the prediction of new molecular targets of known drugs according to Non Patent Literature 3, a technology has been disclosed where proteins are grouped according to the similarity of their ligands and off-target effects are investigated to find other targets in addition to the reported targets.
As for the prediction of drug target interaction networks according to Non Patent Literature 4, a technology has been disclosed where information on protein sequences and drug targets are correlated to newly create a resource referred to as "pharmacological space" and, using this resource, known additional targets for known drugs are revealed and the drug targets are classified into four classes of enzymes, ion channels, G-protein-coupled and nuclear receptors.
As for the large-scale prediction of drug activity according to Non Patent Literature 5, a technology has been disclosed where a drug target-adverse effect network that is used to predict and explain the side effects of marketed drugs is created and, from various unintended interaction between drugs and certain proteins, adverse effects that cannot be explained before can be discovered.
The drug induced liver injury prediction system according to Non Patent Literature 6 is a prediction system for identifying a compound with a high potential to cause liver injury, and a technology has been disclosed where a prediction target is limited to liver and a characteristic of a given type of compound to be likely to cause liver injury is predicted based on the investigations according to scientific literatures. The drug induced liver injury prediction system predicts some proteins and pathways having a potential to cause harmful effects to liver.
Scope of claims [claim1]
1. An approval prediction apparatus (100) for drug development comprising an output unit, a storage unit (106), and a control unit (102), wherein
the storage unit (106) includes:
a similarity network information storage unit (106b) configured for storing similarity network information on a protein similarity network that is constructed according to the similarity between proteins, said similarity between proteins being found with a protein signature-based algorithm determining the similarity between two sequence information, said similarity network information being determined as a graph comprising:
- nodes, each node representing a protein, and
- edges, such that two nodes are connected by an edge only if the nodes share considerable protein sequence similarity; a drug target storage unit (106c) configured for storing drug information containing approval attributes of drugs on approval or rejection and protein information on the proteins targeted by the drugs in association with each other; and
an interaction network information storage unit (106d) configured for storing interaction network information on a protein-protein interaction network that is constructed based on interactions between pre-selected proteins, said interaction between proteins meaning that said proteins are similar, said interaction network information being determined as a graph comprising :
- nodes, each node representing a protein, and
- edges such that two nodes are connected if the corresponding proteins have an interaction; and the control unit (102) includes:
a similarity centrality measure calculating unit (102b) configured for calculating, based on the similarity network information stored in the similarity network information storage unit (106b), similarity centrality measures that are centrality measures containing:
- a degree centrality, defined as an index representing how much the node is directly connected to nodes in the similarity network,
- a betweenness centrality B(v), using the following expression (1) formed of Sij denoting the number of shortest paths between a node i and a node j, and Sij(v) denoting the fraction of shortest paths passing through a node v:
Bv=∑SijvSijwithi≠j,v≠iandv≠j,
- a closeness centrality C(v), using the following expression (2) formed of d(v,i) denoting the distance represented at the step between a node v and the node i:
Cv=1∑dviwithi≠v,
and
- a Burt's constraint C(i) of the proteins included in the protein similarity network, using the following expression (3) formed of piqpqj denoting a product of the proportional strength of the node j's relationship with the node i and the proportional strength of the node j's relationship with the node q:
Ci=∑jpij+∑qpippqj2withq≠i,jandj≠i;
an interaction centrality measure calculating unit (102e) configured for calculating, based on the interaction network information stored in the interaction network information storage unit (106d), interaction centrality measures that are centrality measures containing said degree centrality, betweenness centrality calculated with expression (1), closeness centrality calculated with expression (2), and Burt's constraint of the proteins included in the protein-protein interaction network calculated with expression (3);
a rejection score calculating unit (102f) configured for calculating during the drug development a rejection score that represents a probability of a compound to be validated to be classified as a rejected drug, using classifiers that use, as training data, the approval attributes of the respective drugs stored in the drug target storage unit (106c), the sum and average of the similarity centrality measures per target for each drug that are calculated by the similarity centrality measure calculating unit (102b), and the sum and average of the interaction centrality measures per target for each drug that are calculated by the interaction centrality measure calculating unit (102e), said similarity centrality measures used as parameters for several machine learning classifiers; and
a rejection score outputting unit (102g) configured for outputting, via the output unit, the rejection score that is calculated by the rejection score calculating unit (102f).

[claim2]
2. An approval prediction apparatus (100) for drug development comprising an output unit, a storage unit (106), and a control unit (102), wherein
the storage unit (106) includes:
a similarity network information storage unit (106b) configured for storing similarity network information on a protein similarity network that includes proteins having similarity, said similarity between proteins being found with a protein signature-based algorithm determining the similarity between two sequence information , said similarity network information being determined as a graph comprising:
- nodes, each node representing a protein, and
- edges, such that two nodes are connected by an edge only if the nodes share considerable protein sequence similarity; and a drug target storage unit (106c) configured for storing drug information containing approval attributes of drugs on approval or rejection and protein information on the proteins targeted by the drugs in association with each other; and
wherein the control unit (102) includes:
a similarity centrality measure calculating unit (102b) configured for calculating, based on the similarity network information stored in the similarity network information storage unit (106b), similarity centrality measures that are centrality measures containing:
- a degree centrality, defined an index representing how much the node is directly connected to nodes in the similarity network, ,
- a betweenness centrality B(v), using the following expression (1) formed of Sij denoting the number of shortest paths between a node i and a node j, and Sij(v) denoting the fraction of shortest paths passing through a node v:
Bv=∑SijvSijwithi≠j,v≠iandv≠j,
- a closeness centrality C(v), using the following expression (2) formed of d(v,i) denoting the distance represented at the step between a node v and the node i:
Cv=1∑dviwithi≠v,
and
- a Burt's constraint C(i) of the proteins that the protein similarity network includes, using the following expression (3) formed of piqpqj denoting a product of the proportional strength of the node j's relationship with the node i and the proportional strength of the node j's relationship with the node q:
Ci=∑jpij+∑qpiqpqj2withq≠i,jandj≠i;;
an approval determining unit (102c) configured for obtaining, based on the approval attributes of the drugs targeting the proteins according to the protein information stored in the drug target storage unit (106c), which are the proteins included in the protein similarity network, during the drug development a determination result representing whether the proteins to be validated, which are proteins that the similarity network includes, are within a range of targets of approved drugs or a range of targets of rejected drugs, using the similarity centrality measures of the proteins to be validated that are calculated by the similarity centrality measure calculating unit (102b); and
a determination result outputting unit (102d) configured for outputting, via the output unit, the determination result that is obtained by the approval determining unit (102c).

[claim3]
3. The approval prediction apparatus (100) according to claim 1 or 2, wherein
the storage unit (106) further includes a protein sequence information storage unit (106a) configured for storing sequence information on amino acid sequences of the proteins, and
the control unit (102) further includes a similarity network information storing unit (102a) configured for creating, when the similarity is detected between the proteins using a signature-based algorithm and based on the sequence information stored in the protein sequence information storage unit (106a), the protein similarity network including the proteins between which the similarity is detected and for storing the similarity network information on the protein similarity network in the similarity network information storage unit (106b).

[claim4]
4. The approval prediction apparatus (100) according to claim 2, wherein, based on the approval attributes of the drugs targeting the proteins according to the protein information stored in the drug target storage unit (106c), which are the proteins that the protein similarity network includes, the approval determining unit (102c) is configured for generating a determination result representing that the proteins to be validated are within the range of targets of rejected drugs when the degree centrality contained in the similarity centrality measures of the proteins to be validated that are calculated by the similarity centrality measure calculating unit (102b) is high, the closeness centrality is low, and the Burt's constraint is extremely low.

[claim5]
5. An approval prediction method for drug development executed by an approval prediction apparatus (100) including an output unit, a storage unit (106), and a control unit (102), wherein
the storage unit (106) includes:
a similarity network information storage unit (106b) configured for storing similarity network information on a protein similarity network that is constructed according to the similarity between proteins, said similarity between proteins being found with a protein signature-based algorithm determining the similarity between two sequence information , said similarity network information being determined as a graph comprising:
- nodes, each node representing a protein, and
- edges, such that two nodes are connected by an edge only if the nodes share considerable protein sequence similarity; a drug target storage unit (106c) configured for storing drug information containing approval attributes of drugs on approval or rejection and protein information on the proteins targeted by the drugs in association with each other; and
an interaction network information storage unit (106d) configured for storing interaction network information on a protein-protein interaction network that is constructed based on interactions between pre-determined proteins, said interaction between proteins meaning that said proteins are similar , said interaction network information being determined as a graph comprising :
- nodes, each node representing a protein, and
- edges such that two nodes are connected if the corresponding proteins have an interaction; the method executed by the control unit (102) comprising:
a similarity centrality measure calculating step (SB-1) of, based on the similarity network information stored in the similarity network information storage unit (106b), calculating similarity centrality measures that are centrality measures containing a degree centrality, a betweenness centrality calculated with expression (1), a closeness centrality calculated with expression (2), and a Burt's constraint of the proteins that the protein similarity network includes calculated with expression (3);
an interaction centrality measure calculating step (SB-2) of, based on the interaction network information stored in the interaction network information storage unit (106d), calculating interaction centrality measures that are centrality measures containing:
- the degree centrality, defined as an index representing how much the node is directly connected to nodes in the similarity network,
- betweenness centrality B(v), using the following expression (1) formed of Sij denoting the number of shortest paths between a node i and a node j, and Sij(v) denoting the fraction of shortest paths passing through a node v:
Bv=∑SijvSijwithi≠j,v≠iandv≠j,
- closeness centrality C(v), using the following expression (2) formed of d(v,i) denoting the distance represented at the step between a node v and the node i:
Cv=1∑dviwithi≠v,
and
- Burt's constraint C(i) of the proteins that the protein similarity network includes, using the following expression (3) formed of piqpqj denoting a product of the proportional strength of the node j's relationship with the node i and the proportional strength of the node j's relationship with the node q:
Ci=∑jpij+∑qpiqpqj2withq≠i,jandj≠i;
a rejection score calculating step (SB-3) of calculating during the drug development a rejection score that represents a probability of a compound to be validated to be classified as a rejected drug, using classifiers that use, as training data, the approval attributes of the respective drugs stored in the drug target storage unit (106c), the sum and average of the similarity centrality measures per target for each drug that are calculated at the similarity centrality measure calculating step (SB-1), and the sum and average of the interaction centrality measures per target for each drug that are calculated at the interaction centrality measure calculating step (SB-2), said similarity centrality measures used as parameters for several machine learning classifiers; and
a rejection score outputting step (SB-4) of outputting, via the output unit, the rejection score that is calculated at the rejection score calculating step (SB-3).

[claim6]
6. An approval prediction method for drug development executed by an approval prediction apparatus (100) including an output unit, a storage unit (106), and a control unit (102), wherein
the storage unit (106) includes:
a similarity network information storage unit (106b) configured for storing similarity network information on a protein similarity network that includes proteins having similarity, said similarity between proteins being found with a protein signature-based algorithm determining the similarity between two sequence information , said similarity network information being determined as a graph comprising:
- nodes, each node representing a protein, and
- edges, such that two nodes are connected by an edge only if the nodes share considerable protein sequence similarity; and a drug target storage unit (106c) configured for storing drug information containing approval attributes of drugs on approval or rejection and protein information on the proteins targeted by the drugs in association with each other; and
an interaction network information storage unit (106d) configured for storing interaction network information on a protein-protein interaction network that is constructed based on interactions between pre-selected proteins, said interaction between proteins meaning that said proteins are similar, said interaction network information being determined as a graph comprising :
- nodes, each node representing a protein, and
- edges such that two nodes are connected if the corresponding proteins have an interaction; and the control unit (102) includes:
a similarity centrality measure calculating unit (102b) configured for calculating, based on the similarity network information stored in the similarity network information storage unit (106b), similarity centrality measures that are centrality measures containing:
- a degree centrality, defined as an index representing how much the node is directly connected to nodes in the similarity network,
- a betweenness centrality B(v), using the following expression (1) formed of Sij denoting the number of shortest paths between a node i and a node j, and Sij(v) denoting the fraction of shortest paths passing through a node v:
Bv=∑SijvSijwithi≠j,v≠iandv≠j,
- a closeness centrality C(v), using the following expression (2) formed of d(v,i) denoting the distance represented at the step between a node v and the node i:
Cv=1∑dviwithi≠v,
and
- a Burt's constraint C(i) of the proteins included in the protein similarity network, using the following expression (3) formed of piqpqj denoting a product of the proportional strength of the node j's relationship with the node i and the proportional strength of the node j's relationship with the node q:
Ci=∑jpij+∑qpiqpqj2withq≠i,jandj≠i;
an approval determining step (SA-2) of, based on the approval attributes of the drugs targeting the proteins according to the protein information stored in the drug target storage unit (106c), which are the proteins that the protein similarity network includes, obtaining during the drug development a determination result representing whether the proteins to be validated, which are proteins that the similarity network includes, are within a range of targets of approved drugs or a range of targets of rejected drugs, using the similarity centrality measures of the proteins to be validated that are calculated at the similarity centrality measure calculating step (SA-1); and
a determination result outputting step (SA-3) of outputting, via the output unit, the determination result that is obtained at the approval determining step (SA-2).

[claim7]
7. A computer program product having a non-transitory tangible computer readable medium including programmed instructions for causing, when executed by an approval prediction apparatus (100) including an output unit, a storage unit (106), and a control unit (102), wherein
the storage unit (106) includes:
a similarity network information storage unit (106b) configured for storing similarity network information on a protein similarity network that is constructed according to the similarity between proteins, said similarity between proteins being found with a protein signature-based algorithm determining the similarity between two sequence information , said similarity network information being determined as a graph comprising:
- nodes, each node representing a protein, and
- edges, such that two nodes are connected by an edge only if the nodes share considerable protein sequence similarity; a drug target storage unit (106c) configured for storing drug information containing approval attributes of drugs on approval or rejection and protein information on the proteins targeted by the drugs in association with each other; and
an interaction network information storage unit (106d) configured for storing interaction network information on a protein-protein interaction network that is constructed based on interactions between pre-determined proteins, said interaction between proteins meaning that said proteins are similar , said interaction network information is being determined as a graph comprising :
- nodes, each node representing a protein, and
- edges such that two nodes are connected if the corresponding proteins have an interaction; the approval prediction apparatus (100) to perform an approval prediction method comprising:
a similarity centrality measure calculating step (SB-1) of, based on the similarity network information stored in the similarity network information storage unit (106b), calculating similarity centrality measures that are centrality measures containing:
- a degree centrality, defined as an index representing how much the node is directly connected to nodes in the similarity network,
- a betweenness centrality B(v), using the following expression (1) formed of Sij denoting the number of shortest paths between a node i and a node j, and Sij(v) denoting the fraction of shortest paths passing through a node v:
Bv=∑SijvSijwithi≠j,v≠iandv≠j,
- a closeness centrality C(v), using the following expression (2) formed of d(v,i) denoting the distance represented at the step between a node v and the node i:
Cv=1∑dviwithi≠v,
and
- a Burt's constraint C(i) of the proteins that the protein similarity network includes, using the following expression (3) formed of piqpqj denoting a product of the proportional strength of the node j's relationship with the node i and the proportional strength of the node j's relationship with the node q:
Ci=∑jpij+∑qpiqpqj2withq≠i,jandj≠i
an interaction centrality measure calculating step (SB-2) of, based on the interaction network information stored in the interaction network information storage unit (106d), calculating interaction centrality measures that are centrality measures containing said degree centrality, betweenness centrality calculated with expression (1), closeness centrality calculated with expression (2), and Burt's constraint of the proteins that the protein-protein interaction network includes calculated with expression (3);
a rejection score calculating step (SB-3) of calculating during the drug development a rejection score that represents probability of a compound to be validated to be classified as a rejected drug, using classifiers that use, as training data, the approval attributes of the respective drugs stored in the drug target storage unit (106c), the sum and average of the similarity centrality measures per target for each drug that are calculated at the similarity centrality measure calculating step (SB-1), and the sum and average of the interaction centrality measures per target for each drug that are calculated at the interaction centrality measure calculating step (SB-2), said similarity centrality measures used as parameters for several machine learning classifiers; and
a rejection score outputting step (SB-4) of outputting, via the output unit, the rejection score that is calculated at the rejection score calculating step (SB-3).

[claim8]
8. A computer program product having a non-transitory tangible computer readable medium including programmed instructions for causing, when executed by an approval prediction apparatus (100) including an output unit, a storage unit (106), and a control unit (102),
wherein the storage unit (106) includes:
a similarity network information storage unit (106b) configured for storing similarity network information on a protein similarity network that includes proteins having similarity, said similarity between proteins being found with a protein signature-based algorithm determining the similarity between two sequence information, said similarity network information being determined as a graph comprising:
- nodes, each node representing a protein, and
- edges, such that two nodes are connected by an edge only if the nodes share considerable protein sequence similarity; and a drug target storage unit (106c) configured for storing drug information containing approval attributes of drugs on approval or rejection and protein information on the proteins targeted by the drugs in association with each other;
the approval prediction apparatus (100) to perform an approval prediction method comprising:
a similarity centrality measure calculating step (SA-1) of, based on the similarity network information stored in the similarity network information storage unit (106b), calculating similarity centrality measures that are centrality measures containing:
- a degree centrality, defined as an index representing how much the node is directly connected to nodes in the similarity network,
- a betweenness centrality B(v), using the following expression (1) formed of Sij denoting the number of shortest paths between a node i and a node j, and Sij(v) denoting the fraction of shortest paths passing through a node v:
Bv=∑SijvSijwithi≠j,v≠iandv≠j,
- a closeness centrality C(v), using the following expression (2) formed of d(v,i) denoting the distance represented at the step between a node v and the node i:
Cv=1∑dviwithi≠v,

and
- a Burt's constraint C(i) of the proteins that the protein similarity network includes, using the following expression (3) formed of piqpqj denoting a product of the proportional strength of the node j's relationship with the node i and the proportional strength of the node j's relationship with the node q:
Ci=∑jpij+∑qpiq+pqj2withq≠i,jandj≠i;
an approval determining step (SA-2) of, based on the approval attributes of the drugs targeting the proteins according to the protein information stored in the drug target storage unit (106c), which are the proteins that the protein similarity network includes, obtaining during the drug development a determination result representing whether the proteins to be validated, which are proteins that the similarity network includes, are within a range of targets of approved drugs or a range of targets of rejected drugs, using the similarity centrality measures of the proteins to be validated that are calculated at the similarity centrality measure calculating step (SA-1); and
a determination result outputting step (SA-3) of outputting, via the output unit, the determination result that is obtained at the approval determining step (SA-2).
  • Applicant
  • JAPAN SCIENCE AND TECHNOLOGY AGENCY
  • Inventor
  • DA SILVA LOPES, Tiago Jose
  • KITANO, Hiroaki
  • KAWAOKA, Yoshihiro
IPC(International Patent Classification)
Specified countries Contracting States: AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR
Reference ( R and D project ) ERATO KAWAOKA Infection-induced Host Responses AREA
Please contact us by E-mail or facsimile if you have any interests on this patent.

PAGE TOP

close
close
close
close
close
close