In-Silico Methodologies for Cancer Multidrug Optimization

Drug combinations is considered as an effective strategy designed to control complex diseases like cancer. Combinations of drugs can effectively decrease side effects and enhance adaptive resistance. Therefore, increasing the likelihood of defeating complex diseases in a synergistic way. This is due to overcoming factors such as off-target activities, network robustness, bypass mechanisms, cross-talk across compensatory escape pathways and the mutational heterogeneity which results in alterations within multiple molecular pathways. The plurality of effective drug combinations used in clinic were found out through experience. The molecular mechanisms underlying these drug combinations are often not clear. It is not easy to suggest new drug combinations. Computational approaches are proposed to reduce the search space for defining the most promising combinations and prioritizing their experimental evaluation. In this paper, we review methods, techniques and hypotheses developed for in silico methodologies for drug combination discovery in cancer and discuss the limitations and challenges of these methods.


Introduction
The target of cancer therapy is a heterogeneous population of malignant agents, each distinguished by a different degree of aggressiveness and response to the therapy. Cancer cells are often resistant to apoptosis and develop resistance to cytotoxic agents. Disease progress despite therapy. A pre-existing subpopulation of malignant agents is not responsive to a drug and escape the treatments [1,2]. Cancer cells have mechanisms to overcome perturbations. Therapies targeting only one pathway can fail in clinical trials, or be defeated by mutations at a receptor. Although some tumors are initially sensible to targeted therapies, they ultimately become resistant due to mutations in the target or due to bypass of the targeted pathway. Drug combinations can reduce the prospect of tumor resistance and be more efficient when targeting heterogeneous populations of malignant agents [3]. They are designed to control complex diseases. Targeted drug combinations may also overcome the side effects related to high doses of single drugs. They withstand pathway restitution and increase cancer cell killing while minimizing overlapping toxicity and allowing reduced dosage of each drug [4,5]. The plurality of effective combinatorial drugs used in clinic were detected through experience. This requires laborintensive and time consuming "brute force" screening of all possible combinations among the confirmed individual drugs. The molecular mechanisms underlying these drug combinations are often not clear which makes it difficult to propose new drug combinations. It is not practical to screen all possible drug combinations since the number of possible combinations will increase exponentially with the increasing number of single drugs. In silico methods are developed for predicting new drug combinations before combination composition and practical test in the lab [6,7]. Some of the computational approaches are based on detailed mathematical modelling that concentrate on established cancer pathways and metabolic network constructions. Other methods use the transcriptional responses of drugs like gene expression profiles before and after drug treatments. Some reviews discuss drug combination effects, applications to efficacy and toxicity [96] and the challenges in drug combination discovery [97]. Sayed Ali Madani Tonekaboni et al [98] discussed Predictive approaches for drug combination discovery in cancer. They extensively discussed Quantification methods of experimental drug combination, data sources for predictive drug combination approaches and assessment strategies. They discussed computational methods in drug combination prediction narrowly depending on few articles. In this review, we present many of these computational methods in greater details and discuss how they are applied to biological systems. The proposed approaches for enhancements are also discussed.

Framework of Computational approaches of drug combination discovery
Computational approaches developed for prediction drug combinations can be categorized according to different categorization aspects. These aspects include: types of data used to apply the computational method, types of diseases and the computational methods used for prediction.
Data that can be used to predict the efficacy of drug combinations include -Drug targets data like molecular profiling of tumor such as genomic, proteomic, signaling pathways and interaction networks [14].
-Similarity of drugs chemical structure, biochemical properties and drugs functional classification such as anatomical therapeutic classification (ATC).
-Response of the cancer cells to single and combinatorial therapies are used to discover targets with similar response to a drug and drugs with similar mechanisms of action [91].
Drug combination prediction methods are applied to many complex diseases like Acute myocardial infarction, Cancer, Hypertension, Type 2 diabetes mellitus, Parkinson's disease and Schizophrenia [21].
For validation of the predicted drug combinations, in vitro methods, Chou-Talalay method [87], Loewe additive model [88] or Bliss independent model [89] can be used to determine the effect of drug combinations. The general work flow of drug combinations prediction is shown in Figure 1.
In this review, we concentrate on computational methods applied in prediction of drug combinations for treating cancer diseases. Combining anti-cancer drugs reduces drug resistance and tumour metastatic growth [99]. Survival rates for most metastatic cancers are low. The process of developing new anti-cancer drugs is costly [100]. Therefore, new approaches that combine anti-cancer drugs are considered.
Computational methods applied in drug combination prediction can be categorized into 5 methods: mathematical optimization methods, statistical modelling, search algorithms, machine learning methods and systems biology approaches.  Mathematical methods apply mathematical models and statistical tests to predict drug combination synergistic effect. They aim to discover the relationship between drug combination effect and transcriptomic changes produced by individual drugs via direct mathematical models.
The main advantages of statistical approaches include simplicity and facility in evaluation of drug combination components and interactions, low data input and low computational demands. The existence of noise in statistical models can be estimated but noise may becloud the model [83]. Statistical methods are interested in defining the shape of the response surface and try to describe the response in terms of the input variables thus allow for the continuous analysis of data. However, statistical models are only accurate in the area that corresponds to the input experimental data provided. Therefore, predicting responses outside of this area can lead to inexact results [84]. Additionally, all statistical models make hypothesis about data features. The quality of experimental data cannot be fully captured.
Search algorithms don't demand any hypothesis about the relationship between variables. This facilitates dealing with highly complex, nonlinear systems. One of the restrictions in this approach is the presence of measurement errors and variability in the noisy biological system that may affect the execution of the algorithm [11]. Moreover, there is always a probability that the algorithm will converge to a local minimum or maximum. Search algorithms identify the optimum based on a discretized grid so they always demand data discretization. Too large resolution may decrease the performance of the search algorithm and too small resolution may need additional experiments to locate the optimum.
Machine learning algorithms are flexible when optimizing large data sets with nonlinear models. They do not require generalized assumptions about input data or about the relation between input and output parameters. This is suitable in the field of biological research where the response surface is still unknown and complex. Additionally, machine learning models are more accurate than statistical modelling techniques, as they include higher complexity levels. However, machine learning approaches require much input data. Experimental data gathered from diverse sources may not be sufficient for taking out accurate predictive models. Absence of transparency in machine learning methods may not allow for subsequent analysis. Model training may be very long when applied to quite complex networks and executed through an iterative process using the experimental data.
Systems biology approaches can integrate multiple types of data (topology, efficacy, drug targets, etc.). They have the ability to reveal temporal and spatial dynamics on the level of individual components [85]. Network topology data provide insights into drug interactions and signaling pathway interactions. Additionally, the use of simplified network models and the ability to apply data from external sources may help to increase the information obtained from minimal experimental data. However, incomplete hypothesis may lead to erroneous deduction.

Mathematical Optimization methods
The Dialogue for Reverse Engineering Assessments and Methods (DREAM) consortium [8] launched two community challenges. The challenges aimed at developing in silico methods to rank 91 drug combinations from most synergistic to most antagonistic. In the first DREAM drug combination challenge, the combinations were tested on OCI-LY3 human diffuse large B-cell lymphoma cell line [6]. AstraZeneca-Sanger Drug Combination DREAM Challenge was launched using 85 cancer cell lines and 11,759 drug combination screening for 118 drugs [8]. The predictive models were designed to differentiate synergistic, additive and antagonistic combinations and predict new synergistic combinations in silico [9][10][11][12][13][14][15][16][17].
The two best-performing methods in DREAM challenge [9] namely DIGRE and IUPUI_CCBB methods applied mathematical methods to rank drug combinations.
Yang et al [9] hypothesized that if a cell is treated by compound (a) then compound (b), the first compound (a) would affect the cell transcriptome and modify the effect of compound (b). The transcriptional expression profiles after treatment by individual drugs were compared. A gene-gene interaction network based on KEGG pathways was constructed. A compound -compound similarity score was measured using differential expression genes of each drug. Drug-response curve information and Compound -Compound similarity score were used to estimate the combinatorial effect of the compound combination. Probabilistic concordance index (PC-index) [18] was used to measure the performance which scored ̴ 0.61.
For IUPUI_CCBB [9], the differentially expressed genes were identified for each drug treatment. Genes that were significantly differentially expressed by both drugs with the same direction of regulation were recognized as synergistic. Genes that were significantly differentially expressed by both drugs with the opposite direction of regulation were recognized as antagonistic genes. The numbers of synergistic and antagonistic genes were used to compute an interaction score. Probabilistic concordance index (PC-index) [18] was used to measure the performance which scored ̴ 0.61.
Lee et al [13] applied hypergeometric and Kolmogorov-Smirnov statistical tests to the prediction of drug combinations. They hypothesized that drugs that regulate two different disease-specific pathways could be synergistic. A list of genes whose expression is significantly related to a disease state is identified. Highly enriched pathways in this list of genes were determined. Gene expression pattern matching is implemented to measure the similarity between the list of genes and drug perturbation profiles in CMap [19]. They rank drugs and performed in vitro validations on non-small cell lung cancer cells.
Wei et al [20] generated a gene expression signature of glucocorticoid (GC) sensitivity/resistance in acute lymphoblastic leukemia (ALL) cells. They queried the Connectivity Map database [19] with this signature to get profiles that overlap with it. They performed gene set enrichment analysis based on Kolmogorov-Smirnov statistical test to rank profiles according to their similarity to the signature. There was a strong connection between glucocorticoid sensitivity and the mammalian target of rapamycin (mTOR) inhibitor sirolimus. Rapamycin is an FDA-approved drug that is combined with glucocorticoids to patients with organ transplant. The authors proofed that rapamycin increases glucocorticoid sensitivity through down-regulation of MCL1.
Kaifang Pang et al [21] built a network of drug-target interactions from the Drug Bank database. They used mixed integer linear programming to formulate the optimal combination of drugs problem. Given an input disease gene set, the algorithm maximizes the coverage on the disease genes and minimizes the off target set. They applied their approach on EPAM pathway. They predicted a combination of five drugs and validated the predicted combination by literature.
Yu-Ching Hsu et al [22] hypothesized that drugs with synergistic effects perturb similar genes in biological functions. They constructed three scoring systems that measure the commonly disturbed genes between two drug treatments, the similarity in enriched gene sets between the two drugs and commonly disturbed genes within the similar enriched gene sets. The third score achieved probabilistic c-index (PC-index) [9] of 0.663 when applied on the gold standard of drug pairs of DREAM challenge. They applied their method to the Connectivity Map (CMap) dataset [19] and identified novel synergistic drug pairs for breast cancer.

Statistical modelling methods
Statistical modelling embodies a set of assumptions concerning the generation of some sample data and similar data from a larger population. Ying Li et al [56] proposed a method that uses propensity score matching (PSM) [55] for drug combinations prediction assuming that one drug could reduce the adverse drug reactions (ADR) of the other. They extracted pairs of Drugs ADRs that were reported in FAERS [57]. A logistic regression [101] was applied and the propensity score is estimated as the predicted probability of receiving the drug for each case report. They assessed the predicted interaction score against a set of known drug-drug interaction (DDI) and their ADRs. They generated receiver operating characteristic (ROC) curves and the AUC is 0.80.
Using data provided in the drug combination DREAM challenge, Yiyi Liu and Hongyu Zhao [65] computed two similarity measures between two drugs. The measures are structural similarity and gene expression similarity. They applied logistic regression models [101] to predict synergistic combinations. They performed 3-fold cross validation for four gene expression similarity measures classifiers performed individually and combined with the structural similarity measure. They achieved AUC ranging from 0.43 to 0.81.
Weiss et al [76] used a statistical design of experiment (DOE) approach with an orthogonal array composite design (OACD). Through a series of designed experiments and data analysis based on regression modelling, they were able to identify a set of effective and synergistic drugs for viability inhibition of renal carcinoma cells. They used an initial pool of 10 drugs.
Pivetta et al [103] used (DOE) approach to sample 60 data points for each two-drug combination to predict their synergism. They investigated the correlation between the concentrations of two drugs with the cytotoxicity of their combination. They applied third and fourth-order linear regression models. The fit to the experimental data of the training set achieved (R from 0.9972 to 0.9993). The fit of the validation and test data sets was lower (R from 0.8870 to 0.9869). Due to a problem of data over-fitting, irregularly shaped response surfaces were produced.
Ning et al [104] proposed a method that allow the Hill model to be applied to drug combinations at non-fixeddose ratios. The half maximal inhibitory concentration (IC50) is represented as a function of a vector of drugs proportions in a drug combination. Their model was applied to three drugs for the inhibition of viability of lung cancer cells. The model achieved improved fit over the traditional hill response surface model.

Search algorithms
Search algorithms have been used in drug combination prediction. Search algorithms are a type of heuristic, global optimization technique that optimize a function given input values and selected criteria.
In Calzolari et al [106], the space of drug combinations is represented by a tree. Single drugs are at the bottom of the tree and the combinations are at the top. Starting at the bottom of the tree, the algorithm tests all drug pairs and incrementally adds drugs to the most efficient pair. If adding a drug does not enhance the efficacy, the algorithm returns to the previous node. This algorithm was applied in lymphoma cells apoptosis. The predicted drug combinations achieved more cell viability inhibition than random combinations.
Tse et al [107] defined the response of tumor cells to chemotherapeutic drugs. They combined adaptive elitist genetic algorithm and an iterative dynamic programming (IDP) strategy. Adaptive elitist genetic algorithm identify the global optimum. IDP is a local search strategy based on Bellman's dynamic programming. It searches in the neighbourhood of the optimal solutions for improvements. The algorithm was applied to the optimization of three-drug chemotherapeutics combinations and outperformed the separate application of IDP and adaptive elitist genetic algorithms.
Zinner et al [11] attempted to find the most effective drugs combinations from a set of 19 drugs based on hill climbing search. The first generation was composed of 18 random combinations. At each iteration, a fitness function measures the inhibitory effect of each combination. The iterative process changes one element at a time and accept combinations with an improved fitness function. The algorithm does not identify the optimal drug combination, but identify the drug combinations that are better than the randomly identified ones.
Park et al [105] applied a function that represents the biological system's response to drug combinations. Gaussian process (GP) regression was used to select data points to be tested. The next input is selected such that the expected information gain is maximized. This approach was applied to optimize a drug combination targeting three nodes in the epidermal growth factor receptor (EGFR) signaling network. This approach outperformed the genetic algorithm by identifying a more effective combination.
Kevin Matlock et al [42] formulated the designing of targeted combination therapies as an optimization problem. Their goal is to maximize the efficacy on heterogeneous tumor cells while minimizing the toxicity over normal cells. They created a probabilistic target inhibition map (PTIM) [43][44][45][46] that model the tumor proliferation. They generated a set of PTIM models for different breast-cancer and B-cell lymphoma cancer cell lines from the GDSC database [47]. They performed an accelerated Lexicographical Search to find the optimal solution and achieved minimum cancer sensitivities of 0.45.
Patrycja Nowak-Sliwinska et al [48] developed a drug combination screening method. An initial set of drugs that target non-overlapping endothelial cell signaling pathways is used. An iterative approach of experimental testing and Feedback system control (FSC) analysis is applied. They generated dose-response curves to select the drug dose input of each drug in a combination. They applied differential evolution (De) algorithm which predicts new combinations to be tested in vitro. They identified the effective combinations after ten iterations.
Wang et al [77] defined the difference in response between three control cell lines and three breast cancer cell lines to four chemotherapeutic compounds using Latin hypercube sampling. Each agent was considered at seven doses and many drug combinations were tested. The differential evolution search algorithm was used to identify the global optimum after only one experimental iteration.
Weiss et al [78] applied differential evolution to iteratively test the endothelial cell viability. They identified an optimal effective three-drug combination which was translated successfully into several in vivo tumor models. While some drugs showed synergistic interactions, others showed antagonistic behaviour. In another study, Weiss et al [79] identified drug combinations that showed strong synergistic activity on human endothelial (ECRF24) and human ovarian carcinoma (A2780) cell viability inhibition.

Machine learning methods
Machine learning algorithms are data driven. Using machine learning approaches, predictive models can be created by learning associations between input data (drug-drug, drug-target and target-target) and drug combination effect. Machine learning approaches can be very effective in biological applications as they are capable of predicting the behaviour of highly complex, nonlinear systems. They require a large amount of input data for model training compared with other methods such as statistical modelling methods or search algorithms. Integration of pharmacological and omic data types play key roles in successful machine learning methods prediction [12,16]. Therefore, machine learning methods could be an effective tool for the drug combination problem. In supervised machine learning methods, drug combinations inhibitory effects are labelled (effective/ineffective or antagonistic/additive/synergistic drug combinations). Unlabelled drug combinations inhibitory effects are used in unsupervised methods. Semi-supervised learning use a mixture of labelled and unlabelled drug combination inhibitory effect.

Supervised methods
Application of supervised machine learning methods in drug combination discovery problem suffered from Lack of labelled drug combination data. Few studies have used supervised learning methods to predict the label of drug combinations.
Zhao et al [17] designed a set of predictive features to predict novel drug combinations. The features include target proteins and corresponding downstream pathways, medical indication areas, therapeutic effects as represented in the Anatomical Therapeutic Chemical (ATC) Classification System and side effects. They predicted a set of effective combinations using F-measure maximization. They performed 5-fold cross validation to evaluate the performance of these features. 60% of the predicted effective combinations have been recognized as synergistic combinations in the literature.
Li et al [12] proposed an approach called probability ensemble approach (PEA) that combine drugs chemical and pharmacological features. Six drug-drug similarity features (drug chemical structure, ATC code, target side effect, target sequence, target-target interaction in PPI, and Gene Ontology semantic) for a query drug pair were integrated using Bayesian network to calculate a likelihood ratio (LR) which estimates the similarity to a known drug pair. LRs are summed up over two sets of approved effective drug combinations (EDCs) and undesirable drug-drug interactions (UDDIs) separately. Similarity scores to EDC and UDDI classes are used to decide the class of the new drug pair. The area under the receiver operating characteristic curve (AUC) is 0.90. Performance of PEA were evaluated using external literature validation, and experimental validation of 55 novel predicted drug pairs against the human non-small cell lung cancer A549 cells. 39 effective drug combinations were confirmed (71% accuracy).
Yi Xiong et al [49], developed a computational method for Prediction of effective Drug Combinations using a Stochastic Gradient Boosting algorithm, termed PDC-SGB. They integrated six features to describe the drug combinations, which include the molecular 2D structures, structural similarity, anatomical therapeutic similarity, protein-protein interaction, chemical-chemical interaction, and disease pathways. They applied 10-fold cross validation on the training data set and achieved AUC= 0.9775.
Xiangyi Li et al [50], built a model for synergistic anti-cancer drug combinations prediction based on drug target network features and pharmacogenomics features. The gene expression profiles of drug perturbation from DREAM Challenge [9] was used as the training dataset , while the gene expression profiles of anticancer drug perturbation from Connectivity Map [19] was used as a test dataset. Their model integrated 21 features including drug chemical structure similarity, drug target network features and drug pharmacogenomics features. random forest (RF) algorithm was applied to distinguish synergistic drugs from non-synergistic drugs depending on each feature combination. They identified 28 potentially synergistic drug combinations, three of them had been reported to be effective drug combinations in literatures.
Gayvert KM et al [51] presented a computational approach for predicting synergistic combinations using single drug efficacy. They utilized a high-throughput drug screen performed by Held et al [52]. For each drug pair, a feature set formed of the mean and difference of the single agent dose response in each tested cell line and features representing the similarity of a drug pair's efficacies in melanoma cell lines were obtained. They trained random forest models [53] on 780 drug combinations. Their model achieved (AUC=0.8663) for predicting synergy and (AUC=0.8809) for predicting genotype-selective efficacy in context of BRAF melanomas. Their predictions were compared to an independent high throughput screen [54] showing a significant number of combinations overlapped with this dataset.
Yin Liu et al [58], used pharmacogenomics profiling data identify combination therapies that may inhibit tumor growth. They used CCLE dataset [7]. Drug response for each cell line is determined. Decision tree was applied to identify genomic alterations that may influence drug sensitivity across the cell lines. They identified a subset of genes whose expression was correlated with drug sensitivity in cells that have particular genomic alterations. They performed experimental validation using two lung adenocarcinoma (LUAD) cell lines. Cell proliferation was inhibited more by the drug combination.
Pivetta et al [103] applied artificial neural network (ANN) to predict the synergism of two anti-cancer drug combinations. The authors represented the drug synergy as the net multidrug effect index (NMDEI). Net multidrug effect index is the difference between the non-algebraic additive effect and experimentally obtained activity of drugs in combination. The ANN models showed accurate fits. In the experimental validation of the predicted maximum synergistic effect combinations against human acute T-lymphoblastic leukemia cells (CCRF-CEM), the combinations presented high cytotoxic activity with lower drug doses.
Minji Jeon et al [108] predicted the synergy between two drugs utilizing genomic and pharmacological information and drugs targets. Extremely Randomized Trees (ERT) achieved the best performance. They deduced synergistic rules and validated their results by the literature.

Unsupervised methods
In unsupervised methods, no drug combination labels are available, and the hidden structure of the given data is aimed to be extrapolated [59].
Huang et al [15], assumed that synergistic drugs can inhibit modules of disease signaling networks complementarily. A drug-drug interaction network was built based on cell lines transcriptional expression data and divided to communities using Bayesian nonnegative matrix factorization approach. A disease-specific signaling network was built by combining genomic profiles and interactome data. They defined a synergy score that prioritize the drug pairs that perturb disease-specific signaling network with similar function. To evaluate their method, they applied it on the lung adenocarcinoma and endocrine receptor (ER) positive breast cancer. They show the literature evidence of their proposed effective drug combinations.
Parkkinen and Samuel's method [14] extends the original CMap's methodology [19]. They matched the pattern of drug perturbation profiles from multiple cell lines. They defined a measure based on group factor analysis (GFA) and probabilistic latent factor models to obtain drugs with the most relevant profiles to a single query signature. The authors discovered functionally and chemically similar drugs. Their method performed better than the original CMap.

Semi-supervised methods
Lack of sufficient number of labelled drug combinations inhibitory effects call for applying semi-supervised learning methods in drug combinations discovery problem.
Sun et al [16] applied a semi-supervised method, namely Ranking-system of Anti-Cancer Synergy (RACS). Drug pairs were represented by a set of 14 pharmacological and genomic features different between labelled samples and unlabelled samples. These features include targets distance in PPI network and the proportion of unrelated pathways regulated by the targets of the two agents. Based on a manifold ranking method proposed by Zhou et al [67], the drug pairs were ranked based on similarities to the labelled samples. The authors validated their method using data provided in the drug combination DREAM challenge [9], and achieved an AUC value of 0.85.
Chen et al [68], developed an algorithm called Network based Laplacian regularized Least Square Synergistic drug combination prediction (NLLSS). They hypothesized that principal drugs which obtain synergistic effect with similar adjuvant drugs are often similar. Principal drug shows activity in disease treatment in synergistic drug combination and adjuvant drug shows no effect on disease treatment in synergistic drug combination. The authors used the framework of Laplacian Regularized Least Square (LapRLS) classifier [69]. They computed drug similarity based on drug target interactions, and drug chemical structures. They derived a score to assess synergistic probability of a drug combination. Receiver-operating characteristic (ROC) curve was used to evaluative the performance. They implemented experimental validation for the top 10 potential drug combinations in the three datasets. They found 7 synergistic combinations.
Machine learning algorithms have not been frequently used for drug combinations optimization problem. This is probably due to the relatively large data requirements for model training. As more drug target and genomic data are becoming available through online database sources, there will be an increased interest in applying machine learning techniques to combinatorial drug design.

Systems biology based methods
The systems biology approach aims to predict cell behaviour based on developing a detailed topology map of cellular pathways and interactions. Network modelling approaches have been applied in conjunction with other optimization techniques in order to help predict model constants or train network relationships based on experimental data and to simplify or expedite the development of the network model.
Paola Vera-Licona et al [41] tried to identify and prioritize optimal combinations of interventions that perturb the paths from source nodes to target nodes in signaling pathways. OCSANA nodes scoring is based on the lengths of the paths from the selected node to the targets, the type of the node effect on target nodes (activation/inhibition), side effects of the node, the number of paths in which the node is included in and the number of targets that node can connect to. They applied their method on EGFR network, ERbB family network and HER2+BCN. They achieved enhanced performance over Berge 's algorithm [109]. In their method, not all paths between source and target nodes are tested due to computational time needs. They only consider paths of specified lengths.
Iadevaia et al [80] developed a mass action model of the insulin growth factor (IGF-1) signaling network in a breast cancer cell line. They measured changes in protein phosphorylation after stimulation of IGF-1. The unknown model parameters were identified using the particle swarm optimization technique. They fit their model to the time courses of six proteins based on 126 experimental data points. Model predictions were averaged from three randomly sampled sets of the approximated parameter sets. They identified five targets in PI3K/AKT and MAPK pathways whose inhibition could optimally inhibit irregular signaling pathways. This prediction was experimentally validated.
Based on a mass-action model of heregulin-induced HER2/3 signaling, Faratian et al [110] predicted that PIK3CA inhibition should be combined with RTK inhibitors in tumor cases that have low PTEN.
Mass-action modelling approaches produce specific values for a large number of parameters, which can be not practical in large scale network reconstructions [114].
Sahin et al [111] generated a Boolean model of ERBB signaling of G1/S cell cycle transition. They applied knockdowns of the network proteins, model deduction based on proteomic data and experimental validation with RNAi. They predicted a drug combination that target c-MYC and ERBB2 and reduce ERBB2 breast cancer resistance.
Ranran Zhang et al [112] constructed a Boolean logic model of apoptosis signaling in Leukemic T-Cell large granular lymphocytes. They depended on experimental validation to investigate the effect of two defined species that dominate apoptosis, sphingosine kinase 1 and NFκB.
To overcome the limitations which is related to representing species as one or zero in [112] approach, Aldridge et al [113] extended it to adjust intermediate activity states using fuzzy logic.
It is difficult to explain the results of logic-based modelling approaches because they assign discrete values to continuous variable such as concentration of active species. [115].
In order to discover drug combinations that target resistant melanoma cells, Korkut et al [82] used a computational tool termed as pathway extraction and reduction algorithm (PERA). They integrated signaling network pathways, Proteomic data and five phenotypic responses of cell cycle progression after perturbation with two-drug combinations or single drugs. The belief propagation (BP) algorithm was used to search the network models based on probability distributions that represent the set of network models with the lowest error [81]. The authors predicted an effective combination of c-Myc with either BRAF-or MEK-targeted therapies. The predicted combinations were experimentally validated.
Xiao-Dong Zhang et al [23] hypothesized that drugs that target the same functional network motif could be combined to improve the therapeutic efficacy. They extracted drug combinations from the Drug Combination Database [25]. They identified some motifs that are significantly enriched with combinatorial drugs targets by using FANMOD [24]. They calculated therapeutic similarity between individual drugs that target an interacting protein pair, using ATC code. Some predicted drug combinations were found to be clinically used to treat breast cancer [26]. Some were found to be effective anticancer therapy for CML [27]. Some affect the dissemination of hepatocellular carcinoma (HCC) cells [28]. They recommended some unreported drugs to be used as a combination.
Jordi Serra-Musach et al [35] tried to identify synergistic drug combinations that maximizes the perturbation of the cancer network. They obtained proteins and interactions, expression data from cancer cell lines, IC50 for unique drugs, mutational status of proto-oncogenes and tumor suppressor genes, Genetic, genomic, and molecular alterations identified in cancer cell lines. Cancer network activity (CAN) was defined based on weighted communicability [36]. Their study focused on breast cancer PI3K-mTOR signaling. The assessment of the inhibitory effect of the predicted combination achieved synergism in nine of 12 instances.
Samira Jaeger et al [37], represented the potential cross-talk between two therapeutic signaling networks as a network. They applied network efficiency measure [38] to compute pathway cross-talk inhibition (PCI). PCI is the reduction of network efficiency after a pharmacologic intervention. They used experimental validation, drug combination index, and dose reduction index to validate the performance of their method. The experimental validation of ten novel proposed combinations confirmed a synergistic behaviour for seven of them in, at least, one of four tested breast cancer cell lines.
Francesca Vitali et al [39], applied network-based modelling to identify multi-target drugs for triple negative breast cancer. They constructed a network of disease proteins (DPs) and their interactors. They selected bridging nodes as target proteins (TPs). They ranked the target combinations by applying topological Score of Drug Synergy (TSDS) [40]. They extracted a list of approved drugs that interact with the defined TPs. A score, termed Path-EFF-index that measures the effect of a drug in a specific pathway is assigned to each drug and drug combination. They depended on experimental validation to validate the synergistic effect of one of the proposed combinations.
Pal and Berlow [44] used tumor drug sensitivities and kinase inhibitor profiles to predict sensitivity to drug combinations for four canine osteosarcoma cell lines. The authors assumed that inhibiting a super set of a set of effective kinases, will also be effective. They also assumed that if blocking a set of kinases is not effective, then blocking a subset of this kinases set will not also be effective. They identified drug combinations that inhibit a minimal set of kinases with minimal side effects. They achieved a high level of accuracy. Both of the subsets and supersets of the proposed new drugs must be present in the data; otherwise the method will become nondeterminable, as it takes 0 or 1 irrespective of the actual profile of new drugs.
Jing Tang et al [60], presented a computational strategy named TIMMA (Target Inhibition inference using Maximization and Minimization Averaging). Their method is based on the work of Pal and Berlow [44]. In TIMMA, a drug-target inhibition profile was built for each drug. The inhibition profile of a drug combination is the union of the inhibition profiles of individual drugs. A set of cancer-specific targets is identified using a Sequential forward floating search (SFFS) algorithm [61]. A new Maximization and Minimization averaging rule was applied to overcome limitations of PKIM, so TIMMA achieved enhanced prediction accuracy in cross validation and significant reduction in computation times. TIMMA authors succeeded to identify effective drug combinations for breast and pancreatic cancer cells. The authors reported the R implementation of the algorithm (TIMMA-R), which is much faster than the original MATLAB code [62].
D.Chen et al [63], performed a model termed pathway and pathway interaction (WWI) based on the assumption that drugs targeting related pathways will be more likely to be synergistic drug combinations. The authors built two networks, namely, a protein protein Interaction network based on HPRD [64] database and a WWI network based on KEGG database. For each drug pair, they defined a Score which indicates the connectivity of pathways perturbed by the individual drug of drug combinations on the WWI network and drug targets on the PPI network. They applied receiver operating characteristic (ROC) curves to estimate the performance of their scoring system in predicting SDPs achieving (AUC= 0.75).

Conclusions
With the problem of great growth of high-throughput data, in silico methodologies are successfully applied in drug combination prediction. Application of mathematics, computer science, and biology can extremely speed up the discovery of optimal drug combinations [86]. We reviewed five computational approaches applied in drug combination optimization. Second order linear regression models were successfully applied in drug combinations prediction. Higher-order linear regression models resulted in the over-fitting of data [103]. Genomic and drug target data becomes more available. Greater attention will be paid to applying machine learning techniques to drug combinations discovery. Drug metabolism processes like absorption, transportation, metabolism, and clearance are very important for the treatment efficacy of the diseases [71]. For example, overexpression of the P-glycoprotein (P-gp) result in drug resistance [72] and it has been reported that inhibition of it can improve the drug efficacy [73][74][75] and participate in drug synergistic effect. Drug metabolism processes should be considered in future combinatorial drug prediction models. Interactions between drugs are not handled by the existing computational methods. To be more applicable, other data types such as pharmacokinetic parameters must be considered in the future predictive computational approaches. Molecular data such as mutation, copy number variation and methylation data should be used more widely beside Transcriptomic profiles, interaction networks and biological pathways to predict drug combination efficacy on target cells. Functions of the target proteins differ between their wild and mutated type. So, targeting mutated proteins may cause synergistic effect better than targeting their wild-type and vice versa. To make the predictions more applicable in clinics, there is a need to discover the similarities between in vitro and patient data in a personalized setting. The methodologies that depend on data that is available for few cells lines in restricted tissue types, makes the models less practical in clinical settings [70]. Successful drug combinations used in clinical practice articulate that more awareness should be given to outside tumor cell targets. In drug combination trials for 521 non-small-cell lung carcinoma [92], 184 integrate drugs that have inside tumor cell targets, 110 trials integrate tumor-cell-targeting drugs with angiogenic agents and 94 with immune-targeting agents. The agents that stimulate antitumor immune response has been successfully led to dramatic improved survival in various tumor types [95]. Metastasis lead to an advanced cancer that comprise various subclonal tumors, each with independent genetic controllers and responses to drugs [93]. More attention should be given to drug combination approaches that target subclonal populations that are resistant to the primary therapy [94]. Epigenetic changes in cell state can produce cell populations that participate in the development of resistant tumor-cell populations [93]. Combination therapies that decrease the plasticity of tumor cells, preserve sensitized tumor cell states, or target epigenetic deregulation are aimed to participate in the prevention of drug resistance and tumor evolution.