Artificial Intelligence in Drug Discovery

Artificial Intelligence (AI) systems, specifically machine learning (ML) technologies, learn how to accomplish tasks and identify patterns from data provided instead of relying on dictated rules. This data-based learning has been critical in helping the biotechnology and pharmaceutical companies and academic researchers in the drug discovery field in developing new AI technologies to improve drug design and discovery processes. Having a more efficient AI-based process in drug discovery and design has been helpful in recent years in reducing the 2.6 billion USD average cost and an average of 12 years to bring a new drug to market. 

Automation and discovery have been the two major applications of AI at the industrial level. AI for automation has helped facilitate repetitive work that is usually easy for humans, sometimes without the need for any expertise, and removing employees from dangerous jobs. On the other hand, discovery tasks are those that even experts may not be able to accomplish with high accuracy. For example, it is not a trivial task to predict, without the help of AI, what small molecule could interact with a target protein. These tasks need expertise, patience, and trial and error, combined with experimental efforts. It is worth noting that we as human experts have limited capacity to consider a huge space of possibilities for each discovery task, which can include  millions of possible chemical compounds that can be considered for designing a new drug for a target protein. In this case, AI could help us identify relationships and patterns that human experts do not necessarily know. For example, Matchmaker, is an AI-enabled deep learning engine that predicts the polypharmacology of small molecules as the foundation for small molecule drug discovery. Cyclica  developed MatchMaker in 2018, and it continually  improves its use of information of small molecules and proteins to predict which small molecule can interact with which protein, through drug-target interaction (DTI) data in public and private databases. MatchMaker is an example of successful AI technology designed primarily for hit discovery in the early stages of drug discovery (Figure 1).


Figure 1. Schematic representation of the application of AI in different stages of the drug discovery process. You can find more about MatchMaker and Graph Neural Network (GNN) target ID here.

Successful AI technologies in healthcare and drug discovery are mainly built using machine learning algorithms. Let’s consider three main tasks in building a supervised ML model: 

1) Processing the data and providing features and labels (or continuous values) for an ML system. 

2) Training the ML model and letting it learn the relationships between features of each data point (characteristics of data points) and their labels. 

3) Assessing the performance of the models using proper testing strategies, in addition to all software-related requirements like data, code and model versioning. 

These tasks are necessary for building AI technologies to be used in different drug discovery and development steps, from hit discovery to guiding clinical trials. Considering the complexity of each one of these tasks and the knowledge and experience required in software development, domain knowledge such as in structural bioinformatics and medicinal chemistry, algorithm design, statistics, etc., it is evident that it is not a job to be successfully done by one or a group of experts with the same expertise like structural bioinformatics or algorithm design. The success of companies like Cyclica relies on the collaboration of experts in building and properly utilizing AI technologies like MatchMaker to guide the drug discovery and design process. Also, it lets these companies avoid mistakes that experts in one field can make because of a lack of proper understanding of other aspects of AI technology development.   

Data as the core aspect of building machine learning models and technologies have determined the progress made in the application of AI in drug discovery. Applications like hit discovery for proteins, as in MatchMaker, have been successful in relying on millions of public and private data available to academic and industrial teams. Other applications like protein design, lead optimization, translatability prediction, and guiding clinical trials in later stages of drug development have progressed by getting access to the correct data. Upon the availability of new data through databases or the development of new technologies for the generation of higher quality data in high quantity, we will see even further progress in the application of AI in drug discovery. The rest is on us to know how to use that data and build successful AI technologies.

Throughout our AI Drug Discovery campaign, you’ll hear next about the collaborative environment we have built at Cyclica, in addition to the pitfalls of AI in drug discovery and how we have avoided them in subsequent pieces of content throughout the next month.


  1. Paul, Debleena, et al. "Artificial intelligence in drug discovery and development." Drug discovery today 26.1 (2021): 80.
  2. Chan, HC Stephen, et al. "Advancing drug discovery via artificial intelligence." Trends in pharmacological sciences 40.8 (2019): 592-604.
  3. MacKinnon, Stephen Scott, S. A. Madani Tonekaboni, and Andreas Windemuth. "Proteome‐Scale Drug‐Target Interaction Predictions: Approaches and Applications." Current Protocols 1.11 (2021): e302.
  4. Sugiyama, Michael G., et al. "Multiscale interactome analysis coupled with off-target drug predictions reveals drug repurposing candidates for human coronavirus disease." Scientific reports 11.1 (2021): 1-18.
Dr. Ali Madani, Director of Machine Learning

Dr. Ali Madani, Director of Machine Learning

Ali develops new deep learning models to improve drug-target interaction prediction. He completed his Ph.D. in Computational Biology at the University of Toronto, developing new feature selection approaches from omics profiles of patient tumors that are predictive of their survival and their response to drugs.

Related Posts

Pitfalls in AI Drug Discovery

Artificial Intelligence (AI), or more modestly machine learning, is increasingly being used in many...


The need for speed: how drug target discovery can drive change in precision medicine

Employee Perspective: Cyclica’s Marketing and Communications Specialist, Rebecca Woelfle


Demystifying Protein Structure Prediction Models: AlphaFold, RosettaFold, ESMFold and beyond

The purpose of this blog post is to provide an introduction to protein structure prediction models,...