Industrial clients at ESRF fast-track drug discovery

Researchers from the company Idorsia Pharmaceuticals Ltd have rapidly optimized a weak hit compound against SARS-CoV-2 to increase its potency by 1000-fold. They used artificial intelligence, computational chemistry, high throughput chemistry and structural biology at the ESRF. The results are out in Journal of Medicinal Chemistry and show the strong collaboration between the ESRF and industry.

It all started with a molecule that bound weakly to the SARS-CoV-2, the virus responsible for COVID-19. “About two year ago, we had identified this molecule, a diazepane scaffold, through artificial intelligence and computational screening and thought we would investigate further”, explains Julien Hazemann, first author of the publication and former researcher at Idorsia. The compound could potentially inhibit the virus’s main protease (Mpro)—a critical enzyme for viral replication.

In order to increase the efficiency of the molecule, so that it would bind to Mpro, the team from Idorsia used computational simulations, high-throughput chemistry and structural biology at the ESRF in collaboration with the company Expose GmbH. This approach, called hit-to-lead optimisation, has been used in antiviral drug discovery in the last ten years, but it is the first time that the techniques were integrated in such a tight and effective way in a global effort.  

First, the researchers employed computational docking and molecular dynamics simulations to predict how structural changes to the molecule might improve binding to Mpro.

Using high-throughput medicinal chemistry, they synthesized and tested a focused library of analogues. These steps led to a dramatic improvement of the original compound to a nearly 1,000-fold increase in potency.

However, predicting how a molecule behaves computationally was only one piece of the puzzle. Throughout the process, the researchers came to the ESRF’s macromolecular crystallography beamline ID23-1 to collect high-resolution X-ray diffraction data of the Mpro–inhibitor complexes. They were able to visualise how the inhibitor binds within the active site of the protease. “The ESRF has been crucial in this research, from the beginning, when we scanned the candidate compound, to the end, when we saw how the action takes place”, explains Daniel Ritz, senior director of biology at Idorsia.

One of the features of this study is the small number of compounds they needed, thanks to the highly targeted methodology the scientists used.

Read more on ESRF website

Artificial intelligence explores the underground

Researchers at the Paul Scherrer Institute PSI have shown that artificial neural networks have the potential to determine very precisely the characteristics of rock layers, like their mineralogical composition, solely on the basis of drill core images. This could speed up future geological investigation efforts while simultaneously optimising costs. 

Underground investigations are often time-consuming and costly. Yet without knowledge of the properties and characteristics of the layers located deep below the surface, many important questions cannot be answered: Can data for future explorations around the deep geological repository be predicted quickly and reliably? Is a particular underground site suitable for obtaining deep geothermal heat and power, or for extracting natural gas? Are the geological conditions at a depth of 1,500 metres suitable for storing carbon dioxide? To make it easier to answer these and other questions, Romana Boiger, from the Laboratory for Waste Management in the PSI Center for Nuclear Engineering and Sciences, is working to establish new tools from the area of artificial intelligence for geological investigations.

Boiger’s attention is focused on so-called artificial neural networks in particular. These consist of several layers of interconnected artificial neurons. These are, in the final analysis, mathematical formulas that process input data and deliver a result. What makes this special is that artificial neural networks are capable of learning. For example, an artificial neural network that is supposed to distinguish between apples and pears can be trained by presenting it with images of apples and pears and simultaneously providing the correct interpretation. After a certain number of training runs, the artificial neural network is then prepared to correctly classify even unfamiliar pictures of apples and pears.

In her research, Boiger, a mathematician with a focus on data science and machine learning, uses a special type of artificial neural networks called convolutional neural networks (CNNs). These are especially well suited to the identification and analysis of patterns and simple features in images.

Scientifically uncharted territory

One novel application of CNNs is the subject of the study Boiger and colleagues published in May 2024 in the Swiss Journal of Geosciences. It is the result of an interdisciplinary collaboration between scientists from PSI and experts in geology and engineering at Nagra. In a first step, they used CNNs to analyse images of drill cores taken from the Trüllikon borehole in Northern Switzerland. This was part of Nagra’s site investigation programme to identify a suitable site for a deep geological repository. The test interval was selected from 55 metres of drill core from a depth of between 770 and 939 metres. «We wanted to find out if it’s possible to accurately determine the lithological formations and above all the mineralogical composition of the rock – such as the proportions of calcite, clay, and silicates – solely on the basis of drill core images. » Studies already exist to investigate the lithology, determining properties that can be observed with the naked eye, without the help of a microscope. On the other hand, determining mineralogy in this way is scientifically uncharted territory. «No one had ever done it this way before.»

For her research, Boiger used artificial neural networks that had already been trained. They had previously learned to distinguish between images of vehicles, animals, people, and fruit – as well as geological formations and rocks – using images from the ImageNet database, a collection of more than 14 million images.

The CNN models thus already had a certain knowledge base when they were presented with the Trüllikon drill cores. The 10 cm thick drill cores from various geological units, known as formations, were systematically photographed after washing. The photographs were then cut into slices. Boiger and colleagues proceeded step by step: They expanded the pre-trained CNN by a few layers, which they then specifically trained to distinguish between lithological formations on the basis of the images. This resulted in a new, larger CNN model. It was then expanded again by a few layers – and finally trained to recognise the mineralogical composition.

Read more on PSI website

Image: Romana Boiger wants to use artificial intelligence to improve the exploration of deep earth layers and the analysis of drill cores.

Credit: Paul Scherrer Institute PSI/Markus Fischer

AI finds a cheaper way to make green hydrogen

Researchers at the University of Toronto are using artificial intelligence to accelerate scientific breakthroughs in the search for sustainable energy. They used the Canadian Light Source (CLS) at the University of Saskatchewan (USask) to confirm that an AI-generated “recipe” for a new catalyst offered a more efficient way to make hydrogen fuel.   

To create green hydrogen, you pass electricity that’s been generated from renewable resources between two pieces of metal in water. This causes oxygen and hydrogen gases to be released. The problem with this process is that it currently requires a lot of electricity and the metals used are rare and expensive.

“We’re talking about hundreds of millions or billions of alloy candidates, and one of them could be the right answer,” said Jehad Abed. He was part of a team that developed a computer program to significantly speed up this search. Their findings were published in the Journal of the American Chemical Society. At the time of this project, Abed was a PhD student under the supervision of Edward Sargent at the University of Toronto working alongside scientists at Carnegie Mellon University.  

Researchers are searching for the right alloy, or combination of metals, that would act as a catalyst to make this reaction more efficient and affordable. Traditionally, this search would involve trial and error in the lab, but when you are trying to find the proverbial needle in a haystack, this approach takes too much time.

The AI program the team developed took over 36,000 different metal oxide combinations and ran virtual simulations to assess which combination of ingredients might work the best. Abed then tested the program’s top candidate in the lab to see if its predictions were accurate.

The team used the CLS’s ultra-bright X-rays to analyze the catalyst’s performance during a reaction. “What we needed to do is use that very bright light at the Canadian Light Source to shine it on our material and see how the atomic arrangements would change and respond to the amount of electricity that we put in,” said Abed. The researchers also used the Advanced Photon Source at the Argonne National Laboratory in Chicago.

Read more on CLS website

Fundamentally different

Large research facilities at PSI such as the X-ray free-electron laser SwissFEL and the Swiss Light Source SLS – especially after the upgrade SLS 2.0 – deliver unimaginably vast amounts of data. Artificial intelligence is helping to evaluate data efficiently and exploit the facilities’ full potential for research.

Proteins are the workhorses of life. As tiny molecular machines, they are found in every cell and have a role in nearly all biological processes – from metabolism to cellular communication. Their diversity is enormous, because in the human body alone there are hundreds of thousands of different proteins, each with its own function. Proteins are important targets for drugs, and understanding their structure and function is an important task in biological research. One challenge in drug development is to find, if possible, an active agent that interacts with just one type of protein, to the exclusion of all the rest.

To achieve such a feat, one must first understand the language of proteins. The basis of this protein language is a kind of alphabet. It essentially consists of 20 building blocks analogous to letters. In proteins, however, it’s not about letters, but rather amino acids. Each protein is built up from a certain sequence of these amino acids; the sequence in turn largely determines its properties. Researchers would now like to know which protein sequence leads to which property. This is where so-called large language models such as GPT4 come into play. The AI chatbot ChatGPT, which has been causing a stir since 2022, is based on GPT4. Both were developed by the company OpenAI. ChatGPT uses an extensive dataset of texts created by humans to learn the patterns and structures of language. When the user enters a question or task, the model produces a response based on its understanding of the contexts and patterns that it learned during training. In this way it can write poems, novels and even programming code.

Flurin Hidber, a doctoral candidate supervised by Xavier Deupi, an expert in bioinformatics and protein structure at PSI, uses AI in protein research. Hidber uses a sophisticated model similar to ChatGPT that is trained to predict amino acids in protein sequences, instead of generating human-like language. This unique ability does not merely mimic the predictive capabilities of language models in AI, but rather provides valuable insights into the structure and function of proteins. Pharmaceutical researchers could use these to tailor medications and significantly shorten the process of trial and error in the laboratory, which in the end yields only a small proportion of drug candidates with promising properties.

An ambitious goal

Deupi and Hidber are working towards an ambitious goal: being able to determine the precise amino acid sequence that leads to a desired protein property. One focus of their research is light-sensitive proteins, a speciality of Deupi’s group and a research subject at SwissFEL. These proteins occur in many organisms, from microbes to humans, and have medical potential. Hidber’s use of AI to predict the properties of light-sensitive proteins solely on the basis of the sequence of their building blocks represents a significant advance in this field.

Through the precise prediction of the light-absorption properties of proteins, Hidber’s work could pave the way for the development of molecules with tailored properties – a step that could have a profound impact on optogenetics. This scientific technique employs light to control and monitor the activity of certain cells in living organisms, such as nerve cells in the brain. Researchers insert genes for light-sensitive proteins into these cells so they can precisely influence the cells’ behaviour by irradiating them with light.

This technology could contribute to the understanding and treatment of neurological diseases, since it provides a tool that can be used to investigate and control the activity of specific brain cells with unprecedented precision. For the future, Deupi and Hidber have set themselves the goal of reversing this process. They want to design new proteins with properties tailored to meet specific requirements, for example proteins that react to light of a particular colour. This blueprint could then be checked experimentally, and hopefully confirmed by colleagues in the laboratory.

The topic of protein dynamics is also at the heart of Cecilia Casadei’s research. The physicist has developed a new algorithm that enables more efficient evaluation of measurements at X-ray free-electron laser facilities such as SwissFEL. The building blocks of life often perform ultrafast movements. Investigating these with precision is crucial to gain a better understanding of proteins. In the long run, this can provide valuable information about disease processes and enable the development of novel medical approaches.

Read more on PSI website

Image: Xavier Deupi (left) and Flurin Hidber from the research group for Condensed Matter Theory want to better understand how the function of proteins is related to their structure. They are targeting light-sensitive proteins in particular. 

Credit:  Paul Scherrer Institute/Markus Fischer; KI image generation: Studio HübnerBraun/Midjourney