Home Articles Using Artificial Intelligence to Explore the Biological World

Using Artificial Intelligence to Explore the Biological World

A visual representation of: Using artificial intelligence to explore the biological world

Using Artificial Intelligence to Explore the Biological World

Determining the shape of proteins is known as the ‘protein folding problem’ and has stood as a grand challenge in biology for the past 50 years.

In 2020 the organisers of the biennial Critical Assessment of Protein Structure Prediction (CASP) competition recognised the AI system AlphaFold as a solution to this grand challenge. The area of Artificial Intelligence is rapidly changing, impacting many different medical sectors, and in this blog, I explore an application that may dramatically change the development of novel therapeutics.

The heart of the biological revolution

Since the discovery in 1953 of the double helical structure of DNA, and its role in encoding genetic information, genetics has exploded as a science, resulting in the development of the biotechnology industry and the use of genomics as a powerful health information source.

To carry out its function, DNA sequences must be converted into messages that can be used to produce proteins, which are the complex functional molecules within our bodies. The linear DNA sequence encode and are eventually translated into a linear sequence of amino acids which are the building blocks of proteins.

Proteins are involved in almost every important activity within a living organism, fighting off invading pathogens, digesting food, building structures such as muscle fibres or hair, providing oxygen to cells, acting as messengers between cells. Proteins can undertake this vast array of different functions as a direct consequence of their structure. Although they are composed of a string of amino acids in a particular order, they do not remain one-dimensional but instead fold into complex three-dimensional shapes. Since there are 20 different types of amino acid, each with specific chemical properties, and proteins can range in size from tens to thousands of amino acids, a folded protein can display a vast range of chemical and functional characteristics.

Understanding how and what shape proteins fold into in order to provide their exquisite functional specificity is therefore essential to understanding how organisms function.

The protein-folding problem

Although proteins are almost completely defined through their linear structure, it has been all but impossible to predict into what shape the protein will fold. In 1972, Nobel prize-winner Christian Anfinsen predicted that one day it would be possible to determine a protein’s three-dimensional shape based solely on its linear sequence. But for nearly 50 years this problem remained as a grand challenge for biologists.

The problem is that a protein can theoretically fold into about 10300 different conformations, and it would take an impossibly long time for a protein molecule to sample every conformational space. It is tempting to assume that proteins fold into the correct conformation as they are synthesised, block by block, but this does not appear to be the case, and unfolded (denatured) proteins can almost always be coaxed back into their correctly folded state. In many cases there are specific structures that can be identified within proteins such as a-helices and b-sheets that form secondary building blocks and contribute to the final structure.

But the key question remains, out of all the possible configurations, how does each protein spontaneously fold into one particular shape, allowing it to carry out its specific biological role? Given the importance of the 3D structure of a protein, any attempt at rational development of proteins as therapeutics is often hindered by this problem.

AlphaFold

Every year, the organisers of CASP hold a competition to determine the most effective Artificial Intelligence system for determining protein folding. The competition is straightforward: competitors are given linear amino acid sequences for 100 proteins, and are required to predict their structures, measured against the known conformation. In recent years, a new AI system, AlphaFold, developed by London-based DeepMind, outclassed all opposition, successfully predicting the structure of the test proteins to within the width of about one atom. Previously, protein structures of about 3,500 human proteins had been painstakingly determined using experimental technology such as X-ray crystallography and NMR, whereas thanks to AlphaFold the 3D structures for virtually all 20,000 such proteins are now known.

The artificial intelligence of AlphaFold

Transformers are a neural network architecture being used extensively in ML systems since introduction by Google Brain in 2017. AlphaFold’s development team created a new type of transformer designed specifically to work with three-dimensional structures.

In simple terms, a folded protein can be visualised as a ‘spatial graph’ in which amino acids are the nodes, and edges connect components in close proximity. AlphaFold attempts to interpret the structure of this graph, while reasoning with the virtual graph that it’s building. The model is structured to maximize information flow through recursive hypotheses that create increasingly accurate predictions of the underlying physical structure of the protein, and can determine highly accurate structures in a matter of days.

There are a few potential drawbacks of course: because AlphaFold was trained on publicly available datasets of known protein structures it may not accurately predict the shapes of unusual new proteins. And of course, it does not reveal the mechanism or rules of protein folding for the protein folding problem to be considered solved from an academic perspective.

The future

DeepMind plans to release structures for nearly every protein whose genetic sequence is known to science, over one hundred million. The contribution of AI to structural biology (and most importantly to the design of innovative medicines) has begun, and Plextek is looking to play its part through its capabilities in the field. We are refining and developing our expertise in Machine Learning and Artificial Intelligence in order to provide our clients with state-of-the-art smart systems for medical purposes that use computational processes to improve performance and utility over time.

For an initial chat, please get in touch.
Contact Plextek

Contact Us

Got a question?

If you have got a question, or even just an idea, get in touch

Technology Platforms

Plextek's 'white-label' technology platforms allow you to accelerate product development, streamline efficiencies, and access our extensive R&D expertise to suit your project needs.

  • 01 Configurable mmWave Radar Module

    Plextek’s PLX-T60 platform enables rapid development and deployment of custom mmWave radar solutions at scale and pace

    Configurable mmWave Radar Module
  • 02 Configurable IoT Framework

    Plextek’s IoT framework enables rapid development and deployment of custom IoT solutions, particularly those requiring extended operation on battery power

    Configurable IoT Framework
  • 03 Ubiquitous Radar

    Plextek's Ubiquitous Radar will detect returns from many directions simultaneously and accurately, differentiating between drones and birds, and even determining the size and type of drone

    Ubiquitous Radar
Evolving silicon choices in the AI age
Evolving silicon choices in the AI age

How do you choose? We explore the complexities and evolution of processing silicon choices in the AI era, from CPUs and GPUs to the rise of TPUs and NPUs for efficient artificial intelligence model implementation.

A visual representation of: SSL The Revolution Will Not Be Supervised
SSL: The Revolution Will Not Be Supervised

Exploring the cutting-edge possibilities of Self-Supervised Learning (SSL) in machine learning architectures, revealing new potential for automatic feature learning without labelled datasets in niche and under-represented domains.

A visual representation of: A Programmer's Introduction to Processing Imaging Radar Data
A Programmer’s Introduction to Processing Imaging Radar Data

A practical guide for programmers on processing imaging radar data, featuring example Python code and a detailed exploration of a millimetre-wave radar's data processing pipeline.

Revolutionising chronic pain management
Revolutionising chronic pain management

Fusing mmWave technology and healthcare innovation to devise a ground-breaking, non-invasive pain management solution, demonstrating our commitment to advancing healthtech.

A visual representation of: Using artificial intelligence to explore the biological world
Using Artificial Intelligence to Explore the Biological World

Harnessing AI's capabilities to decode protein folding, catalysing a leap in biological research and therapeutic innovation.

A visual representation of: Artificial Intelligence in the Big and Scary Real World
Artificial Intelligence in the Big and Scary Real World

Analysing the application of Artificial Intelligence in real-world scenarios, addressing its transformative potential and the ethical framework required for its deployment.

A visual representation of: AI Gesture Control
AI Gesture Control

Exploring the possibilities of AI gesture control for household appliances and more, using privacy-preserving radar technology, underscoring innovation in smart home interactions.

A visual representation of: Human Problem Solving in the AI era
Human Problem Solving in the AI-era

Exploring the symbiosis of human expertise and AI, the team navigated the AI era, enhancing problem-solving capabilities across various sectors without compromising the human touch.

A visual representation of: Repurposing Innovation Bullet Proof Your Wine
Repurposing Innovation: Bullet Proof Your Wine

Repurposing military-grade technology to safeguard fine wines, ensuring their pristine condition from bottling to cellar.

A visual representation of: Augmented Reality Assistant in Life Sciences
Augmented Reality Assistant in Life Sciences

Exploring the transformative impact of augmented reality in streamlining laboratory workflows, enhancing the accuracy of scientific experiments, and setting new benchmarks in life sciences efficiency.

A visual representation of: Webcams and Eye Contact in the Post-Covid Office
Webcams and Eye Contact in the Post-Covid Office

Exploring the challenges and technological solutions to achieving effective eye contact through webcams in virtual meetings, enhancing remote communication in the post-COVID workplace.

A visual representation of: Of mice and ships
Calculating Error: What do a brain and a ship have in common?

Analysing the commonalities between brain function and ship steering through error correction methods, highlighting the indispensable role of calculus in both biological and engineered control systems.


Related Technical Papers

View All
an image of our technical paper
Sensing Auditory Evoked Potentials with Non-Invasive Electrodes and Low-Cost Headphones

This paper presents a sensor for measuring auditory brainstem responses to help diagnose hearing problems away from specialist clinical settings using non-invasive electrodes and commercially available headphones. The challenge of reliably measuring low level electronic signals in the presence of significant noise is addressed via a precision analog processing circuit which includes a novel impedance measurement approach to verify good electrode contact. Results are presented showing that the new sensor was able to reliably sense auditory brainstem responses using noninvasive electrodes, even at lower stimuli levels.

an image of our technical paper
GPU Computing

Power limits restrict CPU speeds, but GPUs offer a solution for faster computing. Initially designed for graphics, GPUs now handle general computing, thanks to advancements by NVIDIA, AMD, and Intel. With hundreds of cores, GPUs significantly outperform CPUs in parallel processing tasks. Modern supercomputers, like Titan, utilize thousands of GPU cores for immense speed. NVIDIA’s CUDA platform simplifies GPU programming, making it accessible for parallel tasks. While GPUs excel in parallelizable problems, they can be limited by data transfer rates and algorithm design. NVIDIA’s Tesla GPUs provide high performance in both single and double precision calculations. Additionally, embedded GPUs like the NVIDIA Jetson TX2 deliver powerful, low-power computing for specialized applications. Overall, GPUs offer superior speed and efficiency for parallel tasks compared to CPUs.