About Me

Hi! Before I explain that white thing I’m holding above, let me introduce myself. I'm Arthur Declercq, a PhD student at CompOmics, Ghent University. I graduated in Biomedical Sciences, where I discovered two things: a love for immunology and a knack for working with computers rather than lab equipment. This led me to bioinformatics, where I now develop AI algorithms to study proteins. My work focuses on identifying epitopes—tiny protein fragments shown on MHC molecules (yes, that’s the white thing!)—which are crucial for designing vaccines for cancer or fighting infections.

When I'm not at my desk, you’ll probably find me on the move. Running has become my passion, whether I’m racking up kilometers with the Ghent Running Crew or tackling trails. Beyond running, I love heading out with my tent for a backpacking adventure. And if you ever catch me sitting still, it’s likely with a good book in my hands.

Reach out anytime if you’re into bioinformatics, outdoor adventures, or simply want to chat and exchange ideas!

Experience

Current Position

Predoctoral Researcher at VIB-UGent Center for Medical Biotechnology
2024 – Present

"Working on data-driven bioinformatics solutions for mass spectrometry-based immunopeptidomics."

Education

  • PhD in Bioinformatics – Ghent University
    2021 – ...
  • Deep learning specialisation – Coursera
    2021 – 2022
  • Master in Biomedical Sciences – Ghent University
    2019 – 2021
  • Ethical hacking from scratch in Python – Udemy
    2020 – 2021
  • Honours Programme in life sciences – Ghent University
    2018 – 2020
  • Bachelor in Biomedical Sciences – Ghent University
    2016 – 2019

Skills

Python Machine Learning Immunopeptidomics Bioinformatics Proteomics Data Analysis
Photo by Nathalie Dolmans

“Meetinstrumenten in de biotechwereld worden elke dag beter, met onze software proberen we de ontwikkeling van mRNA-vaccins een duwtje in de rug te geven.”

🔗 Read Article

Publications

TIMS²Rescore: A DDA-PASEF optimized data-driven rescoring pipeline based on MS²Rescore

Authors: Arthur Declercq ∙ Robbe Devreese ∙ Jonas Scheid ∙ Caroline Jachmann ∙ Tim Van Den Bossche ∙ Annica Preikschat ∙ David Gomez-Zepeda ∙ Jeewan Babu Rijal ∙ Aurélie Hirschler ∙ Jonathan R Krieger ∙ Tharan Srikumar ∙ George Rosenberger ∙ Dennis Trede ∙ Christine Carapito ∙ Stefan Tenzer ∙ Juliane S Walz ∙ Sven Degroeve ∙ Robbin Bouwmeester ∙ Lennart Martens ∙ Ralf Gabriels

Journal: Journal of Proteome Research| Year: February 2025 | Abstract abstract Icon

The high throughput analysis of proteins with mass spectrometry (MS) is highly valuable for understanding human biology, discovering disease biomarkers, identifying therapeutic targets, and exploring pathogen interactions. To achieve these goals, specialized proteomics subfields – such as plasma proteomics, immunopeptidomics, and metaproteomics – must tackle specific analytical challenges, such as an increased identification ambiguity compared to routine proteomics experiments. Technical advancements in MS instrumentation can counter these issues by acquiring more discerning information at higher sensitivity levels, as is exemplified by the incorporation of ion mobility and parallel accumulation - serial fragmentation (PASEF) technologies in timsTOF instruments. In addition, AI-based bioinformatics solutions can help overcome ambiguity issues by integrating more data into the identification workflow. Here, we introduce TIMS2Rescore, a data-driven rescoring workflow optimized for DDA-PASEF data from timsTOF instruments. This platform includes new timsTOF MS2PIP spectrum prediction models and IM2Deep, a new deep learning-based peptide ion mobility predictor. Furthermore, to fully streamline data throughput, TIMS2Rescore directly accepts Bruker raw mass spectrometry data, and search results from ProteoScape and many other search engines, including MS Amanda and PEAKS. We showcase TIMS2Rescore performance on plasma proteomics, immunopeptidomics (HLA class I and II), and metaproteomics data sets. TIMS2Rescore is open-source and freely available at https://github.com/compomics/tims2rescore.

View Manuscript

MHCquant2 refines immunopeptidomics tumor antigen discovery

Authors: Jonas Scheid ∙ Steffen Lemke ∙ Naomi Hoenisch-Gravel ∙ Anna Dengler ∙ Timo Sachsenberg ∙ Arthur Declerq ∙ Ralf Gabriels ∙ Jens Bauer ∙ Marcel Wacker ∙ Leon Bichmann ∙ Lennart Martens ∙ Marissa L. Dubbelaar ∙ Sven Nahnsen ∙ Juliane S. Walz

Journal: Research Square| Year: December 2024 | Abstract abstract Icon

The identification of human leukocyte antigen (HLA)-presented peptides as targets of anti-cancer T cell response is pivotal for the development of novel immunotherapies. Mass spectrometry (MS)-based immunopeptidomics enables the detection of these peptides, yet confident identifications and thus implementation in immunotherapy design are hampered by the high diversity and low abundance of naturally presented HLA peptides. Here, we introduce MHCquant2, a Nextflow-based open-source pipeline that leverages OpenMS tools and peptide property predictors (DeepLC, MS2PIP) for highly sensitive and scalable HLA peptide identification and quantification across various MS platforms. MHCquant2 increased peptide identifications up to 27% with a significant expansion of low-abundant peptides, outperforming state-of-the-art pipelines. Using MHCquant2 we build a comprehensive benign tissue repository comprising re-analyzed data from available benign immunopeptidomes and a novel benignMHCquant2 dataset, adding more than 160,000 novel naturally presented HLA peptides. First applications of this benign repository and the MHCquant2 pipeline enabled (i) the refinement of tumor-associated antigens, (ii) the detection of novel, high-frequent tumor-exclusive peptide antigens for multiple tumor entities, and (iii) the identification and quantification of mutation-derived low-abundant neoepitopes. MHCquant2 refines tumor antigen discovery in immunopeptidomics, paving the way for the implementation of off-the-shelf and personalized immunotherapy design.

View Manuscript

Maximizing immunopeptidomics-based bacterial epitope discovery by multiple search engines and rescoring

Authors: Patrick Willems ∙ Fabien Thery ∙ Laura Van Moortel ∙ Margaux De Meyer ∙ An Staes ∙ Adillah Gul ∙ Lyudmila Kovalchuke ∙ Arthur Declercq ∙ Robbe Devreese ∙ Robbin Bouwmeester ∙ Ralf Gabriels ∙ Lennart Martens ∙ Francis Impens

Journal: Biorxiv| Year: November 2024 | Abstract abstract Icon

Mass spectrometry-based discovery of bacterial immunopeptides presented by infected cells allows untargeted discovery bacterial antigens that can serve as vaccine candidates. Reliable identification of bacterial epitopes by such immunopeptidomics approaches is however challenged by their extreme low abundance. Here, we describe an optimized bioinformatical framework to enhance the confident identification of bacterial immunopeptides. Immunopeptidomics data of cell cultures infected with the foodborne model pathogen Listeria monocytogenes were searched by four different search engines, PEAKS, Comet, Sage and MSFragger, followed by data-driven rescoring with MS2Rescore. Compared to standard single search-engine results, this integrated workflow boosted the number of identified immunopeptides on average by 27% and led to the high confident detection of 18 additional bacterial peptides (+27%) matching 15 different Listeria proteins (+36%). Despite an overall large agreement between the search engines, a small number of conflicts (< 1%) in spectra-to-peptide assignments revealed ambiguous identifications that served as a quality filter. Finally, we show compatibility of our workflow with sensitive timsTOF data acquisition and find that rescoring, now with inclusion of ion mobility features, identifies 76% more peptides compared to orbitrap-based acquisition. Together, our results demonstrate how integration of multiple search engine results along with data-driven rescoring maximizes the identification of immunopeptides, boosting the detection of high confident bacterial epitopes for vaccine development.

View Manuscript

MS2Rescore: Data-Driven Rescoring Dramatically Boosts Immunopeptide Identification Rates

Authors: Arthur Declercq ∙ Robbin Bouwmeester ∙ Aurélie Hirschler ∙ Christine Carapito ∙ Sven Degroeve ∙ Lennart Martens ∙ Ralf Gabriels

Journal: Molecular & Cellular Proteomics| Year: August 2022 | Abstract abstract Icon

Immunopeptidomics aims to identify major histocompatibility complex (MHC)-presented peptides on almost all cells that can be used in anti-cancer vaccine development. However, existing immunopeptidomics data analysis pipelines suffer from the nontryptic nature of immunopeptides, complicating their identification. Previously, peak intensity predictions by MS2PIP and retention time predictions by DeepLC have been shown to improve tryptic peptide identifications when rescoring peptide-spectrum matches with Percolator. However, as MS2PIP was tailored toward tryptic peptides, we have here retrained MS2PIP to include nontryptic peptides. Interestingly, the new models not only greatly improve predictions for immunopeptides but also yield further improvements for tryptic peptides. We show that the integration of new MS2PIP models, DeepLC, and Percolator in one software package, MS2Rescore, increases spectrum identification rate and unique identified peptides with 46% and 36% compared to standard Percolator rescoring at 1% FDR. Moreover, MS2Rescore also outperforms the current state-of-the-art in immunopeptide-specific identification approaches. Altogether, MS2Rescore thus allows substantially improved identification of novel epitopes from existing immunopeptidomics workflows.

View Manuscript

MS2Rescore 3.0 Is a Modular, Flexible, and User-Friendly Platform to Boost Peptide Identifications, as Showcased with MS Amanda 3.0

Authors: Louise M. Buur ∙ Arthur Declercq ∙ Marina Strobl ∙ Robbin Bouwmeester ∙ Sven Degroeve ∙ Lennart Martens ∙ Viktoria Dorfer ∙ Ralf Gabriels

Journal: Journal of Proteome Research| Year: March 2024 | Abstract abstract Icon

Rescoring of peptide–spectrum matches (PSMs) has emerged as a standard procedure for the analysis of tandem mass spectrometry data. This emphasizes the need for software maintenance and continuous improvement for such algorithms. We introduce MS2Rescore 3.0, a versatile, modular, and user-friendly platform designed to increase peptide identifications. Researchers can install MS2Rescore across various platforms with minimal effort and benefit from a graphical user interface, a modular Python API, and extensive documentation. To showcase this new version, we connected MS2Rescore 3.0 with MS Amanda 3.0, a new release of the well-established search engine, addressing previous limitations on automatic rescoring. Among new features, MS Amanda now contains additional output columns that can be used for rescoring. The full potential of rescoring is best revealed when applied on challenging data sets. We therefore evaluated the performance of these two tools on publicly available single-cell data sets, where the number of PSMs was substantially increased, thereby demonstrating that MS2Rescore offers a powerful solution to boost peptide identifications. MS2Rescore’s modular design and user-friendly interface make data-driven rescoring easily accessible, even for inexperienced users. We therefore expect the MS2Rescore to be a valuable tool for the wider proteomics community. MS2Rescore is available at https://github.com/compomics/ms2rescore.

View Manuscript

Updated MS²PIP web server supports cutting-edge proteomics applications

Authors: Arthur Declercq ∙ Robbin Bouwmeester ∙ Cristina Chiva ∙ Eduard Sabidó ∙ Aurélie Hirschler ∙ Christine Carapito ∙ Lennart Martens ∙ Sven Degroeve ∙ Ralf Gabriels

Journal: Nucleic Acids Research| Year: July 2023 | Abstract abstract Icon

Interest in the use of machine learning for peptide fragmentation spectrum prediction has been strongly on the rise over the past years, especially for applications in challenging proteomics identification workflows such as immunopeptidomics and the full-proteome identification of data independent acquisition spectra. Since its inception, the MS²PIP peptide spectrum predictor has been widely used for various downstream applications, mostly thanks to its accuracy, ease-of-use, and broad applicability. We here present a thoroughly updated version of the MS²PIP web server, which includes new and more performant prediction models for both tryptic- and non-tryptic peptides, for immunopeptides, and for CID-fragmented TMT-labeled peptides. Additionally, we have also added new functionality to greatly facilitate the generation of proteome-wide predicted spectral libraries, requiring only a FASTA protein file as input. These libraries also include retention time predictions from DeepLC. Moreover, we now provide pre-built and ready-to-download spectral libraries for various model organisms in multiple DIA-compatible spectral library formats. Besides upgrading the back-end models, the user experience on the MS²PIP web server is thus also greatly enhanced, extending its applicability to new domains, including immunopeptidomics and MS3-based TMT quantification experiments. MS²PIP is freely available at https://iomics.ugent.be/ms2pip.

View Manuscript

Thunder-DDA-PASEF enables high-coverage immunopeptidomics and is boosted by MS2Rescore with MS2PIP timsTOF fragmentation prediction model

Authors: David Gomez-Zepeda ∙ Danielle Arnold-Schild ∙ Julian Beyrle ∙ Arthur Declercq ∙ Ralf Gabriels ∙ Elena Kumm ∙ Annica Preikschat ∙ Mateusz Krzysztof Łącki ∙ Aurélie Hirschler ∙ Jeewan Babu Rijal ∙ Christine Carapito ∙ Lennart Martens ∙ Ute Distler ∙ Hansjörg Schild ∙ Stefan Tenzer

Journal: Nature Communications| Year: March 2024 | Abstract abstract Icon

Human leukocyte antigen (HLA) class I peptide ligands (HLAIps) are key targets for developing vaccines and immunotherapies against infectious pathogens or cancer cells. Identifying HLAIps is challenging due to their high diversity, low abundance, and patient individuality. Here, we develop a highly sensitive method for identifying HLAIps using liquid chromatography-ion mobility-tandem mass spectrometry (LC-IMS-MS/MS). In addition, we train a timsTOF-specific peak intensity MS2PIP model for tryptic and non-tryptic peptides and implement it in MS2Rescore (v3) together with the CCS predictor from ionmob. The optimized method, Thunder-DDA-PASEF, semi-selectively fragments singly and multiply charged HLAIps based on their IMS and m/z. Moreover, the method employs the high sensitivity mode and extended IMS resolution with fewer MS/MS frames (300 ms TIMS ramp, 3 MS/MS frames), doubling the coverage of immunopeptidomics analyses, compared to the proteomics-tailored DDA-PASEF (100 ms TIMS ramp, 10 MS/MS frames). Additionally, rescoring boosts the HLAIps identification by 41.7% to 33%, resulting in 5738 HLAIps from as little as one million JY cell equivalents, and 14,516 HLAIps from 20 million. This enables in-depth profiling of HLAIps from diverse human cell lines and human plasma. Finally, profiling JY and Raji cells transfected to express the SARS-CoV-2 spike protein results in 16 spike HLAIps, thirteen of which have been reported to elicit immune responses in human patients.

View Manuscript

Bioinformatics Pipeline for Processing Single-Cell Data

Authors: Arthur Declercq ∙ Nina Demeulemeester ∙ Ralf Gabriels ∙ Robbin Bouwmeester ∙ Sven Degroeve ∙ Lennart Martens

Book: Mass Spectrometry Based Single Cell Proteomics| Year: June 2024 | Abstract abstract Icon

Single-cell proteomics can offer valuable insights into dynamic cellular interactions, but identifying proteins at this level is challenging due to their low abundance. In this chapter, we present a state-of-the-art bioinformatics pipeline for single-cell proteomics that combines the search engine Sage (via SearchGUI), identification rescoring with MS2Rescore, quantification through FlashLFQ, and differential expression analysis using MSqRob2. MS2Rescore leverages LC-MS/MS behavior predictors, such as MS2PIP and DeepLC, to recalibrate scores with Percolator or mokapot. Combining these tools into a unified pipeline, this approach improves the detection of low-abundance peptides, resulting in increased identifications while maintaining stringent FDR thresholds.

View Book Chapter

Psm_utils: A high-level Python API for parsing and handling peptide-spectrum matches and proteomics search results

Authors: Ralf Gabriels ∙ Arthur Declercq ∙ Robbin Bouwmeester ∙ Sven Degroeve ∙ Lennart Martens

Journal: Journal of Proteome Research| Year: December 2022 | Abstract abstract Icon

A plethora of proteomics search engine output file formats are in circulation. This lack of standardized output files greatly complicates generic downstream processing of peptide-spectrum matches (PSMs) and PSM files. While standards exist to solve this problem, these are far from universally supported by search engines. Moreover, software libraries are available to read a selection of PSM file formats, but a package to parse PSM files into a unified data structure has been missing. Here, we present psm_utils, a Python package to read and write various PSM file formats and to handle peptidoforms, PSMs, and PSM lists in a unified and user-friendly Python-, command line-, and web-interface. psm_utils was developed with pragmatism and maintainability in mind, adhering to community standards and relying on existing packages where possible. The Python API and command line interface greatly facilitate handling various PSM file formats. Moreover, a user-friendly web application was built using psm_utils that allows anyone to interconvert PSM files and retrieve basic PSM statistics. psm_utils is freely available under the permissive Apache2 license at https://github.com/compomics/psm_utils.

View Manuscript

Ionmob: a Python package for prediction of peptide collisional cross-section values

Authors: David Teschner ∙ David Gomez-Zepeda ∙ Arthur Declercq ∙ Mateusz K Łącki ∙ Seymen Avci ∙ Konstantin Bob ∙ Ute Distler ∙ Thomas Michna ∙ Lennart Martens ∙ Stefan Tenzer ∙ Andreas Hildebrandt

Journal: Bioinformatics| Year: August 2023 | Abstract abstract Icon

Including ion mobility separation (IMS) into mass spectrometry proteomics experiments is useful to improve coverage and throughput. Many IMS devices enable linking experimentally derived mobility of an ion to its collisional cross-section (CCS), a highly reproducible physicochemical property dependent on the ion’s mass, charge and conformation in the gas phase. Thus, known peptide ion mobilities can be used to tailor acquisition methods or to refine database search results. The large space of potential peptide sequences, driven also by posttranslational modifications of amino acids, motivates an in silico predictor for peptide CCS. Recent studies explored the general performance of varying machine-learning techniques, however, the workflow engineering part was of secondary importance. For the sake of applicability, such a tool should be generic, data driven, and offer the possibility to be easily adapted to individual workflows for experimental design and data processing. We created ionmob, a Python-based framework for data preparation, training, and prediction of collisional cross-section values of peptides. It is easily customizable and includes a set of pretrained, ready-to-use models and preprocessing routines for training and inference. Using a set of ≈21 000 unique phosphorylated peptides and ≈17 000 MHC ligand sequences and charge state pairs, we expand upon the space of peptides that can be integrated into CCS prediction. Lastly, we investigate the applicability of in silico predicted CCS to increase confidence in identified peptides by applying methods of re-scoring and demonstrate that predicted CCS values complement existing predictors for that task. The Python package is available at github: https://github.com/theGreatHerrLebert/ionmob.

View Manuscript

Quality control for the target decoy approach for peptide identification

Authors: Elke Debrie ∙ Milan Malfait ∙ Ralf Gabriels ∙ Arthur Declercq ∙ Adriaan Sticker ∙ Lennart Martens ∙ Lieven Clement

Journal: Journal of Proteome Research| Year: January 2023 | Abstract abstract Icon

Reliable peptide identification is key in mass spectrometry (MS) based proteomics. To this end, the target decoy approach (TDA) has become the cornerstone for extracting a set of reliable peptide-to-spectrum matches (PSMs) that will be used in downstream analysis. Indeed, TDA is now the default method to estimate the false discovery rate (FDR) for a given set of PSMs, and users typically view it as a universal solution for assessing the FDR in the peptide identification step. However, the TDA also relies on a minimal set of assumptions, which are typically never verified in practice. We argue that a violation of these assumptions can lead to poor FDR control, which can be detrimental to any downstream data analysis. We here therefore first clearly spell out these TDA assumptions, and introduce TargetDecoy, a Bioconductor package with all the necessary functionality to control the TDA quality and its underlying assumptions for a given set of PSMs.

View Manuscript

Intensity and retention time prediction improves the rescoring of protein-nucleic acid cross-links

Authors: Arslan Siraj ∙ Robbin Bouwmeester ∙ Arthur Declercq ∙ Luisa Welp ∙ Aleksandar Chernev ∙ Alexander Wulf ∙ Henning Urlaub ∙ Lennart Martens ∙ Sven Degroeve ∙ Oliver Kohlbacher ∙ Timo Sachsenberg

Journal: Proteomics| Year: April 2024 | Abstract abstract Icon

In protein-RNA cross-linking mass spectrometry, UV or chemical cross-linking introduces stable bonds between amino acids and nucleic acids in protein-RNA complexes that are then analyzed and detected in mass spectra. This analytical tool delivers valuable information about RNA-protein interactions and RNA docking sites in proteins, both in vitro and in vivo. The identification of cross-linked peptides with oligonucleotides of different length leads to a combinatorial increase in search space. We demonstrate that the peptide retention time prediction tasks can be transferred to the task of cross-linked peptide retention time prediction using a simple amino acid composition encoding, yielding improved identification rates when the prediction error is included in rescoring. For the more challenging task of including fragment intensity prediction of cross-linked peptides in the rescoring, we obtain, on average, a similar improvement. Further improvement in the encoding and fine-tuning of retention time and intensity prediction models might lead to further gains, and merit further research.

View Manuscript

Projects

MS²Rescore

Modular and user-friendly platform for AI-assisted rescoring of peptide identifications.

github.com/compomics/ms2rescore

MS²PIP

MS2 Peak Intensity Prediction - Fast and accurate peptide fragmentation spectrum prediction.

github.com/compomics/ms2pip

Mumble

Mumble is a Python-based tool designed to find candidate unimod modifications for mass shifts.

github.com/compomics/mumble

MHC-3PO

MHC binding affinity predictions based using large language models and graph neural nets.

Coming soon

Baxerna

Contributor to Baxerna, a next-generation bacterial vaccine development pipeline to generate novel pathogen-specific mRNA vaccines.

baxerna.eu

Website

Creating my own website using the Quarto framework.

github.com/arthurdeclercq.github.io