cv

General Information

Name
José Guilherme de Almeida
Date of Birth
17th November 1994
Languages
English (proficient), Portuguese (native), Spanish (beginner)

Technical skills

  • Programming
    • Advanced/developer: Python and R
    • Intermediate/hobbyust: HTML, CSS, javascript
    • Beginner/learning: C
  • Machine-learning
    • Tabular ML frameworks (scikit-learn, caret)
    • Deep-learning frameworks (torch, MONAI, lightning, huggingface)
  • Large language models
    • LLM chatbot/agentic frameworks with langchain/ollama and RAG-based applciations using Chroma/Weave
    • Orchestration of workflows with OpenAI/Gemini/Anthropic/Ollama APIs
  • Computer-vision
    • scikit-image, OpenCV, torchvision
  • Data science
    • frequentist methods (hypothesis testing, multivariate analyses)
    • Bayesian methods (MCMC)
    • Data manipulation (pandas, polars, tidyverse)
  • Data visualization
    • ggplot2 (R)
    • d3.js (javascript)
  • Workflow
    • Version control (git)
    • Containerisation (Docker)
    • Workflow management (snakemake)

Work experience

  • 2022-now
    Clinical AI researcher
    Champalimaud Centre for the Unknown
    • Development of supervised, semi-supervised and self-supervised deep-learning methods for clinical image classification and segmentation (CT, MRI)
    • Development of generative AI methods for data harmonization and synthetic data generation
    • Automation of dataset organization and curation for large multi-centric datasets >10,000 radiology studies
    • Oversaw and trained BSc. and MSc. students
    • International collaboration with multiple research groups
  • 2017-2022
    Doctoral fellow
    EMBL-EBI
    • Development of machine-, deep-, and multiple instance learning methods to detect, characterize and classify cells and patients
    • Statistical and Bayesian modelling of longitudinal targeted sequencing experiments to uncover the genetic and non-genetic factors driving clonal expansion. Phylogenetic and phylodynamic modelling of the lifelong trajectories of clones using single-cell colonies in healthy individuals
  • 2016-2017
    Student researcher (MSc thesis)
    CNC-UC
    • Development of machine-learning protocols to determine hot-spots (important residues) in the binding interfaces of proteins
    • Structural and statistical analysis of large collections of protein-protein complexes and structural characterization of complexes with no known structure

Education

  • 2017-2022
    PhD on computational biology
    EMBL-EBI + Cambridge University, Cambridge, UK
    • Thesis: Computational analyses of blood cells: somatic evolution and morphology
  • 2015-2017
    MSc on Cell and Molecular Biology (with honours)
    Universidade de Coimbra
    • Thesis: Computational methods for the understanding of protein-based interactions
  • 2012-2015
    BSc on Biochemistry
    Universidade de Coimbra
    • Final project: Differential expression of PD-1/PD-L1 in individuals with chronic myeloid leukaemia