
Research/Data Scientist
Envoyez une offre d'emploi directement à ce candidat
Research/Data Scientist with 10+ years of experience in building machine learning models for academic and real-world applications. During my career, I have built ML models for tabular (e.g., time-series, genomics, proteomics, etc.), imaging (e.g., histopathology and radiology), and textual data (i.e., unstructured free-text) to perform either supervised (i.e., classification and prediction) or unsupervised (i.e., clustering) learning for different areas of interest with a special focus on the medical and biomedical domains. Within such a path, I have dealt with statistical concepts (e.g., bias, variance, optimization, and model convergence), text mining practices (e.g., curations, semantic ontologies, lexical tokenization, and syntactic parsing), and image processing techniques (e.g., segmentation, annotations, and convolutions).
Additionally, I have worked on enhancing (e.g., feature selection, feature engineering, and dimensionality reduction) and simplifying (i.e., probability-based explanatory and visualization) such ML models. To do so, my scripting languages (Python, R, C++, and Java, in order of fluency) have been incorporated with database skills (e.g., SQL, PostgreSQL, Pandas, and Graph DB), and CI/CD pipeline skills (i.e., GitHub and GitLab) along with the ability to leverage big data analytics and cloud-based platforms (i.e., HPC, AWS, and Azure). Recently, I was heavily engaged in leveraging LLMs, including Llama, Gemma, and Mistral, for RAG tasks, as well as fine-tuning such models toward particular, tailored tasks (e.g., structuring unstructured data, text summarization, question answering, vision language modeling, and generating synthetic data).
Postdoctoral Researcher at the Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg