Pierre GUILLAUME
Machine Learning Engineer
I am deeply passionate about the realms of Machine Learning, Deep Learning and Computer Science.
My interest in these advanced technologies has encouraged me to spend a significant portion of my time learning and gradually improving my abilities in these fields.

My experience
I specialized in image processing, transitioned to a research role in Natural Language Processing (NLP), and subsequently worked in the Criteo AI Lab, focusing on bidding models.
Jan 2020 - Aug 2022
Machine Learning researcher @ LSE EPITA
Research on audio, hate speach and toxic comments detection (NLP)
Publications
Sep 2020 - Jan 2021
Machine Learning engineer @ Alsid
Implementation of an end to end ML solution for anomaly detection in large scale Active directory.
Big Data Feature Engineering (30 million features) and Unsupervised Learning (Clustering, Autoencoders, PCA, Anomaly detection)
Jan 2022 - Dec 2023
Machine Learning engineer @ Criteo (Criteo AI Lab).
Working on bidding models (Click / Visit / Sales / Install prediction models ...).
Processing billions of events with Pyspark to fit bidding models.
Worked with Google Chrome teams regarding the implementation of Privacy Sandbox.
Jan 2022 - Dec 2023
Machine Learning engineer freelance
Implementation of custom Machine Learning / Deep Learning models for different companies
MLOps: Deployment of Machine Learning / Deep Learning model (Vertex AI / Huggingface / OVH ...)
Projects
User churn prediction / optimization platform with ML model

ML model to predict churn users (1 day / 3days / 7 days).
ML model to predict conversions.
Full stack deployment on GCP.
Front end app + REST api.
Tech : Python, Bigquery, Golang, Rust, React Typescript, GCP
Data loss prevention platform with LLM

NLP model (xlm-roberta-base) to detect sensible / leak data on ChatGPT platform.
Model Deployment.
Front end app + REST api + Chrome plugin.
Zilliz vector DB.
Tech : Python, React JS, Flask, GCP, zilliz
Machine Learning model for company growth forecast

Build model to predict future growth of a company based on historical earnings.
Build features and make feature engineering (NYSE data).
Top 10 predictions: +40% each year.
Tech : Python
Cyber LLM

LLM Alpaca LORA finetuned on cyber security datasets to
detect malware, infected files and dangerous code.
Model deployment.
Tech : Python
Hate speech / toxic comments detection

State of the art models comparison.
Publication of 2 papers.Tech : Python
Japanese Woodblock prints generation

Web scraping to build the dataset (+10k images).
Pix2Pix model implementation to generation woodblock print from input image or drawing.Tech : Python
Skills
Machine Learning / Deep Learning / Data science
Tabular dataset (high cardinality features / numerical features / caterical features)
NLP (models / LLM finetuning / data augmentation)
Machine Learning with huge amount of data and performance constraints
Software
Spark / Backend / Programming
Papers