Artificial Intelligence Engineer at Quinta (formerly Quicktext) (2024-10 – Present)
Co-developed Q-agents leveraging Small Language Models with Markovian state approach, architected end-to-end RAG and ETL pipelines, engineered Model Context Protocol server, and developed analytics dashboards for MLOps evaluation.
- Co-developed the Q-agents, leveraging Small Language Models (SLMs). Implemented a Markovian state approach to decouple decision-making from history, preventing hallucinations and cutting operational costs to ∼$0.001 avg per turn while maintaining ∼5s latency.
- Architected an end-to-end RAG and ETL pipeline to scrape, semantically chunk, and ingest unstructured data (Websites, PDFs, Images) into a vector DB, pairing it with a fine-tuned LLM to power a Q&A system supporting 85% automation across 38 languages.
- Engineered a unified Model Context Protocol (MCP) server to enable real-time Q&A and seamless end-to-end booking reservations natively across ChatGPT, Claude, and internal agent workflows.
- Developed analytics dashboards to monitor scraping performance, system usage, and operational costs. Engineered an accompanying MLOps evaluation pipeline via n8n to conduct data analytics on production logs, scoring LLM responses using custom metrics and selective Chain-of-Thought (CoT).
- Established a comprehensive internal benchmark to rapidly test and validate the performance of new models, prompt engineering techniques, and agent workflows.
- Worked on an end-to-end Audio-LLM architecture (audio-to-token), eliminating intermediate transcription to significantly reduce latency for conversational AI.
- Technologies: FastAPI, LangChain, LangGraph, n8n, Supabase, ChromaDB, Docker, Azure, Redis, Hugging Face, LoRA/QLoRA, Twilio, Power BI, Excel.
Machine Learning Engineer at Solyntek (2024-01 – 2024-07)
Designed and developed AI-powered safety monitoring modules with model architecture design, training, and post-training quantization. Built Kafka-driven data pipelines for real-time ingestion.
- Designed and developed AI-powered safety monitoring modules, including model architecture design, training, and post-training quantization to optimize inference performance.
- Built Kafka-driven data pipelines for real time ingestion and integration, ensuring scalable, low latency transmission across systems.
- Technologies: Python, C++, PyTorch, OpenCV, Docker, Apache Kafka, Azure, Jira, GitHub Actions (CI/CD).
AI Research Intern at Hematology Laboratory Hospital Farhat Hached (2023-06 – 2023-09)
Developed an automated application for blood cell counting and Acute Lymphoblastic Leukemia (ALL) cell detection to streamline diagnostic workflows.
- Developed an automated application for blood cell counting and Acute Lymphoblastic Leukemia (ALL) cell detection, streamlining diagnostic workflows.
- Technologies: Python, YOLO, Faster R-CNN, ViT, TensorFlow, Keras, OpenCV, Labelme.
Software Engineer Intern at Proxym Group (2022-06 – 2022-08)
Built an intranet employee rating engine and monitoring dashboard combining ML with rule-based formulas.
- Built an intranet employee rating engine and monitoring dashboard combining ML (XGBoost, Random Forests) with rule-based formulas.
- Technologies: Django, XGBoost, Scikit-learn, JavaScript, HTML, CSS