Ouael Ben Amara

Ph.D. Candidate in Computer Science

University of Michigan-Dearborn

Specializing in probabilistic databases and query execution engines, with publications at premier A*/A-tier data management conferences including SIGMOD 2024 and EDBT 2022. Research focuses on relational probabilistic programming, knowledge compilation, query learning, and machine learning integration. Seeking full-time research positions in fundamental ML research, optimization problems, and low-level ML algorithms.

SIGMOD 2024 (A*)
3× ACM Badges
SIGMOD 2026 Submission
3.9/4.0 GPA

Connect with me

Ouael Ben Amara

About Me

Ph.D. Candidate in Computer Science at the University of Michigan-Dearborn (Expected 2026) with 3.9/4.0 CGPA. Specializing in probabilistic databases and relational probabilistic programming. Published at SIGMOD 2024 (A* conference) earning all three ACM reproducibility badges. Investigating integration of symbolic and subsymbolic reasoning for transparent and explainable AI systems.

Skills

Expert in Python, C, C++, and Rust for research applications. Advanced proficiency in PyTorch, Keras, TensorFlow, PyMC3, and CUDA programming. Specialized in probabilistic databases, Variational Inference, MCMC (Gibbs sampling), LLMs, RAG, vector databases, and semantic search. Deep expertise in knowledge compilation and probabilistic inference.

Achievements

Published at SIGMOD 2024 (A*) and EDBT 2022 (A-tier) with paper under review at SIGMOD 2026. StarfishDB project awarded all three ACM reproducibility badges (Available, Evaluated & Reusable, Reproducible). Contributed to Scikit-learn. Co-founded DoVisual (Memorality), won multiple competitions including Arab Bank and MENA Chamber of Commerce. Multiple academic honors including Honors List and Dean's List.

Experience

Research Assistant

The University of Michigan-Dearborn, USA
2020 - Present
  • StarfishDB Project: Developed query execution engine for relational probabilistic programming published at SIGMOD 2024 (A* conference), earning all three ACM reproducibility badges (Available, Evaluated and Reusable, Reproducible)
  • Designed and implemented novel algorithms for probabilistic query execution, achieving significant performance improvements over existing probabilistic programming systems
  • Developed knowledge compilation techniques for efficient probabilistic inference in relational databases, enabling scalable query processing over complex probabilistic models
  • Contributed to the definition and implementation of Gamma-Probabilistic databases for learning from query-answers (EDBT 2022, A-tier conference)
  • Developed query learning methods using JIT-compilation techniques for probabilistic inference
  • Currently have a paper under review at SIGMOD 2026 on advanced query learning approaches and knowledge compilation for probabilistic databases
  • Research Proposals: Principally developed two proposals on 'Explainable AI through Declarative Probabilistic Programming' and 'Learning Adaptive Hierarchical Partitions for Efficient Large-Scale Vector Retrieval'
  • Independent Project: Developed open-source Bayesian Point Cloud Registration exploring conjugacy and MCMC methods for 3D rigid transformations, achieving superior performance compared to standard ICP methods with uncertainty quantification
  • Investigating integration of symbolic and subsymbolic reasoning for transparent and explainable AI systems

Founder, DoVisual Memorality

Paris, France
June 2023 - June 2025
  • Co-founded a computer vision startup developing DoVisual, a platform enabling users to build machine learning pipelines using predefined nodes maintained and verified by expert agents
  • Specialized in deepfake detection, pose estimation, scene understanding, and object tracking
  • Advanced image/media manipulation detection and unsupervised image clustering with vectorization and semantic search capabilities
  • Won Arab Bank competition and MENA Chamber of Commerce pitching competition
  • Selected for Station F founders program and participated in GAIA Saudi Arabia

NLP Engineer

Benten Technologies, Virginia, USA
May 2022 - Aug 2022
  • Designed and developed a chatbot based on RASA framework
  • Developed NLU and Intent classifiers for Chatbot
  • Implemented Speech-to-text and text-to-speech modules
  • Participated in Mobile app development and embedded system design for smart speakers
  • Incorporated Blender Bot with conditional generation using HuggingFace

Teacher Assistant

University of Michigan-Dearborn, MI, USA
Sept 2021 - Present
  • Main Lecturer for C++ Course: Independently taught an entire C++ programming course, serving as the primary instructor with full responsibility for course delivery and content creation
  • Designed and developed comprehensive course materials including lecture slides, programming assignments, and practice problems
  • Created and graded exams, assessments, and project assignments for the C++ course
  • Graded database management system and software engineering courses, and taught software engineering tools lab

Fintech Project Founder

KernelSnap, Tunis, Tunisia
June 2019 - 2022
  • Developed automated identity verification process for Fintech companies using Keras, OpenCV
  • Integrated live processing OCR, Face Recognition and anomaly detection
  • Handled marketing and business development including pitching and client acquisition

Machine Learning Intern

Slide Money, Tunis, Tunisia
June 2019 - Sept 2019
  • Implemented Hidden Markov Model as a start model
  • Developed an asset trading bot based on RNN with integration of various metrics

Publications

StarfishDB: A Query Execution Engine for Relational Probabilistic Programming

A novel query execution engine for relational probabilistic programming. Awarded All Three ACM Reproducibility Badges: Available, Evaluated and Reusable, and Reproducible. Features vectorized execution and JIT compilation for high-performance probabilistic inference.

Ouael Ben Amara, Sami Hadouaj, Niccolò Meneghetti

SIGMOD 2024 (A* Conference) | United States
June 2024

Advanced Query Learning Methods for Probabilistic Databases

Novel approaches to query learning and knowledge compilation techniques for probabilistic databases, extending the StarfishDB framework with advanced optimization methods.

Under Review | SIGMOD 2026

United States
2026

Reproducibility Report: StarfishDB: A Query Execution Engine for Relational Probabilistic Programming

Independent validation and reproducibility assessment of the StarfishDB system, confirming all experimental results and contributing to the ACM reproducibility initiative.

Independent Validation | ACM SIGMOD Reproducibility 2025

United States
2025

Gamma Probabilistic Databases: Learning from Exchangeable Query-Answers

Introduces a new probabilistic database framework that learns from query answers while maintaining exchangeability, with applications in statistical relational learning.

Niccolò Meneghetti, Ouael Ben Amara

EDBT 2022 (A-Tier Conference)
January 2022

Learning From Query-Answers Using JIT-Compilation

Presents optimization techniques for query execution in probabilistic databases using just-in-time compilation, demonstrating performance improvements in probabilistic inference.

Poster Presentation at Northeastern Database Day

Boston, USA
2024

Speech Recognition for COVID-19 Keywords Using Machine Learning

Developed a machine learning approach for recognizing COVID-19 related keywords in speech, aiding in medical diagnosis and monitoring systems.

Wael Ben Amara, Amani Touihri, Salma Hamza

International Journal of Scientific Research in Computer Science and Engineering, Vol.8, Issue.4, pp.51-57
2020

Scikit Learn Bug Fix

Open-source contribution to the scikit-learn machine learning library, fixing a critical bug in the codebase.

Contribution to scikit-learn

United States
2020

Featured Work

StarfishDB

StarfishDB

A query execution engine for relational probabilistic programming published at SIGMOD 2024 (A* conference). Awarded all three ACM reproducibility badges. Built on Apache Arrow and ClangJIT with vectorized execution and JIT compilation for high-performance probabilistic inference.

C++Apache ArrowClangJITProbabilistic DBSIGMOD 2024
Bayesian Point Cloud Registration

Bayesian Point Cloud Registration

Open-source project exploring conjugacy and MCMC methods for 3D rigid transformations. Achieves superior performance compared to standard ICP methods with uncertainty quantification, demonstrating the power of probabilistic approaches in computer vision.

PythonMCMCBayesian Methods3D VisionPyMC3
DoVisual (Memorality)

DoVisual (Memorality)

Computer vision startup platform enabling users to build ML pipelines using predefined nodes maintained by expert agents. Specialized in deepfake detection, pose estimation, scene understanding, object tracking, and semantic search with vectorization capabilities.

Computer VisionML PipelinesDeepfake DetectionStartup

Education

Ph.D. Candidate in Computer Science

The University of Michigan - Dearborn | Dearborn, USA

2023 - Present (Expected 2026)3.9/4.0 CGPA

Focus: Focus: Probabilistic Databases

Masters of Science

The University of Michigan - Dearborn | Dearborn, USA

Winter 20233.9/4.0 CGPA

Software Engineering Degree

Mediterranean Institute of Technology | Tunisia

December 20223.48/4.0 CGPA

Academic Achievements

Awarded the Honors List in Fall 2017, Spring 2019, and Spring 2020; Dean's List in Spring 2018, Fall 2018, and Fall 2019.