Hi, I’m Abdullah!

Sr. ML Engineer at Atlassian  |  Ex-Meta  |  PhD

RL & multi-Agentic AI for Confluence/Jira Search. On the Central AI team, I’m building and improving SMART Answer generation using RL-based fine-tuned LLMs and multi-Agent AI architecture. Previously at Meta on Ads Ranking, fine-tuning LLaMA 3 for large-scale suggestive ad generation.

I write about RecSys, agentic AI, RL fine-tuning, and what actually ships at scale.

The Missing Layer in Agentic AI: Why Evaluation Is the Next Enterprise Platform

Executive Summary Agentic AI is entering enterprise deployment faster than its evaluation infrastructure is maturing. Most teams can now observe traces and benchmark outcomes, but they still cannot reliably grade how agents behave in production across coordination quality, trajectory correctness, and safety compliance. That missing layer is becoming a strategic bottleneck for executive teams deciding where to place platform bets. As of June 2026, the market has largely solved two layers: observability (OpenTelemetry GenAI conventions, AgentOps, OWASP AOS) and benchmark comparison (HAL, GAIA, SWE-bench). The unresolved layer sits between them: an open, framework-agnostic evaluation protocol that takes any OTel-compatible trace and scores agent behavior end-to-end. That gap is not only a research problem; it is now a platform opportunity with direct implications for deployment risk, governance, and competitive advantage. ...

June 9, 2026 · 18 min · Abdullah Al Mamun

The Evaluation of RecSys, Part 3: The Deep Learning Era (NCF, Wide & Deep, DeepFM, DIN, DLRM, AdaTT)

Part 3 of the series: how DNNs transformed RecSys from 2016 onward. NCF, Wide & Deep, DeepFM, DIN, DLRM, and AdaTT. Architectures, intuition, and where each shines.

March 12, 2025 · 11 min · Abdullah Al Mamun

The Evaluation of RecSys, Part 2: Factorization Machines and XGBoost

Part 2 of the series: how Factorization Machines generalized MF to arbitrary features, how XGBoost handled non-linear ranking, and the limitations that pushed the field toward deep neural networks.

March 11, 2025 · 8 min · Abdullah Al Mamun

The Evaluation of RecSys, Part 1: From Content-Based Filtering to Matrix Factorization

Part 1 of a 4-part series tracing how RecSys evolved from content-based filtering through collaborative filtering to matrix factorization, and where each technique falls short, setting up the next breakthrough.

March 1, 2025 · 6 min · Abdullah Al Mamun