RetailOS — AI-first retail operations for MENA
I built an AI-first retail SaaS with a paying customer in Egypt. Forecasting, inventory, POS sync. The interesting part isn't the AI — it's the boring plumbing that kept the tenant paying month after month when the forecasting was wrong and the POS sync hiccupped.
The problem
Independent retailers across MENA run on a stack that looks like this: a POS terminal from 2014, an Excel file for stock, a WhatsApp group for orders, and a human brain for forecasting. The tooling that exists for them is either (a) built for enterprise with a price tag to match, or (b) a generic POS that doesn't speak to anything else.
Mr. Donut, a retail chain in Egypt, had three specific pains:
- Stockouts and dead stock at the same time. Different SKUs, same week. The team was guessing wrong in both directions.
- No visibility across locations. HQ didn't know what was selling where until end-of-month manual reports.
- Daily decisions needed tomorrow's data. They knew what happened three weeks late, which is the same as not knowing.
The path to paid was: stop them from guessing. Give them a forecast for the next 7 days per SKU per location. Show them where the margin is leaking. Do it without replacing their POS.
Architecture
The stack had three constraints that drove every decision: realtime sync (the tenant wanted today's data today), cheap LLM usage (MENA margins are thin), and single-operator maintainability (I ship it, I run it, I can't page someone).
A few architectural calls that mattered more than I expected:
- Supabase realtime instead of polling. The dashboard feels live because it is live — no "pull every 30s" hack. Edge functions subscribe to row changes and push deltas.
- Forecast service as a separate FastAPI process. Python's stats ecosystem is better than Node's for time-series work. Keeping it out of the main Next.js app meant I could iterate on the forecast model without redeploying the UI.
- Cache-first LLM usage. The LLM isn't in the hot path for most requests. Summaries, daily digests, and explanations are generated once and cached in Postgres. Users almost never pay latency for a model call.
- Multi-tenant from day one even with one tenant. Row-level security on every query. It cost me two days up front and saved me weeks later.
The stack
By the numbers
The MAPE (mean absolute percentage error) of ~4% on 7-day forecasts is higher than anything I've seen published for a multi-SKU retail setting of this size. Part of it is that the domain is easier than e.g. fashion retail: donuts have a consistent weekly seasonality and a short shelf life that forces fast adjustment. Part of it is a boring baseline: we start with a naive weekly average and only let the model adjust within bounds.
What I got wrong
Over-engineering the forecast
I spent the first month trying to use a proper time-series model (Prophet, then a custom ARIMA wrapper). Both outperformed the naive weekly average by ~0.3 percentage points of MAPE. For the effort, I should have just shipped the naive baseline, gotten live data flowing, then iterated. The tenant didn't care about MAPE. They cared about was today's order right.
Ignoring the POS driver rabbit hole
The first POS I integrated with had a quirky local API that dropped silent write failures. My sync worker assumed success unless it got an error. For two weeks, the tenant saw yesterday's numbers instead of today's and didn't tell me because they thought that's how it worked. I fixed it by adding an end-to-end checksum on the sync loop: hash the row count and last-updated-at on both sides, alert if they diverge. Should have been there from day one.
Realtime everywhere
I built every table with realtime subscriptions because it felt cool. Then I realized some tables (daily summary rollups) don't need realtime — they change once a day. The overhead of maintaining realtime channels for low-write tables was non-trivial. I now default to no realtime unless proven necessary.
The lesson
The thing that kept Mr. Donut paying wasn't the forecast. It was the checksum, the realtime dashboard, and the fact that when something broke I had observability to tell me before they did. In SaaS, the features you sell are not the features that retain. The features that retain are the ones the user never consciously notices — until they're missing.
If I'd spent less time tuning the forecast and more time on the boring stuff, I would've shipped two months earlier.
Visit retailos.one →