Xiaoyue Zhu, PhD - Data Scientist & Founder

Earlier this year, at my desk, I set an intention: what learning journey excites me most right now? What problem feels most “right” to solve? One stood out to me. As a busy, neophilic New Yorker, I’m often mentally drained by trivial decisions that still somehow demand non-trivial research. Why isn’t there an AI decision-maker that does the legwork, truly understands my preferences, and recommends me action-ready options, while I keep final say? It didn’t sound impossible right? So why doesn’t it exist?

Thinking it through, I realized I was imagining a new category of consumer app: a truly personal, intelligent AI decision-maker. To wedge into the problem, I needed the lowest-friction entry point: restaurant bookings. Anyone who’s lived in NYC knows the pain — balancing context, preferences, a taste for exploration (we’re all neophiles), and actual availability quickly becomes a logistics spiral and 10 tabs open: Resy, OpenTable, The Infatuation, Google Maps, plus calls and group texts. The recent shifts of restaurants from Resy to OpenTable only make it worse: fragmented platforms, scattered information, even more friction for decision-making.

On top of that, restaurant booking apps don’t design for diners in mind. They serve the industry. That’s why a booked-out $150 omakase in the West Village keeps resurfacing on your Resy feed. The data is structured, text-heavy, and siloed across platforms; personalization is basically non-existent. I realize I’m perfectly set up to attack that: a decision-making PhD, data science skills, and a well-trained taste from years of overspending on omakase and Mexican-inspired fusion cocktails. Maybe this problem intrigues me because I am a “maximizer”, but a real solution should also work for “satisficers” — people who just want a clean, no-frills recommendation they can act on now.

2025 made AI agents and yet-another AI coding IDE impossible to avoid. Personally, I wanted to test how far I could push this vibe-coding skill with no formal SWE training, just data science and high agency for learning. It felt right: solve my own pain point that scales to others, learn aggressively, translate ambiguity into code, and see how far AI can carry me. So I took out a piece of paper and started jotting down some technical challenges and ways to solve them. I started building.

From idea to data

The first challenge is in getting comprehensive, structured coverage of NYC restaurants. Booking platforms like Resy and OpenTable already expose rich, standardized listings of participating spots. I programmatically collected publicly visible data from these sources, plus Google Places API for reviews, and editorial inputs from sites like The Infatuation. I then cleaned, deduped, and joined it into a single table with thousands of restaurants.

On top of that database, I run LLM-powered extraction to identify key attributes (ambience, occasion, cuisine nuances, menu highlights etc.) and store the results in a vector database in Supabase for fast retrieval and ranking. With that foundation, flexible prompts like “cozy date spot with natural wine” easily map to concrete attributes, so the system can recommend with precision, not hallucination.

Recommender prototyping

I spent two weeks building a recommender to test whether a personalized model can work off a single user’s Resy history alone. Most recommenders fall into two camps. Content-based models score items by their attributes to match the user’s known tastes by ranking, and collaborative models that learn from behavioral interactions across users and items (e.g. simple collaborative filtering or model-based two-tower approach). For the first pass, I chose a content-based model as it’s a classic single-user cold start problem - I simply don’t have enough data for any collaborative models yet!

I pulled my own Resy history (118 visits, not too shabby) and built a composite taste vector from text fields (cuisine, vibe, occasion, menu) then ranked restaurants by cosine similarity to that vector. Eyeballing showed faint signal but poor framework for iteration.

After some thinking, I defined the evaluation framework first. Using a rolling-window over my history, I trained on the first t bookings to generate top-k recommendations and scored the next booking as “ground truth,” iterating t forward. The metric is NDCG (Normalized Discounted Cumulative Gain) and the gain is log-discounted by rank, such that putting the right pick at rank 1 matters far more than rank 10.

With the system in place, I improved the recommender by replacing the composite vector with attribute-specific embeddings, applying attribute-specific weights learned over hyperparameter optimization, and switched the distance to Mahalanobis to respect the different variance within each feature block. This configuration delivered an NDCG ~20× above random, and it stays robust with as few as ~15 past visits, a strong result for a cold-start setting. Next step, as data accumulates, is to layer in collaborative models and blend into a hybrid ranker.

Database design

Seeing real signal with excitement, I committed to building an MVP that could credibly serve 1k+ users. Having never designed a DB from scratch, naturally, I took a crash course from Youtube. I quickly sketched the domain in DBML (drawing out ERDs, clear primary/foreign keys). The result? A clean relational schema using PostgreSQL supported by Supabase: one source of truth for restaurants, clear user and preferences tables, sessions and options management for recommendations, and custom indexes to speed up large queries.

As highly personalized recommendations are the whole point, I designed the structure to grow with them. Today it supports the content-based ranking, tomorrow the data stored can support training of richer models without much rewrite. The exercise forced me to clarify the product’s core purpose, development path, and ideal end state, so I could design the leanest architecture to self-sustain and grow.

Booking integration

The core vision of the product is actionable recommendations, which means we must enable in-platform booking for that magical flow (none of the existing competitors offer that). That turned out to be the hardest piece. Reservation platforms aren’t exactly designed for third-party flows, and I don’t have a SWE background at all. Despite the unknown, I put on my PhD hat and attacked this as an applied research project, with Cursor and GPT-5 as my resourceful thinking partners. First, I mapped the user flow the end-to-end: authentication, cookies, session lifecycle, retries, booking requirements, and failure modes. Then, I used tools like mitmproxy and DevTools to trace the endpoint behavior, and built a user-authorized, headless browser-mediated flow that keeps users in control while letting the app finish the last mile.

After countless trials and a good amount of head-banging, the auth-and-booking flow finally clicked, on both Resy and OpenTable. It was reliable and indistinguishable from a human user. Watching the first confirmation email land in my inbox didn’t feel like magic, it just felt earned. Finally!

If I had to describe the feeling, it would be like a night ascent without a map, no formal training, loose rocks everywhere, but every step upward revealed the route to summit a bit more. When I finally made it to the top, the emotion wasn’t fireworks so much as clarity: with intention, patience, grit, and intuition of how to ask the right questions, ambiguous (even adversarial) puzzles unravel itself like a treasure map pointing towards its own solution. I felt relief, and a calm, profound confidence that I can "gradient ascend" myself whatever next mountain I choose to.

Backend & Vibe code

Up till now, two months have passed - I’ve cleared the big hurdles: data quality, recommender validity, booking integration. I just needed to build the app and make it real. This is also where I stalled the longest. I was scared and genuinely unsure how to proceed. I’d used every trick I know as a data scientist / AI person, but the next task felt daunting: how do I build this thing when I don’t even know the difference between FastAPI and Next.js, or the basic flow between backend and frontend? Luckily, my cofounder owned design and frontend in Lovable, which gave me time to sit with the fear and ask: how do you build an app with no SWE training at all?

A few YouTube videos later, I felt more grounded. Okay, it’s just a system of components. Start simple. Make a plan for the pieces. There’s no better way to learn than doing, so I clocked into Cursor and started the real vibe-coding journey.

This task-master video actually saved me. If I don’t know what needs to be built, why not ask AI to figure it out from the PRD? The domain knowledge + task-master + claude4 combo proved to be unstoppable. task-master turned my vague requirements into step-by-step tickets in JSON, and claude followed them, spitting out files paired with a test_*.py so I could pytest my way through, even just with mock data. This way, large, ambiguous objectives were managed into smaller, concrete instructions, leaving little room for AI to improvise and hallucinate.

Before I knew it, a real backend was taking shape: a FastAPI app with routers, models, and services for the core flows (auth, search, recommendations, booking). Middleware handled JWT and CSRF cleanly. With Supabase MCP, claude handled database integration with astounding ease. Health checks, structured logs, and fallbacks made the whole thing feel actually real.

This period was a psychological transformation. At first, I felt absolutely overwhelmed by the amount of code generated and felt compelled to examine each file. An innate distrust of a silicon-based token-predicting machine. Quickly I gave up: one, my mental CPU simply cannot process all that information; two, I don’t understand things enough to critique this code. But I do know how to instruct and run tests, and I can evaluate the quality by outcome - does it do the thing I want it to achieve?

Like many things in life, the second you give up control and just go with the flow, you get to fully enjoy the experience. I discovered that, at least for me, accepting that I’m getting intellectually dommed by claude is a rocket booster to inhuman progress. The flow state thus moved from “thinking through the logic” to “designing the next prompt given the current state”. By delegating details to AI, I freed my brain for higher-level planning and orchestration. Of course, having a coffee break while letting AI run loose is not something I recommend - you still need to monitor and steer it from overcomplicating simple solutions. Maybe it’s easier as a non-engineer as I carry less baggage? I now visualize vibe coding like riding a magic carpet: there is a delicate balance between trust and skepticism - steer it well and you fly; let go completely then you may end up in ruins you’ll have to climb out of. A trust exercise.

How I vibe coded an end-to-end consumer app with no SWE training

From idea to data

Recommender prototyping

Database design

Booking integration

Backend & Vibe code