ahad.

Natural Language Workout Logging with Gemini Flash

AK
Ahad KhanAgentic AI Engineer
February 28, 2025
2 min read
GeminiChromaDBFastAPI

Most workout apps make you tap through endless menus to log a single exercise. What if you could just say "I did 3 sets of bench press with 185 lbs" and have it automatically parsed, stored, and queryable?

That's exactly what I built with the AI Gym Memory System.

The Architecture

The system has four core components:

  1. Gemini Flash Service — Takes raw natural language input and extracts structured intent (log vs. query) plus entities (exercise, reps, sets, weight, date)
  2. Activity Parser — Normalizes the extracted data into a consistent schema
  3. Embedding Service — Converts each activity into a vector representation for semantic search
  4. Dual Storage — PostgreSQL for structured transactional data, ChromaDB for vector similarity search

Why Gemini Flash?

Speed. Workout logging needs to feel instant. Gemini Flash provides sub-second intent extraction, which is critical for the conversational feel. It handles ambiguity well too — "Did back and bis today" correctly maps to "back" and "biceps" exercises.

Semantic Memory Retrieval

The real magic is in querying. Users can ask temporal questions naturally:

  • "What did I train last Tuesday?" → Temporal filter + semantic search
  • "Show me my heaviest deadlift this month" → Aggregation + temporal filter
  • "How has my bench press progressed?" → Time-series semantic retrieval

The hybrid query system combines semantic similarity (via ChromaDB), keyword matching, and temporal filtering with configurable thresholds.

What's Next

I'm exploring voice input via Whisper and a mobile-first Streamlit UI for on-the-go logging. The goal is to make workout tracking feel as natural as talking to a training partner.

Check out the source code if you want to try it yourself.