The goal — what I'm exploring
GeminiGPT is where I explore making AI chat useful beyond a single thread. The question: what makes a chat assistant feel like it actually remembers you? What I'm testing is semantic memory — every message embedded and searchable across all your past conversations — plus document understanding and token-by-token streaming, all on a bring-your-own-key model so your data stays yours.
How it uses AI
GeminiGPT wraps Gemini 2.5 Flash with streaming and function calling. The interesting AI piece is memory: every message is embedded and stored in a LanceDB vector index, so semantic search can pull relevant context from any past conversation — not just the current thread. A custom Node pipeline parses uploaded PDFs and DOCX files so the model can reason over your own documents, and a bring-your-own-key model keeps every conversation private to you.
How it works
- Responses stream token-by-token from Gemini 2.5 Flash over Socket.IO WebSockets, so answers render as they generate — no waiting on the full completion.
- Every message is embedded and indexed in a LanceDB vector database, enabling semantic search that recalls context across all of your past chats, not just the current one.
- Drop in PDFs and DOCX files — a custom Node.js pipeline parses them server-side so the model can reason over your documents.
- Bring-your-own-key design: your Gemini API key stays client-side, so usage and conversations remain private.
- Chats persist in SQLite; the whole app ships as a Docker container deployed on Railway with CI through GitHub.
