Skip to content

Chat Streaming (SSE)

The chat system uses Server-Sent Events (SSE) for real-time streaming of AI responses, creating the “typing” effect users expect.

FeatureSSEWebSocket
DirectionServer → ClientBidirectional
ComplexitySimple HTTPRequires upgrade handshake
ReconnectionAutomaticManual

For AI chat, we only need server → client. SSE is simpler and works perfectly.

EventWhenData
message_startStream begins{ conversation_id, message_id }
message_deltaEach token chunk{ content: "token text" }
message_endStream complete{ tokens_used, cost_usd, citations }
errorOn failure{ code, message }
  1. User sends message → optimistic update (message appears immediately)
  2. Frontend POSTs to /api/v1/chat/stream with Accept: text/event-stream
  3. Backend creates StreamingResponse with async generator
  4. NutritionChatAgent builds RAG context → streams tokens from LLM
  5. Frontend accumulates tokens via useSSE hook → renders in real-time
  6. On complete → full message saved to state with citations and metadata
FilePurpose
hooks/useChat.tsAll chat state management
hooks/useSSE.tsSSE streaming state
api/chatClient.tsstreamMessage() with SSE parsing
components/chat/ChatInput.tsxFixed input at viewport bottom
FilePurpose
api/v1/chat.pySSE endpoint + conversation CRUD
agents/nutrition_chat.pyAI agent with streaming
services/context_builder.pyRAG context assembly
  • Optimistic updates — user messages appear before API response
  • Abort streaming — cancel mid-response via AbortController
  • Citations — knowledge sources returned in message_end
  • Meal references — referenced meals rendered as mini-cards
  • Fixed input — chat input stays at viewport bottom (position: fixed)