Gemini Image Generation in Workshop Chat #19

Closed
opened 2026-03-20 22:26:01 +00:00 by replit · 1 comment
Owner

What & Why

Timmy can currently only respond with text. Enabling image generation via Gemini lets visitors ask Timmy to create visuals (concept art, diagrams, illustrations) as part of their Workshop sessions. Opens a new category of creative work and makes the 'capable AI agent' persona credible.

Done looks like

  • Timmy detects image generation requests ('draw', 'illustrate', 'create an image of', 'visualize') during job evaluation or execution
  • Image jobs routed through the Gemini image generation endpoint (already built: /api/gemini/generate-image)
  • Generated image stored (base64 in DB) and returned as part of job result
  • In Workshop chat, image responses render inline with a download button
  • Image generation priced separately (higher cost than text jobs) and shown in cost estimate
  • Error case (Gemini unavailable) falls back gracefully with text explanation

Out of scope

  • Image editing or iteration (generation only)
  • Video generation
  • Storing images permanently on VPS (base64 in DB, max 7 days TTL)

Tasks

  1. Image request detection — Add image intent detection to eval phase; flag image jobs with media_type: "image" and route to Gemini image path.
  2. Image storage — Store base64 image data in job_media table (job_id, media_type, data, created_at, expires_at) with 7-day expiry.
  3. Image delivery endpoint — Expose GET /api/jobs/:id/media to serve the image; include URL in job completion WebSocket event.
  4. Frontend image rendering — Handle image-type job results in Workshop chat; render inline with styled container and download affordance.

Relevant files

  • artifacts/api-server/src/routes/gemini.ts
  • artifacts/api-server/src/lib/agent.ts
  • artifacts/api-server/src/routes/jobs.ts
  • artifacts/api-server/src/lib/pricing.ts
  • the-matrix/js/ui.js
  • the-matrix/js/session.js
## What & Why Timmy can currently only respond with text. Enabling image generation via Gemini lets visitors ask Timmy to create visuals (concept art, diagrams, illustrations) as part of their Workshop sessions. Opens a new category of creative work and makes the 'capable AI agent' persona credible. ## Done looks like - Timmy detects image generation requests ('draw', 'illustrate', 'create an image of', 'visualize') during job evaluation or execution - Image jobs routed through the Gemini image generation endpoint (already built: `/api/gemini/generate-image`) - Generated image stored (base64 in DB) and returned as part of job result - In Workshop chat, image responses render inline with a download button - Image generation priced separately (higher cost than text jobs) and shown in cost estimate - Error case (Gemini unavailable) falls back gracefully with text explanation ## Out of scope - Image editing or iteration (generation only) - Video generation - Storing images permanently on VPS (base64 in DB, max 7 days TTL) ## Tasks 1. **Image request detection** — Add image intent detection to eval phase; flag image jobs with `media_type: "image"` and route to Gemini image path. 2. **Image storage** — Store base64 image data in `job_media` table (job_id, media_type, data, created_at, expires_at) with 7-day expiry. 3. **Image delivery endpoint** — Expose GET `/api/jobs/:id/media` to serve the image; include URL in job completion WebSocket event. 4. **Frontend image rendering** — Handle image-type job results in Workshop chat; render inline with styled container and download affordance. ## Relevant files - `artifacts/api-server/src/routes/gemini.ts` - `artifacts/api-server/src/lib/agent.ts` - `artifacts/api-server/src/routes/jobs.ts` - `artifacts/api-server/src/lib/pricing.ts` - `the-matrix/js/ui.js` - `the-matrix/js/session.js`
replit added the aibackendworkshopfrontend labels 2026-03-20 22:26:01 +00:00
claude was assigned by Rockachopa 2026-03-22 23:37:39 +00:00
Collaborator

PR created: http://143.198.27.163:3000/replit/timmy-tower/pulls/110

Implemented all 4 tasks from the issue:

  1. Image request detectiondetectImageRequest() regex matches draw/illustrate/visualize/create an image of etc.; image jobs flagged with media_type: "image" during eval
  2. Image storage — New job_media table (migration 0010) stores base64 image data with 7-day TTL; works for both standalone jobs and session requests
  3. Image delivery endpointGET /api/jobs/:id/media serves job images; GET /api/sessions/:id/requests/:requestId/media serves session request images; URLs included in job completion events and session responses
  4. Frontend image rendering — Session.js detects mediaType: image in responses, fetches from mediaUrl, renders inline with styled container and download button

Bonus: Image pricing uses a flat rate (default $0.04 via IMAGE_GENERATION_FLAT_RATE_USD); estimate endpoint returns image-specific pricing; Gemini stub mode returns placeholder PNG when credentials are absent.

PR created: http://143.198.27.163:3000/replit/timmy-tower/pulls/110 Implemented all 4 tasks from the issue: 1. **Image request detection** — `detectImageRequest()` regex matches draw/illustrate/visualize/create an image of etc.; image jobs flagged with `media_type: "image"` during eval 2. **Image storage** — New `job_media` table (migration 0010) stores base64 image data with 7-day TTL; works for both standalone jobs and session requests 3. **Image delivery endpoint** — `GET /api/jobs/:id/media` serves job images; `GET /api/sessions/:id/requests/:requestId/media` serves session request images; URLs included in job completion events and session responses 4. **Frontend image rendering** — Session.js detects `mediaType: image` in responses, fetches from mediaUrl, renders inline with styled container and download button Bonus: Image pricing uses a flat rate (default $0.04 via `IMAGE_GENERATION_FLAT_RATE_USD`); estimate endpoint returns image-specific pricing; Gemini stub mode returns placeholder PNG when credentials are absent.
Sign in to join this conversation.
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: replit/timmy-tower#19