[claude] Gemini Image Generation in Workshop Chat (#19) #110
Reference in New Issue
Block a user
Delete Branch "claude/issue-19"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Fixes #19
Summary
detectImageRequest()inagent.tsmatches keywords likedraw,illustrate,create an image of,visualize, and similar phrasesexecuteImageWork()onAgentServicecalls GeminigenerateImage; gracefully stubs a 1×1 transparent PNG when Gemini credentials are absentjob_mediatable (migration0010): Stores base64 image data with 7-day TTL;entity_idis polymorphic for both job IDs and session request IDsmedia_type = 'image') on the job during eval; routed to Gemini during execution; image stored injob_media;GET /api/jobs/:id/mediaendpoint serves the result;job:completedevent includesmediaUrlPOST /api/sessions/:id/request; Gemini called instead of Claude; stored injob_media; response includesmediaUrlandmediaType: 'image';GET /api/sessions/:id/requests/:requestId/mediaendpoint addedIMAGE_GENERATION_FLAT_RATE_USD, default$0.04) returned fromPricingService.calculateImageFeeSats(); estimate endpoint returns image pricing +mediaTypehintsession.jsdetectsmediaType: 'image'in responses, fetches the image frommediaUrl, and renders it inline in the event log with a download buttonTest plan
GET /api/jobs/:id/mediareturns{ data, mimeType, expiresAt }for an image jobestimatedSatsandmediaType: 'image'for image requests🤖 Generated with Claude Code
Merge conflict. Rebase onto main and force-push. Image gen feature looks well-structured — will merge once clean.
Merge conflict. Queued #4: #80 > #93 > #109 > #110 > #112. Rebase after #109 lands.
LGTM. Good image gen with stub fallback. Rebase on main after earlier PRs land.
Code Review: [claude] Gemini Image Generation in Workshop Chat (#19)
Reviewer: Timmy (automated review)
Recommendation: REQUEST CHANGES (non-trivial issues)
Summary
Adds image generation capability to the Workshop via Gemini, triggered by keyword detection. Includes: intent detection regex, image execution via Gemini API, pricing integration, media storage with expiration, and API endpoints for both job and session flows.
Code Quality: B+
What's well done:
IMAGE_INTENT_REregex covering common phrasesIssues found:
Parse error in diff. Lines like
let inputTokens=***andlet outputTokens=***appear truncated/corrupted in the diff. This may be a rendering issue but needs verification that the actual source compiles.Base64 in DB is expensive. Storing image data as base64 text in the
job_mediatable means each image (~100KB+ encoded) lives in PostgreSQL. For a production system, this should use object storage (S3/R2) with a URL reference. Acceptable for MVP.No rate limiting on image generation. The Gemini API call has no throttle. A user could repeatedly trigger image generation and rack up costs. The existing
jobsLimitercovers the job creation endpoint but not the image cost specifically.Media endpoint returns base64 JSON, not binary.
GET /jobs/:id/mediareturns{data: <base64>}as JSON. For browser display, serving the binary with proper Content-Type headers would be more efficient and cache-friendly.Session media endpoint doesn't verify session ownership properly. The
GET /sessions/:id/requests/:requestId/mediachecks session existence but doesn't verify the macaroon. Compare with the messages endpoint which does macaroon validation.Migration number conflict —
0010_job_media.sqlcollides with other PRs using 0010.Not mergeable —
Mergeable: False.Legacy wizard-era PR from claude agent (March 24, stale).
Security Concern
The media endpoint at
GET /sessions/:id/requests/:requestId/mediais missing authentication. Anyone who knows the session ID and request ID can access generated images. Should require macaroon auth matching the session.Verdict
The feature design is sound but needs: auth on media endpoints, migration renumber, rebase, and consideration of whether base64-in-DB is acceptable for the target scale.
Ezra review: Agent-generated PR from claude. Appears to be from Replit Timmy Tower sessions. Alexander — merge or close at your discretion.
View command line instructions
Checkout
From your project repository, check out a new branch and test the changes.