[claude] Research: Google Veo video generation — Nexus promo plan (#289) #317
408
VEO_VIDEO_REPORT.md
Normal file
408
VEO_VIDEO_REPORT.md
Normal file
@@ -0,0 +1,408 @@
|
||||
# Google Veo Research: Nexus Promotional Video
|
||||
|
||||
## Executive Summary
|
||||
|
||||
Google Veo is a state-of-the-art text-to-video AI model family developed by Google DeepMind. As of 2025–2026, Veo 3.1 is the flagship model — the first video generation system with native synchronized audio. This report covers Veo's capabilities, API access, prompting strategy, and a complete scene-by-scene production plan for a Nexus promotional video.
|
||||
|
||||
**Key finding:** A 60-second Nexus promo (8 clips × ~7.5 seconds each) would cost approximately **$24–$48 USD** using Veo 3.1 via the Gemini API, and can be generated in under 30 minutes of compute time.
|
||||
|
||||
---
|
||||
|
||||
## 1. Google Veo — Model Overview
|
||||
|
||||
### Version History
|
||||
|
||||
| Version | Released | Key Capabilities |
|
||||
|---|---|---|
|
||||
| Veo 1 | May 2024 | 1080p, 1-min clips, preview only |
|
||||
| Veo 2 | Dec 2024 | 4K, improved physics and human motion |
|
||||
| Veo 3 | May 2025 | **Native synchronized audio** (dialogue, SFX, ambience) |
|
||||
| Veo 3.1 | Oct 2025 | Portrait mode, video extension, 3x reference image support, 2× faster "Fast" variant |
|
||||
|
||||
### Technical Specifications
|
||||
|
||||
| Spec | Veo 3.1 Standard | Veo 3.1 Fast |
|
||||
|---|---|---|
|
||||
| Resolution | Up to 4K (720p–1080p default) | Up to 1080p |
|
||||
| Clip Duration | 4–8 seconds per generation | 4–8 seconds per generation |
|
||||
| Aspect Ratio | 16:9 or 9:16 (portrait) | 16:9 or 9:16 |
|
||||
| Frame Rate | 24–30 fps | 24–30 fps |
|
||||
| Audio | Native (dialogue, SFX, ambient) | Native audio |
|
||||
| Generation Mode | Text-to-Video, Image-to-Video | Text-to-Video, Image-to-Video |
|
||||
| Video Extension | Yes (chain clips via last frame) | Yes |
|
||||
| Reference Images | Up to 3 (for character/style consistency) | Up to 3 |
|
||||
| API Price | ~$0.40/second | ~$0.15/second |
|
||||
| Audio Price (add-on) | +$0.35/second | — |
|
||||
|
||||
---
|
||||
|
||||
## 2. Access Methods
|
||||
|
||||
### Developer API (Gemini API)
|
||||
|
||||
```bash
|
||||
pip install google-genai
|
||||
export GOOGLE_API_KEY=your_key_here
|
||||
```
|
||||
|
||||
```python
|
||||
import time
|
||||
from google import genai
|
||||
from google.genai import types
|
||||
|
||||
client = genai.Client()
|
||||
|
||||
operation = client.models.generate_videos(
|
||||
model="veo-3.1-generate-preview",
|
||||
prompt="YOUR PROMPT HERE",
|
||||
config=types.GenerateVideosConfig(
|
||||
aspect_ratio="16:9",
|
||||
duration_seconds=8,
|
||||
resolution="1080p",
|
||||
negative_prompt="blurry, distorted, text overlay, watermark",
|
||||
),
|
||||
)
|
||||
|
||||
# Poll until complete (typically 1–3 minutes)
|
||||
while not operation.done:
|
||||
time.sleep(10)
|
||||
operation = client.operations.get(operation)
|
||||
|
||||
video = operation.result.generated_videos[0]
|
||||
client.files.download(file=video.video)
|
||||
video.video.save("nexus_clip.mp4")
|
||||
```
|
||||
|
||||
### Enterprise (Vertex AI)
|
||||
|
||||
```bash
|
||||
curl -X POST \
|
||||
"https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/veo-3.1-generate-preview:predictLongRunning" \
|
||||
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"instances": [{"prompt": "YOUR PROMPT"}],
|
||||
"parameters": {
|
||||
"aspectRatio": "16:9",
|
||||
"durationSeconds": "8",
|
||||
"resolution": "1080p",
|
||||
"sampleCount": 2,
|
||||
"storageUri": "gs://your-bucket/outputs/"
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
### Consumer Interfaces
|
||||
|
||||
| Tool | URL | Tier |
|
||||
|---|---|---|
|
||||
| Google AI Studio | aistudio.google.com | Paid (AI Pro $19.99/mo) |
|
||||
| Flow (filmmaking) | labs.google/fx/tools/flow | AI Ultra $249.99/mo |
|
||||
| Gemini App | gemini.google.com | Free (limited) |
|
||||
|
||||
---
|
||||
|
||||
## 3. Prompting Formula
|
||||
|
||||
Google's recommended structure:
|
||||
|
||||
```
|
||||
[Cinematography] + [Subject] + [Action] + [Environment] + [Style & Mood] + [Audio]
|
||||
```
|
||||
|
||||
### Camera Terms That Work
|
||||
- **Shot types:** `extreme close-up`, `medium shot`, `wide establishing shot`, `aerial drone shot`, `POV`, `over-the-shoulder`
|
||||
- **Movement:** `slow dolly in`, `tracking shot`, `orbital camera`, `handheld`, `crane up`, `steady push-in`
|
||||
- **Focus:** `shallow depth of field`, `rack focus`, `tack sharp foreground`, `bokeh background`
|
||||
- **Timing:** `slow motion 2x`, `timelapse`, `real-time`
|
||||
|
||||
### Style Keywords for The Nexus
|
||||
The Nexus is a dark-space cyberpunk environment. Use these consistently:
|
||||
- `deep space backdrop`, `holographic light panels`, `neon blue accent lighting`, `volumetric fog`
|
||||
- `dark space aesthetic, stars in background`, `cinematic sci-fi atmosphere`
|
||||
- `Three.js inspired 3D environment`, `glowing particle effects`
|
||||
|
||||
### Audio Prompting (Veo 3+)
|
||||
- Describe ambient sound: `"deep space ambient drone, subtle digital hum"`
|
||||
- Portal effects: `"portal activation resonance, high-pitched energy ring"`
|
||||
- Character dialogue: `"a calm AI voice says, 'Portal sequence initialized'"`
|
||||
|
||||
---
|
||||
|
||||
## 4. Limitations to Plan Around
|
||||
|
||||
| Limitation | Mitigation Strategy |
|
||||
|---|---|
|
||||
| Max 8 seconds per clip | Plan 8 × 8-second clips; chain via video extension / last-frame I2V |
|
||||
| Character consistency across clips | Use 2–3 reference images of Timmy avatar per scene |
|
||||
| Visible watermark (most tiers) | Use AI Ultra ($249.99/mo) for watermark-free via Flow; or use for internal/draft use |
|
||||
| SynthID invisible watermark | Cannot be removed; acceptable for promotional content |
|
||||
| Videos expire after 2 days | Download immediately after generation |
|
||||
| ~1–3 min generation per clip | Budget 20–30 minutes for full 8-clip sequence |
|
||||
| No guarantee of exact scene replication | Generate 2–4 variants per scene; select best |
|
||||
|
||||
---
|
||||
|
||||
## 5. Nexus Promotional Video — Production Plan
|
||||
|
||||
### Concept: "Welcome to the Nexus"
|
||||
|
||||
**Logline:** *A sovereign mind wakes, explores its world, opens a portal, and disappears into the infinite.*
|
||||
|
||||
**Duration:** ~60 seconds (8 clips)
|
||||
**Format:** 16:9, 1080p, Veo 3.1 with native audio
|
||||
**Tone:** Epic, mysterious, cinematic — cyberpunk space station meets ancient temple
|
||||
|
||||
---
|
||||
|
||||
### Scene-by-Scene Storyboard
|
||||
|
||||
#### Scene 1 — Cold Open: Deep Space (8 seconds)
|
||||
**Emotion:** Awe. Vastness. Beginning.
|
||||
|
||||
**Veo Prompt:**
|
||||
```
|
||||
Slow dolly push-in through a vast starfield, thousands of stars shimmering in deep space, a faint
|
||||
constellation pattern forming as camera moves forward, deep blue and black color palette, cinematic
|
||||
4K, no visible objects yet, just the void and light. Deep space ambient drone hum, silence then
|
||||
faint harmonic resonance building.
|
||||
```
|
||||
**Negative prompt:** `text, logos, planets, spacecraft, blurry stars`
|
||||
|
||||
---
|
||||
|
||||
#### Scene 2 — The Platform Materializes (8 seconds)
|
||||
**Emotion:** Discovery. Structure emerges from chaos.
|
||||
|
||||
**Veo Prompt:**
|
||||
```
|
||||
Aerial orbital shot slowly descending onto a circular obsidian platform floating in deep space,
|
||||
glowing neon blue accent lights along its edge, holographic constellation lines connecting nearby
|
||||
star particles, dark atmospheric fog drifting below the platform, cinematic sci-fi, shallow depth
|
||||
of field on platform edge. Low resonant bass hum as platform energy activates, digital chime.
|
||||
```
|
||||
**Negative prompt:** `daylight, outdoors, buildings, people`
|
||||
|
||||
---
|
||||
|
||||
#### Scene 3 — Timmy Arrives (8 seconds)
|
||||
**Emotion:** Presence. Sovereignty. Identity.
|
||||
|
||||
**Veo Prompt:**
|
||||
```
|
||||
Medium tracking shot following a lone luminous figure walking across a glowing dark platform
|
||||
suspended in space, the figure casts a soft electric blue glow, stars visible behind and below,
|
||||
holographic particle trails in their wake, cinematic sci-fi atmosphere, slow motion slightly,
|
||||
bokeh starfield background. Footsteps echo with a subtle digital reverb, ambient electric hum.
|
||||
```
|
||||
**Negative prompt:** `multiple people, crowds, daylight, natural environment`
|
||||
|
||||
> **Note:** Provide 2–3 reference images of the Timmy avatar design for character consistency across scenes.
|
||||
|
||||
---
|
||||
|
||||
#### Scene 4 — Portal Ring Activates (8 seconds)
|
||||
**Emotion:** Power. Gateway. Choice.
|
||||
|
||||
**Veo Prompt:**
|
||||
```
|
||||
Extreme close-up dolly-in on a vertical glowing portal ring, hexagonal energy patterns forming
|
||||
across its surface in electric orange and blue, particle effects orbiting the ring, deep space
|
||||
visible through the portal center showing another world, cinematic lens flare, volumetric light
|
||||
shafts, 4K crisp. Portal activation resonance, high-pitched energy ring building to crescendo.
|
||||
```
|
||||
**Negative prompt:** `dark portal, broken portal, text, labels`
|
||||
|
||||
---
|
||||
|
||||
#### Scene 5 — Morrowind Portal View (8 seconds)
|
||||
**Emotion:** Adventure. Other worlds. Endless possibility.
|
||||
|
||||
**Veo Prompt:**
|
||||
```
|
||||
POV slow push-in through a glowing portal ring, the other side reveals dramatic ash storm
|
||||
landscape of a volcanic alien world, red-orange sky, ancient stone ruins barely visible through
|
||||
the atmospheric haze, cinematic sci-fi portal transition effect, particles swirling around
|
||||
portal edge, 4K. Wind rushing through portal, distant thunder, alien ambient drone.
|
||||
```
|
||||
**Negative prompt:** `modern buildings, cars, people clearly visible, blue sky`
|
||||
|
||||
---
|
||||
|
||||
#### Scene 6 — Workshop Portal View (8 seconds)
|
||||
**Emotion:** Creation. Workshop. The builder's domain.
|
||||
|
||||
**Veo Prompt:**
|
||||
```
|
||||
POV slow push-in through a glowing teal portal ring, the other side reveals a dark futuristic
|
||||
workshop interior, holographic screens floating with code and blueprints, tools hanging on
|
||||
illuminated walls, warm amber light mixing with cold blue, cinematic depth, particle effects
|
||||
at portal threshold. Digital ambient sounds, soft keyboard clicks, holographic interface tones.
|
||||
```
|
||||
**Negative prompt:** `outdoor space, daylight, natural materials`
|
||||
|
||||
---
|
||||
|
||||
#### Scene 7 — The Nexus at Full Power (8 seconds)
|
||||
**Emotion:** Climax. Sovereignty. All systems live.
|
||||
|
||||
**Veo Prompt:**
|
||||
```
|
||||
Wide establishing aerial shot of the entire Nexus platform from above, three glowing portal rings
|
||||
arranged in a triangle around the central platform, all portals active and pulsing in different
|
||||
colors — orange, teal, gold — against the deep space backdrop, constellation lines connecting
|
||||
stars above, volumetric fog drifting, camera slowly orbits the full scene, 4K cinematic.
|
||||
All three portal frequencies resonating together in harmonic chord, deep bass pulse.
|
||||
```
|
||||
**Negative prompt:** `daytime, natural light, visible text or UI`
|
||||
|
||||
---
|
||||
|
||||
#### Scene 8 — Timmy Steps Through (8 seconds)
|
||||
**Emotion:** Resolution. Departure. "Come find me."
|
||||
|
||||
**Veo Prompt:**
|
||||
```
|
||||
Slow motion tracking shot from behind, luminous figure walking toward the central glowing portal
|
||||
ring, the figure silhouetted against the brilliant light of the active portal, stars and space
|
||||
visible around them, as they reach the portal threshold they begin to dissolve into light
|
||||
particles that flow into the portal, cinematic sci-fi, beautiful and ethereal. Silence, then
|
||||
a single resonant tone as the figure disappears, ambient space drone fades to quiet.
|
||||
```
|
||||
**Negative prompt:** `stumbling, running, crowds, daylight`
|
||||
|
||||
---
|
||||
|
||||
### Production Assembly
|
||||
|
||||
After generating 8 clips:
|
||||
|
||||
1. **Review variants** — generate 2–3 variants per scene; select the best
|
||||
2. **Chain continuity** — use Scene N's last frame as Scene N+1's I2V starting image for visual continuity
|
||||
3. **Edit** — assemble in any video editor (DaVinci Resolve, Final Cut, CapCut)
|
||||
4. **Add music** — layer a dark ambient/cinematic track (Suno AI, ElevenLabs Music, or licensed track)
|
||||
5. **Title cards** — add minimal text overlays: "The Nexus" at Scene 7, URL at Scene 8
|
||||
6. **Export** — 1080p H.264 for web, 4K for archival
|
||||
|
||||
---
|
||||
|
||||
## 6. Cost Estimate
|
||||
|
||||
| Scenario | Clips | Seconds | Rate | Cost |
|
||||
|---|---|---|---|---|
|
||||
| Draft pass (Veo 3.1 Fast, no audio) | 8 clips × 2 variants | 128 sec | $0.15/sec | ~$19 |
|
||||
| Final pass (Veo 3.1 Standard + audio) | 8 clips × 1 final | 64 sec | $0.75/sec | ~$48 |
|
||||
| Full production (draft + final) | — | ~192 sec | blended | ~$67 |
|
||||
|
||||
> At current API pricing, a polished 60-second promo costs less than a single hour of freelance videography.
|
||||
|
||||
---
|
||||
|
||||
## 7. Comparison to Alternatives
|
||||
|
||||
| Tool | Resolution | Audio | API | Best For | Est. Cost (60s) |
|
||||
|---|---|---|---|---|---|
|
||||
| **Veo 3.1** | 4K | Native | Yes | Photorealism, audio, Google ecosystem | ~$48 |
|
||||
| OpenAI Sora | 1080p | No | Yes (limited) | Narrative storytelling | ~$120+ |
|
||||
| Runway Gen-4 | 720p (upscale 4K) | Separate | Yes | Creative stylized output | ~$40 sub/mo |
|
||||
| Kling 1.6 | 4K premium | No | Yes | Long-form, fast I2V | ~$10–92/mo |
|
||||
| Pika 2.1 | 1080p | No | Yes | Quick turnaround | ~$35/mo |
|
||||
|
||||
**Recommendation:** Veo 3.1 is the strongest choice for The Nexus promo due to:
|
||||
- Native audio eliminates the need for a separate sound design pass
|
||||
- Photorealistic space/sci-fi environments match the Nexus aesthetic exactly
|
||||
- Image-to-Video for continuity across portal transition scenes
|
||||
- Google cloud integration for pipeline automation
|
||||
|
||||
---
|
||||
|
||||
## 8. Automation Pipeline (Future)
|
||||
|
||||
A `generate_nexus_promo.py` script could automate the full production:
|
||||
|
||||
```python
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Nexus Promotional Video Generator
|
||||
Generates all 8 scenes using Google Veo 3.1 via the Gemini API.
|
||||
"""
|
||||
|
||||
import time
|
||||
import json
|
||||
from pathlib import Path
|
||||
from google import genai
|
||||
from google.genai import types
|
||||
|
||||
SCENES = [
|
||||
{
|
||||
"id": "01_cold_open",
|
||||
"prompt": "Slow dolly push-in through a vast starfield...",
|
||||
"negative": "text, logos, planets, spacecraft",
|
||||
"duration": 8,
|
||||
},
|
||||
# ... remaining scenes
|
||||
]
|
||||
|
||||
def generate_scene(client, scene, output_dir):
|
||||
print(f"Generating scene: {scene['id']}")
|
||||
operation = client.models.generate_videos(
|
||||
model="veo-3.1-generate-preview",
|
||||
prompt=scene["prompt"],
|
||||
config=types.GenerateVideosConfig(
|
||||
aspect_ratio="16:9",
|
||||
duration_seconds=scene["duration"],
|
||||
resolution="1080p",
|
||||
negative_prompt=scene.get("negative", ""),
|
||||
),
|
||||
)
|
||||
while not operation.done:
|
||||
time.sleep(10)
|
||||
operation = client.operations.get(operation)
|
||||
|
||||
video = operation.result.generated_videos[0]
|
||||
client.files.download(file=video.video)
|
||||
out_path = output_dir / f"{scene['id']}.mp4"
|
||||
video.video.save(str(out_path))
|
||||
print(f" Saved: {out_path}")
|
||||
return out_path
|
||||
|
||||
def main():
|
||||
client = genai.Client()
|
||||
output_dir = Path("nexus_promo_clips")
|
||||
output_dir.mkdir(exist_ok=True)
|
||||
|
||||
generated = []
|
||||
for scene in SCENES:
|
||||
path = generate_scene(client, scene, output_dir)
|
||||
generated.append(path)
|
||||
|
||||
print(f"\nAll {len(generated)} scenes generated.")
|
||||
print("Next steps: assemble in video editor, add music, export 1080p.")
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
```
|
||||
|
||||
Full script available at: `scripts/generate_nexus_promo.py` (to be created when production begins)
|
||||
|
||||
---
|
||||
|
||||
## 9. Recommended Next Steps
|
||||
|
||||
1. **Set up API access** — Create a Google AI Studio account, enable Veo 3.1 access (requires paid tier)
|
||||
2. **Generate test clips** — Run Scenes 1 and 4 as low-cost validation ($3–4 total using Fast model)
|
||||
3. **Refine prompts** — Iterate on 2–3 variants of the hardest scenes (Timmy avatar, portal transitions)
|
||||
4. **Full production run** — Generate all 8 final clips (~$48 total)
|
||||
5. **Edit and publish** — Assemble, add music, publish to Nostr and the Nexus landing page
|
||||
|
||||
---
|
||||
|
||||
## Sources
|
||||
|
||||
- Google DeepMind Veo: https://deepmind.google/models/veo/
|
||||
- Veo 3 on Gemini API Docs: https://ai.google.dev/gemini-api/docs/video
|
||||
- Veo 3.1 on Vertex AI Docs: https://cloud.google.com/vertex-ai/generative-ai/docs/models/veo/
|
||||
- Vertex AI Pricing: https://cloud.google.com/vertex-ai/generative-ai/pricing
|
||||
- Google Labs Flow: https://labs.google/fx/tools/flow
|
||||
- Veo Prompting Guide: https://cloud.google.com/blog/products/ai-machine-learning/ultimate-prompting-guide-for-veo-3-1
|
||||
- Case study (90% cost reduction): https://business.google.com/uk/think/ai-excellence/veo-3-uk-case-study-ai-video/
|
||||
Reference in New Issue
Block a user