Thalx — Project Overview
Generado por el equipo de agentes Miles | 2026-03-12
# Thalx — AI Content Transformation to Reels ## Vision General **Thalx** es una plataforma SaaS que transforma cualquier contenido (YouTube URL, audio, video, PDF) en **videos cortos 9:16** listos para publicar en YouTube Shorts, Instagram Reels y TikTok. Un pipeline de 11 pasos completamente automatizado — desde la ingestion del contenido fuente hasta la publicacion en redes sociales. --- ## Stack Tecnologico | Capa | Tecnologia | Deploy | |------|-----------|--------| | Frontend | SvelteKit 5 + Tailwind 4 | Vercel (auto-deploy master) | | Backend API | FastAPI + arq (Redis queue) | VPS (systemd) | | LLM | Claude Agent SDK (Max subscription) | Sin API key | | DB + Auth | Supabase (PostgreSQL + RLS) | Supabase Cloud | | TTS | ElevenLabs (cost-controlled) | API | | Stock footage | Pexels API (cached) | API | | AI video gen | Sora/Kling/Runway (stub) | Pendiente | | Video render | Remotion SSR (React + H.264) | Local Node.js | | Transcripcion | faster-whisper | Local CPU | | Queue | arq + Upstash Redis | Serverless | | Upload | Supabase Storage | Cloud | | YouTube proxy | IPRoyal residential proxy | Rotating IPs | --- ## Pipeline Completo (11 pasos) ``` YouTube URL / Audio / Video / PDF | 1. INGESTION ─── yt-dlp, file upload, PDF parse | 2. TRANSCRIPTION ── faster-whisper (local, CPU) | 3. ANALYSIS ─── Claude haiku → core_message, key_points, | quotes, emotional_arc, hooks, footage_keywords | 4. SCRIPT GENERATION (2 fases en paralelo x3 formatos) | 4a. Content workspace → scripts narrativos (que se DICE) | 4b. Production workspace → animation specs (como se VE) | Merge → scenes unificadas | 5. TTS ─── ElevenLabs (cost-controlled, 3-layer limiting) | 6. FOOTAGE ─── Pexels stock (default) o AI video gen | 7. RENDER ─── Remotion SSR (mood palettes, energy animations, | transition-aware, brand-aware) | 8. UPLOAD ─── Supabase Storage (bucket "outputs") | 9. PUBLISH ─── YouTube / Instagram / TikTok | 10. SAVE ─── published_videos table | 11. CLEANUP ``` --- ## Skills System (nuevo — 2026-03-12) Los prompts de Claude ya no estan hardcodeados. Son **skill files versionados** con metadata para evaluacion automatica. ### 5 Skills | Skill | Modelo | Descripcion | |-------|--------|-------------| | `analysis` | haiku | Extrae core_message, key_points, quotes, emotional_arc, hooks, footage_keywords | | `youtube_script` | sonnet | Script narrativo YouTube (8-12 scenes, 5-10 min) | | `reel_script` | sonnet | Script narrativo Instagram Reel (5-7 scenes, 30-45s) | | `tiktok_script` | sonnet | Script narrativo TikTok (4-6 scenes, 15-25s) | | `animation_spec` | sonnet | Direccion visual para render engine (mood, energy, transitions) | ### Formato de Skill File Cada skill es un archivo `.md` con YAML frontmatter: ```yaml --- name: reel_script version: 1 model: sonnet expected_fields: [format, target_duration_seconds, scenes] eval: scene_count: [5, 7] duration_range: [30, 45] required_scene_types: [hook, cta] max_caption_words: 4 --- (prompt text here) ``` ### Ubicacion ``` backend/app/skills/ loader.py # load_skill("name") → Skill dataclass validators.py # 10 quality checks v1/ # version 1 de los skills analysis.md youtube_script.md reel_script.md tiktok_script.md animation_spec.md fixtures/ # golden inputs para evaluacion sample_transcript_01.json sample_analysis_01.json ``` --- ## Eval Framework ### 10 Validators | Check | Que valida | |-------|-----------| | `valid_json` | Output parsea como JSON (tolerante a trailing text) | | `required_fields` | Todos los campos esperados presentes | | `scene_count` | Numero de scenes dentro del rango | | `duration_range` | Duracion total dentro del target | | `scene_types` | Tiene hook y CTA | | `footage_concrete` | Queries de footage > 2 palabras, no abstractos | | `caption_length` | Captions dentro del limite de palabras | | `analysis_counts` | Suficientes key_points, hooks, keywords | | `emotional_arc` | Arc es uno de los valores validos | | `energy_range` | Energy de scenes entre 0.0 y 1.0 | ### Eval Manual (CLI) ```bash cd ~/thalx-web-app/backend venv/bin/python scripts/run_evals.py # todos venv/bin/python scripts/run_evals.py --skill reel_script # uno venv/bin/python scripts/run_evals.py --runs 3 # multiples ``` ### Auto-Eval (diario, sin intervencion humana) **Cron: 8:03 AM diario** → `scripts/auto_eval.py` Loop automatico: 1. Corre los 5 skills contra Claude 2. Si alguno falla → Claude mejora el prompt automaticamente (hasta 2 intentos) 3. Re-evalua para verificar el fix 4. Backup del prompt original (.md.bak) 5. Envia reporte a Roberto por Slack via Miles ### Ultimo Eval (2026-03-12) | Skill | Checks | Pass Rate | Latencia | |-------|--------|-----------|----------| | analysis | 4/4 | **100%** | 10s | | animation_spec | 3/3 | **100%** | 31s | | reel_script | 7/7 | **100%** | 35s | | tiktok_script | 7/7 | **100%** | 25s | | youtube_script | 7/7 | **100%** | 56s | **Overall: 28/28 checks passing (100%)** ### Tests (sin Claude calls) 23 tests unitarios contra golden fixtures: ```bash venv/bin/python -m pytest tests/test_skills.py -v # 23 passed ``` --- ## Render Engine (Remotion) El render engine traduce el animation spec en video real: - **Mood palettes**: 9 paletas de gradientes (urgent, calm, explosive, mysterious, triumphant, etc.) - **Brand-aware**: si el job tiene brand_pack, usa colores de marca como paleta dominante - **Energy-driven**: velocidad de fade, distancia de slide, escala de hook — parametrizado por energy (0-1) - **Transition-aware**: zoom burst, slide/sweep, dissolve, hard cut - **Key moment pulse**: flash sutil en escenas con key_moment - **Composiciones**: Reel (1080x1920), TikTok (1080x1920), YouTube (1920x1080) --- ## Content Advisor (agents-claude) Nuevo agente en el equipo Miles que analiza canales YouTube para recomendar que contenido transformar en reels. ### Metodologia 1. **Channel Health Assessment** — ultimos 10-15 videos, engagement rate, posting frequency 2. **Content-to-Reel Scoring** (1-10) — hook potential (30%), visual richness (20%), shareability (20%), conciseness (15%), timeliness (15%) 3. **Competitor Gap Analysis** — 3-5 canales del mismo nicho 4. **Trend-to-Reel Mapping** — tendencias actuales x relevancia al canal ### Herramientas - `transcribe_youtube` con proxy IPRoyal (residential rotating) - Web search para tendencias - Playwright para navegacion - Notion para contexto de proyectos --- ## YouTube Transcription Tool Herramienta compartida entre agents-claude y Thalx: - **Fast path**: subtitulos nativos via `youtube_transcript_api` + proxy (~1s) - **Fallback**: yt-dlp audio download + faster-whisper - **Metadata**: YouTube oEmbed API (titulo, canal) + duracion estimada del transcript - **Proxy**: IPRoyal residential rotating (PROXY_URL en .env) - **Bandwidth**: ~50KB por transcript nativo = ~20,000 videos por 1GB --- ## Arquitectura Frontend | Pagina | Funcion | |--------|---------| | `/process` | Wizard (source type + config + file upload) | | `/jobs` | Job list con status pills | | `/jobs/[id]` | Job detail con progress bar SSE + outputs | | `/results` | Completed jobs con download/copy URL | | `/architecture` | System architecture view | | `/setup` | Diagnostics | | `/auth/*` | Login/register (Supabase Auth) | --- ## Repos y URLs | Recurso | URL | |---------|-----| | App principal | `~/thalx-web-app` (GitHub: aguirrerjg/thalx-web-app) | | Arquitectura docs | `~/insight-engine-arch` (GitHub: aguirrerjg/insight-engine-arch) | | Frontend prod | https://thalx-web-app.vercel.app | | Backend API prod | https://playgrounds.digitalhubassist.ai/thalx-api/api/v1/ | | Agentes | `~/agents-claude` (GitHub: aguirrerjg/agents-claude) | --- ## Pendientes ### Alta prioridad - [ ] Aplicar migrations 006 + 008 + 009 en Supabase Dashboard - [ ] Configurar BACKEND_URL y SUPABASE_SERVICE_KEY en Vercel env vars - [ ] ElevenLabs: renovar/upgrader quota - [ ] Test E2E completo via UI (YouTube URL → render → results) - [ ] Deploy backend (systemd service o Docker en VPS) ### Media prioridad - [ ] Captologia (streaks, badges, surprise outputs) - [ ] AI video gen: conectar API real (Sora/Kling/Runway) - [ ] Frontend: UI para crear/seleccionar brand packs - [ ] Agregar mas fixtures de evaluacion (transcripts diversos) - [ ] Eval runner: quality scoring semantico con Claude