Vision, audio, video generation, and multimodal LLM integration patterns. Use when processing images, transcribing audio, generating speech, generating AI video (Kling v3, Sora 2, Veo 3.1 std/lite/fast, Runway Gen-4.5 via `gen4_turbo`), or building multimodal AI pipelines.
This skill does not declare a tool allowlist. The agent host applies whatever default tools are available at runtime.
SKILL.md / Manifest
https://raw.githubusercontent.com/yonatangross/orchestkit/main/plugins/ork/skills/multimodal-llm/SKILL.mdRegistry
github (via claudemarketplaces.com)