An open index of curated prompts for image & video generation models.
Make it a winter scene with snow
Change the sky to sunset
Make it look like a painting
Add dramatic lighting
Please open a document before running this script
2. Create a new project or select an existing one
**Dashboard (`/dashboard`)**: Protected user dashboard with profile information
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Create a new issue with detailed information about your problem
小红书爆款封面、Keynote 级 PPT 配图、热力图 / 流程图 / SWOT、公众号文章逻辑图——一个 Skill 全搞定。 12 种内置视觉风格,从 Apple Keynote 极简到中式宋体留白,从赛博霓虹到杂志撞色,每种都按真实设计师的方法论拆细到 hex / 字体 / Do & Don't。
12 种内置视觉风格,从 Apple Keynote 极简到中式宋体留白,从赛博霓虹到杂志撞色,每种都按真实设计师的方法论拆细到 hex / 字体 / Do & Don't。
「画张小红书封面:标题"我用AI每天读100篇文章",xhs-vibrant 风格」 「做张PPT配图:核心观点 Context, not Prompts,minimal 风格」 「画个产品路线图,dark-atmospheric 风格」 「为我这篇文章批量配 6 张图,sketch-notes 风格」 「整套幻灯片:把这份 markdown 转成 12 页 deck」
Text-to-image diffusion models have demonstrated remarkable capability in generating realistic images from arbitrary text prompts. However, they often produce inconsistent results for compositional prompts such as "two dogs" or "a penguin on the right of a bowl". Understanding these inconsistencies is crucial for reliable image generation.
In this paper, we highlight the significant role of initial noise in these inconsistencies, where certain noise patterns are more reliable for compositional prompts than others. Our analyses reveal that different initial random seeds tend to guide the model to place objects in distinct image areas, potentially adhering to specific patterns of camera angles and image composition associated with the seed.
To improve the model's compositional ability, we propose a method for mining these reliable cases, resulting in a curated training set of generated images without requiring any manual annotation. By fine-tuning text-to-image models on these generated images, we significantly enhance their compositional capabilities. For numerical composition, we observe relative increases of 29.3% and 19.5% for Stable Diffusion and PixArt-α, respectively. Spatial composition sees even larger gains, with 60.7% for Stable Diffusion and 21.1% for PixArt-α.
At least 48GB of VRAM for running the CogVLM2 server, or 16 GB for the Int4-quantized version.
4. Generate a dataset with these seeds
**#113** [Emerging from Architectural Blueprint](https://wm4n.github.io/nano-banana-prompt/prompts/113-emerging-from-architectural-blueprint/) — `City & Architecture` `3D & Miniature
**#111** [Cartoon Character Sticker](https://wm4n.github.io/nano-banana-prompt/prompts/111-cartoon-character-sticker/) — `Character & Portrait
**#109** [Taking Photo Under Cherry Blossom](https://wm4n.github.io/nano-banana-prompt/prompts/109-taking-photo-under-cherry-blossom/) — `Photo & Cinematic
"Infographic & UI"
cover: "https://..." # use first step's image for the card
Thumbnail Studio**, YouTube içerik üreticileri için tasarlanmış, yapay zeka odaklı gelişmiş bir thumbnail prompt oluşturucusudur. Karmaşık prompt yazma sürecini ortadan kaldırarak, sadece butonlara tıklayarak yüksek tıklama oranına (CTR) sahip görsel tarifleri oluşturmanızı sağlar.
**🖼️ Canlı Canvas Önizleme**: Seçtiğiniz font canvas üzerinde anlık olarak görüntülenir