Homunculus: AI-Driven Animation

60-second animation showcasing metamorphosis through innovative AI techniques with exceptional character consistency

Homunculus AI animation screenshot showing the emotive creature

Project Overview

For a job application with Pale Blue Dot Films, I created "Homunculus"—a 60-second AI-driven animation showcasing the metamorphosis of a geometric egg into a complex, emotive creature. The project challenged me to explore the theme of "Metamorphosis and Reaction" through an evolving, expressive character using cutting-edge AI technologies.

Despite the company's suggestion of spending only 2-3 hours on the task, I invested 2-3 days to achieve exceptional quality, particularly focusing on character consistency—one of the most persistent challenges in AI-generated video. By developing a sophisticated workflow that combined Stable Diffusion for frame generation, Runway for video synthesis, and personal performance capture for emotional authenticity, I created a visually compelling and emotionally resonant piece.

What makes this project significant is how it demonstrates the potential of AI in animation production while highlighting my ability to solve complex technical problems, maintain artistic vision, and deliver polished results under tight deadlines. The final animation achieved remarkable character consistency and emotional depth, qualities that are notoriously difficult to accomplish in AI-generated content.

🎯

Goal

Create a visually striking and emotionally resonant AI animation that demonstrates technical proficiency while maintaining character consistency.

⏱️

Timeline

2-3 days in August 2024, for a project initially scoped as 2-3 hours.

🧠

Role

Creator, Director, Technical Artist, AI Engineer, and Motion Performer for the entire production.

🛠️

Tools & Technologies

Runway, Stable Diffusion, ComfyUI, Adobe Photoshop, Adobe After Effects, Video-to-Video Processing.

Challenge & Solution

The Challenge

Creating an emotionally resonant AI animation presented several significant technical challenges:

  • Limited Control: Text-to-video AI models like Runway provide very limited control over camera movement, character positioning, and scene composition.
  • Character Consistency: Maintaining consistent character appearance across the entire animation is one of the most persistent problems in AI-generated video.
  • Emotional Authenticity: Creating genuine emotional expressions in AI-generated characters is extremely difficult, as text prompts often result in generic or exaggerated emotions.
  • Time Constraints: While the application suggested spending 2-3 hours on the task, achieving professional-quality results required significantly more time investment.
  • Technical Limitations: Working with early-generation AI video tools presented various technical constraints in resolution, frame rate, and visual artifacts.

The Solution

I developed a comprehensive approach that addressed each challenge with innovative technical solutions:

  • Hybrid Workflow: Rather than relying solely on text-to-video generation, I created a hybrid workflow combining Stable Diffusion for key frames with Runway for video synthesis.
  • First/Last Frame Control: To gain more precise control over the animation, I used Stable Diffusion and Photoshop to create detailed first and last frames, then fed these into Runway to guide the video generation.
  • Performance Capture: For the most emotionally intense sequences, I filmed myself performing the movements and expressions, then processed this through my video-to-video stylization workflow to maintain authentic performance while achieving the desired aesthetic.
  • Extended Timeline: I made the strategic decision to invest 2-3 days rather than 2-3 hours, recognizing that quality and character consistency required meticulous attention and multiple iterations.
  • Post-Production Refinement: Final polish in After Effects enhanced color, smoothed transitions, and ensured visual coherence throughout the piece.

Process & Methodology

My approach to this project followed a comprehensive production pipeline that blended traditional animation workflows with cutting-edge AI techniques. The process evolved through distinct phases, each with its own set of challenges and solutions.

1

Pre-Production

The foundation of the project began with careful concept development and planning:

  • Prompt Selection: After reviewing the options provided by Pale Blue Dot Films, I selected the "Metamorphosis and Reaction" prompt for its potential to showcase emotional expressivity.
  • Concept Development: I developed the concept of a geometric egg shape hatching into a complex, suffering creature to create a clear emotional arc.
  • Character Design: Carefully selected visual style and character physique that would work well with AI generation while maintaining consistency.
  • Storyboarding: Created a detailed storyboard mapping each beat of the animation to plan camera movements, transformations, and emotional progression.
2

Character & Environment Generation

The technical production began with creating consistent visual elements:

  • AI Model Selection: Selected appropriate AI models based on their capability to generate consistent characters and environments.
  • Key Frame Creation: Used Stable Diffusion to generate high-quality character renderings for critical moments in the animation.
  • Manual Refinement: Applied Photoshop adjustments to key frames to ensure character consistency and enhance emotional expression.
  • Environment Design: Generated background environments that would complement the character's emotional journey without distracting from it.
3

Video Creation & Performance Capture

With the visual elements established, I moved to creating the actual animation:

  • Runway Implementation: Utilized Runway's image-to-video capabilities, using the created key frames as starting and ending points for greater control.
  • Performance Recording: For the most emotionally intense sequence, I filmed myself performing the agonized movements and expressions to capture authentic emotion.
  • Video-to-Video Processing: Processed the performance footage through my established stylization workflow to maintain the performance while matching the visual style.
  • Video Segmentation: Created individual video segments following the storyboard structure to maintain control over each section of the narrative.
4

Composition & Post-Production

The final phase brought all elements together into a cohesive piece:

  • Sequence Assembly: Stitched together the various video segments according to the storyboard's narrative flow.
  • AI Cohesion Process: Applied proprietary AI processing techniques to render all elements cohesively and minimize stylistic inconsistencies.
  • After Effects Refinement: Added finishing touches in After Effects, including transitions, subtle effects, and stabilization.
  • Color Grading: Applied careful color grading to enhance mood, emphasize emotional states, and ensure visual consistency throughout the piece.

Results & Impact

The final animation successfully demonstrated the potential of AI in animation production, highlighting both technical innovation and emotional storytelling. My decision to invest additional time resulted in a polished piece that showcased the capabilities of AI-driven animation when guided by artistic vision and technical expertise.

Key Achievements

The project delivered several significant technical and creative accomplishments:

  • Character Consistency: Achieved exceptional character consistency throughout the animation—a notorious challenge in AI-generated video that was successfully overcome through my hybrid workflow approach.
  • Emotional Authenticity: Created genuinely expressive character performances by incorporating performance capture into the AI workflow, resulting in more nuanced and believable emotions.
  • Technical Innovation: Developed a novel approach combining multiple AI tools (Stable Diffusion, Runway) with traditional techniques (performance capture, After Effects) to overcome the limitations of any single method.
  • Visual Quality: Produced a final animation with minimal flickering and high visual fidelity, demonstrating professional standards despite the experimental nature of the technology.
Homunculus AI animation screenshot showing the emotive creature

The final "Homunculus" animation showcasing the metamorphosis from geometric form to emotive creature

"We genuinely appreciate your talent and the thoughtful approach you took in addressing the prompts."

— Shawn Hardin, Producer & Co-Director, Pale Blue Dot Films

Reflection & Learnings

This project provided valuable insights into the current capabilities and limitations of AI-driven animation, as well as strategies for overcoming those limitations to create emotionally resonant work.

What Worked Well

  • Hybrid Approach: Combining multiple AI tools with traditional techniques proved more effective than relying on any single method.
  • Key Frame Control: Using manually created/edited key frames to guide video generation significantly improved quality and consistency.
  • Performance Integration: Incorporating actual human performance for emotional sequences added authenticity that pure AI generation couldn't achieve.

Challenges & Solutions

  • Control Limitations: Text-to-video AI often produced unexpected results; I overcame this by using a more controlled hybrid approach with key frames.
  • Time Investment: Quality required significantly more time than suggested; I made the strategic decision to prioritize quality over strict time adherence.
  • Technical Constraints: Early AI video tools have various limitations; I developed workarounds through multi-stage processing and manual refinement.

Future Applications

  • Workflow Refinement: The hybrid workflow developed for this project could be further optimized for future AI animation projects.
  • Prompt Engineering: Learned specific prompt techniques that could be applied to future text-to-video AI projects for better results.
  • Emotional Performance: The performance capture approach could be expanded to create more nuanced character animations in future work.

Personal Takeaway

This project reinforced my belief that while AI tools offer incredible capabilities, they remain most powerful when guided by human creativity and technical expertise. The limitations I encountered with text-to-video generation pushed me to develop innovative workarounds, resulting in techniques I'll continue to refine in future projects. Though I generally avoid pure text-to-video approaches due to their limited control, this project forced me to develop techniques for getting more predictable results from these tools—skills that will prove valuable as AI video generation continues to evolve.