Search Results

Search Phrase = HeyGen

Main Site Search Results

1: Automated Video Production Pipeline

Automated Video Production Pipeline

Description

This video guides you through setting up an automated video production pipeline, from selecting and testing brand voices using Eleven Labs to pairing them with digital avatars in HeyGen. By following the steps, you'll learn how to catalog and integrate voices, match them with visual characters, and generate preview videos for evaluation. Once you complete the video, you'll be able to efficiently create, test, and organize multiple spokesperson options for your brand's automated content generation. This process empowers you to streamline video production and build a scalable library of branded video assets.

Outcomes

Following are the key things you will be able to do after you watch this demo:

Identify suitable brand voices using generative AI tools.
Catalog and organize voice and avatar options for efficient selection.
Integrate third-party voices into video production platforms.
Pair voices with digital avatars to create compelling spokesperson combinations.
Generate and preview automated video content for evaluation.
Document and track production assets for streamlined workflow.
Select and finalize top spokesperson options for automated content generation.

Summary

Introduction to Automated Video Production Pipeline (00:00:00 – 00:00:59)
Josh kicks off the demo by outlining the goal: selecting brand-aligned voices and digital doubles (either your own clone or hired actors), organizing those assets, and laying out the end-to-end steps needed to spin up a fully automated video production pipeline.
Content Sequencing Concept and Cloning (00:00:59 – 00:02:20)
He explains the core idea of building a repeatable sequence of content—cloning a finished production over and over—so you can continually generate new videos by plugging different scripts into the same automated workflow.
Defining Digital Doubles and Voice Types (00:02:20 – 00:03:11)
Josh clarifies terminology (digital twin vs. digital double), walks through the two main “buckets” of voice assets (personality-based clones vs. spokesperson avatars), and discusses how to mix and match them depending on your brand needs.
Selecting Platforms for Generative AI and Deployment (00:03:11 – 00:04:00)
He emphasizes the importance of vetting your generative-AI tools—voice engines and video avatars—and making sure they’re compatible with your target platforms before committing to any given solution.
Brand-Focused Workflow and SRT Utilization (00:04:00 – 00:05:25)
Josh decides to focus on one streamlined method for this demo, using a single SRT transcript file as the “source of truth” for automation—underscoring that a clean, well-formatted SRT is absolute gold when you’re architecting an automated pipeline.
Importing SRT and Leveraging Automation (00:05:25 – 00:07:40)
He shows how to import the SRT into the voice-generation platform, highlighting how the time-coded script drives every subsequent step—from audio rendering to scene assembly.
Setting Up Voice Design in ElevenLabs (00:07:40 – 00:11:49)
A step-by-step walkthrough of testing voice presets, tweaking text lengths, integrating third-party voices, and crafting voice-design prompts to nail down the exact tone and style you want.
Managing Credits and Reviewing Generated Audio (00:11:49 – 00:15:46)
Josh demonstrates how to monitor and conserve your generation credits, preview the rendered audio, swap out placeholder text, and ensure you’re only spending resources on polished clips.
Applying Voiceover and Text Overlays to Video (00:15:46 – 00:19:08)
He attaches the finalized voice track to the video timeline, adds and styles text overlays (centering, contrast adjustments), and assembles the basic video composition ready for export.
Enhancing Prompts with AI Tools for Voice Design (00:19:08 – 00:22:04)
Introduces additional AI utilities for brainstorming and refining your voice-design prompts—showing how to iterate until you get a sample that truly matches your brand voice.
API Key Handling and Asset Export Configuration (00:22:04 – 00:27:28)
A practical guide on securely copying your ElevenLabs API key, configuring export settings (e.g., 4K output), and organizing all generated files into branded folders for easy access.
Frame Rate Considerations and Quality Checks (00:27:28 – 00:31:42)
Notes the default 25 fps setting, explains how frame rate impacts perceived motion, and walks through checking your export quality to avoid any unexpected artifacts.
Avatar Adjustments, Project Naming, and Fallbacks (00:31:42 – 01:05:16)
Josh covers fine-tuning avatar scale and positioning, updating project names for consistency, and setting up fallback workflows if you need to swap voices or visuals mid-pipeline.
Avatar Replacement and Cataloging (00:31:42 – 00:34:06)
Pair your chosen voice with visuals by replacing the default avatar, browsing through the 21 “looks” in each category, using the snipping tool to capture promising thumbnails, and logging each candidate’s name and category in your tracking spreadsheet.
Avatar Testing and Video Formatting (00:34:07 – 00:36:24)
Brainstorm voice–visual combinations (e.g. “August”), select a portrait-mode avatar, preview the static image, upload any custom avatars into the pipeline, drag your source video beneath the avatar layer, and confirm the composition and framing.
Voice-Avatar Sync and Quality Comparison (00:36:24 – 00:37:39)
Generate audio samples to compare HeyGen vs. ElevenLabs quality, force-refresh the clip to confirm it’s using the intended voice (e.g. Ryan Kirk), and watch for the spinning indicator to verify successful render.
Preview Generation and File Labeling (00:38:10 – 00:39:11)
Render a 4K preview of the voice-avatar pairing, then label the export asset with your convention (e.g. 001_RyanKirk_CharlieAvatar) so each test remains organized and easily identifiable.
Pipeline Duplication for Variant Testing (00:39:11 – 00:41:15)
Duplicate the entire sequence to create “Test 002,” swap in a new avatar (such as Colton), explore lifestyle/UGC categories, and note how background removal and frame size affect the final look.
Background Removal and Frame Adjustments (00:41:15 – 00:42:32)
Apply the background-remover tool to avatars with built-in backgrounds, observe any cut-offs (like arms being cropped), tweak the canvas framing, and decide between static vs. transparent backgrounds based on brand needs.
Third-Party Voice Integration Workflow (00:42:32 – 00:44:03)
In the “My Voices” tab, toggle on integrated voices (e.g. Charlie), heart your favorites so they surface first, preview each sample, and ensure the API integration is active before proceeding.
Voice Audition Labeling and Mood Board Documentation (00:44:03 – 00:47:09)
Name each audition (e.g. 002_CharlieAvatar), update your mood board with snipped thumbnails, record which browser tab or category each came from, and keep this documentation up to date for reproducibility.
Frame Rate and Credit Management (00:47:09 – 00:48:06)
Note the default 25 fps setting—mismatches can cause audio sync issues—toggle off “Avatar 4” if you’re on an unlimited plan, and monitor your generation credits to avoid unexpected limits.
Styling and Folder Organization (00:48:06 – 00:49:29)
Adjust text overlay colors to maintain contrast (match your brand palette), create new folders for each batch, and standardize your output directory structure so you know exactly where each rendered clip lives.
Option Preview and Cataloging Workflow (00:49:30 – 00:55:51)
Refresh thumbnails, scroll through voice-avatar combos, assign option numbers, screenshot grids of candidates, and log each pairing’s status (“Yes,” “Maybe,” “No”) in your spreadsheet.
Iteration Process and Consistency Notes (00:55:51 – 00:57:23)
Always regenerate every variation (never reuse stale renders), note any limitations (e.g. animated text can cover on-screen elements), and keep your naming and documentation consistent so the pipeline remains bullet-proof.
Ranking Options and Visual Separators (00:57:24 – 01:02:40)
Introduce visual separators in your catalog (e.g. blank rows), rank the top voice-avatar combos, screenshot your “definite yes” list, and preserve those as templates for future batches.
Additional Voice Integration: Amelia (01:02:40 – 01:04:33)
Search for “Amelia” in your voice library, verify whether it’s built-in or needs third-party integration, add it to favorites, preview the sample, and record its ID for consistent reuse.
Final Voice Candidate Integration (01:04:33 – 01:05:16)
Confirm Amelia’s render, then search for any last candidates (e.g. “Analore”), heart and test them, catalog the results, and ensure each new voice is fully integrated into the pipeline.
Pipeline Finalization and Duplication for Scale (01:05:16 – 01:08:34)
In closing, he recaps that once you’ve chosen your voices and avatars, you can literally duplicate this entire process—scripts, audio, video, assets—to churn out a full social-media content library on autopilot.
Final Pipeline Recap and Scale Duplication (01:07:40 – 01:08:34)
Recap how you’ve selected your final set of voices and avatars, finalize your naming conventions, and highlight that you can now duplicate this entire automated workflow to churn out an endless library of on-brand social-media videos.

2: Generative AI Audio Clone

In this video, Josh Lomelino demonstrates how to create an AI-powered digital voice replica using 11 Labs, enabling content creators to rapidly generate high-quality audio and video content at scale. By training the system with a consistent audio sample, users can produce automated voice performances that sound like their own, allowing them to create lectures, demos, and other content quickly and efficiently. The method involves uploading 1-3 hours of controlled audio recordings, fine-tuning voice settings, and integrating with platforms like HeyGen to automate video production. After watching this tutorial, viewers will be able to develop their own AI voice clone, streamline content creation, and overcome time constraints by generating multiple scripts and videos with minimal manual effort.

Description

Outcomes

Here are the key things you will be able to do after you watch this demo:

Train an AI voice synthesis system using personal audio recordings
Generate consistent voice replicas with controlled audio samples
Optimize AI-generated voice settings for natural-sounding output
Integrate voice cloning technology with video production platforms
Create automated content at scale using text-to-speech technologies
Manage AI voice generation credits efficiently
Export and store audio files in multiple formats for different applications
Prototype and refine scripts using AI voice technology
Develop a workflow for rapid content creation across lectures, demos, and presentations
Leverage AI tools to overcome time constraints in content production

Summary

Creating a Voice Replica Using AI 0:09
- Josh Lomelino discusses the use of AI-powered voice synthesis to create a voice replica, emphasizing the challenge of matching human recordings.
- He highlights the effectiveness of using text prompts to quickly prototype, test, and revise scripts or generate finished audio files.
- Josh mentions his preference for the 11 labs tool, which offers a studio mode for producing longer form audio tracks.
- He shares his initial struggles with the tool and how contacting their support provided helpful suggestions.
Training the System for Consistent Output 1:24
- Josh explains the importance of training the system with a consistent audio sample to avoid unnatural variations in volume and tone.
- He describes his initial mistake of using diverse recordings from different sessions, which led to inconsistent results.
- Josh emphasizes the need for a controlled environment with a single, consistent audio sample for better results.
- He plans to demonstrate the settings that produce the best results for replicating his voice in the user interface.
Optimizing Generated Audio Files 2:56
- Josh advises generating audio sparingly to avoid exhausting monthly credits and recommends starting with smaller sections of text.
- He explains the process of refining the output and generating both wave and mp3 audio files for different applications.
- Josh mentions the importance of storing both wave and mp3 files for secure storage and project organization.
- He notes that it may take several attempts to develop a method that works well for the user.
Exporting and Integrating Audio Files 4:19
- Josh describes two methods for uploading audio files to virtual avatars: exporting both wave and mp3 versions or integrating the 11 labs API directly with Hey Gen.
- He prefers using the wave audio file for higher quality and to avoid double compression but acknowledges the need to export the mp3 format for larger tracks.
- Josh explains the integration of the 11 labs API with Hey Gen, which allows for rapid development of prototypes and large volumes of content.
- He mentions the need to break up scripts into manageable sections for efficient processing by the software.
Automating Video Production with AI 6:02
- Josh discusses the ability to produce videos at scale by automating both audio and video avatars from text.
- He highlights the productivity gains from using AI to generate video scripts and produce audio and video automatically.
- Josh notes the cost of AI-generated voice and the strategy of using high-quality audio only when necessary.
- He explains the use of draft versions of scripts with Hey Gen's voice replica to refine the script without incurring additional costs.
Finalizing and Exporting Scripts 8:04
- Josh describes the process of finalizing scripts and either reading and recording them manually or using the 11 labs integration within Hey Gen.
- He mentions the use of a side-by-side display setup with a Google document and video avatar performance for quick edits.
- Josh emphasizes the usefulness of this method for high-end projects that require detailed polishing and iteration.
- He concludes the demo by encouraging the use of digital voice replicas to scale beyond time constraints and improve productivity.

3: Batch Producing Avatars

Keywords: batch, avatar, digital-double, production, lighting, setup, color, correction, video, editing, project, HeyGen, encoder

In this tutorial, Josh Lomelino demonstrates a comprehensive workflow for efficiently batch producing multiple virtual avatars with consistent lighting and color quality. Viewers will learn how to set up precise video editing project settings, create a master sequence with multiple camera angles, and use Adobe Media Encoder to render individual clips for avatar training. The technique allows content creators to scale their avatar production, quickly export multiple versions of their digital doubles, and maintain a well-organized project structure that enables future edits and refinements. By following this method, users can streamline their avatar creation process, saving significant time and producing high-quality, professional virtual representations.

Description

Outcomes

Following are the key things you will be able to do after you watch this demo:

Configure video editing project settings to match camera specifications
Create a systematic numbering and organization system for avatar sequences
Set up multiple camera angles within a single project
Use Adobe Media Encoder to batch render avatar clips
Export individual video files for virtual avatar training
Implement color correction and LUT modifications across multiple clips
Organize project files for efficient content production
Develop a scalable workflow for mass avatar creation
Troubleshoot and remove performance anomalies in avatar recordings
Back up and preserve digital asset production files

Summary

Setting Up Lighting and Color Values 0:08
- Josh Lomelino explains the importance of setting up lighting and color values once to achieve consistent results over time.
- He emphasizes the need to test lighting and color values before batch producing a group of avatars.
- Josh mentions the flexibility to make further adjustments later using L, U, T color modifications or color correction tools.
- The workflow allows for the efficient production of 10 to 50 avatars, ensuring visual polish from the start.
Consistency in Project Settings 1:42
- Josh highlights the necessity of matching video editing project settings to the specifications of the recording camera.
- He provides an example of setting up a project for a Logitech 4k camera and ensuring consistency in frame size and frame rate.
- Josh advises checking file properties to extract frame size and frame rate if unsure.
- Consistency in project settings is crucial for mass producing different clips.
Creating a Master Sequence 2:59
- Josh sets up a master sequence to serve as a template for duplicating sequences as needed.
- He uses a clear numbering system for sequences, labeling each avatar with a specific outfit and camera angle.
- Examples include Avatar 001, DIRECT address, no hands, and Avatar 0013, quarter view.
- Josh organizes sequences in a dedicated folder called a bin for project organization.
Batch Rendering with Adobe Media Encoder 4:56
- Josh explains the process of adding clips to a Batch Render Queue using Adobe Media Encoder.
- He selects in and out points for each camera angle, creating dedicated files for each angle.
- Josh configures the encoder to render only the specified in and out range on the timeline.
- Each camera angle should be exported as an individual MP4 file, specifying the folder location and file name.
Finalizing and Organizing Project Files 6:40
- Josh emphasizes the importance of organizing project files, including original source files, rendered clips, and project files.
- He advises saving the video editing project frequently as a fail-safe for future edits.
- Josh highlights the need to review source footage for any performance anomalies and correct them.
- The workflow allows for the removal of outdated avatars and recreation without problematic movements.
Backing Up and Scaling Content Production 8:25
- Josh frequently backs up his entire project folder by compressing it into a zip file for disaster recovery.
- He mentions the time investment upfront to create polished assets and resolve hiccups.
- Josh advises starting with manual methods and gradually scaling to more advanced techniques.
- The well-organized project structure saves time, enables content production scaling, and supports high-performance results.

4: Automate Everything with Text Prompt

Keywords: Automated, performance, text, video, Otter, AI, voice, clone, Eleven Labs, HeyGen, audio, multilingual

In this video, Josh demonstrates how to create fully automated video performances directly from text using tools like Otter AI, 11 Labs, and HeyGen. Viewers will learn how to generate high-quality voice clones, prototype video scripts, and produce professional-looking content with minimal effort by leveraging AI-powered voice and video generation technologies. The workflow allows content creators to transform written or spoken text into polished video presentations quickly and efficiently. By following Josh's method, users can generate multiple video iterations, edit audio precisely, and create digital avatars that replicate their voice and performance with remarkable accuracy.

Description

Outcomes

Following are the key things you will be able to do after you watch this demo:

Generate video scripts from transcribed audio using AI tools
Create high-quality voice clones with consistent audio recordings
Prototype video content using free and paid AI platforms
Optimize voice training for digital avatars
Manage content production across multiple AI environments
Edit audio tracks with minimal credit consumption
Develop a systematic workflow for automated video creation
Replicate personal performance using digital voice technology
Transform text-based content into professional video presentations
Implement cost-effective strategies for video and audio generation

Summary

Creating a Fully Automated Performance from Text 0:08
- Josh Lomelino explains the process of creating a fully automated performance directly from text, including generating audio prompts using Otter AI.
- He describes how he brainstorms ideas while walking and exports the subtitle transcript file, SRT, to process it with AI tools like Claude or ChatGPT.
- Josh mentions breaking up long scripts into manageable blocks of 1800 characters and generating a year's worth of content for various platforms.
- He emphasizes the use of text, whether written manually or spoken and transcribed, to craft a video script using two primary methods.
Generating High-Quality Voice Clones 1:51
- Josh discusses creating a high-quality voice clone using 11 Labs, initially finding the results artificial but later perfecting the settings.
- He highlights the importance of using a consistent audio clip for training the voice digital double, ideally around three hours of spoken audio.
- Josh explains the challenges of recording consistently for three hours and how he stitches together previous demo recordings to create a large audio clip.
- He stresses the need for meticulous tracking of audio settings to ensure uniformity and avoid sudden changes in volume or tonal quality.
Optimizing Audio Recording for Consistency 3:36
- Josh shares his experience of recording multiple live sessions with an audience, which infused the audio with personality and energy.
- He explains the importance of having consistently dialed-in audio for generating a high-quality performance, as the AI listens to everything in the audio track.
- Josh mentions the time and cost involved in using 11 Labs, which can take up to six to eight hours to analyze a voice and build a model.
- He advises against using cheaper models, such as the multilingual version one model or turbo 2.5, and recommends upgrading to the multilingual version two model for better results.
Using Hey Gen for Cost-Effective Prototyping 5:35
- Josh introduces Hey Gen as an alternative for creating generative content when 11 Labs burns through credits too quickly.
- He explains how he trains Hey Gen on his voice by uploading a 10 to 15-minute audio clip and generates unlimited videos for free, depending on the subscription plan.
- Josh describes the process of creating prototypes, making real-time adjustments to the script, and rendering multiple takes.
- He mentions using his phone in split screen mode while walking to make adjustments on the fly and then copying and pasting the revised script into Hey Gen.
Switching Between Hey Gen and 11 Labs 7:44
- Josh explains how he can switch the voice in Hey Gen to the high-quality production voice in 11 Labs with a click of a button.
- He highlights the downside of using Hey Gen, which is the risk of losing all credits if there are issues with the audio track in the final video.
- Josh prefers using the Studio tool in 11 Labs for targeted editing, which allows regenerating just portions of the audio without redoing the entire clip.
- He mentions the benefit of being able to download the WAV file and MP3 file from the Studio tool in 11 Labs as a fail-safe.
Organizing Video Production Phases 9:21
- Josh describes his workflow of treating production as two phases: the cheap, free voice phase and the final phase.
- He explains the process of pasting the text directly into the Hey Gen editor, listening to the prototype, and resolving issues before creating a new file in Hey Gen.
- Josh organizes his videos into two folders: a prototype folder and a final folder, for easy organization of his methods.
- He mentions using the multilingual version two model for cost-effective throwaway tests and training his voice with Hey Gen for free prototyping.
Leveraging Digital Doubles for High-Quality Videos 10:34
- Josh shares how he uses his digital doubles to replicate a performance of his voice and generate a corresponding video composite.
- He explains how he creates a script using Otter AI during a walk, copies and pastes it into his automated workflow, and produces a high-end video with minimal effort.
- Josh highlights the benefits of this workflow, which allows him to deliver excellence without skipping a beat, even when small inconsistencies would have derailed the process before.
- He concludes by mentioning the next steps in the following videos, which will cover adding automated visual elements on screen behind the virtual avatar.

5: AI Tools Overview and Links

AI Tools Overview and Links

Otter AI

Otter AI is a powerful transcription and collaboration tool that solves one of the biggest bottlenecks for membership owners and content creators: turning raw ideas and recordings into publish-ready content quickly. Instead of spending hours manually transcribing podcasts, coaching calls, or brainstorming sessions, Otter automatically converts audio into accurate, searchable text that can be repurposed into blog posts, course modules, captions, or marketing emails. For creators juggling multiple platforms and constant content demands, Otter removes the friction of documentation and frees up time to focus on engaging their audience, scaling their community, and generating revenue.

Otter AI Affiliate Link Signup (use this link)

HeyGen

HeyGen is an AI video creation platform that eliminates the need for expensive equipment, on-camera talent, and complex editing—solving a major pain point for membership owners and content creators who need consistent, professional-looking videos to engage their audiences. With HeyGen, you can instantly turn scripts into high-quality talking-head videos using realistic AI avatars, complete with voiceovers and multilingual capabilities. This allows creators to scale their content output, personalize training or marketing messages, and maintain a polished brand presence without the cost or time traditionally required for video production.

HeyGen.com/?sid=rewardful&via=josh-lomelino" target="_blank" rel="noopener">HeyGen Affiliate Link Signup (use this link)

ElevenLabs

ElevenLabs is an advanced AI voice generation platform that solves the challenge of producing high-quality, natural-sounding audio for membership owners and content creators without the ongoing need for sitting in a chair and recording your voice over and over. It allows creators to instantly convert written content—like course modules, podcasts, or marketing scripts—into realistic human-like narrations in multiple voices and languages. This not only speeds up content production but also ensures a consistent, professional sound across all audio materials, helping creators deliver a polished experience that builds trust, increases engagement, and scales their content library effortlessly.

ElevenLabs Affiliate Link Signup (use this link)

6: Create HeyGen Templates

Create HeyGen Templates

HeyGen

Bible Search Results

There are no Main Site search results.