FluxNote

Guide

asset uploadfile formatsbrand kitvoice cloningonboarding

FluxNote Upload Assets: How to Add Your Images, Audio, and Brand Files in 3 Minutes

You can upload images, logos, and audio samples directly into FluxNote to personalize your AI videos. The platform accepts JPG, PNG, MP3, and MP4 files with no storage limits, even on the Free plan. Your first branded video takes about 3 minutes from upload to export.

Last updated: May 14, 2026

Why FluxNote's asset system is built for creators, not enterprises

Most AI video platforms treat asset upload as an afterthought, burying it in complex project management interfaces. FluxNote starts from the opposite assumption: you have a logo, a brand color, and a voice sample you want to use right now.

Our upload flow is a single drag-and-drop zone accessible from every video creation screen—no digging through settings. The technical limit is 100MB per file, which covers 4K source footage, high-res logos, and full-length audio tracks.

We don't count these uploads against your plan's video or image credits; storage is unlimited because your brand assets shouldn't be a metered resource. This reflects our core design principle: tools for creation should get out of the way.

When you upload a face photo for PuLID identity locking or a logo for watermarking, that asset is available in your personal library within 2 seconds. There's no 'processing' delay for standard formats.

The system automatically generates optimized versions for preview while retaining the original file for final render. This is why our time-to-first-video metric is ~3 minutes—uploading your assets doesn't reset the clock.

Supported formats: What you can (and can't) upload today

FluxNote accepts the following file formats for upload: Images: JPG, PNG, WebP (max 100MB). Video: MP4, MOV, AVI (max 100MB, 60 seconds for source clips).

Audio: MP3, WAV, M4A (max 100MB). Text: SRT for subtitles, TXT for script import.

We do not support PSD, AI, EPS, or other layered design files—you must export to PNG or JPG. This is intentional.

FluxNote is a video generation platform, not a design suite. If you need to edit a complex graphic, use Figma or Photoshop, then upload the final flat image.

For voice cloning, we accept a clean 30-second minimum MP3 or WAV sample. The system extracts the voice profile in about 60 seconds, and that cloned voice then joins your library of 350+ ElevenLabs voices and 13 OpenAI voices.

One common question: can you upload a font file? Not directly. You select from our 8+ animated caption styles (karaoke, kinetic, word-by-word) but cannot upload custom TTF or OTF files.

This is a platform limitation we're transparent about. If brand typography is non-negotiable, you would need to add text in post-production using a video editor.

For most users creating faceless videos, UGC-style ads, or social reels, our built-in fonts cover the aesthetic needs.

Walkthrough: Uploading your brand kit in 5 numbered steps (3 min total)

Here’s the exact process to go from nothing to a video with your branded assets. Step 1: After signing up (no credit card required), click 'New Video' in the top right.

Step 2: In the script editor, scroll down to the 'Assets' panel below the video preview. You'll see three icons: 'Upload Image', 'Upload Audio', 'Upload Video'.

Step 3: Drag and drop your logo file (PNG with transparent background works best). The system confirms upload with a checkmark.

Step 4: Click the 'Brand' tab in the same panel. Here you can set a default watermark position (top left, bottom right) and pick a brand color hex code.

This color will auto-apply to caption styles. Step 5: To use an uploaded asset, simply click on it in the library while in the 'Scene' editor.

Drag it onto the timeline or set it as a background. Total time: under 3 minutes for first-time users.

For voice cloning, the process is separate: go to 'Voice Library' in the main dashboard, click 'Add Clone', upload your MP3, and name it. The clone is ready in ~60 seconds.

All uploaded assets are available across every plan, including Free. There is no 'brand kit' premium upsell—this is core functionality.

Privacy, storage, and what happens to your files

Your private worry: 'Are my uploaded assets used to train AI models?' No. FluxNote does not use your uploaded images, audio, or video clips to train any of our 11 AI video models or 19 AI image models.

Your assets are stored encrypted at rest and are only processed to generate the videos you request. When you delete a file from your library, it is removed from our active servers and backup rotation within 30 days.

We don't sell, license, or analyze your content for any purpose other than providing the FluxNote service. This is a contractual commitment in our Terms of Service.

For teams concerned about compliance: we accept Data Processing Agreements (DPAs) for Pro and Max plan subscribers. Regarding storage limits: there are none.

You can upload 100GB of source footage if you need to, though we recommend keeping your library organized. The 100MB per file limit is for processing stability—if you need to use a longer source video, split it into clips under 60 seconds.

All asset uploads occur over HTTPS encryption. We don't require biometric data or government ID for voice cloning, which simplifies privacy concerns for most creators.

Your cloned voice is only accessible from your account.

Pro workflows: Chaining uploads with AI models for custom output

Uploading an asset is rarely the end goal. The power is in combining your uploads with FluxNote's AI models.

Example 1: Use an uploaded product photo with the 'Image to Video' animation feature. Select the image, then choose a model like Sora 2 Pro or Veo 3.1 to generate a 5-second product reveal.

Example 2: Upload a 10-second clip of yourself talking. Use the PuLID face identity model to lock that face onto an AI-generated presenter in a different video.

This creates a consistent avatar without daily filming. Example 3: Upload a background music track (MP3) and a logo.

In a studio template like 'Business Reels,' set the logo as a watermark and the audio as the soundtrack. The AI will generate a video matching the beat of your track.

The key is that uploaded assets become first-class inputs to the AI, not just overlays. When you upload an image, the system reads its composition and can suggest complementary styles from models like FLUX 2 Pro or GPT Image 2 for additional scenes.

This turns your brand library into a creative partner. For faceless video creators, uploading a set of 5-10 stock image backgrounds once means every new video can have consistent visual branding without searching for assets.

Troubleshooting: When uploads fail or don't look right

If your upload fails, check these points in order: 1) File size under 100MB. 2) Format is on the supported list (e.g., not a HEIC photo from iPhone—convert to JPG). 3) Internet connection is stable. If the upload progresses but then stalls, refresh the page—your partially uploaded file is saved in draft state.

If an uploaded image looks blurry in the final video, the issue is usually resolution. For best results with AI expansion, upload images at least 1920x1080 pixels.

The AI can upscale, but starting with low-res assets yields poor results. If your uploaded audio (voice clone sample) is rejected, ensure it's at least 30 seconds of clear speech with minimal background noise.

A podcast excerpt or meeting recording usually works. If your cloned voice sounds off in the video, try re-uploading a cleaner sample—the model is sensitive to audio quality.

For watermark positioning: if your logo appears cropped, re-upload a PNG with transparent padding around the design. The system treats the non-transparent area as the logo boundary.

Still stuck? Use the in-app chat support (all plans) or email. Max plan users get priority queue for support tickets.

We resolve 95% of upload issues within one business day.

Pro Tips

  • Upload a PNG logo with transparent background—the system auto-sizes it for watermarking without white boxes.
  • For voice cloning, record your sample in a quiet room on a phone. A 30-second clear monologue works better than a 2-minute noisy conversation.
  • Organize assets into projects immediately. Use the 'Project' folder system to keep logos, audio, and videos for each client or channel separate.
  • If you hit the 100MB file limit for a video source, use a free tool like HandBrake to compress the clip to under 60 seconds and under 100MB without quality loss.
  • Set your brand color hex code in the Brand tab once. It will apply to all new videos automatically, saving manual color picking per scene.

Create Videos With AI

SM
MR
EW
NS

100,000+ creators already shipping content with FluxNote

★★★★★ 4.9 rating

Turn this into a video — in 2 minutes

FluxNote turns any idea into a publish-ready short-form video. Script, voiceover, captions, footage & music — all AI, no editing.

Try FluxNote FreeNo credit card · 1 free video/month

Frequently Asked Questions

90s

Your first viral video is 90 seconds away.

Type a topic. AI writes, voices, captions, and edits.You download a 1080p video — yours to post anywhere.

No credit cardNo watermarkCancel anytime

Already 100,000+ creators won't tell you this is their secret.