TechEarl

How to Add or Replace the Audio Track in a Video with ffmpeg

Combine a silent video with an audio file, or swap out an existing soundtrack, using ffmpeg. The -map and -shortest flags that make it work, plus how to fix audio that drifts out of sync.

Ishan Karunaratne⏱️ 8 min readUpdated
Share thisCopied
Add or replace the audio track on a video with ffmpeg using -map and -shortest, copy the video stream without re-encoding, and fix out-of-sync audio with -itsoffset.

To combine a silent (or to-be-replaced) video with a separate audio file, point ffmpeg at both inputs, copy the video as-is, and explicitly map one stream from each:

bash
ffmpeg -i video.mp4 -i audio.m4a -c:v copy -c:a aac -map 0:v:0 -map 1:a:0 -shortest out.mp4

That is the whole job for the common case. out.mp4 has the picture from video.mp4 and the sound from audio.m4a. The video is copied byte-for-byte (no quality loss, no slow re-encode), and the audio is encoded to AAC so it sits cleanly in an MP4 container. The rest of this page is what each flag is doing and how to handle the cases that bite people: replacing an existing soundtrack, and audio that lands a fraction of a second out of sync.

I reach for this constantly: a screen recording that captured no microphone, a slideshow render that needs a music bed, a clip whose original audio I want gone and replaced. They are all the same operation, muxing one video stream and one audio stream into a single file.

What each flag does

Four flags carry the command. Skip any one of them and you get a subtly wrong result, so it is worth knowing why each is there.

  • -c:v copy keeps the video stream exactly as it is. ffmpeg moves the encoded video packets straight into the new container without decoding and re-encoding them. It is instant and lossless. Re-encoding video here would be slow and would throw away quality for no reason, because you are not changing the picture.
  • -c:a aac encodes the audio to AAC. MP4 wants AAC. If your source audio is already AAC (an .m4a very often is), you can use -c:a copy instead and skip the re-encode entirely. If the audio is MP3, Opus, WAV, or anything the MP4 container does not accept, let ffmpeg re-encode it to AAC as shown. When in doubt, -c:a aac always works.
  • -map 0:v:0 picks the first video stream from input 0 (the video file). The syntax reads input-index colon stream-type colon stream-index, so 0:v:0 is "input 0, video stream 0".
  • -map 1:a:0 picks the first audio stream from input 1 (the audio file).

Without the -map flags, ffmpeg falls back to its automatic stream selection, which picks one stream of each type and can grab the wrong one when an input already has audio. Mapping explicitly removes the guesswork: you are telling ffmpeg precisely which streams go into the output and in what order. Once you have more than one input, get in the habit of mapping by hand.

Why -shortest matters

-shortest tells ffmpeg to stop writing the output as soon as the shorter of the mapped streams ends. Audio and video files are rarely the exact same length. If your video runs 40 seconds and the audio track is 60 seconds, without -shortest ffmpeg holds the output open for the full 60 seconds, and the final 20 seconds is a frozen last frame sitting under the leftover audio. The reverse is just as ugly: a long video with short audio leaves a silent tail.

-shortest trims to the shorter input so there is no frozen frame and no dead air at the end. If you genuinely want the output to run the full length of the longer stream (say, music continuing over a held frame), leave -shortest off, but that is the exception. For the everyday "make these line up" case, keep it on.

Replacing an existing audio track

If your video already has a soundtrack and you want a different one, the command is identical. You do not strip the old audio first; you simply never map it:

bash
ffmpeg -i video.mp4 -i newaudio.m4a -c:v copy -c:a aac -map 0:v:0 -map 1:a:0 -shortest out.mp4

-map 0:v:0 takes the video from input 0 and -map 1:a:0 takes the audio from input 1. The original audio that lives inside video.mp4 (its 0:a:0 stream) is never mapped into the output, so it is dropped. There is no separate "remove the old audio" step, the mapping decides what survives, and anything you do not map is left behind.

If you only want to mute the clip rather than replace the sound, that is a one-input job covered in remove audio from a video with ffmpeg. And if you want to pull the existing soundtrack out to a file before swapping in a new one, see extract audio from a video with ffmpeg.

Fixing audio that is out of sync

Sometimes the audio and video are recorded separately and the sound lands early or late against the picture. The fix is -itsoffset, which shifts the timestamps of the next input by a time duration. The rule to remember: -itsoffset is an input option, so it goes before the -i of the stream you want to move.

To delay the audio by half a second (the audio currently starts too early):

bash
ffmpeg -i video.mp4 -itsoffset 0.5 -i audio.m4a -c:v copy -c:a aac -map 0:v:0 -map 1:a:0 -shortest out.mp4

The 0.5 shifts the audio 500 milliseconds later relative to the video. A positive offset delays the input it precedes. To pull the audio earlier instead, offset the video the same way, since making the picture later is the same as making the sound earlier:

bash
ffmpeg -itsoffset 0.5 -i video.mp4 -i audio.m4a -c:v copy -c:a aac -map 0:v:0 -map 1:a:0 -shortest out.mp4

You can pass the offset as plain seconds (0.5, 1.2) or as a timestamp (00:00:00.500). Finding the right number is trial and error: nudge it, watch the result, adjust. Start in the tens-of-milliseconds range and refine from there. The good news is that with -c:v copy each attempt is near-instant, so iterating is cheap.

One caveat when you copy a stream rather than re-encode it: -itsoffset shifts the input's start timestamp, and on a stream copy that shift lands in the container's edit list rather than in the packets themselves. Most players honor it, but some ignore the edit list and play the offset stream from zero, so the shift appears to do nothing. The command above sidesteps this for the audio side, because -c:a aac re-encodes the audio and bakes the offset in. If you offset the video with -c:v copy (the second command) and the result looks unshifted in a stubborn player, re-encode that stream instead (-c:v libx264) so the offset is applied during encoding.

This works identically on any platform ffmpeg runs on, Linux, macOS, and Windows, since it is the same binary and the same flags everywhere.

FAQ

See also

Sources

Authoritative references this article was fact-checked against.

TagsffmpegaudiovideomuxCLILinux-map-shortest

Found this useful? Pass it on.

Copied

Ishan Karunaratne

Tech Architect · Software Engineer · AI/DevOps

Tech architect and software engineer with 20+ years building software, Linux systems, and DevOps infrastructure, and lately working AI into the stack. Currently Chief Technology Officer at a healthcare tech startup, which is where most of these field notes come from.

Keep reading

Related posts

How to Extract Audio From a Video with ffmpeg

Pull the audio out of a video with ffmpeg: copy the stream untouched when you just want to demux (fast, lossless), or re-encode to MP3, AAC, or WAV when you need a different format. Plus how to check the source codec first and grab a single segment.

How to Speed Up or Slow Down a Video with ffmpeg

Change a video's speed from the command line with ffmpeg: setpts for the video, atempo for the audio, why you still chain atempo in steps of 2.0 even though it now accepts up to 100, and keeping both tracks in sync.