To combine a silent (or to-be-replaced) video with a separate audio file, point ffmpeg at both inputs, copy the video as-is, and explicitly map one stream from each:
ffmpeg -i video.mp4 -i audio.m4a -c:v copy -c:a aac -map 0:v:0 -map 1:a:0 -shortest out.mp4That is the whole job for the common case. out.mp4 has the picture from video.mp4 and the sound from audio.m4a. The video is copied byte-for-byte (no quality loss, no slow re-encode), and the audio is encoded to AAC so it sits cleanly in an MP4 container. The rest of this page is what each flag is doing and how to handle the cases that bite people: replacing an existing soundtrack, and audio that lands a fraction of a second out of sync.
I reach for this constantly: a screen recording that captured no microphone, a slideshow render that needs a music bed, a clip whose original audio I want gone and replaced. They are all the same operation, muxing one video stream and one audio stream into a single file.
What each flag does
Four flags carry the command. Skip any one of them and you get a subtly wrong result, so it is worth knowing why each is there.
-c:v copykeeps the video stream exactly as it is. ffmpeg moves the encoded video packets straight into the new container without decoding and re-encoding them. It is instant and lossless. Re-encoding video here would be slow and would throw away quality for no reason, because you are not changing the picture.-c:a aacencodes the audio to AAC. MP4 wants AAC. If your source audio is already AAC (an.m4avery often is), you can use-c:a copyinstead and skip the re-encode entirely. If the audio is MP3, Opus, WAV, or anything the MP4 container does not accept, let ffmpeg re-encode it to AAC as shown. When in doubt,-c:a aacalways works.-map 0:v:0picks the first video stream from input 0 (the video file). The syntax reads input-index colon stream-type colon stream-index, so0:v:0is "input 0, video stream 0".-map 1:a:0picks the first audio stream from input 1 (the audio file).
Without the -map flags, ffmpeg falls back to its automatic stream selection, which picks one stream of each type and can grab the wrong one when an input already has audio. Mapping explicitly removes the guesswork: you are telling ffmpeg precisely which streams go into the output and in what order. Once you have more than one input, get in the habit of mapping by hand.
Why -shortest matters
-shortest tells ffmpeg to stop writing the output as soon as the shorter of the mapped streams ends. Audio and video files are rarely the exact same length. If your video runs 40 seconds and the audio track is 60 seconds, without -shortest ffmpeg holds the output open for the full 60 seconds, and the final 20 seconds is a frozen last frame sitting under the leftover audio. The reverse is just as ugly: a long video with short audio leaves a silent tail.
-shortest trims to the shorter input so there is no frozen frame and no dead air at the end. If you genuinely want the output to run the full length of the longer stream (say, music continuing over a held frame), leave -shortest off, but that is the exception. For the everyday "make these line up" case, keep it on.
Replacing an existing audio track
If your video already has a soundtrack and you want a different one, the command is identical. You do not strip the old audio first; you simply never map it:
ffmpeg -i video.mp4 -i newaudio.m4a -c:v copy -c:a aac -map 0:v:0 -map 1:a:0 -shortest out.mp4-map 0:v:0 takes the video from input 0 and -map 1:a:0 takes the audio from input 1. The original audio that lives inside video.mp4 (its 0:a:0 stream) is never mapped into the output, so it is dropped. There is no separate "remove the old audio" step, the mapping decides what survives, and anything you do not map is left behind.
If you only want to mute the clip rather than replace the sound, that is a one-input job covered in remove audio from a video with ffmpeg. And if you want to pull the existing soundtrack out to a file before swapping in a new one, see extract audio from a video with ffmpeg.
Fixing audio that is out of sync
Sometimes the audio and video are recorded separately and the sound lands early or late against the picture. The fix is -itsoffset, which shifts the timestamps of the next input by a time duration. The rule to remember: -itsoffset is an input option, so it goes before the -i of the stream you want to move.
To delay the audio by half a second (the audio currently starts too early):
ffmpeg -i video.mp4 -itsoffset 0.5 -i audio.m4a -c:v copy -c:a aac -map 0:v:0 -map 1:a:0 -shortest out.mp4The 0.5 shifts the audio 500 milliseconds later relative to the video. A positive offset delays the input it precedes. To pull the audio earlier instead, offset the video the same way, since making the picture later is the same as making the sound earlier:
ffmpeg -itsoffset 0.5 -i video.mp4 -i audio.m4a -c:v copy -c:a aac -map 0:v:0 -map 1:a:0 -shortest out.mp4You can pass the offset as plain seconds (0.5, 1.2) or as a timestamp (00:00:00.500). Finding the right number is trial and error: nudge it, watch the result, adjust. Start in the tens-of-milliseconds range and refine from there. The good news is that with -c:v copy each attempt is near-instant, so iterating is cheap.
One caveat when you copy a stream rather than re-encode it: -itsoffset shifts the input's start timestamp, and on a stream copy that shift lands in the container's edit list rather than in the packets themselves. Most players honor it, but some ignore the edit list and play the offset stream from zero, so the shift appears to do nothing. The command above sidesteps this for the audio side, because -c:a aac re-encodes the audio and bakes the offset in. If you offset the video with -c:v copy (the second command) and the result looks unshifted in a stubborn player, re-encode that stream instead (-c:v libx264) so the offset is applied during encoding.
This works identically on any platform ffmpeg runs on, Linux, macOS, and Windows, since it is the same binary and the same flags everywhere.
FAQ
See also
- ffmpeg command cheat sheet: convert, crop, trim, mux, and compress from the CLI, with the canonical command for each task.
- Remove audio from a video with ffmpeg: mute a clip while keeping the video stream untouched.
- Extract audio from a video with ffmpeg: pull the soundtrack out to MP3, AAC, or WAV.
Sources
Authoritative references this article was fact-checked against.





