TechEarl

AWS S3 cp and sync Cheat Sheet: Copy, Move, and Sync Files with the CLI

A scannable AWS S3 CLI reference: aws s3 cp, sync, mv, rm, ls; recursive uploads and downloads; --exclude / --include filters; storage classes (STANDARD_IA, GLACIER, INTELLIGENT_TIERING); SSE encryption (AES256, aws:kms); --dryrun safety; the trailing-slash gotcha; concurrency tuning via max_concurrent_requests and multipart_chunksize; cross-account profiles.

Ishan KarunaratneIshan Karunaratne⏱️ 19 min readUpdated
AWS S3 CLI cheat sheet: aws s3 cp local-to-S3, S3-to-local, S3-to-S3 cross-region; aws s3 sync incremental with --delete; --exclude and --include patterns; --storage-class STANDARD_IA / INTELLIGENT_TIERING / GLACIER; --sse AES256 and --sse aws:kms; --acl bucket-owner-full-control; --dryrun for safety; concurrency tuning with max_concurrent_requests and multipart_chunksize; the trailing-slash gotcha that ruins half of all aws s3 cp invocations.

The aws s3 high-level commands are the daily-driver interface to S3: faster than the SDK for one-offs, safer than s3api for routine operations because they handle multipart uploads, retries, and parallelism automatically. Five subcommands cover ninety percent of what you ever need (cp, sync, mv, rm, ls) and a handful of flags handle the rest (--recursive, --exclude, --include, --storage-class, --sse, --dryrun). This page is the reference for those subcommands and flags, the trailing-slash gotcha that breaks more cp invocations than every other mistake combined, and the concurrency knobs that turn a 40-minute upload into a 4-minute one.

How do I use the AWS S3 CLI?

aws s3 cp copies a file or prefix in either direction: local-to-S3, S3-to-local, or S3-to-S3. Add --recursive for whole directories. aws s3 sync is cp --recursive plus delta-detection (only changed files are transferred). Add --delete to make sync mirror the source exactly. aws s3 mv is copy-then-delete. aws s3 rm deletes objects (add --recursive for prefixes). aws s3 ls lists buckets, prefixes, or objects. Filter with --exclude '*.tmp' --include '*.json' (rules apply left-to-right). Set storage class with --storage-class STANDARD_IA | INTELLIGENT_TIERING | GLACIER | DEEP_ARCHIVE. Encrypt with --sse AES256 (S3-managed keys) or --sse aws:kms --sse-kms-key-id <key> (customer-managed). Always preview first with --dryrun. The single most common bug: s3://bucket/path vs s3://bucket/path/ change behavior. The trailing slash means "into this prefix"; without it the prefix is treated as a key. Concurrency defaults are conservative; raise max_concurrent_requests to 20 to 50 and reduce multipart_chunksize from 8 MiB to 16 to 64 MiB on fast links for noticeable throughput gains.

Try it with your own values

Jump to:

Setup and credentials

The CLI reads credentials in this order: command-line flags (--profile), environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN), the shared credentials file (~/.aws/credentials), and the instance metadata service (on EC2 with an IAM role attached).

Configure a named profile:

bash
aws configure --profile myprofile

Use the profile per-command:

bash
aws s3 ls s3://:bucket/ --profile myprofile --region :region

CLI v2 (released GA November 2020) is the default on new installs and adds structured --output yaml, SSO-based auth, and a better paginator. The high-level aws s3 subcommands are identical between v1 and v2; the s3api low-level commands have minor differences. Everything in this cheat sheet works on both.

Copying files: aws s3 cp

Local file to S3:

bash
aws s3 cp ./report.csv s3://:bucket/:prefix

S3 to local:

bash
aws s3 cp s3://:bucket/:prefixreport.csv ./report.csv

S3 to S3 (same region):

bash
aws s3 cp s3://:bucket/:prefixreport.csv s3://other-bucket/backup/report.csv

S3 to S3 (cross-region):

bash
aws s3 cp s3://:bucket/:prefixreport.csv s3://other-bucket/backup/report.csv \
  --source-region :region --region eu-west-1

The cross-region copy is a server-side operation (the bytes never come back to your machine), which is dramatically faster than download-then-upload for anything bigger than a few MB.

Recursive copy of a whole directory:

bash
aws s3 cp ./local-dir s3://:bucket/:prefix --recursive

Download a whole prefix:

bash
aws s3 cp s3://:bucket/:prefix ./local-dir --recursive

Stream from stdin to S3 (handy for pg_dump, mysqldump, tar):

bash
pg_dump mydb | gzip | aws s3 cp - s3://:bucket/:prefixbackup.sql.gz

Stream from S3 to stdout:

bash
aws s3 cp s3://:bucket/:prefixbackup.sql.gz - | gunzip | psql mydb

Syncing directories: aws s3 sync

sync is the workhorse for "make S3 look like this directory" and "make this directory look like S3." It copies only files that differ in size or modification time, which is what makes incremental uploads fast.

Upload changes:

bash
aws s3 sync ./local-dir s3://:bucket/:prefix

Download changes:

bash
aws s3 sync s3://:bucket/:prefix ./local-dir

Mirror exactly (delete files at the destination that no longer exist at the source):

bash
aws s3 sync ./local-dir s3://:bucket/:prefix --delete

The --delete flag is destructive and silent; always pair the first --delete run with --dryrun to see what would be removed.

Sync between two S3 buckets:

bash
aws s3 sync s3://:bucket/:prefix s3://other-bucket/:prefix

Pin object timestamps (skip the modification-time comparison, use size only):

bash
aws s3 sync ./local-dir s3://:bucket/:prefix --size-only

Useful when the local clock is off, or when working with files that have been touched without changing.

Moving and removing: mv and rm

mv is copy-then-delete. The source is removed only if the destination write succeeds.

Move a single file:

bash
aws s3 mv ./report.csv s3://:bucket/:prefix

Move within S3 (rename a key):

bash
aws s3 mv s3://:bucket/:prefixold.csv s3://:bucket/:prefixnew.csv

Move a whole prefix into archive storage:

bash
aws s3 mv s3://:bucket/:prefix s3://archive-bucket/:prefix \
  --recursive --storage-class GLACIER

Delete a single object:

bash
aws s3 rm s3://:bucket/:prefixreport.csv

Recursively delete a prefix (irreversible if versioning is off):

bash
aws s3 rm s3://:bucket/:prefix --recursive

The first time I ran aws s3 rm --recursive without --dryrun on the wrong prefix I lost a few hundred files I had to restore from a snapshot. Always preview first.

Listing: aws s3 ls

List all buckets in the account:

bash
aws s3 ls

List a bucket's top-level contents:

bash
aws s3 ls s3://:bucket/

List a specific prefix:

bash
aws s3 ls s3://:bucket/:prefix

Recursive listing with human-readable sizes and totals:

bash
aws s3 ls s3://:bucket/:prefix --recursive --human-readable --summarize

The --summarize flag adds a total object count and total size at the end of the output, useful when sizing a future migration.

Filters: exclude and include patterns

Glob-style patterns: * matches any sequence of characters, ? matches a single character. Rules apply in the order specified; later rules override earlier rules.

Upload only JSON files:

bash
aws s3 cp ./local-dir s3://:bucket/:prefix --recursive --exclude '' --include '.json'

Exclude .tmp and .bak, include everything else:

bash
aws s3 sync ./local-dir s3://:bucket/:prefix \
  --exclude '*.tmp' --exclude '*.bak'

Exclude everything in a node_modules directory:

bash
aws s3 sync ./project s3://:bucket/:prefix --exclude '/node_modules/'

The order matters. --exclude '*' --include '*.log' uploads only .log files. --include '*.log' --exclude '*' uploads nothing (the final exclude wins).

Storage classes

Set the storage class at write time. Cannot be changed by cp alone; to re-class an existing object, copy it to itself with a different class.

bash
aws s3 cp ./big.parquet s3://:bucket/:prefix \
  --storage-class STANDARD_IA

Available classes and what they're for:

ClassUse caseMinimum durationRetrieval cost
STANDARDDefault; frequently accessedNoneNone
STANDARD_IAInfrequent access, immediate retrieval30 daysPer-GB
ONEZONE_IAIA without multi-AZ durability30 daysPer-GB
INTELLIGENT_TIERINGUnknown access patterns; auto-tieredNoneNone
GLACIER_IRGlacier Instant Retrieval; millisecond access90 daysPer-GB
GLACIERFlexible Retrieval; minutes to hours90 daysPer-GB
DEEP_ARCHIVELowest cost; 12-hour retrieval180 daysPer-GB

For data with unknown access patterns, INTELLIGENT_TIERING (launched 2018) auto-moves objects between frequent and infrequent tiers based on the last access time. The small monitoring fee per object is almost always cheaper than getting the manual tier choice wrong.

Re-class an existing object by copying it onto itself:

bash
aws s3 cp s3://:bucket/:prefixreport.csv s3://:bucket/:prefixreport.csv \
  --storage-class GLACIER --metadata-directive REPLACE

--metadata-directive REPLACE is required for self-copy, otherwise S3 rejects the request as a no-op.

Server-side encryption

--sse AES256 uses S3-managed keys (SSE-S3, free, no key configuration needed):

bash
aws s3 cp ./secret.csv s3://:bucket/:prefix --sse AES256

--sse aws:kms uses a KMS customer master key (CMK). With no key ID specified, the default S3 KMS key for the account in that region is used:

bash
aws s3 cp ./secret.csv s3://:bucket/:prefix --sse aws:kms

With a specific KMS key:

bash
aws s3 cp ./secret.csv s3://:bucket/:prefix \
  --sse aws:kms --sse-kms-key-id alias/my-bucket-key

Many production buckets enforce SSE-KMS via a bucket policy. If your cp returns AccessDenied and the object isn't encrypted, the policy is rejecting the unencrypted write. Add --sse aws:kms and retry. For sync operations against a bucket that requires SSE-KMS, pair it with --sse-kms-key-id so every object lands with the right key.

ACLs and bucket-owner-full-control

The classic cross-account upload bug: account A uploads to a bucket owned by account B, account B can't read the resulting object because the object owner is the uploader, not the bucket owner. The fix is the bucket-owner-full-control ACL:

bash
aws s3 cp ./report.csv s3://:bucket/:prefix \
  --acl bucket-owner-full-control

Other ACL values: private (default), public-read, public-read-write (almost never the right choice), authenticated-read. Most production buckets now block public ACLs at the bucket level; the cross-account bucket-owner-full-control case is the main reason to pass --acl at all.

If a bucket has Object Ownership set to BucketOwnerEnforced (the recommended default for new buckets), ACLs are disabled entirely and the --acl flag is silently ignored. The bucket owner automatically owns every object.

Dry-run before destruction

--dryrun shows what would happen without actually doing it. Required hygiene for any sync --delete, rm --recursive, or mv against shared buckets.

bash
aws s3 sync ./local-dir s3://:bucket/:prefix --delete --dryrun

Output looks like the real run, prefixed with (dryrun). Eyeball the list, confirm it's what you want, then re-run without --dryrun. The cost of a five-second preview is much smaller than the cost of restoring deleted objects from versioning.

Concurrency tuning

The CLI's default concurrency is conservative: 10 concurrent requests, 8 MiB multipart chunks. On a fast network (anything over 1 Gbps) those defaults leave throughput on the table.

Set them per-profile in ~/.aws/config:

ini
[profile fast]
s3 =
    max_concurrent_requests = 50
    multipart_chunksize = 64MB
    multipart_threshold = 64MB
    max_queue_size = 10000

Or per-command via environment:

bash
AWS_MAX_CONCURRENT_REQUESTS=50 aws s3 sync ./big-dir s3://my-bucket/prefix/

Rough tuning rules:

Networkmax_concurrent_requestsmultipart_chunksize
Home / slow VPN5-10 (default)8 MiB (default)
1 Gbps office20-5016-32 MiB
10 Gbps datacenter / EC250-10032-64 MiB
Single very large file on fast link20+64-128 MiB

Going too aggressive (say 200 concurrent requests on a 100 Mbps link) saturates the connection and triggers retries; throughput drops. Test with time aws s3 cp on a representative file before committing to a tuning.

For huge geographically-distant transfers, S3 Transfer Acceleration (since 2016) routes uploads through CloudFront edge locations. Enable on the bucket and add use_accelerate_endpoint = true to the profile. Trades a small per-GB fee for materially faster upload times to a single bucket from far-away clients.

Cross-account and assumed roles

For routine cross-account work, define a profile in ~/.aws/credentials that assumes a role:

ini
[profile crossacct]
role_arn = arn:aws:iam::222222222222:role/S3Operator
source_profile = default
region = us-east-1

Then:

bash
aws s3 cp ./report.csv s3://other-account-bucket/ --profile crossacct

The CLI calls STS to get temporary credentials for the role and caches them under ~/.aws/cli/cache/. The session length defaults to 1 hour, configurable via duration_seconds.

For one-off cross-account uploads against a bucket policy that grants access from your IAM principal, just attach --acl bucket-owner-full-control and skip the role assumption.

The trailing-slash gotcha

This single bug is responsible for more wrong-S3-key bug reports than every other CLI mistake combined.

aws s3 cp ./file s3://bucket/foo uploads file to a key literally named foo.

aws s3 cp ./file s3://bucket/foo/ uploads file to a key named foo/file.

The trailing slash on the destination is the difference between "use this exact key" and "place inside this prefix." Same applies in reverse for downloads:

bash
aws s3 cp s3://:bucket/data ./

That downloads the single key data to ./data.

bash
aws s3 cp s3://:bucket/data/ ./ --recursive

That downloads everything under the data/ prefix into the current directory. Without --recursive the second form errors out because data/ is a prefix, not a key.

Rule of thumb: a trailing slash means "this is a directory/prefix"; no slash means "this is a single file/key." Same on both sides of the source and destination. Mixing them is where the surprises live.

cp vs sync: when to use which

cpsync
Single fileYesTreats the file as a one-item directory
Recursive (--recursive)Required for directoriesImplicit
Delta detectionNo (always copies)Yes (size + mtime, or --size-only)
--delete to mirrorNoYes
--include / --excludeYes (requires --recursive)Yes
Use for backup uploadsOnly for one-shot copiesYes; idempotent
Use for initial bulk loadYes (with --recursive)Also fine; slightly more overhead per file
Use for "make destination match source"NoYes (with --delete)
Use for piped streamsYes (cp - s3://...)No

Default to sync for anything you're going to run more than once. Default to cp for one-shot, single-file, or piped operations.

Useful flag matrix

FlagSubcommandsPurpose
--recursivecp, mv, rmApply to all objects under the source
--exclude PATTERNcp, mv, rm, syncSkip files matching glob
--include PATTERNcp, mv, rm, syncRe-include after --exclude
--storage-classcp, mv, syncSTANDARD, STANDARD_IA, ONEZONE_IA, INTELLIGENT_TIERING, GLACIER_IR, GLACIER, DEEP_ARCHIVE
--ssecp, mv, syncAES256 (SSE-S3) or aws:kms
--sse-kms-key-idcp, mv, syncKMS CMK ARN or alias
--aclcp, mv, syncprivate, public-read, bucket-owner-full-control, etc.
--metadata-directivecp, mvCOPY (default) or REPLACE
--content-typecp, mvOverride the auto-detected MIME type
--cache-controlcp, mvSet the Cache-Control header
--expirescp, mvSet the Expires header
--dryruncp, mv, rm, syncPreview without executing
--quietcp, mv, rm, syncSuppress per-object progress (CLI v2)
--deletesyncMirror by removing extra destination objects
--size-onlysyncSkip mtime comparison
--exact-timestampssync (downloads)Re-download if local mtime differs at all
--no-progresscp, mv, rm, syncHide the progress bar (CLI v2)
--source-regioncp, mv, syncSource region for cross-region S3-to-S3
--profileallNamed profile from ~/.aws/credentials

Common pitfalls

1. Trailing slash on the destination. Covered in detail above. The single biggest source of "where did my file go?" bugs.

2. sync --delete deletes the bucket-side companion of a local file you didn't intend to delete. The flag is symmetric: if you sync ./local-dir to S3, any S3 object whose key would land under s3://bucket/prefix/ but isn't present locally is deleted. Always preview with --dryrun first.

3. cp without --recursive against a directory silently fails. It uploads nothing and exits 0. Use --recursive, or use sync instead.

4. Cross-account uploads leave the bucket owner unable to read the object. The uploader is the object owner by default. Pass --acl bucket-owner-full-control, or have the bucket configured with Object Ownership = BucketOwnerEnforced to make this irrelevant.

5. KMS encryption is silently slower. Every PUT against an SSE-KMS bucket calls kms:GenerateDataKey. For high-volume writes, the KMS API quota (per-account, per-region) becomes the bottleneck before S3 ever does. Plan accordingly; request a quota increase if a sync of millions of objects starts throttling on KMS.

6. --storage-class GLACIER on tiny objects wastes money. Glacier and Deep Archive have minimum object sizes (128 KB billed minimum) and minimum storage durations (90 / 180 days). Lots of small objects in Glacier costs more than the same objects in STANDARD. Use INTELLIGENT_TIERING for mixed-size data and let S3 figure it out.

7. sync re-uploading files that haven't changed. Sync compares size and mtime. If the local file was rewritten without changing its content (a build script touching a file every run), sync uploads it every time. Use --size-only to compare only on size, or content-addressable storage patterns for build artifacts.

8. Stale credentials from an old profile. The CLI caches assumed-role STS credentials under ~/.aws/cli/cache/. After rotating a role's permissions, delete the cache or wait for it to expire. Clear it with rm ~/.aws/cli/cache/*.json.

9. aws s3 cp exit 0 on partial multipart failure. Rare but possible. For critical uploads, verify with aws s3 ls plus a checksum compare. CLI v2 added --checksum-algorithm SHA256 for end-to-end verification on cp / sync; use it for archive workflows.

10. PowerShell single-quote semantics. PowerShell does not interpret single quotes the way bash does. '{"Value": "x"}' becomes a literal string with the curly braces and quotes. Use double quotes with backslash escapes: "{\"Value\": \"x\"}".

What to do next

FAQ

TagsAWSS3AWS CLIDevOpsCloud StorageBackupCheat Sheet
Share
Ishan Karunaratne

Ishan Karunaratne

Tech Architect · Software Engineer · AI/DevOps

Tech architect and software engineer with 20+ years across software, Linux systems, DevOps, and infrastructure — and a more recent focus on AI. Currently Chief Technology Officer at a tech startup in the healthcare space.

Keep reading

Related posts