TechEarl

How to Add Git to an Existing Project Without Committing Junk

Bring an existing or legacy codebase under version control cleanly. Init the repo, write .gitignore first so you skip the junk, make the initial commit, and push to a new remote.

Ishan Karunaratne⏱️ 8 min readUpdated
Share thisCopied
Adding Git version control to an existing project folder with git init, gitignore, and an initial commit

To put an existing project under Git, run git init in the project folder, write a .gitignore before you stage anything, then git add -A, git commit, and push to a remote. The one mistake to avoid: do not run git add before the ignore file exists, or you commit node_modules, build output, and secrets that are then baked into history.

bash
cd my-existing-project
git init
# write .gitignore first (see below), then:
git add -A
git commit -m "Initial commit"
GitHub's quick setup screen with commands to push an existing repository
GitHub's quick setup gives you the exact commands to push an existing repository.

That is the whole shape of it. The rest of this article is the careful version: deciding what to ignore so your first commit is clean, and wiring up the remote without tripping over the common errors. (Starting from an empty folder with no code in it yet? That is the set up Git for a new project path instead; this guide is for a folder that already has files, and the junk that tends to come with them.)

Step 1: Initialize the repository

cd into the root of the project (the top folder that holds everything you want tracked) and run git init:

bash
cd my-existing-project
git init
text
Initialized empty Git repository in /home/techearl/my-existing-project/.git

This creates a hidden .git/ directory. That folder is the repository: all your history, branches, and config live there. Your files are untouched, nothing is tracked yet, and Git is now watching the folder. If you run this in the wrong place and want to undo it, just delete the .git/ folder and start again.

On Git 2.28 and later you can name the default branch as you init, which saves a rename:

bash
git init -b main

Older Git versions create master. If you are on one of those and want main, rename it after the first commit with git branch -m main. If you are setting Git up from scratch and want to know the wider picture of config, identity, and the first repo, I cover that in setting up Git for a new project and the broader Git for beginners guide.

Step 2: Write .gitignore before you stage anything

This is the step people skip, and it is the one that matters most. An existing project almost always has files you do not want in version control: dependency folders, build artifacts, editor settings, logs, and (the dangerous one) secrets. If you git add -A before excluding them, they go into the initial commit, and removing them later is a history-rewrite chore, not a one-line fix.

Create a .gitignore file in the project root. Here is a sensible starting point for a Node project:

text
# Dependencies
node_modules/

# Build output
dist/
build/
.next/

# Logs
*.log
npm-debug.log*

# Environment and secrets
.env
.env.local
*.pem

# OS and editor cruft
.DS_Store
Thumbs.db
.vscode/
.idea/

The exact entries depend on your stack: a Python project ignores __pycache__/, *.pyc, and venv/; a PHP project ignores vendor/. GitHub maintains a collection of starter templates per language; grabbing the right one is faster than writing your own. For the full set of patterns, negation rules, and how matching actually works, see my guide to .gitignore with examples.

Before you commit, sanity-check what Git is about to track:

bash
git add -A
git status
text
Changes to be committed:
  (use "git rm --cached <file>..." to unstage)
        new file:   .gitignore
        new file:   README.md
        new file:   package.json
        new file:   src/index.js

If node_modules/ or .env show up in that list, stop. Your ignore file is not catching them. Unstage everything with git reset, fix the patterns, and stage again. Understanding exactly what git add does to the index is worth a read on its own, the Git staging area explained.

Already committed something you should not have?

If a secret or a junk folder slipped into the staging area but you have not committed yet, git rm --cached removes it from tracking while leaving the file on disk:

bash
git rm -r --cached node_modules
echo "node_modules/" >> .gitignore

That is the right tool for "stop tracking this but keep my local copy," covered in full in removing a file from Git without deleting it. If the bad file already made it into a commit, that is a history problem: removing a leaked credential properly means rewriting history, which I walk through in removing a secret from Git history. Doing the ignore file first is how you avoid ever needing either.

Step 3: Make the initial commit

With a clean staging area, commit. Git needs to know who you are first. If you have never set your identity, Git will refuse with a Please tell me who you are message; the fix is two config lines:

bash
git config --global user.name "Your Name"
git config --global user.email "you@example.com"

Then commit:

bash
git commit -m "Initial commit"
text
[main (root-commit) 9c1f2ab] Initial commit
 4 files changed, 312 insertions(+)
 create mode 100644 .gitignore
 create mode 100644 README.md
 create mode 100644 package.json
 create mode 100644 src/index.js

(root-commit) confirms this is the very first commit in the repository. The whole existing codebase is now a single snapshot. Some people prefer to break the import into a couple of logical commits instead of one giant one; either is fine for an import. Keep the message short and meaningful (a useful habit once a team is reading the log, which I cover in commit message best practices).

Step 4: Create the remote and push

The repository is real and local. To back it up and share it, create an empty repository on a host (GitHub, GitLab, Bitbucket) and connect it as a remote. Create the remote repo empty, with no README, no license, no .gitignore, because your local repo already has all of that and an auto-generated file on the remote causes the unrelated histories headache below.

Copy the remote URL the host shows you and add it:

bash
git remote add origin git@github.com:yourname/my-existing-project.git
git push -u origin main
text
Enumerating objects: 6, done.
Counting objects: 100% (6/6), done.
Writing objects: 100% (6/6), 1.21 KiB | 1.21 MiB/s, done.
Total 6 (delta 0), reused 0 (delta 0)
To github.com:yourname/my-existing-project.git
 * [new branch]      main -> main
branch 'main' set up to track 'origin/main'.

The -u flag sets origin/main as the upstream for your local main, so future git push and git pull need no arguments. Skip it and you get the no upstream branch message on the next push.

That URL above is SSH. You can also use an HTTPS URL like https://github.com/yourname/my-existing-project.git. They behave the same for pushing; the difference is how you authenticate. SSH uses a key you add to GitHub once; HTTPS uses a personal access token (GitHub removed plain password auth in 2021). For the full comparison and how to pick, see Git SSH vs HTTPS remotes.

Common errors when pushing an existing project

A few errors cluster around this exact moment. Quick map:

  • remote origin already exists means you ran git remote add origin twice (or the host's instructions did). Use git remote set-url origin <url> to point it at the right place. Full fix in remote origin already exists.
  • src refspec main does not match any almost always means you have not committed yet, so there is no main branch to push. Make the initial commit first. See src refspec does not match any.
  • refusing to merge unrelated histories happens when the remote was not created empty (it has its own initial commit from an auto-added README) and yours has another. That is exactly why I said create it empty; the recovery is documented here.
  • failed to push some refs usually means the remote has commits you do not have locally. Pull first to bring those commits down, then push, as covered in failed to push some refs.
  • Permission denied (publickey) is an SSH key problem, not a Git problem. Here is how to fix it.

What about a legacy project with no Git history?

This whole process creates fresh history starting today. You are not recovering a past that was never recorded; before git init there were no commits to find. If you actually landed here meaning to grab code that already lives on a host, that is a different task, cloning an existing repo someone else owns, not adding Git to a folder of your own. That is expected and fine: the initial commit is your "this is where tracking began" line, and everything from here gets a real history. From this point on you can branch safely (covered in Git branching for beginners), and if you ever lose work after the repo exists, the reflog can recover it.

If the project is going onto a team's repo rather than your personal one, set up branch protection and a workflow before others start pushing: see protecting your main branch and Git workflows for teams.

FAQ

Sources

Authoritative references this article was fact-checked against.

Tagsadd git to existing projectGitVersion Controlgit initgitignore

Found this useful? Pass it on.

Copied

Ishan Karunaratne

Tech Architect · Software Engineer · AI/DevOps

Tech architect and software engineer with 20+ years building software, Linux systems, and DevOps infrastructure, and lately working AI into the stack. Currently Chief Technology Officer at a healthcare tech startup, which is where most of these field notes come from.

Keep reading

Related posts

How to SSH into a Google Cloud VM Without gcloud

Connect to a GCP VM using plain OpenSSH, no gcloud required. Add a public key to instance metadata, fetch the external IP, and ssh in like any normal Linux box. Plus OS Login, IAP, and a Windows PuTTY path.