First, an incomplete list of terms:
git super powers. In addition to the files and directories, it also contains a hidden .git/ directory that contains information about versions, branches, etc. You should never modify files in .git/ directly.remote: a copy of your repository that is located somewhere else. Typically, you will have one local on your computer and one or more remotes hosted on sites like gitlab, but there is nothing special about local vs remote.
In principle, I could set up the repo that is your local as a remote from my computer. The repos hosted on gitlab are just directories withgit superpowers and a webpage.diff (changes) from the previous commit, not a whole new copy of the file(s).branch: a particular chain of commits. Typically, there’s a primary branch (named main or master), and then 0 or more development branches, that are kept separate to avoid conflicts.
Once development on a branch is complete, it can be merged into the main branch.
Now the steps to working with an existing git repo. Note that I will include the terminal commands, but GUI applications like source tree can do almost everything you’ll need, and use similar terminology.
The first step in working with a git repository is to “clone” it. This is a means by which you make a copy of the directory as well as all of the git magic (stored in .git/).
By default, the place that you cloned from (in this case, the gitlab repo) will be set as the default remote and named origin.
If you are working by yourself, you can get away with only ever using the default branch. So far, I’ve primarily been working on the main branch for this repo. When collaborating though, it’s a good idea to only make changes on personal branches, then merge them into main only by agreement.
You do this by running
$ git branch mybranchname # creates a new branch
$ git checkout mybranchname # makes it your working branch
(or alternatively, do both together with git checkout -b mybranchname).
If you ever need to get back to the main branch, just do git checkout main (but be sure to save / commit any changes first).
You can edit and save files in a git repo as normal. But saving them does not save a version from git’s perspective. To tell git that you’ve got some changes you want to save a version of, first you “stage” the files using git add. If you do this with a newly created file, this also has the effect of causing git to “track” it, so future changes will be noted.
You can stage just one file you’ve modified, all of them, or even just certain modifications (“chunks”).
Once you’re happy with the changes, it’s time to save them with git commit. Typically, you will also include a commit message - try to make these informative, it can be quite helpful later on.
Note: You can stage and commit at the same time by doing git commit -a, which will stage and commit any previously tracked files (though not brand new files that have never been git added)
At this point, your changes have been recorded on your local machine, but aren’t yet reflected in the remote repo. To sync them, use git push. The first time you do this for a new branch, you need to tell git where to push, so typically you’ll do git push --set-upstream origin mybranchname.
This will create a new branch at the remote, and push your changes to it. Future pushes can be accomplished with just git push.
Typically, you’ll do many stage-commit-push cycles on your branch before they get merged. You can also just stage-commit, stage-commit a bunch, then push once at the end of the day or something.
I’ll typically do one commit per “unit” of work I complete, whether that’s getting a figure generated, getting some column of data wrangled, or whatever. But this isn’t a hard and fast rule. Try not to go too long between commits, but it certainly doesn’t need one per line of code written.
Now your branch is on the remote, you can create a “merge request” (“MR” - on github this is called a “pull request”/“PR”). This isn’t really a git thing - it’s one of the social features of the code-hosting platforms, but it’s quite handy. Basically, it says “I have a bunch of changes that should be merged into the main branch”.
But you don’t need to wait until you’re done. Gitlab has a bunch of nice features for collaborating, so I can leave comments / suggestions on your in-process edits. I suggest that you open a MR right from the start, and just call it “Draft: my analysis” or something.
You can create a merge request by just navigating to the repo on gitlab after the first time you push. There will be a banner asking if you want to start one. Just click the button and follow the prompt.
Once we both agree that chunk of work is complete, we merge the MR into the main branch. You can do this locally as well, but this is a case where using the extra features of gitlab / github is really nice. I often use MR/PRs for my personal projects to keep track, even though I could do everything locally.
If anything else has changed in the meantime, there may be merge conflicts that need to be dealt with, but we can jump off that bridge when we come to it.
Now that the chunk of work is complete, it’s important to make sure your local repo reflects that. first, git checkout main to switch to the main branch, then git pull to sync everything, including any changes from other MRs that have occurred.
Once that’s done, you can delete the old branch with git branch -D mybranchname, then checkout a new one, and get started again.