How to use the JupyterLab Git extension

5 minute read

Version control is indispensable to collaboration in Jupyter Notebooks. You can use version control tools like Git and GitHub with Jupyter Notebooks to:

  • Track changes to your notebooks.
  • Share notebooks with your team.
  • Review notebooks.
  • Build collaboratively on your work.

If you haven’t yet used GitHub with Jupyter, check out our basic tutorial which will show you a command line workflow to use Git / GitHub with notebooks.

In this post, we’ll talk about a JupyterLab Git extension that offers a UI based notebook git workflow (git clone, pull, push, diff, merge) right from within the JupyterLab UI.

Set up

If you are using a newer version of JupyterLab (version 3.0 or later), you can install the Git extension the same way you install any other Python packages, using pip or Conda:

pip install --upgrade jupyterlab-git

or

conda install -c conda-forge jupyterlab-git

For older versions of JupyterLab, refer to instructions on the jupyterlab-git extension GitHub page.

Git workflow

Once you have installed the extension, start a JupyterLab server by running:

jupyter-lab

JupyterLab will automatically open in your web browser.

You’ll see a new Git menu item at the top and a Git icon on the left-hand panel (shown in red):

jupyterlab-git

Click on the icon to set up the Git extension:

jupyterlab-git-setup

There are three buttons listed here:

  • Navigate to a folder that is already in a Git repository.
  • Create a new Git repository.
  • Clone an existing Git repository.

Git workflow: clone

Let’s clone an existing Git repository. We’ll clone the repository of example notebooks provided by the naas.ai project.

Let’s fork the repository so we can make some changes. Next, navigate to the forked repository, click on the green Code button, and copy the link to clone the repository:

You can follow these instructions to learn more about setting up the appropriate authentication protocol to connect to a private repository.

jupyterlab-git-repo

Click on the Clone a Repository button on the JupyterLab Git extension panel and paste the repository link we copied above:

jupyterlab-git-clone-repo

Now the repository is cloned to your local machine & you can see the awesome-notebooks folder in the navigation panel.

jupyterlab-git-awesome-notebooks

Navigate to the awesome-notebooks folder and click on the Git icon on the left-hand panel:

jupyterlab-git-awesome

We are now in a Git repository, so the Git panel displays various Git information: which branch you are in, uncommitted changes you have made, and a panel to commit your changes.

Git workflow: branch

Before we edit any notebooks, let’s create a separate branch to track all of our changes.

Create a new branch by navigating back to the Git panel. Click on the “Current Branch” panel, then the “New Branch” button.

jupyterlab-git-new-branch

We’ll call our branch text-left:

jupyterlab-git-text-left

Git workflow: diff

Now let’s edit a notebook called NASA_Sea_level.ipynb. Our change will move the “data source” text further left on the plot. Once you have made the change and saved the notebook, you’ll see the notebook listed under “Changed”.

jupyterlab-git-new-branch

If you hover over the name of the changed file, you’ll see the icons to: open the file; view the diff between the current branch and your change; revert the changes, and stage the change.

Let’s check out the rich notebook diff that the JupyterLab git extension shows -

jupyterlab-git-diff

Git workflow: commit

In the diff, we can see the edit we made to the plotting line. If the change looks satisfactory, we can commit the change by hovering over the name of the changed file under “Changed” and clicking the + icon. Enter a commit message and click on “COMMIT”. This creates a local commit / checkpoint.

jupyterlab-git-commit

Git workflow: push

Now we push our local commit to the remote GitHub repository. Have a look at the screenshot below -

jupyterlab-git-text-left

The cloud on the left pulls the latest changes to your local branch. Conversely, the cloud on the right pushes local changes to remote repository.

The right-hand cloud now has a red dot over it indicating un-pushed changes. Let’s click on the right-hand cloud to push our changes to GitHub. This will push our newly created branch text-left to the remote GitHub repository. The branch contains the changes we made to the NASA_Sea_level.ipynb notebook.

Git workflow: merge conflict

Perhaps there was some miscommunication in your team. Before you had a chance to merge your changes in the master branch, one of your colleagues went ahead and changed the plot text and issued a pull request before you.

You disagree with her choice of plot end date, so you still want to make a pull request with your change. Now when you make a pull request from your branch, there is a conflict:

jupyterlab-git-branch-conflict

If you change back to the master branch on JupyterLab now, a pop-up will appear to let you know that the remote version file has changed:

jupyterlab-git-remote change

Let’s go ahead and pull the latest changes to master made by your colleague.

Then change back to the text-left branch, so we can merge the master updates into our branch. Click on the Git option in the taskbar and choose “Merge Branch”, then select the master branch to merge into your text-left branch:

jupyterlab-git-merge-master

This action will fail and you will see the NASA_Sea_level.ipynb notebook appear in the “Conflicted” section of the left-hand Git panel. Double-click to view the conflict. You will see the three versions: the changes we made in our current branch, the original master, and the new master (updated by your colleague):

jupyterlab-git-conflict-compare

Change the bottom cell to the date you want - say 2016-01-01 - then click on “Mark as resolved”. You will see the notebook file move from the “Conflicted” section to the “Staged” section. It’s ready to commit. Use the bottom commit box to commit, push the commit, and you’re done!

You’ve resolved the notebook merge conflict with the help of JupyterLab git extension. Now if you go back to GitHub, you’ll see the conflict is gone and our pull request is ready for review:

jupyterlab-git-conflict-resolve

Code Review for Jupyter Notebooks

If you open up the “Files changed” section of your notebook’s pull request on GitHub, you will see the diff of your commit in the JSON format.

github-notebooks-in-json-format

As you can imagine, textual JSON diff (like above) makes it hard to review Jupyter Notebooks on GitHub.

You can use tools like ReviewNB to review your notebook changes in the rich diff format. You can see which notebook cells have changed, what the changes are & write a comment on any notebook cell.

ReviewNB is a GitHub marketplace app & integrates seamlessly with GitHub & Bitbucket repositories.

reviewnb-rich-diff-notebooks

End note

The JupyterLab Git extension gives us a visual way to keep track of our branches and commits, all from inside the notebook — what a pleasure!

With built-in diffs and support for merge conflicts, this extension makes your day-to-day branching, committing, and pushing as simple as a few clicks. We all need a bit of a push to version control our notebooks more consistently - this could be the push your team needs!

To explore other JupyterLab Git extensions, you can check out the JupyterLab GitPlus extension by ReviewNB on GitHub, or read about it on our blog. It helps push commits and create pull requests from JupyterLab.