Tips and tricks for visualizing data with Matplotlib

5 minute read

Matplotlib is a powerful, highly-configurable visualization library. You can use it for quick exploratory plots or sophisticated figures with custom styling and many components. To get a sense of what is possible, check out their example gallery.

Unfortunately, this library’s flexibility and power come at a cost: Matplotlib can be confusing to use.

One of the most important things to know about using Matplotlib is that there are two distinct interfaces:

  • Functional interface (pyplot), and
  • Object-oriented interface.

The functional interface is simpler to use but it doesn’t give you as much flexibility. Let’s have a look at some common Matplotlib questions, keeping these two interfaces in mind.

How do I install and update Matplotlib?

First things first: How do I install and update Matplotlib? Matplotlib can be installed with pip:

pip install matplotlib

If you already have Matplotlib installed and want to update to a newer version, you can force pip to update the package using:

pip install -U matplotlib

or

pip install --upgrade matplotlib

This will upgrade you to the latest available version of Matplotlib. If there is a specific version you know you need, you can install that specific version. For example, if you want to install version 3.5.3:

pip install matplotlib==3.5.3

Then you can import Matplotlib to use in your code. The convention is to import Matplotlib as follows:

import matplotlib.pylab as plt

Occasionally, you may need to import a different component of the Matplotlib package, but most of the time matplotlib.pylab will be all you need.

What is the simplest way to create a Matplotlib plot?

The simplest way to create a Matplotlib plot is using the functional pyplot approach:

Simple pylab plot

How do I use Matplotlib subplots?

If you would like to create a figure containing multiple subplots, you can use the subplots function. The subplots function creates two objects: a figure, which acts as the canvas holding all of the plots, and a grid of axes, which you can plot your data onto:

Subplots

Here we created a figure containing a 2-by-2 grid of axes and plotted a line on just one of the axes. If the figure has more than one plot on it, the axis variable returned by the subplots function is a numpy array with the dimensions of the axes. In the example above, we created a plot on the axis ax[0,1], which is the axis on the first row and second column (using Python’s zero-based indexing).

This method of creating a Matplotlib figure sets us up to use the object-orientated interface because each axis is an object with a suite of attributes and methods that we can then use to modify it.

How do I change axis values in Matplotlib?

We’ve created a plot, now how do we configure our axis values?

If you are creating a single simple plot, then you can use the pyplot approach to adjust labels and limits:

Pyplot plot with axis label

If you have used the subplots function to create multiple axes, then you can change the axis values using axis object methods:

Configured subplots

Our axis object ax[0,1] has the associated methods set_ylabel and set_xlim, which you can use to configure the plot. Axes have many configurable properties, as well as labels and limits. You can find a comprehensive list of configurable properties in the Matplotlib documentation here.

How do I add a legend in Matplotlib?

There are two steps needed to add a legend. First, you need to label each of the plot components with the label you want to see in the legend. Second, you need to create the legend.

In the pyplot method, we add the label to the plot and then call the legend function to create the legend:

Pyplot with legend

The object-oriented subplots approach is similar, but in this case, the legend is created on the axis object:

Subplots with legend

How do I change the color of a bar chart in Matplotlib?

Changing the color of all of the bars in a bar chart is as simple as setting the color parameter:

Colored bar chart

We chose a tomato shade of red. The list of Matplotlib-named colors can be found here.

Notice how we used the subplots command to create an axis with a single figure. The single figure returns an axis object directly, not a numpy array grid of axis objects.

Remember we said at the start of this post that Matplotlib was highly configurable, so surely we can do more with color than just changing our plot to a single, solid color? Indeed, we can!

How about making one of our bars a different color? We can do this by setting the color of each bar explicitly:

Bar chart with two colors

Here we supply the bar chart with a list of colors, one for each bar. We set one bar to be green and the rest to a grayscale value of 0.5.

Another option is to use a Matplotlib colormap to create a gradient of color:

Bar chart with color gradient

Here we created an array of increasingly lighter shades of blue, using the Blues colormap. You can find more information here about what colormaps are available and how to use them.

How to save my Matplotlib plot to a PNG?

We can save our plots either using the pylab approach:

Save as a png

Or using the object-oriented approach:

Save as a png

Note that in the object-oriented approach, savefig is a method on the figure object, not the axis object. It is the full canvas that gets saved, not just an individual axis.

How can I use markers instead of lines in a Matplotlib plot?

The simplest way to change to markers rather than lines in a Matplotlib plot is to take advantage of the third format argument to the plot function. This is a shorthand method to set basic marker, line, and color options. For example, we can change lines to markers in our plot:

Plotting with markers

We can use the format option to use markers as well as lines ('o-'), to change the line style (e.g. '--' for dashes) or the marker style (e.g. 's' for squares), or to specify the color (e.g. 'm' for magenta):

Various marker plot formats

We can also explicitly set each option with its own argument:

Setting plot format explicitly

If we are using explicit arguments, we can turn off the line and just use markers by setting the line style to the string 'None':

Setting plot format explicitly

How can I collaborate on Matplotlib figures?

Engineers and data scientists often use version-control tools like GitHub or Bitbucket to collaborate. Collaboration workflows usually include a code review step in which colleagues review each other’s work and provide feedback. For entirely code-based work, the review tooling provided by version control systems does a great job of enabling these feedback conversations.

But if your work includes visual elements like Matplotlib plots, the existing version-control tooling is inadequate. GitHub does render figures that are part of Jupyter notebooks, and recently it launched rich diffs for notebooks, which includes figure diffs. But you can’t comment on notebooks, so reviewing visual elements is still awkward.

If you want to effectively collaborate on Matplotlib figures, the best option is still ReviewNB. With ReviewNB you can comment on Jupyter notebooks inline, allowing you and your colleagues to discuss all your plots in a single view.