Categories
DATA MINING DATA SCIENCE DATA VISUALIZATION MACHINE LEARNING PROGRAMMING PYTHON TUTORIALS

How To Plot Histogram In Python Using Matplotlib

So in this article, we will take a look at how we can plot histogram in Python using Matplotlib library. Now this is some thing quite different from the basics of line plotting we have seen so far. So I know you will need some time to get through it. So what I will do is to go through it in an easy to understand way. Alright?

So relax, take a cup of coffee if you want to. As we will now look into the plot of histogram in Python using Matplotlib!

Plot Histogram In Python Using Matplotlib – The Basics

To get started, let us first learn a bit about what Histogram plot is. And then we will look at other questions. Like where it is used and how to draw it using Matplotlib. Okay? Great! So here we go!

What is an Histogram?

A histogram is a way to display frequencies of some thing. So how does it look like? In simple words, it is drawn using bars.

Oh wait a second! So does that mean that it is a kind of bar graph? Yeah you are right. Kind of!

So what happens is, the data that you want to show in an histogram is grouped together. But it does not mean that they are grouped randomly. But instead, similar data items are grouped together. Alright? Does that make sense? So when you plot, you will be plotting these grouped data on the chart. Okay?

Now there is one other thing. In Matplotlib, we call these groups of data as bins.

What are Histogram bins?

So a histogram bin is nothing but a group of similar data. That is all it is. So there is nothing really confusing about it!

Alright. So now that we know what a histogram bin in Matplotlib is, it is time for us to see an example of it. So how do we go about creating a plot of histogram in Python? Here is an example of it.

Plot Histogram In Python Using Matplotlib – Example

So we all know that to start a plot of something, we need data. Right? So how do we get this data? Since histogram is used to plot a lot of data, we cannot create it by hand. So what do we do then? We will have to take help of a library. Of course!

And what better library than NumPy to get a set of random numbers. Right? So that is what we will do. We will use Numpy to generate a bunch of random numbers.

But how many random numbers shall we use? 10, 50 or 100? Naah! We can surely go more than that. Right? So how about using 1000 random numbers? 😉

So here is the piece of code we will use to generate 1000 random numbers using Numpy!

import numpy as np
y = np.random.randn(1000)

That is it! That is all the code we need to create 1000 random numbers using Numpy! So easy. Right?

So now that we have our data ready, let us see how we can plot it as a Histogram using Python’s Matplotlib.

So the code to plot a histogram using Matplotlib looks like this:

import matplotlib.pyplot as plt
import numpy as np

y = np.random.randn(1000)
plt.hist(y);
plt.show()

That’s it! We just import pyplot module and call it’s hist( ) function with our data. And the Matplotlib library does the rest. It will go ahead and plot a Histogram in Python for us!

This is very easy right? And that is the beauty of Matplotlib library. The modules and functions are so well written that you can create beautiful histogram plot in Python easily!

So then how does the final output plot of the Histogram look like? Well, you see it for yourself!

Plot of Histogram Drawn In Python Using Matplotlib
Plot of Histogram Drawn In Python Using Matplotlib

Matplotlib Histogram Bins

Woah! What happened here? We gave it 1000 input data points right? What happened to all of it then? Well let me explain. Here is what Matplotlib has done.

It has taken our 1000 data input and grouped them together into 10 bins. And then it created the above histogram!

So why 10 bins? Why not 12 or 15 or any other number? Now that is a valid question for you to ask. So let me tell you why the number 10.

It is because that is the default number of bins Matplotlib will create for any number of input data you give to it. Okay? Does that make sense?

So in simple terms – Matplotlib took our 1000 data & grouped closer numbers together into 10 bins. It then went on to create the above histogram plot!

So that is all there is to it! But what if we want to have more than 10 bins? Well, we will come to that soon, but not now. Because it is going to need it’s own article that I will write next!

So see you in the next article!