Things, when visualized, are better to understand as they can be presented to others in a better way.
Matplotlib is a plotting library for Python which provides an environment to a plot, graphically represent the data and make it presentable.
It is the most widely used data visualization library.
Matplotlib allows a user to visualize different types of data using different types of graphs, making the data look presentable. The platform allows users to tell a non-technical audience about the data studied in a better way.
In this blog, we will highlight the different aspects of Matplotlib through several examples.
Scatter Plots With plt.plot
In the below example, we import matplotlib.pyplot as plt to ensure that we can use plt throughout in the code. We create an x-axis filled with 30 values between 0 and 10 evenly spaced using a linspace function. We will create y-axis with sin function of values on the x-axis and then use pyplot’s plot() function to plot x and y values.
Free Step-by-step Guide To Become A Data Scientist
Subscribe and get this detailed guide absolutely FREE
import matplotlib.pyplot as plt import numpy as np x = np.linspace(0, 10, 30) y = np.sin(x) plt.plot(x, y) plt.show()
We take one more example to demonstrate the use of pyplot’s plot() function.
to plot x. The y-function will be norm.pdf(x).
Here we are plotting a normal probability density function using matplotlib. So we just use pyplot’s plot() method to set up our plot, and then we display it using plt.show().
Using arange function we create an x-axis filled with values between -5 and 5 at increments of 0.01. The y function will be norm.pdf(x). We use pyplot’s plot() function to plot these values.
Probability density function with a normal distribution based on the x values has been created using the scipy.stats norm package.
from scipy.stats import norm
import matplotlib.pyplot as plt import numpy as np x = np.arange(-5, 5, 0.01) plt.plot(x, norm.pdf(x)) plt.show()
Plotting Multiple Plots On The Graph
This example demonstrates plotting of more than one thing at a time.
We have created t variable filled with 400 values between 0 and 2*pi evenly spaced using linspace function and another two variables a and b using cos function and sin function of t respectively.
In the output, matplotlib chooses different colors for each graph automatically as shown in the screenshot below.
from numpy import * import math import matplotlib.pyplot as plt t = linspace(0, 2*math.pi, 400) a = cos(t) b = sin(t) c = a - b plt.plot(t, a) # plotting t, a separately plt.plot(t, b) # plotting t, b separately plt.plot(t, c) plt.show()
Saving The Graphs As Images
We called plt.savefig() with a path to the location in the system where we want to save this file and what format we want it in.
The above graph for multiple plots is saved in the Satya directory in the png format as shown below.
plt.savefig('C:\\Users\\satya_000\\multiple_plot.png',format = 'png')
Adjusting The Axes
Using the below code we have explicitly set the marks on x-axis and Y axis using plt.xticks and plt.yticks.
Using plt.xticks function we have taken the initial value as min(x) which is 0 and max(x) + 1 which is 16 as the stop value with spacing between values as 1.
Similarly, using plt.yticks we have plotted Y axis with initial value 1 and final value 15 with 3 space between all the values.
import numpy as np import matplotlib.pyplot as plt x = [0,4,8,10,15] y = [0,1,2,3,4] plt.plot(x,y) plt.xticks(np.arange(min(x), max(x)+1, 1.0)) plt.yticks(np.arange(1,15,3)) plt.show()
Adding A Grid In The Graphs
As shown in the below code we have added the grid lines in the above graph by using the plt.grid() function.
import numpy as np import matplotlib.pyplot as plt x = [0,4,8,10,15] y = [0,1,2,3,4] plt.plot(x,y) plt.xticks(np.arange(min(x), max(x)+1, 1.0)) plt.yticks(np.arange(1,15,3)) plt.grid() plt.show()
Changing The Colour And Style
Using the below code we have passed an extra parameter on the plot() functions at the end.
We have passed “ b-” to indicate a solid blue line and “ r: ” to indicate dashed red line.
from scipy.stats import norm import matplotlib.pyplot as plt import numpy as np x = np.arange(-7, 7, 0.01) y = np.arange(-5, 5, 0.01) plt.plot(x, norm.pdf(x), 'b-') plt.plot(x, norm.pdf(x, 2.0, 0.5), 'r:') plt.show()
Labeling Axes, Adding A Title And Legend
We have used the x-label() and y-label() functions on plt to put labels on our axes and have labeled x-axis as parameters for x and y axis as Probability.
To add the title we have used plt.title() function and have given the title “ A Sine Curve “.
We have also added the legend using the function plt.legend().
The legend consists of the list of the names of the various graphs. As seen in the below screenshot, we can see the names of the legends as Sample Curve1 and Sample Curve2.
Loc 4 indicates the positioning of the legends in the graph.
from scipy.stats import norm import matplotlib.pyplot as plt import numpy as np x = np.arange(-7, 7, 0.01) y = np.arange(-5, 5, 0.01) plt.plot(x, norm.pdf(x), 'b-') plt.plot(x, norm.pdf(x, 2.0, 0.5), 'r:') plt.title("A Sine Curve") plt.xlabel('Parameters for x') plt.ylabel('Probability') plt.legend(['Sample Curve1', 'Sample Curve2'], loc=4) plt.show()
Using plt.pie we have created an array of values, colors, labels, and explode to demonstrate whether we want the items exploded, and in case it is exploded then by how much.
We have created a pie chart with the values 15, 50, 4, 30 and 12 and have assigned explicit colors and labels to each one of those values.
The Mumbai and Capetown segment of the pie has been exploded by 20%, and this plot has been given a title of “Location of the Matches”.
import matplotlib.pyplot as plt import numpy as np values = [15, 50, 4, 30, 12] colors = ['g', 'r', 'c', 'b', 'm'] explode = [0.2, 0, 0.2, 0, 0] labels = ['Mumbai', 'Sydney', 'Capetown', 'London', 'Others'] plt.pie(values, colors= colors, labels=labels, explode = explode) plt.title('Location of the Matches') plt.show()
We have defined an array of values and colors and plotted the data using plt.bar.
The below code plots from the range of 0 to 5, using the y values from the values array and using the explicit list of colors listed in the colors array.
import matplotlib.pyplot as plt values = [15, 50, 4, 30, 12] colors = ['g', 'r', 'c', 'b', 'm'] plt.bar(range(0,5), values, color= colors) plt.show()
Using scatter plot we can plot a couple of different attributes of a single person or thing against each other.
For example, we may plot the expenditure against the age of the person.
In the below example we have taken a random distribution in X and Y and have plotted against each other using plt.scatter().
import matplotlib.pyplot as plt from pylab import randn X = randn(300) Y = randn(300) plt.scatter(X,Y) plt.show()
In the below example, we have called a normal distribution centered around 30000 with a standard deviation of 15,000 with 10,000 data points.
Then using hist() from pyplot library we specify the input data and the number of buckets inside which things are grouped. Then, we display the graph using plt.show() function.
import matplotlib.pyplot as plt expenditure = np.random.normal(30000, 15000, 10000) plt.hist(expenditure, 50) plt.show()
This sums up our discussion on Matplotlib library.
Our upcoming blog articles will introduce you to Numpy and Pandas.