Zipf Distribution in Python
Submitted by moazkhan on Thursday, July 9, 2020 - 10:49.
In this tutorial you will learn:
- What is Zipf Distribution?
- Zipf Distribution Implementation in python
- Visualization of Zipf Distribution
Zipf Distribution
Zipf Distribution is a discrete pareto distribution also known as Riemann zeta distribution. It is specified by probability mass function. Zipf distribution samples the data based on Zipf’s law which refer to the fact that many types of data studied in the physical and social sciences can be approximated with a Zipfian distribution. Zipfian distribution belongs to the family of discrete power law probability distributions commonly used in linguistics, insurance and the modelling of rare events. The graphical pattern of Zipf Distribution follows a straight line when it is plotted on a double-logarithmic diagram.Zipf Distribution in Python
In order to implement the Zipf Distribution the random module of python’s NumPy library function provides an inbuilt function ”zipf()”. It takes in 2 mandatory parameter. The first parameter is the “size”, it is the size of array which is desired as the output from the zipf() function, it could be 1D, 2D or n-dimensional array as required by the programmer. The second parameter is the distribution parameter defined by ‘a’, it must be a unsigned float or int and must greater than 1. In order to observe the results of Zipf Distribution, lets take an example. Here we will generate a 1D array of Zipf distribution having size 4 with distribution 1.5. In the code below we are importing the random module in the second line of the code and in the fourth line we are applying the Zipf distribution with size of output array 4 and distribution parameter ‘a’ equal to 1.5.- #importing the random module
- from numpy import random
- #applying the Zipf function
- res_arr= random.zipf(size=4,a=1.5)
- #printing the results
- print('1D array of size 4 having Zipf distribution with distribution parameter 1.5 :\n')
- print(res_arr)
- #importing the random module
- from numpy import random
- #here we are using Zipf function to generate Zipf distribution of size 5 x 2 with distribution parameter 2.5
- res_arr = random.zipf(size=(5,2),a=2.5)
- print('2D Zipf Distribution as output from Zipf() function:\n')
- #printing the result
- print(res_arr)
- #importing the random module
- from numpy import random
- #here we are using Zipf function to generate Zipf distribution of size 2 x 3 x 4
- res = random.zipf(size=(2,3,4), a=1.1)
- print('3D Zipf Distribution as output from Zipf() function:\n')
- #printing the result
- print(res)
Visualization of Zipf Distribution
In this example we will visualize the Zipf Distribution with distribution parameter 2. Here we will be using the displot function of seaborn library to plot and visualize a one dimensional discrete Zipf distribution- #importing all the required modules and packages
- from numpy import random
- import matplotlib.pyplot as mpl
- import seaborn as sb
- #here we are using Zipf function to generate distributions of size 3000 with distribution parameter 2
- sb.distplot(random.zipf(size=3000,a=2), hist=False, label='Zipf Distribution')
- #plotting the graph
- mpl.show()
Add new comment
- 314 views