Data has become an essential component which has to be analysed with statistical tools to arrive at a conclusion. Such analysis of data provides a strong foundation and base for the decisions made. Be it research scholars or businessmen, everyone needs data for arriving at meaningful conclusions. So, having the basic knowledge of statistics is necessary for surviving in this competitive world. Statistical skills are utilized by data scientists to acquire the required data and are then processed to arrive at a decision by using different statistical tools among which probability mass function is an important one.
What is a probability mass function?
A Probability Mass Function is also termed as a frequency function and is a vital part of statistics. Probability Mass Function integrates that any given variable has the probability that the random number will be equal to that variable. All the probabilities for the given discrete random variables provided by Probability Mass Function. Here discrete essentially means that there are a set number of outcomes for the variables. For understanding discrete variables better, the set number of outcomes in a die can only be 1, 2, 3, 4, 5 or 6. Here a discrete random value when considering a die is a set of random variables which are finite.
Free Step-by-step Guide To Become A Data Scientist
Subscribe and get this detailed guide absolutely FREE
Probability Mass Function properties are unique, and they set it apart from probability density function. PMF is a part of the Probability Distribution Function. A function which is used to denote a probability distribution is a Probability Distribution Function. All the probabilities for the given discrete random variables provided by Probability Mass Function. Consider a discrete random variable X, its probability mass function is assigned by allocating a probability that X is equal to all of its possible values. For characterizing discrete random variables, its probability distribution can be attributed to probability mass function. Probability Mass Function is denoted by P(x). Let us get back to the example of a six-sided die. The probability of rolling a 4 is f(4) = 1/6.
Understanding continuous random variables as opposed to discrete random variables
A variable that can possibly take on any number on a continuum is considered as a continuous random variable and is used in PMF. Example of a continuous random variable is a set of all real numbers. Just like probability mass function, we cannot assume that the probability of X is exactly as of each given values. A probability density function and probability mass function is different, so we essentially assign the probability of value X as near to each value in pdf.
Random variables may be any number out of the hat or numbers from the dice and more. A random variable is subject to any changes due to random variations that may take place. You often think of a random variable as an outcome of a random experiment like flipping a coin for heads or tails, rolling a die.
The probability mass function and probability density function for discrete random variables and continuous random variables respectively are similar as we use integrals in the former and sums in the latter. The formula for probability density function is Pr(X∈A)=∫Aρ(x)dx.
The equation for PMF is f(x)= p(X=x). This formula means that the probability that X takes on the value x. Most commonly PMF is plotted on a graph for easy interpretation of the subject under study.
Why do we need probability distribution?
We certainly need probability distribution to understand the likelihood of a scenario so as to be ready for the outcome in advance. Here the probable value will eventually bounce between the maximum and the minimum variables. It depends on the number of factors as to where the plotting of a probability distribution will take place.
For a successful data scientist knowledge about probability mass function becomes quintessential for being competent. Now that you know Statistical skills are utilized to acquire the required data and are processed to arrive at a decision by using different statistical methods like