We are searching data for your request:
Upon completion, a link will appear to access the found materials.
The median of a set of data is the midway point wherein exactly half of the data values are less than or equal to the median. In a similar way, we can think about the median of a continuous probability distribution, but rather than finding the middle value in a set of data, we find the middle of the distribution in a different way.
The total area under a probability density function is 1, representing 100%, and as a result, half of this can be represented by one-half or 50 percent. One of the big ideas of mathematical statistics is that probability is represented by the area under the curve of the density function, which is calculated by an integral, and thus the median of a continuous distribution is the point on the real number line where exactly half of the area lies to the left.
This can be more succinctly stated by the following improper integral. The median of the continuous random variable X with density function f( x) is the value M such that:
Median for Exponential Distribution
We now calculate the median for the exponential distribution Exp(A). A random variable with this distribution has density function f(x) = e-x/A/A for x any nonnegative real number. The function also contains the mathematical constant e, approximately equal to 2.71828.
Since the probability density function is zero for any negative value of x, all that we must do is integrate the following and solve for M:
0.5 = ∫0M f(x) dx
Since the integral ∫ e-x/A/A dx = -e-x/A, the result is that
0.5 = -e-M/A + 1
This means that 0.5 = e-M/A and after taking the natural logarithm of both sides of the equation, we have:
ln(1/2) = -M/A
Since 1/2 = 2-1, by properties of logarithms we write:
- ln2 = -M/A
Multiplying both sides by A gives us the result that the median M = A ln2.
Median-Mean Inequality in Statistics
One consequence of this result should be mentioned: the mean of the exponential distribution Exp(A) is A, and since ln2 is less than 1, it follows that the product Aln2 is less than A. This means that the median of the exponential distribution is less than the mean.
This makes sense if we think about the graph of the probability density function. Due to the long tail, this distribution is skewed to the right. Many times when a distribution is skewed to the right, the mean is to the right of the median.
What this means in terms of statistical analysis is that we can oftentimes predict that the mean and median do not directly correlate given the probability that data is skewed to the right, which can be expressed as the median-mean inequality proof known as Chebyshev's inequality.
As an example, consider a data set that posits that a person receives a total of 30 visitors in 10 hours, where the mean wait time for a visitor is 20 minutes, while the set of data may present that the median wait time would be somewhere between 20 and 30 minutes if over half of those visitors came in the first five hours.