RANDOM VARIABLE AND PMF #
Discrete random variable = can take on only a countable number
Continuous random variable = has an uncountable number
Probability Mass Function = PMF= Gives probability of outcome of Discrete random variable
Cumulative Distribution Function = CDF = Probability that a random variable will take on a value less than or equal to x.
EXPECTATIONS #
The expected value is the weighted average of the possible outcomes of a random variable, where the weights are the probabilities that the outcomes will occur. The mathematical representation for the expected value of random variable X is:
E(X) = ∑P(xi)xi – P(x1)x1 + P(x2)x2+ … + P(xn)xn
Properties of expectation include:
- If c is any constant, then: E(cX) =cE(X)
- If X & Y are any random variables, then:E(X + Y) = E(X) + E(Y)
FOUR POPULAITON MOMENTS #
1st Moment is Mean = Expected value
The other three variance, Skewness and Kurtosis are known as Central moments because, these are relative to mean.
2nd Moment is Variance
Var(X) = E[(X-μ)2]
The square root of the variance is called the standard deviation. The variance and standard deviation provide a measure of the extent of the dispersion in the values of the random variable around the mean.
3rd—Skewness
E[(X-μ)3] / sd3
Skewness, or skew, refers to the extent to which a distribution is not symmetrical. Non symmetrical distributions may be either positively or negatively skewed and result from the occurrence of outliers in the data set.
A positively skewed distribution is characterized by many outliers in the upper region, or right tail. A positively skewed distribution is said to be skewed right because of its relatively long upper (right) tail.
A negatively skewed distribution has a disproportionately large amount of outliers that fall within its lower (left) tail. A negatively skewed distribution is said to be skewed left because of its long lower tail.
4th Moment Kurtosis
The kurtosis statistic is the standardized fourth central moment of the distribution. Kurtosis refers to the degree of peakedness or clustering in the data distribution and is calculated as:
kurtosis = E[(R-μ)4]/σ4
Kurtosis for the normal distribution equals 3. Therefore, the excess kurtosis for any distribution equals:
Excess kurtosis = kurtosis—3
PORBABILITY DENSITY FUNCTION #
PDF allows us to calculate the probability of an outcome between two values when working with continuous variables. Always remember, Probability of at single value is always equal to zero in PDF.
QUANTILE FUNCTION #
Quantile is inverse of CDF. i.e. it gives probability for that random variable will be less than or equal to some value.
50% quantile is median.
IRQ is interquartile range gives upper and lower value of outcomes Q(25) and Q(75).
IRQ is also measure of deviation, similar to standard deviation.
LINEAR TRANSFORMATION #
X takes the form of Y = a + bX, where a and b are constant.
- Mean E(Y) = a+bE(X). Both location and scale are affected
- Variance : VarY = b2 VarX., Location is affected but dispersion is same.
- SD is SD (Y) = |b|SD(X)
- With b>0, skewness is unaffected. SkewY=Skew X
- b<0, the magnitude of the skew is unaffected but sign change, skew Y = – skew X
- Linear transformation does not affect kurtosis. K(Y) = K(X)