Introduction of statistics -Part 2

Parmita Biswas
Analytics Vidhya
Published in
4 min readSep 12, 2020

--

Photo by Corinne Kutz on Unsplash

In the previous article we focused on some basic stats. In this article we will learn about various curves. We will learn with lots and lots of examples. It is said that of we can visualize things, it’s easy for us to remember.

Let us start with few examples:

1. “The heights of adult women in the India follow, at least approximately, a bell-shaped curve.”. Let’s read this sentence twice and try to understand what this means.

2. What does it mean to say that a boy’s weight is in the 90th percentile for all adult males?

3. Male heights have a mean of 60 inches and a standard deviation of 4 inches. Female heights have a mean of 55 inches and a standard deviation of 2 inches. Thus, a man who is 64 inches tall has a standardized score of 1. What is the standardized score corresponding to your own height?

Let’s jump to some definitions and then we can come back to these sentences.

1. Population: Population means every member in a group. The best example is census.

2. Frequency curve: We all are aware of histograms. If we smooth that out and join the tops, its results in a frequency curve.

Let’s say females in India follow the below curve.

Normal distribution

This type of curve is called — normal distribution or a bell-shaped curve or a Gaussian curve.

But it is not always necessary to have bell shaped curve. If we see curve where to try to see the salary, the curve may look like:

Skewed graph

This means that people having less salary are more as compared to those having more salary. This type of curve is called right skewed.

Lets deep dive into the term “skewness “

Skewness — explained via curves

The above distribution where mean is greater than mode, and hence a right skewed. Whereas the image which has means less than mode, hence left skewed. Skewness indicates the direction and relative magnitude of a distribution’s deviation from the normal distribution.

The next important term is: Proportion

Proportion of population of measurements falling in a certain range = area under curve over that range. In the earlier example the mean height of females is 53 inches. Which means that 50% of the data lies on the either side. This can be read as 50% of the females are taller than 53 inches.

Percentiles and Standardized Scores

Percentile = the percentage of the population that falls below you.

Finding percentiles for normal curves requires:

• Your own value.

• The mean for the population of values.

• The standard deviation for the population. T

Then any bell curve can be standardized so one table can be used to find percentiles.

Standardized Scores(z score) = (observed value — mean standard deviation)/Standard deviation

Example :

IQ scores have a normal distribution with a mean of 50 and a standard deviation of 10.

• Suppose your IQ score was 66.

• Standardized score = (66–50)/10 = +1.6

• Your IQ is 1.6 standard deviation above the mean.

• Suppose your IQ score was 35.

• Standardized score = (35–50)/19 = –1.5

  • Your IQ is 1.5 standard deviation below the mean.

Congratulations, you did it.

For now, thank you all for making it this far. We covered various types of curves and skewness. In the next article we will deep dive into few more terms which will help us towards the field of data science and data analysis.

Please go through the previous article in case you missed.

And as always, if there are any question, remarks, or comments feel free to contact me!

Reference :

Statistics How To

https://alevelmaths.co.uk/statistics

https://alevelmaths.co.uk/statistics/skewness/

--

--

Parmita Biswas
Analytics Vidhya

I am an enthusiast data scientist as well as a python developer. I have an overall ten years of industry experience.