Inferential Statistics or Inductive Statistics or Statistical Inference is the process of inferring something about the population based on what is measured in the sample. Inferential statistics are used to determine if observed data we obtain from a sample (i.e., data we collect) are different from what one would expect by chance alone

Statistics - Introduction & Terminology

Some may argue that statisticians are not really interested in generalizing from a sample to a specified population but to an idealized superpopulation spanning space and time

best course on statistics: https://bolt.mph.ufl.edu/6050-6052/

Introduction & Terminology

The field of statistics exists because it is usually impossible to collect data from all individuals of interest (population). Our only solution is to collect data from a subset (sample) of the individuals of interest, but our real desire is to know the “truth” about the population. Quantities such as means, standard deviations and proportions are all important values and are called “parameters” when we are talking about a population. Since we usually cannot get data from the whole population, we cannot know the values of the parameters for that population. We can, however, calculate estimates of these quantities for our sample. When they are calculated from sample data, these quantities are called “statistics.” A statistic estimates a parameter.

population distribution consists of all units of interest

empirical distribution consists of observed units collected from the population

population parameter (𝜽)

sometimes just called a parameter

is any variate analysis of population distribution (e.g. mean, variance, etc)

usually have an unknown value

sample statistic (𝜽ˆ)

sometimes just called statistic

is a function of sample distribution as input

is any variate analysis of a sample distribution (e.g. sample mean, sample variance, etc)

is an estimate of the corresponding population parameter 𝜽

is a random variable because it is computed from a random sample distribution a subset of population distribution. Thus, this statistic has a sampling distribution

see methods estimating sample statistic

Error

Random Process - Random Variables - Stochastic Model - Probability Distribution - Statistical Inference - Statistical Model - Exploratory Data Analysis - Estimator - Probability Model

Many times there are observable phenomena that are random in nature. We call it a Random Process (Random Experiment). The random process has outcomes, and subsets of these outcomes are called Events. We map these events to a numeric form using Random Variables.

We study and capture our knowledge about this random process by creating a Stochastic Model. The stochastic model predicts the output of an event by:

providing different choices (of values of a random variable)

the probability of those choices

These two elements are summarized as a Probability Distribution.

This distribution has some parameters (like mean, standard deviation, etc) which were inferred from the observable phenomena using Statistical Inference.

Before inference, the distribution had unknown (not inferred yet) parameters. It was, hence, a family of distributions, since each value of the parameter is a different distribution. This family is called a Statistical Model.

Usually, a statistical model is guessed (exponential, binomial, normal, uniform, Bernoulli, etc) using Exploratory Data Analysis, then its parameters are inferred (estimated) by applying statistical inference (say, algorithms involving loss function minimization) to arrive at a stochastic model (statistical model with known parameters) (a.k.a. Estimator) that captures our knowledge about the random process.

The term ‘Probability Model’ (probabilistic model) is usually an alias for stochastic models.

Link to original

Inferential Statistics - Paradigms

Circular transclusion detected: mathematics/probability---statistics---information-theory---econometrics/statistics/inferential-statistics/index
Link to original

Inferential Statistics - Forms/Methods

Each form/method represents a different way of using the information obtained in the sample to draw conclusions about the population

Estimation
- Parameter Estimation - given sample data, estimate the value of the unknown population parameter
- Interval Estimation - given sample data, estimate the value of an unknown population parameter using an interval of values that is likely to contain the true value of that parameter (and state how confident we are that this interval indeed captures the true value of the parameter)
Inferential Statistical Hypothesis Testing - begin with a claim about the population (we will call the null hypothesis), and we check whether or not the given sample data provide evidence AGAINST this claim

Inferential statistics uses probabilistic approximate inference algorithms to infer probabilities of the global population

Inferential Statistics - Process

A random sample is taken from the population
In order to estimate a population parameter, a sample statistic is calculated from the sample (e.g. sample mean, sample proportion, etc.)
- This is where point estimation is used
We then learn about the sample statistic’s sampling distribution
- This is where interval estimation is used
Using this sampling distribution we can make inferences about our population parameter based on our sample statistic
- This is where hypothesis testing is used

Inferential Statistics - Goals

Parameter Estimation / Interval Estimation / Hypothesis Testing
- the parameters/properties of a population distribution are called population parameters and they are often an unknown constant. These parameters need to be estimated in such a way that the resulting distribution model best explains the observed data
- e.g. the parameters of a normal distribution are its mean and standard deviation. So, if you know that the data resembles the model of a normal distribution, parameter estimation would amount to trying to learn the true values of its mean and standard deviation
Structure Estimation - Distribution Model Comparison
- the distribution of a population is often unknown
- we propose a set of possible distribution models, have each model parameter estimated, and then use model comparison to select the model that best explains the observed data
Data Prediction
- for this goal, you usually have a distribution model produced from the first 2. Then you use them to predict future data.
- e.g. after measuring the heights of females in a sample, you can estimate the mean and standard deviation of the distribution for all adult females. Then you can use these values to predict the probability of a randomly chosen female having a height within a certain range of values

／var／log marcus chiu

Explorer

Inferential Statistics

Statistics - Introduction & Terminology

Inferential Statistics - Paradigms

Inferential Statistics - Forms/Methods

Inferential Statistics - Process

Inferential Statistics - Goals

／var／logmarcus chiu

Explorer

Inferential Statistics

Statistics - Introduction & Terminology

Inferential Statistics - Paradigms

Inferential Statistics - Forms/Methods

Inferential Statistics - Process

Inferential Statistics - Goals

／var／log marcus chiu