Photo credit |
STATISTICS
- the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making more effective decisions. (Mason, Lind Marshall)
Statistics is used in a wide range of fields and situations whenever you need to analyze data, make predictions, or draw conclusions based on evidence. Here are some common scenarios where statistics is utilized:
Scientific Research: Statistics is crucial for designing experiments, analyzing data, and drawing conclusions in fields like biology, chemistry, physics, and social sciences.
Business and Economics: Businesses use statistics for market research, forecasting sales, analyzing financial data, and making strategic decisions. Economists use statistics to study economic trends, evaluate policies, and forecast economic indicators.
Healthcare: Statistics is vital in medical research for clinical trials, epidemiological studies, analyzing patient data, and evaluating treatment effectiveness.
Quality Control: Industries use statistics to monitor product quality, control manufacturing processes, and ensure compliance with standards.
Finance: Statistics plays a key role in risk management, portfolio analysis, pricing financial instruments, and modeling financial markets.
Social Sciences: Statistics is used in sociology, psychology, political science, and other social sciences to analyze survey data, study social phenomena, and test hypotheses.
Education: Statistics is important in educational research for evaluating teaching methods, assessing student performance, and conducting educational assessments.
Sports Analytics: Statistics is widely used in sports to analyze player performance, optimize strategies, and make decisions in areas like player recruitment and game tactics.
Environmental Studies: Statistics is used to analyze environmental data, study climate change, assess environmental impacts, and model ecological systems.
Market Research: Statistics is employed to analyze consumer preferences, conduct surveys, segment markets, and predict market trends.
- Provides researchers the means to scientifically measure the conditions that may be involved in a given problem and evaluate how these conditions are related.
- Shows the laws underlying facts and events that cannot be determined by individual observation.
- Observe trends and behavior in related conditions that otherwise may remain unclear.
TYPES OF STATISTICS
-referring to, constituting, or grounded in matters of observation or experience.
- concerned with the gathering, classification, and representation of data and the collection of summarizing values to describe group characteristics of the data.
- aims to give information about the large groups of data without dealing with each
hypotheses using t-test, z-test, correlation, analysis of variance, chi-square test, regression analysis, and time series analysis. The basis for inferential is the ability to make decisions about parameters without having the complete census of the population.
(source: Masteral notes)
SOME DEFINITION OF TERMS USED IN STATISTICS:
- Quantitative Variable – when the variable studied can be reported numerically.
- Sample – A portion or part of the population of interest.
- Population – A collection of all possible individuals, objects, or measurements of interest.
- Qualitative Variables – when the characteristics or variable being studied is non-numeric.
- Discrete variable – it is a quantitative variable. Assumes only certain values like bedrooms in a house, no. of cars arriving at a tollbooth, etc..
- Continuous variable – It is a quantitative variable. Assume specific values within a specific range like air pressure in a tire, weight in a shipment of grains, the flight from USA to Manila Etc..
Collection of Data:
- 1. Primary – refers to information which are gathered directly from an original source, or which is based on direct or first-hand experience.
- 2. Secondary data – refers to information that is taken from published or unpublished data that were previously gathered by other individuals or agencies.
- 1. Direct or Interview method – this is a method of person-to-person exchange between the interviewer and the interviewee.
prepared questions.
observations in each mutually exclusive category.
- Find the range: ( the
difference between the highest score and the lowest score)
- FIND THE CLASS INTERVAL.
4. Write the CI starting with the lowest score limit as determined by your choice ( as
5. Determine the class frequencies for each class interval by referring to the tally column and dividing the sum by 2. The class mark is the representative value of the corresponding interval.
6. Compute the class mark by adding the lower and upper limits of the class interval,
Class Boundaries – more precise expressions of the class limits by at least 0.5 of their values. CB is situated between the upper limit of one interval and the lower limit of the next interval.
- Range:
- Make a class interval ( use 5)
- Show the < frequency
- Show the > frequency
- Solve for relative frequency
- Solve for the Percentage frequency
- Solve for the Class Mark
- Solve for the Class Boundaries
- Solve for the pie/circle graph
- Draw the circle/pie graph/chart
- Popularly known as average.
- Are descriptive statistics because of a single no. describes a central value of a group of observations or individuals where this central value represents all the figures in a group of which it is a part.
- It is a shorthand descriptive of a group of quantitative data obtained from a sample.
- It is more economical, easier, and meaningful to let one figure stand for a group than to remember all particular numbers in a group.
- It is descriptive of a sample obtained in a particular group of observations at a particular time in a particular way.
- It also describes indirectly, but with some accuracy, the population from which the sample is drawn.
- Arithmetic mean is a frequently used measure of central tendency because it is subject to less error.
- It lends itself to algebraic manipulation.
- Its standard error is less than the median.
- The sum of the deviation of the cases about the mean is zero.
- It is entirely independent of the extreme measures.
- Its position is not stable.
- It is not contributed by all items in a series.
- It is not always well-defined or possible to locate properly.
- The set of observations can be unimodal (one mode), bimodal ( two modes), trimodal (three modes), or polymodal.
- Most reliable, most stable, and with the least probable error.
- Most generally recognized measure of central tendency.
- the best measure for irregular or skewed distribution.
- It may be located in an open-end distribution or when the data are incomplete.
- It is always real value since it does not fall on zero.
- Simple to approximate by observation especially when the number of cases is small.
- It does not lend itself to algebraic manipulation.
- Does not require the arrangement values.
- Does not supply information about the homogeneity of the group.
- The more heterogeneous the set of observations or group of individuals is, the less satisfactory
- Is the mean as measure of tendency.
- Requires the arranging of items according to size before it can be computed.
- Has a larger probable error than the mean,
- t does not lend itself to algebraic treatment.
- Erratic when the data do not cluster at the center of distribution.
- Inapplicable to a small number of cases when the values may not be repeated.
- It is rigidly defined and is inapplicable to irregular distribution.
Formula:
Qk = LB + [(kN/4 - >cf) / f] i
Formula:
Dk = LB + [(kN/10 - >cf) / f] i
Formula:
Pk = LB + [(kN/100 - >cf) / f] i
- It tells us the spread of the data.
- Measures of variability give information on how the data are scattered or spread and describe the mass of data. They give the total picture and characteristics of the set of data on how they are dispersed.
- Absolute Variability:
- Range - simplest and easiest measure of variability, classified into, absolute range, total range, Kelly range. The absolute range is simply the difference between the highest and lowest scores. Total range is the difference by subtracting the lowest score from the highest score + lowest score.
(P90-P10).
Relative standard deviation (RSD). Coefficient of variation |
Formula:
Inter quartile range (IQR)
To talk about the interquartile range, we need to first talk about the percentile. The pth percentile of the data set is a measurement such that after the data are ordered from smallest to largest, at most p% of the data are below this value and at most (100-p)% above it. Thus, the median is the 50th percentile.
Also, Q1 = lower
quartile = 25th percentile and Q3 = upper quartile = 75th
percentile.
The inter-quartile range is the difference between upper and lower quartiles and is denoted as IQR.
I QR = Q3 - Q1
= upper quartile - lower quartile = 75th percentile - 25th percentile.
Note: IQR is not affected by extreme values. It is thus
a resistant measure of variability.
|
Formula:
Weight(kg)
|
Length(m)
|
0.43
|
0.52
|
0.54
|
0.62
|
0.41
|
0.51
|
0.63
|
0.68
|
0.55
|
0.63
|
0.42
|
0.57
|
0.58
|
0.62
|
0.57
|
0.61
|
0.48
|
0.54
|
0.62
|
0.68
|
0.60
|
0.65
|
0.59
|
0.62
|
0.65
|
0.72
|
0.59
|
0.63
|
+ comments + 1 comments
Extremely useful information which you have shared here I admire this article for the well-researched content and excellent wording. erp software in chennai