Saturday, September 21, 2024

Quantitative Research Fundamentals

REF BOOK: Statistics for Management by Richard I Levin, David S Rubin, Sanjay Rastogi and Masood Husain Siddiqui 

               .Chapter 2.

DEFINITIONS

Continuous Data : Data that may progress from one class to the next without a break and may be expressed by either whole numbers or fractions.

Cumulative frequency distribution : A tabular display of data showing how many observations lie above, or below, certain values.

Data : A collection of any number of related observations on one or more variables.

Data Array : The arrangement of raw data by observations in either ascending or descending order.

Data Point : A single observation from a data set.

Data Set:  A collection of data.

Discrete Classes : Data that do not progress from one class to the next without a break; that is, whe classes represent distinct categories or counts and may be represented by whole numbers.

Frequency Curve: A frequency polygon smoothed by adding classes and data points to a data set.

Frequency Distribution: An organized display of data that shows the number of observations from data set that falls into each of a set of mutually exclusive and collectively exhaustive classes.

Frequency Polygon : A line graph connecting the midpoints of each class in a data set, plotted at a heg corresponding to the frequency of the class.

Histogram :A graph of a data set, composed of a series of rectangles, each proportional in width to range of values in a class and proportional in height to the number of items falling in the class, or fraction of items in the class.

Ogive: A graph of a cumulative frequency distribution.

Open-Ended Class : A class that allows either the upper or lower end of a quantitative classific limitless.

Population :A collection of all the elements we are studying and about which we are trying to to draw conclusions. 

Raw Data: Information before it is arranged or analyzed by statistical methods.

Relative Frequency Distribution : the display of a data set that shows the fraction or percentage of the total data set that falls into each of a set of mutually exclusive and collectively exhaustive classes 

Representative Sample : A sample that contains the relevant characteristics of the population in th same proportions as they are included in that population. 

Sample : A collection of some, but not all, of the elements of the population under study, used to desan the population.

Width of class intervals = (Next unit value after largest value in data - Smallest value in data) / Total number of class intervals

Chapter 3

Bimodal Distribution: A distribution of data points in which two values occur more frequently than the rest of the values in the data set.

Boxplot : A graphical EDA technique used to highlight the center and extremes of a data set.

Chebyshev's Theorem : No matter what the shape of a distribution, 
- at least 75 percent of the values in the population will fall within 2 standard deviations of the mean and
- at least 89 percent will fall within 3 standard deviations.

Coding: A method of calculating the mean for grouped data by recoding values of class midpoints to more simple values.

Coefficient of Variation: A relative measure of dispersion, comparable across distributions, that expresses the standard deviation as a percentage of the mean.

Deciles:  Fractiles that divide the data into 10 equal parts.

Dispersion: The spread or variability in a set of data.

Distance Measure: A measure of dispersion in terms of the difference between two values in the data set.

Exploratory Data Analysis (EDA): Methods for analyzing data that require very few prior assumptions.

Fractile: In a frequency distribution, the location of a value at or above a given fraction of the data. 

Geometric Mean: A measure of central tendency used to measure the average rate of change or growth for some quantity, computed by taking the nth root of the product of n values representing change.

Interfractile Range : A measure of the spread between two fractiles in a distribution, that is, the difference between the values of two fractiles.

Interquartile Range : The difference between the values of the first and the third quartiles; this difference indicates the range of the middle half of the data set. 

Kurtosis: The degree of peakedness of a distribution of points.

Mean/ A central tendency measure representing the arithmetic average of a set of observations.

Measure of Central Tendency : A measure indicating the value to be expected of a typical or middle data point.

Measure of Dispersion : A measure describing how the observations in a data set are scattered or spread out.

Median : The middle point of a data set, a measure of location that divides the data set into halves. 

Median Class : The class in a frequency distribution that contains the median value for a data set.

Mode : The value most often repeated in the data set. It is represented by the highest point in the distribution curve of a data set.

Parameters : Numerical values that describe the characteristics of a whole population, commonly rep- resented by Greek letters.

Pg142

Percentiles : Fractiles that divide the data into 100 equal parts. 

Quartiles : Fractiles that divide the data into four equal parts.

Range The distance between the highest and lowest values in a data set.

Skewness : the extent to which a distribution of data points is concentrated at one end or the other, the lack of symmetry: 

Standard Deviation : The positive square rot of the variance; a measure of dispersion in the same units as the original data, rather than in the squared units of the variance.

Standard Score : Expressing an observation in terms of standard deviation units above or below the mean, that is transformation of an observation by subtracting the mean and dividing by the standard deviation.

Statistics :  Numerical measures describing the characteristics of a sample. Represented by Roman letters. 

Stem and Leaf Display:  A histogram-like display used in EDA to group data, while still displaying a the original values.

Summary Statistics: Single numbers that describe certain characteristics of a data set.

 Symmetrical: A characteristic of a distribution in which each half is the mirror image of the other half. 

Variance : A measure of the average squared distance between the mean and each item in the population

 Weighted Mean: An average calculated to take into account the importance of each value to the overall total, that is, an average in which each observation value is weighted by some index of its importance.

No comments :

Post a Comment

Comments will appear on the post after moderation.