East Africa Techometer: Qualitative Research Fundamentals

Book Ref : Research Methods and Statistics for the Social Sciences: A Brief Introduction by Amber DeBono (author) (ISBN: 9781516537389)

Variable - anything that can be measured or changed in a research study is called a variable.

Independent Versus Dependent Variable

The independent variable is the factor that the researcher believes will affect an outcome. Variables that cannot be experimentally manipulated (constants), are not true independent variables.

The dependent variable is the variable that the researcher believes will be affected by the independent variable.

Categorical Versus Continuous Variables

A categorical variable is a variable in which participants belong to different groups or categories.

A continuous variable is a variable in which participants can fall anywhere on a spectrum of scores.

Important to know variables are categorical or continuous because it will affect what type of statistical analysis they will use to analyze their data.

Hypothesis

Null hypothesis always states that your independent variable will have no effect on the dependent variable. We test the null hypothesis with statistical analyses in order to find support for our research hypothesis. We call this null hypothesis testing.

Example

- Research Question : Is [Group1] more likely than [Group2].

- Research Hypothesis : If [condition], then [Group1] [consequence] more likely than [Group2 consequence].

- Null Hypothesis : If [condition], then [Group1] [consequence] will be equally likely to [Group2 consequence].

Directional Versus Non-Directional Research Hypothesis

Directional research hypotheses always predict that one group will be higher than the other on the dependent variable.

Nondirectional research hypotheses are written in a way that differences are predicted between the groups, but the researcher isn’t sure which group will be higher than the other.

Example.

- Directional Research Hypothesis: If [condition], then [Group1] [consequence] more likely than [Group2 consequence].

- Nondirectional Research Hypothesis: If [condition], then [Group1] [consequence] more or less likely than [Group2] [consequence].

Statistics only work when the data is "normal". If there is a skew, then it has to be transformed into a normal distribution,

Probability and Null Hypothesis Testing: p < .05

The most important probability is p = .05. Converted into a percentage this would be a probability of 5%.. Typically, social scientists test the null hypothesis. They do not want the null hypothesis to be true. When they calculate their statistical tests on the null hypothesis, they want the probability of the null hypothesis to be true in their dataset to be less than 5% (p < .05). That’s a pretty low probability. By demonstrating how unlikely it is that the null hypothesis is true in the dataset, the researchers have evidence to support their research hypothesis. if p>.05, improve your test design like Increase your sample size.

Descriptive Versus Inferential Statistics

Descriptive variables describe the data ex demographics. Inferential variables infer the relationship between the variables ex height vs weight taller person may weigh more , ofc this may not be true in all cases.

Raw Scores are the actual data before we transform or standardize them. when we count occurence of a raw number it is called frequency.

Statistical Tests

You are always testing the null hypothesis (which assumes no relation between variables). Your aim is to prove p < 0.05.

--------------------------------------

Central tendency: Values that represent the central point within a group of scores

Mode: The most frequently occurring score or value in a set of scores or values

Median: The midpoint in the set of scores in which 50% of the scores are above this midpoint and 50% are below

Mean: The sum of the scores in a dataset divided by the number of scores (also known as the average)

Variance: A single number that describes how spread out the scores in a dataset are

Standard deviation: A single number derived from the variance that states how spread out the scores in a dataset are. The standard deviation is typically reported in manuscripts to describe the variance

Central Tendency (Mean, Median, Mode) vs Spread (Variance)

(N-1) is called degrees of freedom = This is the number of values in a distribution of scores that are free to vary.

Conceptually, formula for variance is a simple fraction. That means that the bigger the numerator is, and the smaller the denominator is, the larger your variance. When we look at the formula, this means that the more your scores are different from the mean and the fewer participants you have, the larger your variance will be. Likewise, the closer your scores are to the mean and the more participants you have, the smaller your variance will be. Typically, we want our variance to be small. This is the reason we aim for a higher sample size.

The standard deviation tells us how much our scores vary as a whole. That is, smaller standard deviations tell us that the scores aren’t very different from each other, whereas larger standard deviations tell us that the scores are very different from each other.

----------------------------------------

Z-Scores (Standardized Scores)

Raw scores are the actual scores that you get from your participants.

Standardized scores are very useful because they tell you how many standard deviations your raw score is from the mean. The sign of Z-score indicates whether the participant is below or above the mean, and the Magnitude tells us if it is an outlier. They are also important for calculating correlations.

One-Tailed Versus Two-Tailed Test

Recall that a nondirectional research hypothesis will have no specific prediction for the relationship between the research variables, whereas a directional research hypothesis will predict a very specific relationship between the research variables.

When we have a directional research hypothesis, we should conduct a one-tailed test.

Ex if our hypothesis is Loneliness increases Depression, we want our statistic to be in top 5% of distribution. so we test the top-tail,

But if our hypothesis is Loneliness affects Depression (we don't know), we test the low 2.5% and high 2.5%.

Journals may want researchers to still conduct 2-tail test in directional hypothesis too, because they don't want to take your word about pre-conceived directions!

Z-Test: Your First Statistical Test

This test is to know if a z-score is substantially different from the mean. Also called Statistical Significance. This means the the Z-score should occur (p<0.05) in your dataset.

so, 2-tailed Z-test would be conducted to publish for a journal, or if you aren't sure about the direction. :-) Two tail Z-test : See if the z-score is above +1.96 or below -1.96 [Critical Values].

In a bell curve, top 2.5% means z-score of +1.96. bottom 2.5% means z-score of -1.96

This means that if you get a z-score below -1.96 or above +1.96, it is very unlikely that the Null Hypothesis is true. You win!!

<<Whenever a researcher calculates a z-score below -1.96 or above +1.96, the researcher can reject the null hypothesis and claim the score demonstrates evidence to support the research hypothesis.>>

One-Tailed Z-test:

For a one-tailed z-test (when you have a directional research hypothesis and you’re not going to publish your results), you would need to see if your z-score is above +1.64, if your research hypothesis predicts that your score is higher than the mean, or below -1.64, if your research hypothesis predicts that your score is below the mean. Statistical significance is a good thing.

When we have statistical significance, this means we have evidence to support our research hypothesis.

Correlational Research Design

This is a type of research study that examines how two or more variables are related to each other.

A positive correlation is when two variables are related in a way that as one variable increases, the other also increases. This must also logically mean that as scores on one variable decrease, the other one also decreases.

A negative correlation is when one variable increases, another variable decreases.

Positive and negative, in terms of correlations, simply explain whether we expect the variables to correspond in the same (positive) or opposite (negative) direction.

This is best suited when you can't change a variable, or don't have the resources (time etc) to do so. Like trying to test correlation between Self Esteem and Exam Scores of students. You can't change the self esteem in an experiment.

Correlational designs are also an excellent way to replicate the effects from another study. Indeed, if the same effect can be found using multiple research methodologies, this can be some very powerful evidence for your hypotheses!

CO-RELATION IS NOT CAUSATION!

Calculating Correlations: Pearson’s r

A scatterplot is a graph that shows the relationship between two variables. As a researcher, you are hoping that your data plots (the dots in the scatterplot) cluster closely to a diagonal line and in the

direction that is consistent with your research hypothesis.

If the data plots fall closely along a diagonal line; this is called the Regression Line. It is the line that intersects with the most points in the scatterplot.

The more datapoints cluster on the regression line, the stronger the correlation, that is, the closer your correlation will be closer to -1.00 for negative correlations or +1.00 for positive correlations.

Most commonly used correlation, Pearson’s r (Pearson,1920) formula:

Basically, multiply the z-scores for variable X by the z-scores for variable Y. Then you add them up and divide that number by your total number of participants minus 1.

Summary

Z-scores: Also known as standardized scores, these scores tell us how far each score is from the mean (in terms of standard deviations)

Z-test: A statistical test to find out if a single score is significantly different from the mean

Correlational research: A type of research study that examines how two or more variables are related to each other but does not determine cause and effect

Scatterplot: A graph that includes plots for participants’ data on two variables

r -pearson : The most frequently reported correlation. It is calculated by summing the multiplication

of z-scores on two variables and dividing that sum by the number of participants (N).

Regression line: The best-fitting line in a scatterplot that is closest to the most data points

Positive correlation: A correlation in which scores on one variable increase as scores on the other variable increase

Negative correlation: A correlation in which scores on one variable increase as scores on the other variable decrease

Strong correlation: A type of correlation that is good at predicting how one person will score on one variable, knowing how they scored on another variable

Weak correlation: A type of correlation that is not very good at predicting how one person will score on one variable, knowing how they scored on another variable

------------------------------------

When researchers create a questionnaire or other type of assessment, they need to make sure that it is valid (accurate) and reliable (consistent).

There are three types of reliability that are important for researchers:

- internal consistency (Cronbach’s alpha),

- test-retest reliability, and

- inter-rater reliability.

Cronbach’s alpha, a measure of internal consistency, tells researchers how closely questionnaire items are correlated with one another.

Test-retest reliability tells researchers how much their assessment results in similar scores over time.

Inter-rater reliability tells researchers who use behavioral measures if they are consistently measuring a behavior based on people’s ratings.

To make sure that their assessment is valid, they need to find evidence for content and construct validity. Construct validity tells us if our scale is measuring what it is supposed to, whereas content validity tells us if our items are measuring what they are supposed to measure.

RELIABILITY = Consistent

Our methods (let's say our scales or established questionnaires that we administer on people to measure their "loneliness" for example) are deemed reliable when we have evidence that they are consistent. We

want our questionnaire to consistently find the same results. We want the scores on a questionnaire to consistently measure the same characteristic.

There are three main ways that we measure reliability:

- internal consistency (Cronbach’s alpha),

- test-retest reliability, and

- inter-rater reliability.

Internal Consistency: Cronbach’s Alpha

We want the questionnaires that we use in the social sciences to be internally consistent. This means that all the questions seem to be measuring the same concept, the one that we claim to be measuring. If all the questions are measuring the same concept, then the responses to the questions should mostly be positively correlated with one another, rather than with the questions we designed to measure the same concept or behavior.

Instead of examining the multiple correlations between item responses; we look to a single number—the Cronbach’s alpha. The Cronbach’s alpha is like a mega-correlation; it tells us the extent to which responses to all questions correlate with each other.

Cronbach’s alphas for your methods ought to be above .80, although .70 can also be considered acceptable. Less than that, junk the questionnaire, find another.

Test-Retest Reliability

We also want to receive the same (or very similar) scores on the questionnaires every time we

administer them to each of our participants. Just find correlation for 2 attempts responses, and correlation should be more than 0.7

Inter-Rater Reliability

Inter-rater reliability is most often used with measuring determinants directly—not with methods. Again correlation between the "raters" or "observers" has to be high.

VALIDITY = measuring what we think we are measuring,

The primary ways that we establish that our scales are valid are

- construct validity and

- content validity.

Construct Validity: Convergent and Discriminant Validity

Construct validity tells us if our scale is measuring to the construct (e.g. the trait) that it’s supposed to. There are two main ways that researchers can establish construct validity:

- Convergent Validity and

- Discriminant Validity.

Convergent Validity: To demonstrate that your questionnaire has good convergent validity, you will need to administer your questionnaire and other questionnaires that measure very similar characteristics.

Researchers usually want their questionnaires to measure a unique characteristic or to measure that characteristic better than already existing questionnaires. Hence correlation should not be too high! (r>0.9)

For Discriminant validity, researchers want to ensure that the questionnaire they developed is not related to questionnaires that measure characteristics that are unrelated to the characteristic the researcher is hoping to measure. ex: you want to measure self-esteem, it should not correlate highly to say humour ot death anxiety lol. You need to look for zero correlation in this case!

Content Validity = Face Validity

Content validity tells us if the items in the questionnaire are assessing what they are supposed

to (Haynes et al., 1995). One of the best ways to establish content validity is to establish face validity.

Face validity means that a lay person or expert in the field has reviewed the items in your questionnaire and agrees that your items seem like they would measure what you are trying to measure (Holden, 2010).

Summary

Reliability: A characteristic of a measure that demonstrates consistency

Internal consistency: A characteristic of a measure that demonstrates that all questions are measuring the same concept; this is typically reported as Cronbach’s alpha ()

Cronbach’s alpha: A statistic that we use to measure the internal consistency of a questionnaire

Test-retest reliability: A characteristic of a measure that tells a researcher how consistent the scale is over time

Inter-rater reliability: A characteristic of a measure (usually behavioral) that demonstrates the consistency of raters

Validity: A characteristic of a measure that tells us how accurate our measure is, if the measure is measuring what it is designed to measure

Construct validity: A characteristic of a measure that tells us if our scale is measuring the construct it’s supposed to. This includes convergent and discriminant validity.

Convergent validity: The extent to which a measure correlates with similar measures

Discriminant validity: The extent to which a measure does not correlate with unrelated measures

Content validity: A characteristic of a measure that tells us if our items are measuring the content they are supposed to

Face validity: Tells the researcher if, simply reading each item, the items seem to measure the characteristic they are supposed to measure

------------------------------ pg 50----

Experimental Designs

Till now we discussed, what makes a questionnaire good. Let's now discuss what makes the experiment good. The experimental method requires two basic components: an independent and dependent variable.

The experimental method is considered the best scientific method because it can provide evidence for a cause-and-effect relationship between the independent and dependent variable. To make sure that you have a good experiment, the experimenter must thwart several threats to the experiment’s internal and external validity. Mook (1983) suggested that it is critical for researchers to replicate their results. By doing this, the experimenter has strong evidence that an experiment’s results are generalizable.

There are two main types of experiments: between-subjects and within-subjects.

A between-subjects design is an experiment where groups of participants receive different experiences of the independent variable. A within-subjects design is an experiment in which each participant serves as both the experimental and control condition. Often, this means a pre-/post-test design.

Internal validity describes how well an experiment demonstrates the cause-and-effect relationship

between the independent and dependent variables.

Some of the most frequent threats to internal validity are :-

History : History refers to the events that occur between the measurements of the dependent variable in a within-subjects design. To protect against this threat, a good experimenter would make sure there is as little time as possible in between the pre- and post-test measures of the dependent variable.
Maturation : Maturation refers to changes in participants that occur over time during an experiment. To combat this threat, it is important for the experimenter to make their studies as short as possible.
Practice Effect : The practice effect is when the experimenter measures the dependent variable so often that the participants perform better on the dependent variable simply because of practice (and not the independent variable). To avoid this validity threat, researchers should keep measurements of the dependent variable to a minimum.
Reactive Measures : Reactive measures are measurements of the dependent variable that provoke the participants and result in imprecisely measuring the dependent variable. For example, questionnaires that measure participants’ sexual activity and drug use would be considered reactive measures. One way to combat this threat is to obtain a certificate of confidentiality.
Selection : Selection refers to choosing participants in a way so that our groups are not equal prior to the experiment. To fight this threat, researchers should always randomly assign participants to condition.
Mortality : It refers to drop-out rates. Need to keep them incentivized.
Demand Characteristics: To prevent the occurrence of these demand characteristics, experimenters should create scripts that are practiced and strictly followed during the experiment. Participants, in turn, may respond to questionnaires in a way that is misleading or false. This type of bias is called response bias. Response bias may be due to demand characteristics, but this bias may simply be due to the participants’ desire to present themselves favorably.

Important Steps to Protect Against These Threats

To ensure that your experiment is internally valid, it is critical for you to randomly assign your participants (in a between-group design). Also, be sure to keep the length of your study short and keep measurements of the dependent variable to a minimum. A good researcher will also practice running the experiment multiple times and follow a script so that all participants are treated similarly (except for the experience of the independent variable).

External Validity: Generalizing Your Findings

Researchers want to be able to generalize their findings from the sample they recruited to the general population, across time and place. In order to do that, researchers must keep three types of generalization in mind:

- population generalization ,

- environmental generalization , and

- temporal generalizability.

Population Generalizability : to demonstrate that the results of studies conducted only on this population can generalize to people from other races, genders, ages, and socioeconomic statuses. Social scientists should, to the extent possible, recruit from abroad swath of the population to ensure good population generalizability\

Environmental Generalizability

The ability to find the same (or very similar) results from an experiment to a situation or environment that differs from that of the original experiment is called environmental generalizability.

Temporal Generalizability

To have good temporal generalizability, you need to conduct your experiment for years and find very similar results every year.

The Statistics of Assessing Generalizability

To determine generalizability, we will need to find a similar pattern of findings across studies. The best way to assess these patterns is with a meta-analysis.

Some of the most frequent threats to internal validity are :-

Artificial Conditions - controlled labs. white rats. Convenience Sampling.

Importance of Replication.

Summary:

Experimental method: A research design that includes a manipulated independent variable (assigning participant to different experiences) and a measured dependent variable
Between-subjects design: An experiment in which groups of participants receive different experiences of the independent variable
Within-subjects design: An experiment in which the participants serve as both the experimental and control conditions
Internal validity: The extent to which an experimenter can demonstrate that the independent variable causes changes to the dependent variable
History: The events that occur between the measurements of the dependent variable in a within-subjects design
Maturation: The changes in participants that occur over time during an experiment
Practice effect: Participants perform better on the dependent variable due to multiple measurements of the dependent variable
Reactive measures: Measurements of the dependent variable that provoke the participants
Selection: Choosing participants in a way so that groups are not equal prior to the experiment
Mortality: Participants’ dropout rates that are particularly problematic if dropout rates differ between experimental conditions
Demand characteristics: The researcher leads participants to behave in a certain way in the experiment
Response bias: Participants in a research study respond in a way that presents themselves more favorably
External validity: The extent to which experimental results apply to different populations and situations
Population generalizability: The ability to apply the results of an experiment to a group of participants that is different and more encompassing than those used in the original experiment
Environmental generalizability: The ability to find the same (or very similar) results from an experiment to a situation or environment that differs from the original experiment
Temporal generalizability: The ability to find the same (or very similar) results from an experiment over time
Meta-analysis: A statistical analysis that examines the combined findings of multiple studies (published and unpublished)
Artificial conditions: A research environment, such as a laboratory, that does not look or feel like the participants’ natural environment
Convenience sampling: Recruiting participants who are convenient or easy to find and participate in research
Replication: Repeating an experiment with a new set of participants.

----------------------------

Having learned the experimental designs for data collection, we now learn how to analyze our data.

Recall that the two most common types of experimental designs are between and within subjects.

Between-subjects design refers to experiments in which people are assigned to different experimental groups and the researcher determines if there is a difference between these groups. Each group experiences completely different aspects of the independent variable, in this case social exclusion. This type of research design is often referred to as a classic experimental design because it is a frequently used experimental design.

Within-subjects design : This type of research design means that the subjects experience all or some of the aspects of the independent variable. Participants are serving as both the control and experimental condition. This is also an example of a pre-/post-test design. In this case, the dependent variable (aggression) was measured before and after the experimental intervention, making participants feel left out.

Pros and Cons for using a between-subjects design.

- between-subjects design tends to be shorter because the dependent variable is only measured once

within-subjects designs are repeated measures designs

- Order effects (~ carryover effects) are irrelevant because the order of the independent variable and dependent variable are the same for all participants. However, for a within-subjects design, this would be a major concern.