Blog (est 17Feb2003). Archived a few times since. As a Current Affairist, I comment on varied topics of interest. I welcome comments and a good debate. You can also find me on twitter/@prasunchat Care is taken that anything sourced from the internet is referenced by the URL link, where it was found. All credits to where they are due.
Wednesday, September 25, 2024
digital governance ideas
Tuesday, September 24, 2024
Qualitative Research Fundamentals - Social Sciences
Book Ref : Research Methods and Statistics for the Social Sciences: A Brief Introduction by Amber DeBono (author) (ISBN: 9781516537389)
Variable - anything that can be measured or changed in a research study is called a variable.
Independent Versus Dependent Variable
The independent variable is the factor that the researcher believes will affect an outcome. Variables that cannot be experimentally manipulated (constants), are not true independent variables.
The dependent variable is the variable that the researcher believes will be affected by the independent variable.
Categorical Versus Continuous Variables
A categorical variable is a variable in which participants belong to different groups or categories.
A continuous variable is a variable in which participants can fall anywhere on a spectrum of scores.
Important to know variables are categorical or continuous because it will affect what type of statistical analysis they will use to analyze their data.
Hypothesis
Null hypothesis always states that your independent variable will have no effect on the dependent variable. We test the null hypothesis with statistical analyses in order to find support for our research hypothesis. We call this null hypothesis testing.
Example
- Research Question : Is [Group1] more likely than [Group2].
- Research Hypothesis : If [condition], then [Group1] [consequence] more likely than [Group2 consequence].
- Null Hypothesis : If [condition], then [Group1] [consequence] will be equally likely to [Group2 consequence].
Directional Versus Non-Directional Research Hypothesis
Directional research hypotheses always predict that one group will be higher than the other on the dependent variable.
Nondirectional research hypotheses are written in a way that differences are predicted between the groups, but the researcher isn’t sure which group will be higher than the other.
Example.
- Directional Research Hypothesis: If [condition], then [Group1] [consequence] more likely than [Group2 consequence].
- Nondirectional Research Hypothesis: If [condition], then [Group1] [consequence] more or less likely than [Group2] [consequence].
Statistics only work when the data is "normal". If there is a skew, then it has to be transformed into a normal distribution,
Probability and Null Hypothesis Testing: p < .05
The most important probability is p = .05. Converted into a percentage this would be a probability of 5%.. Typically, social scientists test the null hypothesis. They do not want the null hypothesis to be true. When they calculate their statistical tests on the null hypothesis, they want the probability of the null hypothesis to be true in their dataset to be less than 5% (p < .05). That’s a pretty low probability. By demonstrating how unlikely it is that the null hypothesis is true in the dataset, the researchers have evidence to support their research hypothesis. if p>.05, improve your test design like Increase your sample size.
Descriptive Versus Inferential Statistics
Descriptive variables describe the data ex demographics. Inferential variables infer the relationship between the variables ex height vs weight taller person may weigh more , ofc this may not be true in all cases.
Raw Scores are the actual data before we transform or standardize them. when we count occurence of a raw number it is called frequency.
Statistical Tests
You are always testing the null hypothesis (which assumes no relation between variables). Your aim is to prove p < 0.05.
--------------------------------------
Central Tendency (Mean, Median, Mode) vs Spread (Variance)
Conceptually, formula for variance is a simple fraction. That means that the bigger the numerator is, and the smaller the denominator is, the larger your variance. When we look at the formula, this means that the more your scores are different from the mean and the fewer participants you have, the larger your variance will be. Likewise, the closer your scores are to the mean and the more participants you have, the smaller your variance will be. Typically, we want our variance to be small. This is the reason we aim for a higher sample size.
The standard deviation tells us how much our scores vary as a whole. That is, smaller standard deviations tell us that the scores aren’t very different from each other, whereas larger standard deviations tell us that the scores are very different from each other.
----------------------------------------
Z-Scores (Standardized Scores)
Raw scores are the actual scores that you get from your participants.
Standardized scores are very useful because they tell you how many standard deviations your raw score is from the mean. The sign of Z-score indicates whether the participant is below or above the mean, and the Magnitude tells us if it is an outlier. They are also important for calculating correlations.
Recall that a nondirectional research hypothesis will have no specific prediction for the relationship between the research variables, whereas a directional research hypothesis will predict a very specific relationship between the research variables.
When we have a directional research hypothesis, we should conduct a one-tailed test.
Ex if our hypothesis is Loneliness increases Depression, we want our statistic to be in top 5% of distribution. so we test the top-tail,
But if our hypothesis is Loneliness affects Depression (we don't know), we test the low 2.5% and high 2.5%.
Journals may want researchers to still conduct 2-tail test in directional hypothesis too, because they don't want to take your word about pre-conceived directions!
Z-Test: Your First Statistical Test
This test is to know if a z-score is substantially different from the mean. Also called Statistical Significance. This means the the Z-score should occur (p<0.05) in your dataset.
so, 2-tailed Z-test would be conducted to publish for a journal, or if you aren't sure about the direction. :-) Two tail Z-test : See if the z-score is above +1.96 or below -1.96 [Critical Values].
In a bell curve, top 2.5% means z-score of +1.96. bottom 2.5% means z-score of -1.96
This means that if you get a z-score below -1.96 or above +1.96, it is very unlikely that the Null Hypothesis is true. You win!!
<<Whenever a researcher calculates a z-score below -1.96 or above +1.96, the researcher can reject the null hypothesis and claim the score demonstrates evidence to support the research hypothesis.>>
One-Tailed Z-test:
For a one-tailed z-test (when you have a directional research hypothesis and you’re not going to publish your results), you would need to see if your z-score is above +1.64, if your research hypothesis predicts that your score is higher than the mean, or below -1.64, if your research hypothesis predicts that your score is below the mean. Statistical significance is a good thing.
When we have statistical significance, this means we have evidence to support our research hypothesis.
Correlational Research Design
This is a type of research study that examines how two or more variables are related to each other.
A positive correlation is when two variables are related in a way that as one variable increases, the other also increases. This must also logically mean that as scores on one variable decrease, the other one also decreases.
A negative correlation is when one variable increases, another variable decreases.
Positive and negative, in terms of correlations, simply explain whether we expect the variables to correspond in the same (positive) or opposite (negative) direction.
This is best suited when you can't change a variable, or don't have the resources (time etc) to do so. Like trying to test correlation between Self Esteem and Exam Scores of students. You can't change the self esteem in an experiment.
Correlational designs are also an excellent way to replicate the effects from another study. Indeed, if the same effect can be found using multiple research methodologies, this can be some very powerful evidence for your hypotheses!
CO-RELATION IS NOT CAUSATION!
Calculating Correlations: Pearson’s r
------------------------------------
The experimental method is considered the best scientific method because it can provide evidence for a cause-and-effect relationship between the independent and dependent variable. To make sure that you have a good experiment, the experimenter must thwart several threats to the experiment’s internal and external validity. Mook (1983) suggested that it is critical for researchers to replicate their results. By doing this, the experimenter has strong evidence that an experiment’s results are generalizable.
- History : History refers to the events that occur between the measurements of the dependent variable in a within-subjects design. To protect against this threat, a good experimenter would make sure there is as little time as possible in between the pre- and post-test measures of the dependent variable.
- Maturation : Maturation refers to changes in participants that occur over time during an experiment. To combat this threat, it is important for the experimenter to make their studies as short as possible.
- Practice Effect : The practice effect is when the experimenter measures the dependent variable so often that the participants perform better on the dependent variable simply because of practice (and not the independent variable). To avoid this validity threat, researchers should keep measurements of the dependent variable to a minimum.
- Reactive Measures : Reactive measures are measurements of the dependent variable that provoke the participants and result in imprecisely measuring the dependent variable. For example, questionnaires that measure participants’ sexual activity and drug use would be considered reactive measures. One way to combat this threat is to obtain a certificate of confidentiality.
- Selection : Selection refers to choosing participants in a way so that our groups are not equal prior to the experiment. To fight this threat, researchers should always randomly assign participants to condition.
- Mortality : It refers to drop-out rates. Need to keep them incentivized.
- Demand Characteristics: To prevent the occurrence of these demand characteristics, experimenters should create scripts that are practiced and strictly followed during the experiment. Participants, in turn, may respond to questionnaires in a way that is misleading or false. This type of bias is called response bias. Response bias may be due to demand characteristics, but this bias may simply be due to the participants’ desire to present themselves favorably.
To ensure that your experiment is internally valid, it is critical for you to randomly assign your participants (in a between-group design). Also, be sure to keep the length of your study short and keep measurements of the dependent variable to a minimum. A good researcher will also practice running the experiment multiple times and follow a script so that all participants are treated similarly (except for the experience of the independent variable).
- Experimental method: A research design that includes a manipulated independent variable (assigning participant to different experiences) and a measured dependent variable
- Between-subjects design: An experiment in which groups of participants receive different experiences of the independent variable
- Within-subjects design: An experiment in which the participants serve as both the experimental and control conditions
- Internal validity: The extent to which an experimenter can demonstrate that the independent variable causes changes to the dependent variable
- History: The events that occur between the measurements of the dependent variable in a within-subjects design
- Maturation: The changes in participants that occur over time during an experiment
- Practice effect: Participants perform better on the dependent variable due to multiple measurements of the dependent variable
- Reactive measures: Measurements of the dependent variable that provoke the participants
- Selection: Choosing participants in a way so that groups are not equal prior to the experiment
- Mortality: Participants’ dropout rates that are particularly problematic if dropout rates differ between experimental conditions
- Demand characteristics: The researcher leads participants to behave in a certain way in the experiment
- Response bias: Participants in a research study respond in a way that presents themselves more favorably
- External validity: The extent to which experimental results apply to different populations and situations
- Population generalizability: The ability to apply the results of an experiment to a group of participants that is different and more encompassing than those used in the original experiment
- Environmental generalizability: The ability to find the same (or very similar) results from an experiment to a situation or environment that differs from the original experiment
- Temporal generalizability: The ability to find the same (or very similar) results from an experiment over time
- Meta-analysis: A statistical analysis that examines the combined findings of multiple studies (published and unpublished)
- Artificial conditions: A research environment, such as a laboratory, that does not look or feel like the participants’ natural environment
- Convenience sampling: Recruiting participants who are convenient or easy to find and participate in research
- Replication: Repeating an experiment with a new set of participants.