  2. Cox's proportional hazards model - Commonest approach to model survival or time to event data. - It's analogous to a multiple regression model, and tests the difference between survival times of particular group, while allowing for other factors. - In this model, the dependent variable is the 'hazard', which is the probability of dying (or experiencing the event in question), given that patients have survived up to a given point in time, or the risk for death at that moment. - No assumption is made about the probability distribution of the hazard. - However, it is assumed that if the risk for dying at a particular point in time in one group is, say, twice that in the other group, then at any other time it will still be twice that in the other group. In other words, the hazard ratio does not depend on time. The hazard of failure in one group is a constant ratio (over time) of the hazard of failure in the other group. Log rank test does not assume proportional hazards per se. - Used to compare 2 survival curves, and tests whether there's a difference between the survival times of different groups. However, it does not allow other explanatory variables to be taken into account. - It's used to test the null hypothesis that "there is no difference between the population survival curves" (i.e. the probability of an event occurring at any time point is the same for each population). - It's the most powerful for detecting alternative hypotheses in which the hazards are proportional. Quick Reference:
  3. My shortcut to remember subtypes of criterion/construct validity: CONstruct validity: CONvergent & divergent Criterion validity: Concurrent & predictive
  4. I remember them this way and I think it's easier to understand. Sensitivity (e.g. of a screening test) Among those with the disease, how many will be correctly screened as positive? Specificity Among those without the disease, how many will be correctly screened as negative? Positive Predictive Value Among those who were screened positive, how many actually have the disease? Negative Predictive Value Among those who were screened negative, how many actually don't have the disease? Hope this helps
  5. Performance bias - Happens when one group of subjects in an experiment (e.g., a control group or a treatment group) gets more attention from investigators than another group. - Can also refer to the fact that participants can change their responses if they know which group they are allocated in. (A set of Hawthorne effect) Observer bias (also called experimenter bias or research bias) - Tendency to see what we expect to see, or what we want to see. When a researcher studies a certain group, they usually come to an experiment with prior knowledge and subjective feelings about the group being studied.
  7. Hi hi I know I'm 6 years late but just wanna try answering the question Chi-square is really just a special case of logistic regression, and this is analogous to the relationship between ANOVA & regression. (Ref: Chi-square contingency analysis: - Independent variable is dichotomous - Dependent variable is dichotomous - Purpose: Used to determine whether there's significant difference between expected vs observed frequencies in one or more categories. Logistic regression is a more general analysis, because: - Independent variable is not necessarily dichotomous, and can have >1 independent variable. - Dependent variable (outcome) is dichotomous - Purpose: Predicts value of a dichotomous dependent variable (outcome) by using 1 independent variable & a constant. - E.g. How does the probability of getting lung cancer (yes vs. no) change for every additional pound a person is overweight and for every pack of cigarettes smoked per day? Here, outcome is dichotomous (gets lung cancer vs not getting lung cancer); there're >1 independent variables (weight & packs of cigarettes smoked per day). References:
  8. General rule: Positive Predictive Value (PPV) increases with increasing prevalence. Negative Predictive Value (NPV) reduces with increasing prevalence. In this case, urban had higher prevalence. Thus, urban should have higher PPV, lower NPV. Rural should have lower PPV (option A is correct), higher NPV. Explanation: PPV = Out of those who were tested positive, how many actually had the disease? Thus, prevalence affects the PPV calculation. Sensitivity = Among those with disease, how many will be tested positive? So it doesn't actually matter how many ppl have the disease (prevalence is not important). The important point is, how many ppl will be tested positive.
