Sunday, February 7, 2016

Counterfactual reasoning: A logical fallacy?

You must have heard the counterfactual argument a thousand times: if the program had not occurred, the effect would not have occurred as well.  Let us dissect the logical form of the argument.

In symbolic logic, the counterfactual argument has this structure:
If Program X is implemented, then Y happens (X  Y)
Program X is not implemented (~X)
Therefore, Y does not happen. ( ~Y)

This reasoning is fallacious.  It has even a name which is denying the antecedent.   Michael Scriven calls this overdetermination in his debate with Tom Cook.  This is a logical fallacy because even if X is false (~X), Y can be true (~~Y).  As can be seen in the truth table below, a false antecedent (X) always yields a true logical relationship (X ⊃ Y) regardless of the value of the consequent (Y).

Truth Table

 X Y X ⊃ 
TTT
TFF
FTT
FFT

Why is this so? Because in real world, it is not only Program X that produces Y.  Other programs or causes can produce similar result. Suppose a donor claimed that because of the financial assistance to farmers, the household income of program beneficiaries increased.  This reasoning failed to consider that the household income will increase even without the financial assistance because of other contributing factors such as improving economy, better weather conditions, other donors, etc.

Saturday, January 30, 2016

Why bother searching for unintended effects in social programs?

I had this funny encounter with a researcher from a prestigious institution that advocates the use of experimental approach in evaluating social programs.  Using their argument that social programs are like medicines whose impacts have to be evaluated using the best possible research design, I asked for the place of unintended effects within the framework of experimentation.   Yup, it is not only drugs that have intended and unintended effects but also social programs.   For example, research studies on pay for performance policies have consistently found out that these policies result to perverse unexpected behaviors such as narrowly focusing on what are being measured and the gaming of performance metrics.

The answer given was to specify that unintended outcome beforehand and measure it before and after the experimentation.  This answer fits perfectly to the description of Thomas Kuhn on ‘normal science’.  Within prevailing paradigm, the typical response when encountered with something outside of that paradigm is to fit it within its boundaries; to determine the program’s unintended effects is to anticipate it as an intended effect.  You don’t find unintended effects by making conjectures – that is what intended effects for.  You find them by discovery and keen perception.  

Benefits of learning unintended effects
It is important that every monitoring and evaluation plan should include strategy for capturing the program’s unintended effects, positive or negative. In medicine, many of the drugs that we use were discovered because of their unintended effects.  For example, do you know that the original objective of Viagra was to treat heart problem?  And skin whitening is actually a side-effect of glutathione.  How about the interesting story of Thalidomine which was intended to be an antibiotic? This drug was shelved due to its harmful side-effects to pregnant women only to be revived few months later because of its anti-cancer properties.

In program evaluation, it might be possible that even though the program was successful in achieving its goals and targets, it might have triggered undesirable behaviors that off-set what otherwise would have been exemplary program performance.  Conversely, there might be unintended effects that are desirable and worthwhile to replicate through the introduction of similar social program.  Also, learning more about unintended effects of program allows one to understand the causal mechanisms working within the context.  These causal mechanisms sometimes work for or against the program goals.  I remember an example given by an African classmate of a day care center constructed on top of a burial site regarded as sacred by community members.   Because of cultural belief on the dead (the causal mechanism), nobody wants to go inside the building and the day care didn’t produce the intended result. Rather, its construction ignited animosity between the local community and central government.  

In search for a method
How do you evaluate unintended effects? Leading evaluation theorist, Michael Scriven, proposes the use of Goal-Free Evaluation.  As to what this is, that is an interesting material for another blog.

Friday, January 22, 2016

Counterfactual, clone, match and other similar terms

A couple of weeks ago, I was invited by the ASEAN Training Center on Preventive Drug Education to train them on Impact Evaluation as I was informed that the trend on this area is the use of randomized control trial (RCT) as the gold standard for evaluation design.  I went beyond that as I found myself talking also about various quasi-experimental methods as these suffices in the event that RCT is not viable or feasible design.

I started my talk with the concept of counterfactual as this is an indispensable element of impact estimation.  Because the audience was relatively new to monitoring and evaluation, I use layman's terms so that the notion of counterfactual is understood.  Among the words close to the sense of counterfactual are the following:

Control Group
Non-beneficiary
What if situation
Status Quo
Clone
Twin
Match
Equivalent
Nearest Neighbor
Comparison Group
Similar
Identical
Look-alike

Can you think of other words for counterfactual?

Friday, January 15, 2016

Stata package for Interrupted Time Series Impact Evaluation Design

I heard this design in my 25 unit course on impact evaluation and immediately got interested with it. The idea is simple.  Basically, in the absence of an intervention (which is the interruption in the time series data set), the outcome variable would just follow the trend.  Sort of like, history repeating itself.  However, with an intervention, there would be marked interruption in the trend.

How do you implement interrupted time series?  I recently found that Stata has a user developed command packaged called itsa (Interrupted Time Series Analysis).  

According to the package description, "itsa estimates the effect of an intervention when the outcome variable is ordered as a time series, and a number of observations are available in both pre- and post-intervention periods. The study design is generally referred to as an interrupted time series because the intervention is expected to interrupt the level and/or trend subsequent to its introduction (Campbell & Stanley, 1966; Glass, Willson & Gottman, 1975; Shadish, Cook & Campbell, 2002)"

Let us try this lovely Stata command.  Type the following in your command panel:

1. sysuse cigsales_single, clear // this opens the data set
2. tsset year // in stata, it is important to declare the dataset as time series
3. itsa cigsale, single trperiod(1989) lag(1) fig posttrend //this implements the itsa, the single trperiod (1989) is the introduction of the intervention, while fig creates the graph while posttrend computes for post-intervention linear trend.




Saturday, January 2, 2016

Text Mining in R for Newbies (like me!)

Lately, I got attracted to learn R (an open-source statistical program available at https://www.r-project.org/) because of text mining.  Stata also a textual analysis package called txttool but I haven't studied it yet.  

So over the holidays, I undertook a project to analyze textual documents.  Data mining of textual documents is on the rise because this area provides insights to your research if you want to surface dominant themes.  I did some content analysis before using Nvivo at University of Melbourne and even with Nvivo I found it a lot of challenge to analyze qualitative data.  So, let's begin with our small project.

Step 1: Install R (go to the link I above)

Step 2: Install RStudio (a user interface of R that is very helpful for newbies)
This is how the RStudio looks like.



Step 3: Install packages
On the left top most panel, type the following:

install.packages(‘tm’, dependencies=TRUE)
install.packages(‘wordcloud’, dependencies=TRUE)

tm (textmining) and wordcloud packages allow you to mine the text in any document and then convert the most frequent text into word clouds. 

Step 4:  Save your documents to text files using your notepad.  I suggest that you create a folder for the purpose of this project.  In my case, I saved the text files on C:/Users/grace/Desktop/txtmining

Step 5: Type the codes
These are not the best ever codes for text mining.  These codes were also borrowed from other R blogsites.  I only selected those codes that provides a simple solution to my problem of mining text documents.  

library(wordcloud)
library(tm)
## You have to upload the packages as library so that you can use them.

setwd("C:/Users/grace/Desktop/txtmining")
## This sets the working director

txtdata <- Corpus (DirSource("C://Users/grace/Desktop/txtmining"))
inspect(txtdata)

txtdata <- tm_map(txtdata, stripWhitespace)
## This removes blank spaces

txtdata <- tm_map(txtdata, content_transformer(tolower))
## This transforms uppercase to lowercase (e.g. 'DEPED' to 'deped')

txtdata <- tm_map(txtdata, removeWords, stopwords('en'))
## This removes words that are not necessary to your analysis (e.g. is, are, shall, in, the, etc)

wordcloud(txtdata)
## shows the word cloud of your texts