# Yes, you read that correctly and no Quandl (http://www.quandl.com/) did not pay me anything.# Quandl is a new database management tool which seeks to become the place to find datasets. They boast of having over 5×10^6 data sets available t…
Continue reading Quandl Package – 5,000,000 free datasets at the tip of your fingers!
Neophyte usually finds it difficult to crack the problem they meet when they have some data-that is not randomly collected or truncated or censored-to analysis. But in the real world, these kinds of problems always exist. This post will dig a little deeper in this area by presenting limited dependent variable’s type and relevent Stata [...]
Continue reading Crack Limited Dependent Variable’s Regression Using Stata
This issue is a bit hard, I have been reading a book for several weeks(a short time period every time), and I know the theory behind that is so complex… However, we can solve the complex problem by just several commands in STATA, so powerful a software, Ha… First, you should have the STATA software [...]
Continue reading Propensity Score Match
Have to say, that, DID, which stands for differnce in difference, one of commonly used economitrica method in program evaluation, is such an easy method. I have been deceived by it for such a long time… Here is the command of applying DID in STATA: regress EARNS Tdyear2 TREAT dyear2 OLS regresssion, with some dummy [...]
Continue reading Unveil the truth of DID
* I recently went to an interesting seminar today by Matias Cattaneo from the University of Michigan.* He was presenting some of his work on non-parametric regression discontinuity design which I found interesting.* What he was working on and the concl…
Continue reading Non-Parametric Regression Discontinuity
Stata do file* Path analysis is an interesting statistical method that can be used to indentify complex relationships beween variables and an outcome variable.* As with all statistical methods the modelling framework is essential to derive reasonable r…
Continue reading Path Analysis
William Gould on Stata’s blog (previously mentioned here
) has two great posts (here
) on the intuition behind matrices and regression coefficients. The section on near-singular matrices is characteristically nice:
Singular matrices are an extreme case of nearly singular matrices, which are the bane of my existence here at StataCorp. Here is what it means for a matrix to be nearly singular: [see figure]
Nearly singular matrices result in spaces that are heavily but not fully compressed. In nearly singular matrices, the mapping from x to y is still one-to-one, but x‘s that are far away from each other can end up having nearly equal y values. Nearly singular matrices cause finite-precision computers difficulty. Calculating y = Ax is easy enough, but to calculate the reverse transform x = A-1y means taking small differences and blowing them back up, which can be a numeric disaster in the making.
Both posts are great and I recommend them for anyone struggling with the intuition behind what exactly you’re doing when you type in reg y x.
As an added bonus, earlier this week I stumbled across Kenneth Simon’s excellent
pdf cheat sheet of Stata commands for intermediate / advanced econometrics, here
. I was trying to figure out a way to do something cute with distributed lag models and post-estimation tests, but the sheet covers everything from the simple but important (e.g., the difference between gen old = age >= 18
and gen old = age >= 18 if age<.
) to the arcane but potentially important (e.g., nonlinear hypothesis testing). If you’re in applied work and use Stata I highly
recommend flipping through it. I’ve already found several useful techniques I wasn’t even aware existed.
Continue reading Stata blog post on understanding matrices (with bonus Stata cheat sheet)