Useful command for stata



Maximum Likelihood (write your own commands)

Moving Averages & Data Cleaning

Data Imputation
Unobserved Effect Models

Instrumental Variables
 Many Forms of Instrumental Variables
 The Weak Instrument Problem
 Testing the Rank Condition of IV Estimators
 2SQreg IVqreg Cfqreg – zombies

Censored Regression
Dependent variable is bottom coded 
Asymmetric Error with Right and Left Sensoring
Tobit Normality Assumption Fail – Tobit Still Works
Cragg’s Double hurdle model used to explain censoring

Quantile Regression
Program your own quantile regression v1 – Maximum Likelihood 
Quantile Regression Fail
Oh wait! Quantile regression wins!
 2SQreg IVqreg Cfqreg – zombies
 Quantile Regression (qreg) is invariant to non-decreasing transformations

Random Coefficients
 Estimating Random Coefficients on X (using xtmixed)
Estimating Random Coefficients on X (using NormalReg -preferred)

Binary Response Models
My Own Tools (No Theory Backing Them)
Breaking down R2 by variable
Sticky Probit – clustered bootstrapped standard errors
 Non-Parametric PDF Fit Test
 Test of PDF Fit – Does not workPower Analysis
Power Analysis with Non-Linear Least Squares: A Simulation Approach

Miscellaneous Topics

Special Topics

Value-added modelling iPad example


External Resources

Courses with Public Material


Wooldridge: Econometric Analysis of Cross Section and Panel Data (2nd Edition)
Greene: Econometric Analysis (7th Edition)

Germán Rodríguez – Stata Tutorial – Princeton
Stata Programming Essential – University of Wisconsin

Data Sources

Inter-University Consortium for Political and Social Research – This organization has a large collection of data that have been formatted and contributed by many researchers.  There are thousands of data files accessible to researchers.  However, you must be affiliated with an organization that is a member to access the resources. – This site provides a tremendous resource (5,000,000 datasets) which organizes the data from publicly available sites.  They provide packages in R and Stata which allow you to in a single command access and download data files.

IPUMS – This organization assembles sampling of census years for the US and other countries.  It is a vital resource for many research projects.

Economic Blog Aggregator at the federal reserve is a blog aggregator that “looks through blog posts for a link to some research indexed on a RePEc service, currently EconPapers, IDEAS and NEP. IDEAS then also links back from the abstract page to the blog posts.”

Economics Papers

RePEc (Research Papers in Economics) is a collaborative effort of hundreds of volunteers in 75 countries to enhance the dissemination of research in Economics and related sciences.

Effective Resume Writing

Effective Resume Writing
A lot of places around the world call it A Curriculum Vitae, in North America, it’s a Résumé. This is definitely one of the most important tools that any jobseeker has at their disposal. You may be THE best candidate for a particular job by a long way, however, if you don’t make it to the interview stages the company will never know.

Many companies (especially the larger corporations) will use computer software to “read” all the résumé’s and reject any that don’t fit a particular template. This may seem unfair, but it’s cost effective.

So, for some jobs you have to beat the computer and still read well enough for someone who may not have any knowledge of the position you are applying for. It is definitely worthwhile to adapt your resume for the position that is advertised. There may well be some of the “buzz” words the “filter” is looking for mentioned in the job description.

It is very important that you can substantiate all the claims you make, preferably with physical examples or letters. This will be essential in any in interview situation.

There is now a wealth of information available online, from books, local employment offices and with professional writing agencies. You can also access other people’s résumé’s that are posted online which will give some great ideas for style and content.

Professional writers may seem the answer, but, all the research I have done seems to lean away from them. I have never used one and feel that it will give a good impression if you have written it yourself (this will display literacy). Apparently, if they are professionally written, they are easy to spot; however, they may be worth the expense if you are stuck. You can always “customise” what has been written to make it your own work.

In my case, I had been in the military since I left school and had never written a resume or had an interview. I spent a lot of time writing, copying other people’s styles and changing things. I didn’t realise how difficult it is to catch up on 16 years – I’ll never allow mine to go out of date again! I found the hardest part was to actually start writing. The best advice I was given was to just write anything that you can think of and it will soon start to flow. With modern word processors it’s relatively quick and easy to cut and paste so you can keep on changing it until you are happy. More detailed information can be found at my website

Good Luck!!!!!


Economics professor Mehmet Caner arrived at Ohio State in August as part of a group of newly hired faculty affiliated with Translational Data Analytics @ Ohio State, formed to create and apply data-based solutions to global challenges. He holds a courtesy appointment in the Department of Statistics.

An econometrician, Caner’s research focus is on econometric theory related to big data problems and international finance. His recent research emphasized high dimensional econometrics, which specializes in big data related estimation, testing.

Prior to coming to Ohio State, Caner held the position of Thurman-Raytheon Distinguished Professor of Economics at North Carolina State University.

In 2010, Caner, along with Thomas Grennes (North Carolina State University), and Friederike Köhler-Geib (The World Bank), published a paper, Finding the Tipping Point – When Sovereign Debt Turns Bad, in the book, Sovereign Debt and Financial Crisis, examining the debt-growth relationship for dataset of 99 developing and developed economies from 1980 to 2008.

Caner and his colleagues addressed the questions of whether a tipping point in public debt exists and if so, how severe would the impact of public debt be on growth beyond this threshold? What happens if debt stays above this threshold for an extended period of time?

Their findings: The economy loses .017 percent in growth for every percent of the public debt-to-GDP ratio above 77 percent. The effect is even more pronounced in emerging markets where the threshold is 64 percent debt-to-GDP ratio. In these countries, the loss in annual real growth with each additional percentage point in public debt amounts to 0.02 percentage points. The cumulative effect on real GDP could be substantial.

Their work was cited by The Economist in response to earlier published findings on debt-to-GDP ratio by Carmen Reinhart and Kenneth Rogoff.

So what do we need to know about big data and economics?

“Data sets are incredibly larger now than 10 years ago,” said Caner. “Whereas years ago we worked with two to three variables . . . now we are faced with 70. The tools we’ve been using to build models don’t suffice anymore; we need a new technique.”

By way of example, Caner points out that the method of least squares — the standard approach in regression analysis — presupposes that the number of variables is very limited.

“Least squares may not be optimal in non-linear problems or in forecasting,” explained Caner.

The LASSO (Least Absolute Shrinkage and Selection Operator) regression method — a relatively new tool — may revolutionize forecasting and that is something Caner is very excited about.

“Big data is high dimensional and may have tremendous implications for big policy.”

Currently, Caner is contemplating a big data project: forecasting the euro to dollar exchange rate. In the meantime, he has several papers in process and is teaching two PhD courses – one in high dimensional econometrics.

Caner is associate editor of the Journal of Econometrics where he is also a fellow. He is associate editor of the Journal of Business and Economics Statistics; Econometric Reviews; and Studies in Nonlinear Dynamics and Econometrics. He has published more than 30 articles, in journals such as Econometrica; Journal of Econometrics; Econometric Theory; Journal of Business and Economic Statistics; Journal of International Money and Finance; and World Economy.

Caner earned an MA and a PhD in economics at Brown University. He received his BS in business administration at Middle East Technical University (METU), Ankara, Turkey.

Machine Learning for Economists: An Introduction

Machine Learning for Economists: An Introduction


A crash course for economists who would like to learn machine learning.

Why should economists bother at all? Machine learning (ML) generally outperforms econometrics in predictions. And that is why ML is becoming more popular in operations, where econometrics’ advantage in tractability is less valuable. So it’s worth knowing the both, and choose the approach that suits your goals best.

An Introduction

These articles have been written by economists for economists. Other readers may not appreciate constant references to economic analysis and should start from the next section.

  1. Athey, Susan, and Guido Imbens. “NBER Lectures on Machine Learning,” 2015. A shortcut from econometrics to machine learning. Key principles and algorithms. Comparative performance of ML.
  2. Varian, “Big Data: New Tricks for Econometrics.” Some ML algorithms and new sources of data.
  3. Einav and Levin, “The Data Revolution and Economic Analysis.” Mostly about new data.


Practical applications get little publicity, especially if they are successful. But these materials do give an impression what the field is about.


  1. Bloomberg and Flowers, “NYC Analytics.” NYC Mayor’s Office of Data Analysis describes their data management system and improvements in operations.
  2. UK Government, Tax Agent Segmentation.
  3., Applications. Some are ML-based.
  4. StackExchange, Applications.

Governments use ML sparingly. Developers emphasize open data more than algorithms.


  1. Kaggle, Data Science Use cases. An outline of business applications. Few companies have the data to implement these things.
  2. Kaggle, Competitions. (Make sure you chose “All Competitions” and then “Completed”.) Each competition has a leaderboard. When users publish their solutions on GitHub, you can find links to these solutions on the leaderboard.

Industrial solutions are more powerful and complex than these examples, but they are not publicly available. Data-driven companies post some details about this work in their blogs.

Emerging applications

Various prediction and classification problems. For ML research, see the last section.

  1. Stanford’s CS229 Course, Student projects. See “Recent years’ projects.” Hundreds of short papers.
  2. CMU ML Department, Student projects. More advanced problems, compared to CS229.


A tree of ML algorithms:


Econometricians may check the math behind the algorithms and find it familiar. Mathematical background:

  1. Hastie, Tibshirani, and Friedman, The Elements of Statistical Learning. Standard reference. More formal approach. [free copy]
  2. James et al., An Introduction to Statistical Learning. Another standard reference by the same authors. More practical approach with coding. [free copy]
  3. Kaggle, Metrics. ML problems are all about minimizing prediction errors. These are various definitions of errors.
  4. (optional) Mitchell, Machine Learning. Close to Hastie, Tibshirani, and Friedman.

For what makes ML different from econometrics, see chapters “Model Assessment and Selection” and “Model Inference and Averaging” in The Elements.

Handy cheat sheets by KDnuggets, Microsoft, and Emanuel Ferm. Also this guideline:


Software and Hardware

Stata does not support many ML algorithms. Its counterpart in the ML community is R. R is a language, so you’ll need more tools to make it work:

  1. RStudio. A standard coding environment. Similar to Stata.
  2. CRAN packages for ML.
  3. James et al., An Introduction to Statistical Learning. This text introduces readers to R. Again, it is available for free.

Python is the closest alternative to R. Packages “scikit-learn” and “statsmodels” do ML in Python.

If your datasets and computations get heavier, you can run code on virtual servers by Google and Amazon. They have ML-ready instances that execute code faster. It takes a few minutes to set up one.


I limited this survey to economic applications. Other applications of ML include computer vision, speech recognition, and artificial intelligence.

The advantage of ML approaches (like neural networks and random forest) over econometrics (linear and logistic regressions) is substantial in these non-economic applications.

Economic systems often have linear properties, so ML is less impressive here. Nonetheless, it does predict things better, and more of practical solutions get done in the ML way.

Research in Machine Learning

  1. arXiv, Machine Learning. Drafts of important papers appear here first. Then they got published in journals.
  2. CS journals. Applied ML research also appear in engineering journals.
  3. CS departments. For example: CMU ML Department, PhD dissertations.

Econometrics and “Big Data”


Econometrics and “Big Data”

In this age of “big data” there’s a whole new language that econometricians need to learn. Its origins are somewhat diverse – the fields of statistics, data-mining, machine learning, and that nebulous area called “data science”.
What do you know about such things as:
  • Decision trees
  • Support vector machines
  • Neural nets
  • Deep learning
  • Classification and regression trees
  • Random forests
  • Penalized regression (e.g., the lasso, lars, and elastic nets)
  • Boosting
  • Bagging
  • Spike and slab regression?
Probably not enough!
If you want some motivation to rectify things, a recent paper by Hal Varian will do the trick. It’s titled, “Big Data: New Tricks for Econometrics”, and you can download it from here. Hal provides an extremely readable introduction to several of these topics.
He also offers a valuable piece of advice:

“I believe that these methods have a lot to offer and should be more widely known and used by economists. In fact, my standard advice to graduate students these days is ‘go to the computer science department and take a class in machine learning’.”

Interestingly, my son (a computer science grad.) “audited” my classes on Bayesian econometrics when he was taking machine learning courses. He assured me that this was worthwhile – and I think he meant it! Apparently there’s the potential for synergies in both directions.

Big Data Driving New Approaches in Econometrics

Data is finance’s new currency, healthcare’s latest wonder drug, and the energy sector’s new oil.

Another day, another Big Data analogy.

All of the hype doesn’t change the fact that businesses across nearly every industry are gaining competitive advantage by extracting value from large datasets.

Econometrics is an area that has been cautious about Big Data. The field is built on a strong foundation of theory and methodology, and relies on a variety of approaches that differ significantly from those of Big Data analytics. For example, econometrics typically starts with a theory and then uses data analysis to prove or disprove it, while Big Data and machine learning work in reverse. Econometricians have also expressed concerns regarding the context, reliability and representativeness of such vast datasets.

However, it’s becoming clear that Big Data has the potential to be disruptive to traditional econometrics. Data collection over social sources has produced unprecedentedly large and complex datasets about human behavior and interaction, and this unstructured data has proven itself to be a goldmine of economic information.

Econometricians are certainly not strangers to data analysis; however the growing volume of economic data from diverse sources is driving the need to adopt new computational approaches and develop better data manipulation tools.

Econometricians entering the field today also face a bit of a learning curve, and find they require a combination of skills in both economics and computer science to deal with the increasing volume, variety, and velocity of data. Hal Varian, Chief Economist at Google offers this word of advice to current students of econometrics: “Go to the computer science department and take a class in machine learning.”

While econometricians might still be working out the “kinks” in their Big Data approaches, the analysis of large datasets is already driving a number of advancements across the field:

  • Over a two-year period, researchers analyzed millions of transactions among nearly 128,000 consumers of a packaged goods chain to determine whether characteristics of the first purchasers of a new product had any impact on that product’s long-term success. Customer characteristics and purchasing behavior were processed to reveal a small subset of customers they referred to as “harbingers of failure,” who had a propensity to buy new products that were likely to flop. The data also helped researchers quickly reveal hard-to-find patterns among consumer groups and challenge traditional early indicators of product success.
  • Researchers have used Big Data to analyze investor behavior and its eventual effect on stock market performance. By collecting internet usage data, researchers could pinpoint investor’s attempts to gather information online before executing a trading decision. The resulting data allowed researchers to trace the consequences of specific investor behavior, as well as offered predictions regarding stock market performance and new insight into the early information-gathering stages of decision-making.
  • MIT’s Billion Prices Project (BPP) aggregates daily price fluctuations of approximately five million items sold by 300 online retailers in more than 70 countries to provide real-time predictions on inflation. While traditionally econometricians have been forced to rely on historical data to generate future predictions, data sources are now available in real-time to help identify economic trends as they are occurring.

Machine learning by its very definition has the potential to rapidly alter the field of econometrics. The ability of computers to develop pattern recognition, and then learn from and make predictions based on data is a familiar task for econometricians, who on a daily basis analyze tremendously large volumes of economic data in order to form theories.

As Big Data continues to penetrate the methods of econometrics, the field will need to adopt new computational tools and approaches in order to extract insight from these increasingly large and complex economic datasets. The granularity offered by Big Data will enable econometricians to adopt new data-driven styles of analysis and investigation to help them resolve their biggest economic questions.

Stockholm School of Economics

The Stockholm School of Economics (SSE) is the leading business school in Northern Europe. For more than a century, SSE has educated talented women and men for leading positions within the business community and the public sector. SSE offers bachelors and masters degree programs along with highly regarded PhD, MBA and executive education programs.

SSE has earned a reputation for excellence both in Sweden and around the world. The School is accredited by EQUIS (European Quality Improvement System) certifying that all of its main activities, teaching as well as research, are of the highest international standards. SSE is also the Swedish member institution of CEMS (The Global Alliance in Management Education) and PIM (Partnership in International Management).

a vibrant place to live, to learn, to work, and to explore.

Harvard College was established in 1636 by vote of the Great and General Court of Massachusetts Bay Colony, and was named for its first benefactor, John Harvard of Charlestown. Harvard is America’s oldest institution of higher learning, founded 140 years before the Declaration of Independence was signed. The University has grown from nine students with a single master to an enrollment of more than 18,000 degree candidates, including undergraduates and students in 10 principal academic units. An additional 13,000 students are enrolled in one or more courses in the Harvard Extension School. Over 14,000 people work at Harvard, including more than 2, 000 faculty. There are also 7,000 faculty appointments in affiliated teaching hospitals. Our mission, to advance new ideas and promote enduring knowledge, has kept the University young. We strive to create an academic environment in which outstanding students and scholars from around the world are continually challenged and inspired to do their best possible work. It is Harvard’s collective efforts that make this university such a vibrant place to live, to learn, to work, and to explore.

Econ PhD programs over the world.

When you deciede to pursuit an Econ PhD, you should know the following facts that hidden behind the procedure.

Nowadays, you can using internet to discover the world’s top universities for economics & econometrics. The rankings highlight the world’s top universities in 36 individual subjects, based on academic reputation, employer reputation and research impact . Use the interactive table to sort the results by location or performance indicator, and to access more details about the universities you’re interested in.

London School of Economics and Political Science (LSE) is a leading public university in the UK capital, renowned worldwide for its leadership in the social sciences. Known for its excellence in both research and teaching, the university and its graduates make a significant contribution to global policy and debate.

Ranked 35th overall in the QS World University Rankings® 2015/16, LSE is recognized among the world’s very best across a range of academic disciplines. In the 2016 edition of the QS World University Rankings by Subject, LSE ranks within the global top 10 for social policy, development studies, politics, communication and media studies, anthropology, accounting and finance, geography, history, philosophy, law, economics, and business and management studies – with positions in the top 50 for psychology and statistics.

The university’s central London campus brings together staff and students from all over the world, offering a truly international environment. Students enjoy close proximity to world-class facilities such as the British Library of Political and Economic Science, alongside LSE’s Language Centre, and a vibrant Students’ Union. Graduates can look forward to excellent career opportunities, joining a worldwide network of prestigious alumni.

LSE makes over £15.5 million of need- and merit-based financial aid available to its students each year. Awards range from a contribution to tuition fees to full coverage of all expenses, with grants, loans and scholarships awarded based on merit and/or household income.

1)Am I ready for a PhD program?

2)Do you really want to pursuit an academic life?

3)Financial support and Family.