econometrics econ/decs b360-001 fall 20191 econometrics econ/decs b360-001 fall 2019 course...
TRANSCRIPT
1
ECONOMETRICS
ECON/DECS B360-001 Fall 2019
Course Location: 1st floor computer lab, Miller Hall
Class hours: 2:00 – 3:15 Tuesdays and Thursdays
Instructor: John Levendis
Office: Miller 315
Office phone: 864-7941
Email: [email protected]
Office Hours: 3:30-4:30 am Tuesdays and Thursdays and by appointment.
Course Prerequisites: Principles of Microeconomics (ECON B100), Principles of
Macroeconomics (ECON B101), Business Statistics (DECS 205), and Junior standing.
Terms of Use: A student's continued enrollment in this course signifies acknowledgment of, and
agreement with, the statements, disclaimers, policies, and procedures outlined within this
syllabus and elsewhere in the Blackboard environment. This Syllabus is a dynamic document.
Elements of the course structure (e.g., dates and topics covered, but not policies) may be
changed at the discretion of the professor.
College of Business Mission Statement: In the Ignatian tradition, the mission of the College of
Business is to provide a superior value-laden education that motivates and enables students to
become effective and socially responsible business leaders. We strive to contribute quality
research, serve local and intellectual communities, and graduate students who possess critical
thinking skills and courage to act justly in a global business environment.
Course Description: Econometrics is an intermediate level statistics course. After a brief
overview of statistics, the course covers least squares estimation, statistical inference, diagnostic
methods, selection and evaluation of functional form, and simultaneous equations estimation.
The course focuses more on applied work than on its theoretical underpinnings. Students are
actively involved with computer exercises in this course, using the STATA software program.
Students will complete a comprehensive statistical research project.
By the time you are finished, you will have learned a skill that most employers value, are willing
to pay for, and are utterly mystified by!
Expected outcomes: Students completing this course should be able to:
Articulate a statistically testable claim
Choose the right statistical method to test this claim
Use tools of statistical inference in order to evaluate claims based on sample data
2
Correct your method for violations of the standard assumptions
Use modern software packages to estimate your model
Produce a cogent report explaining the results of quantitative analysis in terms that are
both precise, and yet comprehensible to those who are unfamiliar with quantitative
research.
What you might do with these skills:
Economists – might use regression analysis to investigate how quickly increases in the rate of
money growth are reflected in price indices.
Social Scientists – might investigate how much a murder in a neighborhood decreases the value
of nearby homes.
Accountants – separate bundled prices into its constituent parts, or to estimate costs.
Managers – forecast sales, and increase efficiency by identifying processes that are the biggest
contributors to waste and lost time.
Marketers – investigate how various product characteristics influence the decision to buy
Financiers – test for empirical regularities between different financial assets.
Required Texts:
McCloskey, D. (1999). Economical Writing, 2nd
ed. Waveland Press, Inc.
Pedace, Roberto (2013). Econometrics for Dummies. Wiley Press. Hoboken, NJ.
Some other useful textbooks:
Achen, Christopher H. H. Interpreting and Using Regression. Quantitative Applications in the
Social Sciences, Vol. 29. Sage Publications: London.
Berry, William Dale and Stanley A. Feldman. Multiple Regression in Practice. Quantitative
Applications in the Social Sciences, v. 50. Sage Publications: London.
Lewis-Beck, Michael S. (1980). Applied Regression: An Introduction. Quantitative Applications
in the Social Sciences. Sage Publications: London.
Schmidt, Stephen J. (2005). Econometrics. McGraw-Hill: NY, NY.
Software:
We will be using a program called Stata. The College of Business has installed Stata in the
computers in the 1st floor lab. I really recommend getting your own copy of Stata. (It’ll save you
trips to the computer lab, and you can do your work when you feel it’s most productive.) “Small
Stata” only allows for datasets with fewer than 1000 observations, so if you’re planning on
working with microeconomic datasets (like labor market or census data) then you’ll need a more
hefty version of Stata like Stata/IC.
3
You can get your own copies of the student version for cheap(er). During the semester in which
you are enrolled, you may order Stata at
http://www.Stata.com/coursegp.html
Specify JL360 for the GRADPLAN ID, and choose which version of Stata you would like to
order. I recommend a six-month or one-year license for Stata/IC.
Alternatively, Stata 14.2 is on our Virtual Lab, which can be accessed from any computer on
campus outside of the Miller Trader Lab and off campus, at: www.loyno.edu/vlab
Digital storage:
While not required, you might find it useful to keep your work (datasets, articles, etc.) on a
thumb (flash) drive, or on “the cloud” via DropBox or Google Drive.
Homework:
Learning by doing is extremely important in this class. There will be hw assignments from
almost every chapter that we cover. These will consist of a series of problems to be solved using
Stata. Each homework assignment will be worth 10 points. You can drop one of the hws.
Homework will be collected at the beginning of class. For every day that your assignment is late,
10% will be deducted from your grade. Solutions will be posted on the morning of the next class
day. After this point, late assignments will not be accepted.
While student cooperation and discussion is encouraged, homework assignments must be the
work of the individual student. If your hw is a copy, or near copy, of a classmate’s hw, both
students will receive zeros for the assignment.
Project:
You will be responsible for completing a 15-20 page research project from start to finish. To
make sure you stay on track, you will have to turn in portions of the project periodically. These
smaller portions will eventually add up to the four major sections of your research paper:
introduction and literature review, description of the data, analysis, and conclusion.
Grading:
Homeworks 60 pts
Prelim report 1 (topic) 10
Prelim report 2 (lit review) 30
Prelim report 3 (data) 30
Prelim report 4 (draft) 50
Referee report 20
Final paper 100
---------------------------------- ----------
Total 300 pts
4
Grade Percent
A 93-100
A- 90-92
B+ 88-89
B 83-87
B- 80-82
C+ 78-79
C 73-77
C- 70-72
D+ 68-69
D 60-67
F 59 and below
I reserve the right to raise or lower your final grade by an increment of a letter grade (from a C to
a C+, or a B to a B-, etc…)
Tentative calendar:
Topics will be covered in the following order. The pace of the class varies each year, depending
upon the idiosyncrasies of the students and of the academic calendar; therefore, the due dates
cannot be given with precision. Additional materials may be added as appropriate. This schedule
is subject to change.
8/20 Introduction, syllabus
Basic concepts
Conducting a research project
Sample paper handout
Textbook (Schmidt) chapters 1 and 2
8/22 Stata example from Kohler and Kreuter
Anscombe’s Quartet
Distribute HW-1
8/27 Do-files and log-files
The grammar of Stata commands
Paper discussion
Class time for HW-1
8/29 HW-1 due
OLS lecture (Ch6 in Schmidt)
9/3 Return graded HW-1
Discuss HW-1
Distribute HW-2
5
In class Stata exercise on OLS
9/5 Discuss various student projects
9/10 Preliminary report #1 (topic) is due
Ch4: Estimation
Lecture on scaling
Lecture on converting between nominal and real values
Ch5: Hypothesis testing
Central Limit Theorem
9/12 HW-2 due
Ch5 continued
Distribute HW-3
Do the in-class portion of HW-3
9/17 Return graded HW-2, discuss
Discuss sample paper
9/19 Ch7: Properties of OLS
9/24 HW-3 due
Ch7, continued
9/26 Return graded HW-3, discuss
10/1 Ch8: Multivariate regression
The Frisch-Waugh Theorem
Multicolinearity
R2
10/3 Ch9: Functional form
Assign HW-4
10/8 Preliminary report 2 (lit review) due
Ch9 continued
10/10 Students work on HW-4 in class
10/15 Fall break. No class.
10/17 HW-4 (ch.9) due
Lecture on data management, appending, and merging.
10/22 Return graded HW-4, discuss
6
10/24 Assign HW-5 (data management)
Chapter 10: Determining the specification of the model
Which variables to include?
10/29 Chapter 10 continued
In-class HW time
10/31 HW-5 (data management) due
11/5 Return graded HW-5, discuss.
Using logarithms
11/7 Chapter 11: Dummy variables
Distribute HW-6 (chapter 11)
11/12 Preliminary report 3 (data section) due
Ch 11: Dummy variables continued
In-class portion of HW-6
11/14 HW-6 (ch.11) due
11/19 Return graded HW-6, discuss
11/21 Ch14, 2sls and endogenous variables
Distribute HW-6 (on ch14, 2sls)
In-class time for HW-7
11/26 HW-7 due
11/28 Thanksgiving holidays. No class.
12/03 Rough draft due, bring three hard copies.
Swap drafts with other students
12/5 Referee report due
Return my referee reports
Workshop student papers
12/12 Your final paper is due on Thursday, December 12th
at 11:30am.
7
Academic Integrity Statement:
It is very important that you do your work yourself. It is OK to work with others, but the written
product that you turn in must be your own work. Violations will result in academic penalties,
including receiving an “F” for the assignment, and possibly for the course. More serious
violations will be taken through the appropriate administrative channels.
According to the Loyola University Student Handbook
(http://www.loyno.edu/studentaffairs/conduct/handbook/academic_policies.html):
“All academic work will be done by the student to whom it is assigned without
unauthorized data or help of any kind. A student who supplies another with such data or
help is considered deserving of the same sanctions as the recipient. Specifically, cheating,
plagiarism, and misrepresentation are prohibited. Plagiarism is defined by Alexander
Lindley as “the false assumption of authorship: the wrongful act of taking the product of
another person’s mind, and presenting it as one’s own” (Plagiarism and Originality).
“Plagiarism may take the form of repeating another’s sentences as your own, adopting a
particularly apt phrase as your own, paraphrasing someone else’s argument as your own,
or even presenting someone else’s line of thinking in the development of a thesis as
though it were your own” (MLA Handbook, 1985). A student who is found to have
cheated on any examination may be given a failing grade in the course. In case of a
second violation, the student may be excluded for one or two terms or dismissed from the
University.
“A student who engages in cheating, plagiarism, or misrepresentation on term papers,
seminar papers, quizzes, laboratory reports, and such may receive a sanction of a failing
grade in the course. A second offense may be cause for exclusion or dismissal from the
University. Faculty members are required to report immediately to the dean of the
student’s college any case of cheating, plagiarism, or misrepresentation which they have
encountered and, later, the manner in which it was resolved.”
Attendance Policy:
None. You are adults. But there is a high correlation between attendance and performance. This
correlation is statistically and economically significant.
Cell Phones and Computers:
Please keep cell phones turned off during class. Do not check messages or send messages during
the lecture.
You will be tempted to browse on the internet during class. Do not do this. It is rude, I take
offense to it, and it’ll cost you. All applications that are not related to classwork should be
closed, not just minimized.
8
Additional Expectations:
Free discussion, inquiry, and expression are encouraged in this class. Classroom behavior that
interferes with either (a) the instructor’s ability to conduct the class or (b) the ability of students
to benefit from the instruction is not acceptable. Examples may include routinely entering class
late or departing early; use of beepers, cellular telephones, or other electronic devices; talking
while others are speaking; or arguing in a way that is perceived as “crossing the line of civility.”
The Term Paper
Your assignment is to construct an econometric model of a functional relation that is
interesting to you, collect the relevant data, and estimate the model dealing with possible
statistical problems that might arise.
You must clear your idea with your professor before getting started on your project. You can
find ideas for possible projects from examples in our text, from questions that are raised in your
other classes, or from journals that publish applied statistical research, such as The Review of
Economics and Statistics, Applied Economics, The Journal of Applied Econometrics, and various
other journals from almost any business field.
I will impose very few restrictions on the nature of this project; the application does not
have to be an economic relation. However, there are some projects that are likely to be less
suitable than others, because of unavailable data or lack of interesting testable hypotheses.
Therefore, to guarantee that your project remains on the right track, you are required to regularly
submit portions of your paper.
Once we have agreed on a project, you should collect the data. If you were to propose a model of
wage determination, would you observe wages of individuals at a point in time, or would you
model average wages in the US over time, or possibly average wages of states observed across
states? Clear thinking about this issue is vital to developing a reasonable econometric model.
The next step is to begin estimation. You will probably want to try several alternative
specifications of your model, and you will undoubtedly encounter various statistical problems.
An important part of the project is the testing and treatment of these various econometric
problems, using procedures presented in the course. You should document your use of these
procedures.
Grading is neither by pound of output or written material, nor by complexity of project, but
rather by:
1. Economic theory behind the design of your econometric model;
2. Appropriateness of the procedures selected; and
3. Completeness and clarity in communicating the results and the policy implications.
9
The Progression of the Paper Preliminary Report 1: The Problem Statement, its significance.
1. Select your topic in consultation with your instructor. 2. Justify the selection of your topic: Why should we care? 3. Make sure that the necessary data are, at least in principle, available. 4. Fill out and turn in the “Preliminary Report 1: Paper Topic” sheet. You will find it a couple of pages
after this one. Preliminary Report 2: Literature Review:
1. Type up a proper introduction to your paper. You should establish what your research question is, and why it is important. (See your Preliminary Report 1 if you’ve forgotten.)
2. Survey the literature a. What type of analysis did previous researchers do? b. What data did they use? Where did they get it? c. What were their results?
3. Include a proper bibliography. 4. Very briefly, what type of economic model and data do you intend to use?
Preliminary Report 3: A fixed-up version of Report 2, plus a description of your data:
1. Define your variables. 2. Describe the data. Are they categorical, ordinal, continuous… 3. Describe your data sources. 4. Summarize the data using simple summary-stats and graphs. 5. Write your hypotheses and speculate on what you expect to find and why
(i.e. what are the expected effects of each variable?)
Rough Draft: A fully written paper, submitted for peer review
1. Treat this as though it were your final submission. 2. Perform the data analysis. 3. Explain to the reader why you did what you did. 4. What are your estimation results? Conclusions? 5. Your paper will be reviewed, anonymously, by your peers for constructive feedback,
so only put your name on a cover sheet. Referee’s Report
1. Summarize in your own words a classmate’s paper 2. Offer criticisms and suggestions for improvement 3. Grade the paper
Final Report: Final Submission This final submission will be a finished, polished research paper.
1. You should stress your conclusions, 2. Explain how they compare with previous studies’ results, and 3. Discuss how your results have any implications for theory or policy.
10
Preliminary Report 1: Paper Topics
You can research whatever topic you want, as long as you research it statistically, using the tools
presented in this class. Coming up with a paper topic that is doable is harder than it seems. It is
easy to come up with interesting research questions; it is much more difficult to find the data you
would need to answer those questions.
Here are some preapproved topics:
Micro
1. The correlation between sexual orientation and wages (Are LGBT discriminated
against?)
2. What is the relationship between alcohol consumption and personal income?
Macro
3. The correlation between economic freedom and economic growth/income? (Are freer
countries richer? Do they grow quicker?)
4. What is the correlation between corruption and economic growth/income?
5. What is the relationship between gender equality and economic growth/incomes? (Is it
bad for the economy to discriminate against women?)
6. What is the relationship between income inequality and economic growth/income?
7. What is the relationship between pollution and economic growth/income?
Finance
8. What is the relationship between economic freedom and stock market returns/volatility?
9. What is the relationship between corruption and stock market returns/volatility?
Some suggestions:
Stay away from sports topics. Baseball has tons of data, but it is not provided in a format
that is amenable to regression analysis.
Don’t do country-specific macroeconomic studies. If you’re going to do macro, stick with
cross-country research questions. Ex: do countries with high inflation rates have lower
rates of long-term growth? Rather than, did the US experience higher growth when it had
higher inflation rates?
Don’t do a study on drugs. It’s a fun topic to think about, but the data are much too hard
to find.
Flip through your old textbooks, and remind yourself what questions you had. For
example, when in an environmental econ class, did you ever wonder how GDP varies
with gasoline prices? Or what the correlation is between unleaded and diesel prices?
Keep it simple: what is the relationship between X and Y. (We’ll have ample opportunity
to complicate things later.)
Meet with me to discuss your ideas. You should come prepared to our meeting with ten
research questions.
After we have discussed your paper ideas, you should pick one. Think about what your X
and Y variables will be, as well as what these factors (in “holding other factors constant”)
might be. Fill out and turn in the sheet on the next page.
11
Preliminary Report 1: Paper Topic
Name:______________________________ Date: ______________________________
My paper will investigate the following hypothesis: ____________________________________
______________________________________________________________________________
______________________________________________________________________________
______________________________________________________________________________
It is an interesting and relevant question because: ____________________________________
______________________________________________________________________________
______________________________________________________________________________
______________________________________________________________________________
My paper will investigate the effect of X on Y, holding other things constant.
Here is what my variables will likely consist of:
Y (my one dependent variable) = _____________________________________
X1 (my main independent variable) = ___________________________________
X2 (other variables that might affect Y) = ___________________________________
X3 (other variables that might affect Y) = ___________________________________
X4 (other variables that might affect Y) = ___________________________________
X5 (other variables that might affect Y) = ___________________________________
X6 (other variables that might affect Y) = ___________________________________
X7 (other variables that might affect Y) = ___________________________________
X8 (other variables that might affect Y) = ___________________________________
X9 (other variables that might affect Y) = ___________________________________
X10 (other variables that might affect Y) = ___________________________________
You don’t need to have all X2-X10 filled out, but you should have some idea of what these
“other factors” might be that affect Y.
12
Preliminary Report 2: The Literature Review
Not to be confused with a book review, a literature review surveys scholarly articles (and
sometimes books) relevant to a particular area of research and provides a description, summary,
and thematic grouping of each work. The purpose is to offer an overview of significant literature
published on a topic.
You can find relevant articles using:
1. EconLit – a database of all economics articles, which you can access through the Monroe
Library website.
2. Google Scholar – provides a broader search, beyond strictly economics articles.
3. The bibliographies from papers that are already in your lit review
4. Web of Science’s “cited reference search”. This allows you to do a bibliography going
forward in time; that is, it gives you a list of all the papers that have cited a particular
paper. Think of it as a reverse bibliography.
You will notice that each academic article follows a specified format. Usually, they motivate the
research in the first couple of paragraphs. That is, they explain why the topic is of interest. Then
they dive right into a literature review. This is their attempt to bring the reader up to speed on the
literature. (It also points out areas where the current research is lacking, and where their research
will fill those gaps.)
These papers’ literature reviews will be a second source of relevant articles. If they mention an
article, you should probably read that article, too. So the bibliography provides a third place to
find articles to cite.
How many articles should you cite? It’s hard to say. You just have to be thorough. Some topics
have been studied at length, so a proper literature review will have survey 15-30 articles. Other
topics have received little attention, so you can only cite 10 or so articles.
Components
Literature reviews should comprise the following elements:
An overview of the subject, issue or theory under consideration, along with the objectives
of the literature review
Division of works under review into categories (e.g. those in support of a particular
position, those against, and those offering alternative theses entirely)
Explanation of how each work is similar to and how it varies from the others
Conclusions as to which pieces are best considered in their argument, are most
convincing of their opinions, and make the greatest contribution to the understanding and
development of their area of research
The purpose of a literature review is to:
Place each work in the context of its contribution to the understanding of the subject
under review
13
Describe the relationship of each work to the others under consideration
Identify areas of prior scholarship to prevent duplication of effort
Point the way forward for further research
Again, please use the articles’ own literature reviews as templates for your own reviews. Don’t
copy, but just follow their style of writing.
Speaking of style of writing: you must write in a professional, academic tone. There is no room
for slang, or even “I will” or “I think”, etc… Follow the writing style of the articles you’ve read.
You’ll find it useful to outline your lit review before you begin writing. This way, you’ll have a
well-defined structure that organizes the papers thematically.
Your first paragraph should briefly introduce the research question, and motivate the reader that
the question is important and worth answering. Make sure you have a thesis statement that
indicates what your research will investigate.
Do not just have one paragraph-per-paper: A said X, B said Y, C agreed with X, and D agreed
with X. I could just read each paper’s abstract for that. Rather, what I need from you, is to help
me relate the papers to each other. Group the papers together in terms of their results, or
approaches, or statistical techniques, or some other such theme that helps your reader understand
the overall outlines of the debate.
If you quote, you must cite the page number. If you have a quote that runs more
than four lines, you must use a “block” quote format. This is where you indent a
half-inch from both the left and the right margins.
Try to conform to the citation style described in the course BlackBoard site, under Course
Materials Literature Review folder.
Google Scholar now has a feature which actually generates properly formatted references in all
the major styles: MLA, APA, and Chicago. Use it!
In your References section, each entry should use a “hanging indent” such as:
Lee, S. H., Levendis, J., & Gutierrez, L. (2012). Telecommunications and economic growth: An
empirical analysis of sub-Saharan Africa. Applied Economics, 44(4), 461-469.
Google how to do this in your specific word processor. The lit review in that paper is pretty
good, btw.
You may also want to read the following:
Denney, Andrew S. and Tewskbury, Richard (2013). How to write a literature review. Journal of
Criminal Justice Education, 24(2): 218-234.
14
You should consult the sample paper for formatting, and general writing style. D.
McCloskey’s Economical Writing is a great guide on how to write for an audience of
academic economists. And don’t be afraid to go to the WAC lab for an extra pair of
proofreading eyes.
15
GRADING RUBRIC FOR LITERATURE REVIEW
Paper authors:
Criteria and qualities Poor (1)
Good (2)
Excellent (3)
Point Value
Scaling factor
Introduction: the Problem statement
Vague reference is made to the topic to be examined.
Readers are aware of the overall problem or topic to be examined.
The topic is introduced, and its relevance is explained.
5%
Articles: appropriateness of
sources
Information is gathered from limited, dubious, sources.
Information is gathered from a few sources, some of dubious quality.
Information is gathered from multiple, research-based sources.
20%
Articles: appropriateness of
content
Major sections of pertinent content have been neglected or greatly run-on. The topic is of little significance to the field.
Some discussion of broader scholarly literature.
The appropriate content is covered in depth without being redundant. Sources are cited when specific statements are made. Relevance and significance are unquestionable.
10%
Articles: Balanced viewpoint
Presents only one answer to the research question.
Some discussion of alternative viewpoints, but heavily favors one side.
Objective, balanced viewpoint from various perspectives.
10%
Conclusion: Synthesis of
literature's findings and research
question.
There is no indication the author tried to synthesize the information or make a conclusion based on the literature under review. No hypothesis or research question is provided.
The author provides concluding remarks that show an analysis and synthesis of ideas occurred. Some of the conclusions, however, were not supported in the body of the report. The hypothesis or research question is stated.
The author was able to make succinct and precise conclusions based on the review. Insights into the problem are appropriate. Conclusions and the hypothesis or research question are strongly supported in the report.
10%
16
Presentation
Grammar Four or more spelling and/or grammatical mistakes per page.
Two or three spelling +/- grammatical mistakes per page.
One or fewer spelling and/or grammatical mistakes per page.
20%
Clarity of writing and writing technique
It is hard to know what the writer is trying to express. Writing is convoluted.
Writing is generally clear, but unnecessary words are occasionally used. Meaning is sometimes hidden. Paragraph or sentence structure is too repetitive.
Writing is crisp, clear, and succinct. The use of pronouns, modifiers, and parallel construction is appropriate.
10%
Coherence and structure
Poorly conceptualized, haphazard. Some coherent structure. Well developed, coherent. 10%
Citation format
Citations for statements included in the report were not present, or references which were included were not found in the text.
Citations within the body of the report and a corresponding reference list were presented. Some formatting problems exist, or components were missing.
All needed citations were included in the report. References matched the citations, and all were encoded in the appropriate format.
5%
Lagniappe
Timeliness Material was submitted more than one class late.
Material was submitted up to one class late.
Material is submitted on time. 10% reduction
per day late
Additional Comments Final Score:
As a percent:
17
The Data Section
Please do not put off working on the data too late. You should consider availability of data
from the beginning of the project.
If your paper is microeconomic in nature, some useful sources of data are:
National Longitudinal Study of Youth
Panel Study of Income Dynamics
Current Population Survey
General Social Survey
Macro data are easier to find. Check out:
World Bank. The Stata command, wbopendata, download this data directly into Stata.
International Monetary Fund
FRED (The Federal Reserve’s database) The Stata command, freduse, downloads this
data directly into Stata.
Penn World Tables. This dataset is very good for cross-country comparisons. You can
also access these data from freduse.
Several macro databases, in Stata format no less, can be accessed through
http://graduateinstitute.ch/md4Stata/datasets.html. This includes the Penn World Tables.
If you download from here, or from wbopendata or freduse, you will not have to
convert your data to Stata format; your life will be easier.
A useful portal to find data is: http://www.oswego.edu/~economic/data.htm. It includes dozens
of links to data sources.
In getting your dataset in order, try to do all your work in one do-file. This do-file should include
everything from loading your original datasets, and the commands for making any alterations to
them.
Some useful commands are:
destring (this command converts a text variable to numeric. This is especially useful
if your data originally came in Excel format, and you copy-and-pasted into Stata.)
xpose (transpose. It rotates your dataset so that columns become rows and rows become
columns)
merge (to splice together two different datasets.)
collapse (helps you calculate averages and sums. For example, you might have data
for 100 countries, each with 10 years. This lets you calculate the 10-year average for each
country.)
reshape (useful if you have panel data and you want to switch between long and wide
format)
Also, Dr. Mehmet Dicle and I wrote a Stata command to simplify some of the tedium of
renaming countries so that they share the same standard names across datasets. You can install
this command by typing into the command line: .net from http://dlacademics.com/Stata
or
18
.net from http://www.loyno.edu/~mfdicle/Stata
then clicking on click on “countrynames,” and then clicking on “(click here to install).”
The data section of the paper describes the sources of the data that you will be using. Your
readers should be able to find the same data you used pretty easily after reading this section.
It is common to provide a table of summary statistics, too. (Remember the summarize
command in Stata.) If you do this, remember that readability is important for your reader: keep
the number of decimal places to the necessary minimum.
Variable Obs Mean Std. Dev. Min Max
lnPRICE 33808 11.97 0.65 4.25 15.15
PRICE 33808 194134.40 152564.30 70.00 3800000.00
elevation (ft) 25786 9.08 26.60 -13.80 354.70
dist_mi 25786 17.17 14.40 0.18 78.83
bdrms 33808 3.28 0.77 0.00 10.00
hbaths 33808 0.28 0.48 0.00 5.00
fbaths 33804 2.01 0.66 0.00 9.00
age 33158 2.72 2.56 0.00 20.00
condition 33807 3.32 1.23 0.00 5.00
LotArea 28909 186.72 1000.76 3.39 75020.46
LivingArea 33739 19.16 8.18 0.01 131.76
FoundRaised 33808 0.21 0.41 0.00 1.00
percent white 25786 0.71 0.28 0.00 0.99
PostK 33808 0.33 0.47 0.00 1.00
lnMHI 25775 3.74 0.37 1.76 4.98
MHI 25775 45.08 16.61 5.82 146.16
Table 2: Summary Statistics
If data are categorical, or ordinal, explain this. For example, property rights are often reported
on a 1-5 scale. Be sure to mention this, and also explain whether a score of 5 means that there are
good property rights or poor ones. Age is probably a continuous variable. But sometimes it is
lumpy, so that age is reported in decades. If so, let your reader know.
Keep the decimal points to a minimum. Three decimal places are usually more than enough.
It is often useful to provide a scatterplot of the main variables that you will be focusing on. For
example, your paper might focus on the effects of property rights on per capita GDP. You will,
of course, have to control for other variables, such as education, political rights, or whatever. But
presuming the main variables you are interested in are property rights and pcGDP, it is not a bad
idea to provide a scatterplot, and even maybe a simple regression line overlaid on the scatterplot.
A similar and powerful way of making the case that X affects Y is to provide two summary
tables that compare two groups. For example, suppose that X is TaxRates and Y is GDP. Then it
might be presumed that higher tax rates correlate to lower GDP. So, split your dataset into two
groups, one with low tax rates and another with higher ones, and report summary statistics for
19
each. You should be able to tell at a glance whether GDP is higher in the low tax country. You
will also be able to tell how high- and low-tax countries differ regarding all the other X variables,
too.
As always, use as templates the articles you’ve read in your literature review.
And remember, this is an important part of your paper. Your tables should be formatted to
look like ones in the academic journals you are reading. You’ll have to cut and paste from
Stata into Excel, and then format the tables before they are presentable. Tables should
have titles and be numbered (Table 1: Description of the Data; Table 2: Data Sources,
etc…).
You don’t have to report regression results at this stage. That comes later.
This new portion of your paper should be approximately 2-4 pages long. Not much writing
is involved, but be sure to include a revised intro and literature review. And don’t be afraid
to go to the WAC lab for an extra pair of proofreading eyes.
20
GRADING RUBRIC FOR DATA SECTION
Paper author:
Criteria and qualities Poor (1)
Good (2)
Excellent (3)
Point Value
Scaling factor
Introduction & Literature Review
The introduction and literature review were barely revised.
Some revisions were made. Revisions were thorough and appropriate.
10%
Appropriateness of data sources
Most data are gathered from dubious sources.
Some data are gathered from dubious sources.
All data are gathered from reputable sources.
20%
Appropriateness of dataset
No explanation of the strengths and limitations of your dataset.
Some discussion of the strengths and limitations of your dataset.
You explained the strengths and limitations of your dataset.
5%
Appropriateness of dataset: number of obs.
You had too few observations (less than 20 per variable).
You had between 20 and 30 observations per variable.
You had more than 30 observations per variable.
10%
Appropriateness of dataset: variable relevance
Your dataset contains many variables with limited relevance to the research question.
Your dataset contains some variables of limited relevance.
Your variables are appropriate to the research question.
20%
21
Explanation of the variables (definition of terms, nominal vs real, units used, categorical vs ordinal, etc…)
It is not clear you understand what your data are about.
You have some understanding of your data, and are able to impart some of that to your reader.
You have a clear understanding of your data. Your explanation of the variables is also clear.
10%
Graphics and Scatterplots No extra graphs were provided.
Scatterplots of the main variables of interest were provided, but many were extraneous.
Scatterplots were provided, and were appropriate to the question at hand.
. . .
10%
Table of Summary Statistics: Thoroughness
Your table includes few summary statistics.
Your table includes summary statistics.
Table includes the appropriate summary statistics, and comparisons across groups.
10%
Table of Summary Statistics: Presentation
Your table was more like a cut-and-paste from Stata.
Your table was formatted, somewhat, but could stand improvement.
Your table was appropriately titled; you use descriptive variable names; your table was readable (without too much meaningless detail).
5%
Lagniappe
Lateness. Material was submitted more than one class late.
Material was submitted up to one class late.
Material is submitted on time. 10% reduction per day late
Additional Comments Final Score:
As a percent:
22
Drafts
Please treat your draft as though it were your final submission. That’ll allow me to make
more substantive comments to your paper. It would be a waste to have me tell you to check for
punctuation and grammar; to tell you to format, number and title your tables; etc…
Be kind to your readers and use descriptive variable names. It is ok to have ED2LJJK as a
variable name in your do-file, but it’s not very informative for your readers. Rename your
variables so that your readers can easily tell what the variables mean.
Be kind to your readers and summarize regression results in tables. Don’t provide scores and
scores of regression results. Rather, you should focus on your main findings and can summarize
your results in one large table. If you used a testing-down procedure, for example, you could
show the sequence of regressions in a table as follows:
Table 3: Regressions on ln(wages): Testing Down
(1) (2) (3)
Age 32*** 23*** 30**
(0.003) (0.005) (0.04)
Education 456 555 644*
(0.30) (0.20) (0.03)
Female 100
(0.35)
Hair color 42* 50
(0.08) (0.25)
Constant 100 120 90
(0.20) (0.25) (0.15)
N 100 100 100
Adj-R2 0.65 0.70 0.80
Note: *** indicate the coefficient is significant at the 0.01 level; ** are significant at
the 0.05 level; * at the 0.10 level. P-values are in parenthesis.
If you did a testing up procedure, then you would have tons and tons of regressions. Don’t show
these. Rather, put your “final” regression as column (1). Then, in columns (2) and (3), show what
happens when you throw in an additional variable. In the body of the paper, discuss the results
from (1). Columns (2) and (3) will show the reader how robust your results are to including some
other variables. One or two sentences to this regard should suffice.
Format your tables so that they can be understood without referring to the body of the paper.
Don’t report too many decimal places. If you require many decimal places, perhaps you can
redefine your variable (income in thousands rather than in dollars).
Some students find Stata’s outreg command simplifies the construction of tables like the one
above.
The conclusion is usually brief. One or two paragraphs. Restate your research question, give the
high points of the literature review, and restate your conclusion and why it is important.
23
Consider going to the WAC-lab or have a friend proofread your paper.
You should consult the sample paper for formatting, and general writing style. As always,
consult D. McCloskey’s Economical Writing! And don’t be afraid to go to the WAC lab for an
extra pair of proofreading eyes. Did I mention that you should proofread your work?!
Please turn in three copies. Your name should appear only on a detachable cover page; this
page should only include your name and the paper’s title. Please include the paper’s title on the
first page of the body of the paper as well.
24
GRADING RUBRIC FOR DRAFT/FINAL PAPER
Paper title: Paper author:
Referee:
Criteria and qualities
Poor (1)
Good (2)
Excellent (3)
Point Value
Scaling factor
Introduction Does not convince the reader that the issue is important. The intro either wanders or is too short.
Decent introduction. Could use some improvements.
Convinces the reader that the issue is worth investigating.
5%
Literature review
Scattered treatment of disconnected sources. Little connection to your data analysis.
Sources/papers are listed one after the other. The papers raise issues that are not addressed by you in the analysis section.
The sources are integrated. They relate to the question at hand, and with your statistical treatment of the research question.
10%
Data section: variables
Your dataset contains many variables with limited relevance to the research question. Most data are gathered from dubious sources.
Some variables are included which are unjustified; other relevant ones are excluded. Some data are gathered from dubious sources.
All, and only, relevant variables are included. All data are gathered from reputable sources.
10%
Data Section: text
Variables are unexplained. Strengths and limitations of your dataset are not discussed.
Variable definitions, sources, strengths and weaknesses are somewhat explained.
You explain clearly what each variable (and its abbreviation) mean. You explained the strengths and limitations of your dataset.
5%
Data section: graphs and tables
No extra graphs or summary tables were provided. Or, there were way too many or way too few of them. Tables and graphs not formatted.
Scatterplots of the main variables of interest were provided, but many were extraneous. Tables and graphs could benefit from formatting.
Targeted use of graphs and tables. They are appropriate to the question at hand. They are properly formatted.
10%
25
Criteria and qualities Poor (1)
Good (2)
Excellent (3)
Point Value
Scaling factor
Model specification: non-linear effects (squared variables, polynomials, etc.)
What’re polynomials?
Used a polynomial, but without justification, or without seeming to know why.
You don’t need to use polynomials, but you should at least consider their use. Explain why you did or did not use them.
5%
Consideration given to model specification: interaction effects.
What’re interaction effects?
You interacted some stuff, but without much justification.
You used interactions appropriately and interpreted the coefficients properly.
10%
Heteroskedasticity/autocorrelation tested and accounted for.
What are Hetero-skedasticity or autocorrelation?
You have an incomplete grasp of the problem of heteroskedasticity and/or autocorrelation.
You tested for both and, where appropriate, used
robust standard errors. 5%
Model specification: log or un-logged variables, as appropriate.
Linear model with no justification for the choice
Discussion on apriori grounds for log v no-log.
Full explanation and statistical justification for log v no-log.
10%
Appropriate use of Dummy Variables. What are dummy variables?
Dummies used when continuous variables would have been better.
Dummies are used where appropriate.
10%
26
Criteria and qualities Poor (1)
Good (2)
Excellent (3)
Point Value
Scaling factor
Table of regression output
Cut and paste from Stata.
Could use some more formatting.
Are properly titled, formatted, and professional looking. No extraneous decimals. Units are appropriate.
5%
Interpretation of coefficients
Very little comment on the coefficients.
Focus only on the signs of the coefficients.
Coefficients are properly interpreted. Explain not only the signs of the coefficients, but the numbers as well.
10%
Conclusions: Poorly written. Could use improvements.
Restate the problem, why it is interesting, and how your paper addressed the outstanding questions of the literature.
5%
Bonus:
Model specification:
RESET test. Ramesy’ RESET test was used appropriately. The procedure was explained. The final model passes the test.
10% bonus
Instrumental Variables Instrumental variables were used when appropriate. The procedure was explained. 10% bonus
Deductions
Grammar By this stage in the game, your paper should have been read and reread, so that there are no grammatical mistakes whatsoever.
2% reduction per grammatical error
Lateness. Grades are due to the registrar’s office two days after the end of exams. 10% reduction per day late
Additional comments Final score:
_____________________________
27
Referee Reports
Write a 2-3 page referee report summarizing the contribution of the paper, its key weaknesses
and how these problems might be addressed, portions of the paper that might be strengthened,
expanded, shortened, etc. Be clear about the exact revisions required.
It is important that the referee provide the author with useful feedback on his manuscript.
The most important part of the referee report is your critical analysis of the paper. Exactly what
you do will depend on the paper and its content. There is no single checklist, but below are some
questions that you might ask of the paper. The list is not exhaustive. In addition, the paper you
are refereeing might not require answers to all of the questions.
Most important is the answer to the question: Does the paper accomplish what it set out to do?
Other possible questions to ask are the following:
If the paper contains theory (explicit or implicit), does it hold up upon closer scrutiny? Is
there an alternative theory that is better suited that the author has ignored?
Is the paper convincing? If not, why not?
Are the appropriate econometric tools being used? Is an inappropriate model being used?
(For example, if you think endogeneity is a problem and someone estimated a 1-equation
model. Or, if you think that a logit model would have been more appropriate.)
Are the coefficients of interest properly identified?
Should the author subject the empirical work to robustness tests?
Would the reader be better served by some simple tabulations or simple means and
standard deviations or by some graphical presentation?
Are the tables and figures self-contained and easy to comprehend, and could you, if
needed, replicate them?
Is there a relevant and important literature that the author does not cite and/or use when it
should be cited and/or used?
What do you as the referee suggest that the author do to correct the errors of commission
or omission that you have found?
In writing your report you can consult additional literature. A quick lit review using JSTOR or
EconLit is a good idea. This way you can verify whether the author has undertaken a thorough
literature review and has understood the topic before trying to make the new contribution.
Also, include your assessment of the paper, using the draft/final-paper rubric, which you
can find above. For each item, multiply the given points by their weight (say 2pts * 0.10 = 0.2
pts) and then add up all these weighted points. This is the score of the paper, on a scale of 0-3. If
you want, divide by 3 to get the percentage score. Turn in two of these (one for me, and one for
the paper’s author.)
The first page of your referee’s report should be a detachable cover sheet which includes
your name and the title of the paper you refereed. Please turn in two copies for each paper
you have refereed. Use a stapler.
28
Structure of a Referee Report
A referee report is usually organized as:
Referee Report
Title of the Paper You are Reviewing
A. Summary:
When you summarize the paper, without evaluation, write neutrally as you might
if you were recording information for yourself or for a member of a research team
that you were working in. The key is that this part of your report is like notes that
you would put in your files to answer the question: "what did the author of this
paper view himself as doing?"
This should be about ½ to 1 page.
B. Evaluation
1. Larger issues:
Did the author accomplish what he set out to? Is the survey of the
literature complete? Does it help conceptualize the literature, or is it
simply bullet points. Are there weaknesses with the data? With the
argument? With the statistics? Are the tables properly formatted, and even
more importantly, are they readable? Are the results (the coefficients)
properly interpreted? Are the results explained in an accessible way?
Etc…
2. Smaller Issues (by page):
Point out areas with spelling and grammatical problems. Areas where the
writing was unclear.