Stata Project 1

Econ 330: Econometrics
Professor Lemke
Fall 2008

Due: Thursday, September 18

Direction: Everyone must write their own Stata program that produces answers to the following two questions. You can talk to your classmates, but everyone's work must be his or her own. You may find it useful or necessary to learn new Stata commands. The lab write-up is due at the start of class on Sept. 18. At that time, turn in a hard copy of your program and a separate hard copy of your answers to the specific questions. Your answers to the questions need to be written extremely clearly without relying on Stata commands or including unnecessary Stata output.

  1. For the first question, use welfare_edited.dta.

    1. Create a cross-tab of whether the mother is working against the number of children she has. Consider three groups for the number of children: 1, 2 or 3, and 4 or more. Be sure to label any variables you create and label the values of variables that make sense to label so that Stata's output clearly labels everything. Do not include cell, row, or column percentages.
    2. Create a cross-tab of whether the mother is working against the age of the mother's youngest child. Consider three groups for the age of the youngest child: 0 to 2, 3 to 5, and 6 or older. Again, be sure to label any variables and the the values of any variables you create so that Stata's output clearly labels everything. Include in the table the percent of mothers who work (and who do not work) for each of the three groups of ages.
    3. What percent of mothers are working in each standard metropolitan statistical area?
    4. Where are the 10th, 20th, 30th, ... , 90th percentiles in the distribution of the percent of group day care providers that are NAEYC accredited?
    5. By education level, what are the average wage and average weekly hours worked of those who are working?

  2. For the second question, use cps_may_2006_workers.dta, which we created together in class while working through the Introduction to the Data Ferrett laboratory exercise.

    1. Create variable labels for each of the variables. Describe and summarize the variables.
    2. What does it mean that the average value for Sex is 0.48? Convince someone who doesn't know the data set that your interpretation is correct.
    3. How many observations in the data set are associated with earning between $5 and $15 per hour?
    4. Using a single Stata command, what is the average hourly wage earned by individuals in the different education classes?
    5. Using the scatter command, plot wages (y-axis) against sex (x-axis). Include a copy of the graph in your results write-up. Explain why your graph is fairly meaningless.
    6. Suppose what is really called for is a bar chart that reports the average wage by sex. Find a graphing command in Stata that produces this graph. Include the graph in your results write-up. According to your graph, what is the average hourly wage for women? For men?
    7. Suppose to really impress your boss you decide to produce graphs of the entire distribution of hourly wages by sex. (You may recall that such a graph is typically called a histogram.) Find a graphing command in Stata that produces this graph. Include the graph in your results write-up. What stands out in your graph(s) when considering the distribution of wage across sexes?

    In parts E-G, you are asked to produce graphs. You will receive more points the better your graphs look. In particular, graphs should have titles, appropriately labeled axes and legends, and appropriately numbered axes. Even more detail would be good, such as choosing a good bin size for a histogram, among other things. Extra credit goes to whoever has the best graphs.