Stata project choose one topic and create analysis


Question Description

First, you must have Stata Software, it is a big project. (may 20-25 pages include text, tables, charts, and graphs)

choose one topic interesting in and begin it.

full outline in document and PDF.

Unformatted Attachment Preview

Data Sources Stata Data or Direct Access Federal Reserve Economic Data See Help freduse in Stata Word Bank See Help wbopendata in Stata See Help worldstat in Stata Worldwide Governance Indicators Australian Bureau of Statistics (registration required) radl) Haver Analytics (subscription required) See Help haver in Stata Other Sources (US Government’s Open Data) World Economic Forum US Department of Commerce Bureau of Economic Analysis EconData.Net Federal Reserve Archival System for Economic Research Federal Reserve Archival System for Economic Research Inter-university Consortium for Political and Social Research (ICPSR) Correlates of War Project Development Research Institute International Monetary Fund Penn World Table United Nations United Nations Conference on Trade and Development United Stata Government Printing Office United States Census Medical Sources World Health Organization Centers for Medicare & Medicaid Services Centers for Disease Control and Prevention The HCUP family of administrative longitudinal databases Your First Stata Project This guide will take you through the steps of starting a Stata .do file, importing data, and doing some analysis. Along the way, several useful functions will be introduced. 1. Create a folder on your computer that will contain the data, program, log, and other documents related to your project. 2. Download the file called “ACC Basketball Players.csv.” from Blackboard. From your browser right-click on the link, choose the “Save Link As…” option from the context-menu, and save it to the folder you created in step 1. 3. Open Stata 4. From the menu, go to Window > Do-file Editor > New Do-file. Or, hit Ctrl-8. a. This is a Do-file, in which you will type commands for Stata to execute. 5. I suggest starting all Do-files with a description of what the file does. Anything on a line that begins with an “*” character will not be read and executed by Stata. Thus, the asterisk can be called a comment character. You may want to begin your Do-file with something like: * This file is my first Do-file, in which I load data and run a few analyses. 6. Next, I suggest you type the clear command, which will empty Stata’s memory every time you run the Do-file from the beginning. You will also want to change the working directory so that it points to where the data and other project files may be stored: clear cd “C:\users\jerry\desktop\project” cd “~desktop/project” // PC // Mac 7. Now, find the file path that points to where you saved “ACC Basketball Players.csv” on your hard drive. It might be something like “C:\Users\dbs9\Desktop\ACC Basketball Players.csv” 8. You load this into Stata with the command: insheet using "C:\Users\dbs9\Desktop\ACC Basketball Players.csv", comma 9. Here, insheet is the command for loading data, and comma tells Stata that the file is a .csv, or comma-separated value file. 10. At this point, you’ll want to run your Do-file. You can do this in many ways: a. On the Do-file editor menu, go to Tools > Do b. Select all of the code you’ve written (Ctrl-A), and then hit the keyboard shortcut for “do,” Ctrl-D. c. You can also run one line at a time. If your cursor is on a line of code, or selecting part of a line or several lines of code, you can “do” that selection alone. Page 1 of 5 d. For now, you’ll want to run everything you’ve typed, through the insheet command. 11. Now, add a line with the command list to your Do-file and run that line alone. In the output window of Stata, you’ll see a table with five columns and 75 observations (you may have to click “more” at the bottom of the output window, or hit space to advance to the end of the table). a. As you can see, the data lists a subset of ACC Men’s basketball players from the 2010-11 season, along with their team, height, weight, and class. 12. Now, add and Do the describe command in your Do-file. 13. The output here tells you the nature of the variables in memory. In this case, there are four string variables (as in, containing letters/words/phrases/text), and one integer variable, wt. 14. Now, we’ll convert our ht variable, in a “ft-in” string format, to an integer variable indicating a height in inches. 15. There are several ways of doing this, but we’re going to use a command called split, which will create new variables from the ht variable: split ht, parse("-") gen(height) 16. Here, split is the command, ht is the variable to switch, “-“ is the character at which the ht string will be split, and height1 and height2 are the new variables created in the process. 17. Add, and Do, a line in your Do-file that says: list ht height1 height2 18. This will show the original height variable, plus the new variables you just generated with the original number of feet (height1), and the original number of inches (height2). 19. These new variables are still strings, as you can see in your Variables window, or with the describe command. 20. We need to convert them to another variable type, so we can add them together. One way to do this is with the destring command. You can either use two lines in your Do-file that say: destring height1, replace destring height2, replace Wherein replace tells Stata that you want to replace the variables you’re changing, rather than generate new variables with new names. Alternatively, you can use a single line, that says: destring height*, replace Here, the asterisk serves as a wild-card character, telling Stata to run the destring command on any variable that matches the pattern of beginning with “height.” 21. Now, we’ll generate a new variable, which you will be doing a lot in Stata. This variable will convert our feet and inches variables to a total cumulative number of inches. Page 2 of 5 gen heightinches = height1 * 12 + height2 22. Where gen or generate is our command, heightinches is our new variable, which we define as equal to 12 times our measure of feet, plus extra inches. Here, of course, the asterisk is a multiplication operator. 23. Now we can do some analysis. First, let’s calculate some summary statistics for heights. sum heightinches 24. The summary command gives you a summary of any variables you list. Here, we are told that the mean height is 77.68 inches, or just under six-and-a-half feet. 25. Now, let’s make a variable that indicates whether a player is greater than the average height. gen overmean = 0 replace overmean = 1 if heightinches > 77.68 26. First, we generated a new variable called overmean that consists entirely of zeroes. 27. Then, we replaced the zero with a one if and only if the heightinches variable is greater than the mean we saw in the summary. 28. Now, we can explore this new variable: sum overmean table overmean 29. The mean we see in the summary is 0.52, which we can interpret to say that 52% of players in this file are greater than the mean. This is reinforced in the table of values shown, which indicates that 39 players are taller than the mean, and 36 are not. 30. How does this variable break down by class? We can show a cross-tab with the command: table overmean yr 31. The output here shows the breakdown of those over the average height (and not) by class. In all cases, except for Juniors, there are more players above the average height than not. 32. Now, what is the distribution of weights? First, we can summarize player weights with the now-familiar sum command. Then, we can have Stata draw a histogram of weights, which gives us a more complete understanding of the distribution: sum wt hist wt 33. As the histogram shows, the modal weight is around 200 pounds, and there is a longer tail to the upper end of the weight range. 34. Now for some science. Let’s say you have a daring new hypothesis that taller players will weigh more, due to their larger mass. Let’s explore this hypothesis in several ways. Page 3 of 5 35. First, have Stata show the average weight of players taller than the mean height, and not taller than the mean height: mean wt, over(overmean) 36. As you can see, the mean weight of the taller group of players is greater than the mean weight of the shorter group (and the output suggests that the 95% confidence intervals of the mean estimates do not overlap.) Now, can we observe a linear relationship between height and weight? First, let’s examine the data visually: twoway scatter wt heightinches 37. The scatter plot puts height in inches on the x-axis as the independent variable, and weight on the y-axis as a response variable, because our hypothesis posits that height influences weight. There looks to be a clear and positive relationship between the two. How well do they correlate? corr wt heightinches 38. Stata tells us that the correlation is positive and large, at about 0.80. In a similar vein, we can run a regression predicting weight from height: regress wt heightinches 39. With the regress command, the response variable is always listed first, followed by any independent/explanatory variables. The output here tells us that our model explains about 64% of the variance in weight, and that every additional inch of height is associated with, on average, an increase in weight of 6.27 pounds. 40. There is much more that can be done in Stata; this tutorial has only scratched the surface. For further information, merely run the command “help [command]” where “[command]” is replaced by the function you want to know about, as in “help regress.” This will bring up Stata’s on-line help, which details syntax, use, and examples. 41. I can also recommend several online quick references on Stata: 42. Of course, if you run into a problem, Google almost certainly has an answer, if you are willing to look and make analogies from situations faced by others in the past. Here is what your Do-file might look like at the end of this tutorial: * This file is my first Do-file, in which I load data and run a few analyses. clear cd “C:\users\jerry\desktop\project” // PC cd “~desktop/project” // Mac insheet using "C:\Users\dbs9\Desktop\ACC Basketball Players.csv", comma list describe Page 4 of 5 split ht, parse("-") gen(height) destring height*, replace gen heightinches = height1 * 12 + height2 sum heightinches gen overmean = 0 replace overmean = 1 if heightinches > 77.68 sum overmean table overmean table overmean yr sum wt hist wt mean wt, over(overmean) twoway scatter wt heightinches corr wt heightinches regress wt heightinches Page 5 of 5 Project Report Format The format for the project report should generally follow: 1) 2) Title page Summary page a. Question addressed b. Results c. Implications 3) Introduction a. Why the topic is of interest b. How will the results be used 4) Literature review (if done) 5) Description of data used including sources 6) Description of the model(s) used 7) Results 8) Topics/ideas for further research 9) Conclusion 10) Bibliography (if needed) 11) Appendix a. Model output b. Other supporting analyses Project Guidelines 1. Introduction Explain the issue you are examining and why it is significant.  Describe the general topic to be studied.  Explain why this is important to study (e.g., benefits, potential cost savings, increased profits, understanding of markets etc.) 2. Background/Review of the Literature A description of what has already known about this area and short discussion of why the further study is needed.  Summarize what is already known about the topic. Include a summary of the basic background information on the topic gleaned from your literature review (you can include information from class, but the bulk should be outside sources)  Discuss any studies that have already been done in this area (you can use a source such as to find the latest research).  Point out why these background studies are insufficient. In other words, what question(s) do they leave unresolved that you would like to study?  Choose (at least) one of these questions you might like to pursue yourself. 3. Rationale A description of the questions you are examining and an exploration of the claims.  List the specific question that you are exploring. o Explain how these research questions are related to the larger issues raised in the introduction. o Describe what specific claim, hypothesis, and/or model you will evaluate with these questions.  Explain what it will show about the topic if your hypothesis is confirmed.  Explain what it will suggest about the topic if your hypothesis is disconfirmed. 4. Data, Method and Analysis A description of how you would collect data and test the question you are examining. You are not required to come up with a new or original method (though you can try!). Journal articles from the literature review can help you in this area.  Data: From what source will you get the data needed? o Describe the sample you would test and explain why you have chosen this sample.  Method: Describe the general methodology and/or model you choose for your study in order to test your hypothesis. o Explain why this method is the best for your purposes. o Controls: What kinds of factors would you need to control for in your study? o Describe what types of effects would be likely to occur which would make your results appear to confirm, or to disconfirm your hypothesis.  Analysis: How will you analyze and present the results? o What kind of results would confirm your hypothesis (you hope to see)? o What kind of results would disconfirm your hypothesis (you hope not to see)? 5. Significance and Conclusion Discuss, in general, how your proposed project would lead to a significant improvement in understanding your topic of interest and how it would benefit users. (In other words, why should someone care? If you were applying for money to do this, why would someone fund you? If you wanted to publish your results, why would they be interesting?) 6. References Include any references. ...
Purchase answer to see full attachment

Tutor Answer

School: UC Berkeley

Please let me know if there is anything needs to be changed or added. I will be also appreciated that you can let me know i...

flag Report DMCA

I was on a very tight deadline but thanks to Studypool I was able to deliver my assignment on time.

The tutor was pretty knowledgeable, efficient and polite. Great service!

I did not know how to approach this question, Studypool helped me a lot.

Similar Questions
Related Tags

Brown University

1271 Tutors

California Institute of Technology

2131 Tutors

Carnegie Mellon University

982 Tutors

Columbia University

1256 Tutors

Dartmouth University

2113 Tutors

Emory University

2279 Tutors

Harvard University

599 Tutors

Massachusetts Institute of Technology

2319 Tutors

New York University

1645 Tutors

Notre Dam University

1911 Tutors

Oklahoma University

2122 Tutors

Pennsylvania State University

932 Tutors

Princeton University

1211 Tutors

Stanford University

983 Tutors

University of California

1282 Tutors

Oxford University

123 Tutors

Yale University

2325 Tutors