Stata - Tips PDF
Stata - Tips PDF
Stata - Tips PDF
Additionally please see our list of Stata books available for purchase.
General Tips
Options
One of the strengths of Stata is the system of options, typed after a comma. If you list your
data you see the value labels of the variables. If you add the nolabel option after a comma
you will see the underlying code values.
To see the underlying value codes rather than labels, add the nolabel option after a comma:
Note that you can list the variables in any order that you define.
BTW, the comma is a toggle. If used a second time it turns off the options. We could have writen
the above command as:
Top
To see the value codes rather than labels, add the nolabel option after a comma:
When you want to leave the Editor, Stata checks that you want to preserve the changes you
made.
Top
https://www.surveydesign.com.au/tips.html 1/114
17/6/2019 Stata | Tips
Do Editor
The Do-file Editor is very handy, invoked from a menu button or by typing
doedit .
• You can enter several lines or insert another file, such as one of your earlier *.do files.
• You can select a line in the Do Editor and Do only that line.
• Or Do from that line to the end of the set of instructions. Or select several lines and Do
them.
• At the end you can save the contents of the Do Editor as a *.do file.
Top
Folders
Type adopath or sysdir to see the location of various Stata folders for your main files,
updates, STB files, personal ado files, etc.
Top
Statalist
There is a Stata list server with useful advice about Stata, including new programs to help with
special problems.
See http://www.stata.com/support/statalist/faq
Top
• The first issue of The Stata Journal was released at the end of 2001.
• The last issue of the Stata Technical Bulletin (#61) was in May 2001.
From within Stata you can see what STB procedures are available and download the ones that
interest you. From the Help menu, select STB and User-written Programs . After that,
choose the hypertext links (clickable blue words in the help window) for the Stata site then then
click on stb
Useful Resources
https://www.surveydesign.com.au/tips.html 2/114
17/6/2019 Stata | Tips
• The Stata resources web page, includes links to free downloadable tutorial:
http://www.stata.com/links/resources1.html
• UCLA graphics page using Stata may be of interest:
http://www.ats.ucla.edu/stat/stata/Library/GraphExamples/default.htm
• Setting up docked windows (Great for learning to set up Stata 9 windows):
http://www.ats.ucla.edu/stat/stata/faq/stata9gui/dockfloatpin.html
• Princeton tutorial for learning Stata:
http://data.princeton.edu/stata/
• Princeton help notes for Stata:
http://dss.princeton.edu/online_help/stats_packages/stata/
• Working with Hilda data - PanelWhiz:
http://www.panelwhiz.eu
• Stata Programming:
http://www.stata.com/meeting/11uk/baum.pdf
Monthly Tips
Top
The contract command reduces the Stata dataset to a number of variables; that you specify
and their frequencies. This may be what you wish to achieve. However, if you only wish to see
these in table form then using the Stata editor will achieve this with just a few additional
commands.
Example 1
The result:
https://www.surveydesign.com.au/tips.html 3/114
17/6/2019 Stata | Tips
Domestic 29 1
Domestic 30 1
Domestic 34 1
Foreign 14 1
Foreign 17 2
Foreign 18 2
Foreign 21 2
Foreign 23 3
Foreign 24 1
Foreign 25 4
Foreign 26 1
Foreign 28 1
Foreign 30 1
Foreign 31 1
Foreign 35 2
Foreign 41 1
Example 2
The result:
help contract
help bysort
help edit
Top
cond(x,a,b,c)
cond(x,a,b)
Example 1: You wish to fill in a variable with 5 and where the variable a is greater than 5 then the
value of a
//generate data
clear
set obs 15
generate a=_n
generate b=cond(a>5,a,5)
list
//(1)
generate c=5
replace c=a if a>5
list
//(2)
generate c1=(a>5)*a
replace c1=5 if a<6
list
//(3)
generate c2=5
replace c2=a if inrange(a,5,.)
list
//(4)
// This options while doing this in one line is not as easy to understand as
// that using the cond() function
gen c3=(a>5)*a + (a<=5)*5
list
//(5)
// This can also be done with the max() function eg.
generate c4=max(a,5)
list
https://www.surveydesign.com.au/tips.html 5/114
17/6/2019 Stata | Tips
Example 3
However where the variable a contains missing values the results from cond() are different from
max():
// generate data
clear
set obs 11
generate a=_n if _n<10
generate b=cond(a>5,a,5)
list
exit
Example 4
input str10 a
"thsi "
"that "
"the"
"tree"
"these"
end
list
gen b=cond(trim(a)=="thsi","this",trim(a))
list
exit
Example 5
input a
-2
-1
0
1
2
end
list
generate c= ///
cond(a>0, 1, /// greater than zero
cond(a==0, 0, /// equal to zero
https://www.surveydesign.com.au/tips.html 6/114
17/6/2019 Stata | Tips
cond(a<0, -1, . /// less than zero
)))
list
Example 6
Example 7
Example 8
From the Stata press book: "An Introduction to Stata programming by Christopher F. Baum"
Section 3.3.2
generate netmarr2x=cond(marr/divr>2.0, 1, 2)
Example 9
clear
set obs 10
generate x=_n if _n<8
list
generate z=cond(x>5,1,0,.)
list
// the above produces the correct values; as a missing value is always greater than
// any number in the variable. However
// you may wish to have missing where a value in a is missing.
// http://www.stata.com/statalist/archive/2008-02/msg01204.html
// This puts missing values in where they occur in the x variable
generate z1 = cond(missing(x), ., x > 5)
list
// An alternative to the above is to use the 2nd syntax of the cond() function. Note the 3rd term
// in the fucntion does nothing so instead of "." any number would have been OK.
generate z2=cond(x,x>5,.,.)
list
Example 10
The stata press book "Data Anlaysis using Stata by Ulrich Kohler and Frauke Kreuter" p460 Show
how the cond() function can be used provide a default title for a graph
local title = cond( `"`title'"' == `""', `"`varlist' by `by'", `"`title'"')
https://www.surveydesign.com.au/tips.html 7/114
17/6/2019 Stata | Tips
help cond()
Stata Journal Vol 5 No 3
Top
Combinations of variables
From time to time I'm asked how to write a program that used all combination of a subset of
variables in an estimation command.
This approach:
program combin
syntax , NUMTot(integer) NUMPick(integer)
quiet {
clear
numlist "1/`numtot'"
set obs `numtot'
gen `bbb'=""
gen `q'=.
forvalues i=1/`=_N'{
local a "`=`con'[`i']'"
drop if !missing(`con1')
duplicates drop `con', force
labels2 //program
} //quiet
end
program labels1
args max_vars
describe, replace
keep name
keep in 1/`max_vars'
encode name, gen(name1)
label save name1 using filename , replace
end
program labels2
do filename
label value a* name1
foreach i of varlist a* {
decode `i', gen(z`i')
}
egen levels=concat(z*), punct(" ")
keep levels
end
input
numtot(#): this is the number of variables from which the "numpick" are taken out. The order of
the variables is important as mumtot start at the first variable in the current order
forvalues i = 1/6 {
di "`i'"
regress turn `=levels[`i']'
estimates store a`i'
}
https://www.surveydesign.com.au/tips.html 9/114
17/6/2019 Stata | Tips
estimates table a* , stats(r2)
exit
Top
Solving for pi yields: pi = 4 * (scatter points in quadrant area)/(scatter points in square area ie.
1*1 square)
https://www.surveydesign.com.au/tips.html 10/114
17/6/2019 Stata | Tips
clear
set obs 1000
gen x=runiform()
gen y=runiform()
gen height=sqrt(1 - (x)^2)
args n //2
set obs `n'
local x=runiform()
local y=runiform()
local dis=sqrt(`x'*`x'+`y'*`y')
if `dis'<=1 local ++z //5
local ++zz
} //end loop
https://www.surveydesign.com.au/tips.html 11/114
17/6/2019 Stata | Tips
Notes:
1 - Program called by the simulate command. The program we called "a". Note class r ie returns
r values
2 - The args command. When we call the "a" program we also pass to the program the number
of observations required
help return
help simulate
help macro
help runiform()
help summarize
Top
The number of commands could have been reduced by using return results from preceding
commands or you can even turn this into a program.
matrix a1=r(table) // 6
matrix list a1 // 7
mata // 8
b=xl()
b.create_book("Results","Sheet1") // 9
b.put_string(1,1,"Variable") // 10
b.put_string(1,2,"Coefficient-foreign=1")
b.put_string(1,3,"Coefficient-foreign=0")
https://www.surveydesign.com.au/tips.html 12/114
17/6/2019 Stata | Tips
b.put_string(2,1,"weight")
b.put_string(3,1,"length")
b.put_number(2,2,st_matrix("a1")[1,1]) // 11
b.put_number(3,2,st_matrix("a1")[1,2])
b.put_number(2,3,st_matrix("a0")[1,1])
b.put_number(3,3,st_matrix("a0")[1,2])
end // 12
Notes:
2 - Erasing any previous excel file saved with the same name in the folder.
4 - Not required to run the above but lets us see what return results Stata produces for this
command.
5 - Not required to run the above but lets us see the results that Stata saves in the matrix.
6 - Save the Stata matrix called r(table) into a Stata matrix we will call a1.
7 - Not required to run the above but lets us see the contents of the matrix we have just saved.
8 - Starting Mata.
9 - Mata command to name the file in which we wish to save the results. We call this file
"Results".
11 - Mata command to put a result; obtained from the Stata matrix "a1" into a cell. The cell being
row 2 and column 2 [2,2].
12 - End Mata.
help return
help matrix
help mata
help help m5_intro
help help mf_xl
Top
Customize
Click on New button
https://www.surveydesign.com.au/tips.html 13/114
17/6/2019 Stata | Tips
Top
however, the group rename may have been over looked by some users. Detailed examples and
syntax can be found at:
This extends the functionality of the rename command to include renaming groups of variable
names.
Example 1
Say that you have imported your data set and all the variable name come capitalised; you prefer
lower case.
foreach i of varlist * {
rename `i' lower(`i')
}
https://www.surveydesign.com.au/tips.html 14/114
17/6/2019 Stata | Tips
rename * , lower
clear
forvalues i=1/15 {
gen var`=`i'^2' = missing()
}
Example 2
Add suffix to the variable names var1 to var100 (based on the current variable name order)
use data, clear
rename (var1-var100) =A
Example 3
Example 4
Adding the string "four" to the variable name when ever the name contains the number "4"
use data, clear
Example 5
Swapping the prefix and suffix where they both contain a character.
use data, clear
rename * *A
Example 6
https://www.surveydesign.com.au/tips.html 15/114
17/6/2019 Stata | Tips
help rename
help rename group
Top
Installation of new fonts may vary between operating systems. These instructions are for
Windows XP
Copy the font file to the disk file (temp directory suggested)
The new font "Female and male symbols" has now been installed
label define sexlab 1 `"{fontface "Female and male symbols":M }"' 0 ///
`"{fontface "Female and male symbols":F }"'
label values s sexlab
https://www.surveydesign.com.au/tips.html 16/114
17/6/2019 Stata | Tips
Top
Getting to the function quickly
Stata has lots of functions. So many that at times you may need to look up the syntax. However,
accessing the online help for this is a bit tedious eg:
help functions
To speed up access a keyboard function key can be assigned to bring up this page.
To do this is:
global F4 "help functions; This global macro statement is put into your profile.do
************profile.do**************
***********************************
help profile
or see
Previous tip on profile.do
Example
The above is fine if you know what the number means. This could be looked up with:
help label
Top
In this examle we are using the label command with the save option.
clear
set more off
input a b
1 1
2 2
3 3
end
label values a a
label values b b
list
//finished setting up the data set
preserve // (4)
https://www.surveydesign.com.au/tips.html 18/114
17/6/2019 Stata | Tips
keep t1 // (8)
type c:/try.do
restore // (10)
type c:/try.do
do c:/try // (13)
label list
https://www.surveydesign.com.au/tips.html 19/114
17/6/2019 Stata | Tips
help encode
Top
Breaking up dates
Sometimes a long time span may need to be broken up into years or months etc., because say a
particular year in the data is of interest or the dataset is otherwise too big (wide) for Stata.
clear
set more off
//start date
generate y`i'_s1=date_in2a if flag & year(date_in2a)==`i' // (9)
replace y`i'_s1=td("1Jan`i'") if flag & missing(y`i'_s1) // (10)
//end data
generate y`i'_f1=date_out2a if flag & year(date_out2a)==`i'
replace y`i'_f1=td("31Dec`i'") if flag & missing(y`i'_f1)
}
format y* %td
list
exit
(3) Formating the just created elasped date to make it easier to read for checking if the conversion
went correctly.
(4) Using the summarize command to get the earliest date. This is stored in the return scalar:
r(min) All the return value from the summarize command can be seen by typing return list.
(5) Storing the min value in a local macro.
(6) Generating a flag variable to indicate if a particular year is within the date span of the
observaion.
(7) Looping through all the years in the data.
(8) Making flag equal to 1 if the year is in range.
(9) If the date span for the observation contains the year in the looping index and this is the starting
date then put the starting date in the variable.
(10) If the starting date is earlier than the start of the year in the looping index put in the 1 Jan for
that year.
Top
The separate command is useful for splitting up a variable into variables based on the levels of
another variable. These new variables can then be used to produce a graph.
https://www.surveydesign.com.au/tips.html 21/114
17/6/2019 Stata | Tips
Sometimes there are Web pages that you would like to access every now and then. For example
Stata's forthcoming web page (you can sign up for email updates but writing a program is more
fun) or Stata blogs etc.
Below is a way that this might be done. You could turn this into a ado file or just put it into your
profile.do
clear mata
mata: //2
if(!fileexists("mymatrix")){ //3
v=st_global("c(current_date)") //4
X=date(v,"DMY") //5
st_local("date",strofreal(X))
exit
(1) Extended macro saving the path of Stata's PERSONAL location in the local macro a.
PERSONAL is on Stata's adopath
(2) Using a Stata Mata matrix to store the date that a Web page was last accessed. You could store
the information in other forms but a Mata matrix seemed a handy way of doing this
(3) The first time that this is run there is no Mata file; so just checking if one needs to be created. If
not, jump into the loop
(4) Save the current date in a scalar matrix called v
https://www.surveydesign.com.au/tips.html 22/114
17/6/2019 Stata | Tips
(5) Convert current date to Stata elapsed time using the date() function
(6) Saving the Mata matrix to a file
(7) Reading the saved Mata file
(8) Save the interval (days) that you wish to display the web page. In this case every day
(9) If the duration that the web page was last accessed is greater then the specified interval and it
has not been accessed today then jump into loop
(10) The web page that you wish to see
(11) Comment indicating that program is working but is not required to access Web page
Sometimes official Stata's commands or downloaded user written commands may not supply all
the information that you require or the information may not be in the form that you require. One
way of addressing this is to write your own command that includes the additional feature.
For example the official Stata command levelsof only return the levels of variable specified.
Often the number of levels is also required. This is easily included in your own command, as
shown as follows:
(1) program command with a new name for the command. Never over write existing commands;
create a new name. A common prefix will allow you to easily identify the new command. The rclass
options is used where the program is required to return some values.
(2) The levelsof command with `0'. The macro `0' contains what was passed to the new command
eg. mpg,local(z)
(3) Using the macro extended function: word count , to count the nunber of levels
(4) The return value as a scalar
(5) end of the program
(6) Loading the Stata data set
(7) Calling the new program: s_levelsof
https://www.surveydesign.com.au/tips.html 23/114
17/6/2019 Stata | Tips
help program
help extended_fcn
Top
Stata 12's do editor has many great feature including selecting columns. The following is an
example were this can be used.
Say you wish to clean up the following do file by putting all the /// into a column. You adjust each
triple forward slash individually but if there are many this would take time and be boring. Using
Stata's column select this is easier:
( /// is used for concatenating the next line )
Highlight the column of ///. Adjust the end /// so that it is the closest to the rhs. At the top lhs of the
column place place the cursor. Then on the keyboard simultaneously press the Ctrl and Alt keys,
then with the mouse select the column require.
https://www.surveydesign.com.au/tips.html 24/114
17/6/2019 Stata | Tips
At the top of the column drag the column to the left hand side. Then using the do editor pull down
menu: Edit>Find>Replace, tick the regular expression box and type in the show regular expression
and execute.
Then select the column of ///. (on the keyboard simultaneously press the Ctrl and Alt keys, then
with the mouse select the column require ) At the top of the column drag right to the required
position.
https://www.surveydesign.com.au/tips.html 25/114
17/6/2019 Stata | Tips
adoupdate
(January 2013)
The Stata command adoupdate is used to update user written packages obtained from ssc.
However, it may be the case that not all your user written packages have been obtained from ssc,
and hence will not be updated with adoupdate.
For example:
You have just read an article in the Stata Journal titled " Error–correction–based cointegration tests
for panel data" it sounds interesting so you download load the program using:
findit xtwest
You click on the hyperlink and it downloads (confirmed by ado dir or using the pull down menu:
Help>SJ and user written programs and then previously installed packages>list.
To get the version of this program you type which on the Stata command line.
https://www.surveydesign.com.au/tips.html 26/114
17/6/2019 Stata | Tips
. which xtwest
c:\ado\plus\x\xtwest.ado
*! xtwest 1.1 1Apr2008
*! Damiaan Persyn, LICOS centre for Development and Economic Performance www.econ.kuleuven.be/licos
*! Copyright Damiaan Persyn 2007-2008.
adoupdate does not update this because it was not loaded from ssc eg.
. adoupdate xtwest, update
(note: adoupdate updates user-written files; type -update- to check for updates to official Stata)
(no packages match "xtwest")
. which xtwest
c:\ado\plus\x\xtwest.ado
*! xtwest 1.5 1Jul2010
*! Damiaan Persyn, LICOS centre for Development and Economic Performance www.econ.kuleuven.be/licos
*! Copyright Damiaan Persyn 2007-2008.
Now the version number is 1.5 (previously 1.1). The later version includes some bug fixes.
Therefore care should be taken to see that the version of a user package is the one that you
require.
Other cases of adoupdate not updating user packages is where the author of a user written
package has this on their personal web page; not ssc. eg.
Spost
http://www.indiana.edu/~jslsoc/web_spost/sp_install.htm
and other..
In these cases you need to go to the authors site and down load the package as per the authors
instructions.
Top
Compress
(December 2012)
Often handy after inporting data from a spreadsheet where foot notes, comments in the variable
https://www.surveydesign.com.au/tips.html 27/114
17/6/2019 Stata | Tips
column can create a string length that is far in excess to that required.
An example
clear
input str200 country
"Note: the details for this are.."
1
2
end
edit
notes : TS country[1]
drop in 1
sleep 3000
compress
notes
Top
For small numbers of variables, reshaping long and attaching variable labels afterwards, can be
done by hand but with more than says 10 variable stubs this becomes boring, error prone and
time consuming; so it is advisable to automate this.
//the reshape
use data, clear
reshape long "`stub'", i(id) j(Year)
https://www.surveydesign.com.au/tips.html 28/114
17/6/2019 Stata | Tips
}
drop varlab var
describe
list
exit
The lines of code under this title get the variable stubs and their associated labels.
1 Using the replace and clear options of the describe command the variable names and labels
replace the existing data in Stata.
5 As only one of each stub is required the duplicates command is used to remove duplicates
6 merge the variable name and label with the data set
7 Check to see if variable name in the dataset is the same as merged data variable name
8 If the same the variable label "varlab" is given the label for this variable
Top
Graph marker labels will frequently overlap on graphs; making them difficult to read. One solution
to prevent/minimise this is to write an algorithm that reduces this to a minium. However, this is a
significant amount of work. An alternative is to separate them based on a qreg and then angle
the labels based on the distance from the adjacent markers.
https://www.surveydesign.com.au/tips.html 29/114
17/6/2019 Stata | Tips
The following is a graph where we start with a specific observation, in this case observation one
of the sort order that Stata supplies, then determine the closest adjacent observation, then this
observation is used to determine the next closest observations etc.
Then a qreg is used at split the data; the quantile options filled in with a value determined by the
user.
From there we select the number options: anglular rotations of the label, the starting angle and
the angle of rotation.
The code
clear all
https://www.surveydesign.com.au/tips.html 30/114
17/6/2019 Stata | Tips
generate order=.
generate index= _n
//scale
summarize mpg
local a1_max=r(max)
local a1_min=r(min)
local a1_diff=`a1_max'-`a1_min'
summarize weight
local a2_max=r(max)
local a2_min=r(min)
local a2_diff=`a2_max'-`a2_min'
generate dis_hor=.
generate dis_vert=.
generate hyp=.
generate kk=.
local z=1
forvalues i=1/74 {
if `z'==1 {
replace order=1 in 1
local z=0
}
else {
gsort -order
replace dis_hor=(mpg[1]-mpg)/`a1_diff'
replace dis_vert=(weight[1]-weight)/`a2_diff'
replace hyp=sqrt(dis_hor^2 +dis_vert^2)
sort hyp
replace kk=sum(missing(order))
replace kk=. if !missing(order)
replace order=`i' if kk==1
}
}
predict a
generate up_down= a < mpg
tab up_down
//values to change
local angle =10
local s_ang=55
local no_ang= 3
local z=0
forvalues i=0/`=`no_ang'' {
if `z'==0 {
local aa2 =`"(scatter mpg weight if aa==`i' & up_down==0 , mlabcolor(blue) "' + ///
`" mlab(make) mlabpos(3) mlabangle(`=-`s_ang'-`angle'*`i'') ) "' + ///
`"(scatter mpg weight if aa==`i' & up_down==1 , mlab(make) mlabcolor(red) "' + ///
`" mlabpos(3) mlabangle(`=`s_ang'+(`angle'*`i')') ) "'
local z=1
}
else {
local aa1= `" (scatter mpg weight if aa==`i' & up_down==0, mlabcolor(blue) "' + ///
`" mlab(make) mlabpos(3) mlabangle(`=-`s_ang'-`angle'*`i'') ) "' + ///
`"(scatter mpg weight if aa==`i' & up_down==1, mlab(make) mlabcolor(red) "' + ///
`"mlabpos(3) mlabangle(`=`s_ang'+`angle'*`i'') )"'
}
https://www.surveydesign.com.au/tips.html 31/114
17/6/2019 Stata | Tips
local aa2 `aa2' `aa1'
}
twoway ///
`aa2' , ///
yline(22) xline(2930) name(kk2) legend(off)
exit
Stata's reshape command requires the prefixes of variables to be stated. If there are many
variables to be reshaped, then rather than type in their prefixes , let Stata do the work.
First generate a pretend data set. In reality there will be many more variables
clear
list
preserve //2
local a //6
forvalues i=1/`=_N' { //7
local a `a' `=prefix[`i']' //8
}
restore //9
list
exit
(1) The Stata command input is useful for inputting a small data set.
(2) The Stata command preserve is used here because the existing data will be cleared and a
copy of the data is required. perserve copies the current data to the computer hard drive.
(3) The Stata command describe ;one of the options of describe is replace ie replacing the
existing data with the contents of the describe command. The command is used to get a list of
variable names.
(4) The Stata functions regexs and regexm are regular expression functions and are used to
generate a new variable containing the variable prefixes. For more informatrion see our
previous tip on regular expression at: November 2010
(5) The Stata command contract contacts a data set to a set of values and frequencies; similar to
a oneway frequency table. This command is used to remove duplicates from the data. An
https://www.surveydesign.com.au/tips.html 32/114
17/6/2019 Stata | Tips
alternative to this command is duplicates but in this case contract is easier to user.
(6) local initializes the value of the local macro, which we have called a. This will be used later to
accumulate variable prefixes.
(7) The Stata command forvalues is a loop command which goes through all the variable prefixes
in the data.
(8) Using the local macro a prefixes are accumulated.
(9) The Stata command restore replace the data currently in Stata with the data set previously
saved by the preserve command.
(10) The Stata command reshape is used to change the data format from wide to long. The
reshape command uses the prefixes stored in macro a.
Top
Refering to last months fuzzy merge tip, the Stata commands tsset and tsfill were used to get all
possible times. Sometimes the number of observations can exceed Stata's limit. When this
happens an alternative method is to append the 2 datasets and then determine which events are
to be treated as the same event.
cd c:/ //(1)
clear
clear
input time //(5)
8
19
30
end
sort time
list //(11)
list
list
exit
(1) The Stata command cd changes the working directory to that specified eg. c:/
(2) The Stata command input is useful for inputting a small data set. In merge jargon the data set
created here is called the "using" data set; because it is called into Stata by the merge
command.
(3) Creates a variable to indicate the "using" data set.
Creates a variable containing random values which range from 0 to 1. The function
(4)
runiform() does this. This is done to create some "play" data.
(5) Inputting the second data set. This is the data set that is in Stata's memory when we merge.
In merge jargon this is called the "master".
(6) Create a variable that identifies this data set.
(7) Joins the 2 data sets vertically together; matching variable names.
(8) Using Stata's duplicate command and the tag option generates a new variable called same.
This indicates where both data set match on the specified variable(s).
Where the time match perfectly only one observation is required therfore the duplicate
(9)
observation is dropped.
Merge the data based on a one to one (1:1) relationship (1:1 meaning only 1 observation from
(10) the using data set and one observation from the master data set allowed) between the key
variable (time) in the using and master data sets
(11) list the data to see if it looks as expected.
(12) drops the observation if both data1 and data2 are missing
(13) generates an indicator variable that tells us what is the closest observation in terms of time. A
negative sign indicates if this is before the merge and a positve sign indicates after the merge.
To do this the cond() function is used. The first term in the brackets ie time[_n+1]-time>time-
time[_n-1] evaluates to either true of false. If true, the second term is used ie -1*(time-
time[_n-1]) if false the third term is used ie time[_n+1]-time ) the 2nd or 3rd terms are only
generated on the observations were date2 equals 1 ie date2==1
Terms like time[_n+1] make Stata work at the observation level. Observations to be used are
defined by the contents of the square brackets. The variable outside of the square brackets is
the variable name whose observations we are using. Inside the square brackets _n is the
current observation hence the term _n+1 is the current observation plus 1 (one).
https://www.surveydesign.com.au/tips.html 34/114
17/6/2019 Stata | Tips
(14) Replaces the contents of stuff with stuff[_n+sign(min)*1] only if the expression
!missing(min)& abs(min)<=2 is true. In other word if the time between the merge
observation and the next closest time is less than or equal to 2 the merge observation is filled
in with this value.
(15) Lastly the observation that comes closest to the merge is deleted. Because this can be above
or below the merge observation 2 conditions are specified separated by an OR symbol ie |
(16) Creates an indicator variable total_matches using the egen command that indicates where
the data now matches
(17)
change the order of the variables to make for easier reading/checking
Top
Merging 2 data sets in Stata on a key variable requires the key variable to match exactly.
However if the key variable is time, small discrepancies (milli seconds) will result in a non-match
even if the 2 observations relate to the same event. To merge data sets like this, a range of time
can be used for a match. In this example we will specify a time range + and - for an acceptable
match to occur.
cd c:/ //(1)
clear
clear
input time //(7)
8
19
30
end
https://www.surveydesign.com.au/tips.html 35/114
17/6/2019 Stata | Tips
generate date2=1 //(8)
sort time
list //(10)
list
list
exit
(1) The Stata command cd changes the working directory to that specified eg. c:/
The Stata commandinput is useful for inputting a small data set. In merge jargon the data set
(2) created here is called the "using" data set; because it is called into Stata by the merge
command.
(3) Creates a variable to indicate the "using" data set.
Creates a variable containing random values from 0 to 1. The function runiform() does this.
(4)
This is done to create some "play" data.
tsset is Stata's command that sets the data for time series. In this case we set the variable
(5) "time" as the variable that contains time for time series. The only reason we tsset the data is
so that we can use the next command ie tsfill
(6) tsfill is Stata's time series command that fills gaps in the time variable
Inputting the second data set. This is the data set that is in Stata's memory when we merge.
(7)
In merge jargon this is called the "master".
(8) Create a variable that identifies this data set.
merge the data based on a one to one (1:1) relationship between the key variable (time) in the
(9)
using and master data sets
(10) list the data to see if it looks as expected
(11) drops the observation if both data1 and data2 are missing
(12) generates an indicator variable that tells us what is the closest observation in terms of time. A
negative sign indicates if this is before the merge and a positve sign indicates after the merge.
To do this the cond() function is used. The first term in the brackets ie time[_n+1]-time>time-
time[_n-1] evaluates to either true of false. If true, the second term is used ie -1*(time-
time[_n-1]) if false the third term is used ie time[_n+1]-time ) the 2nd or 3rd terms are only
generated on the observations were date2 equals 1 ie date2==1
Terms like time[_n+1] make Stata work at the observation level. Observations to be used are
defined by the contents of the square brackets. The variable outside of the square brackets is
the variable name whose observations we are using. Inside the square brackets _n is the
current observation hence the term _n+1 is the current observation plus 1 (one).
(13) Replaces the contents of stuff with stuff[_n+sign(min)*1] only if the expression
!missing(min)& abs(min)<=2 is true. In other word if the time between the merge
observation and the next closest time is less than or equal to 2 the merge observation is filled
in with this value.
(14) Lastly the observation that comes closest to the merge is deleted. Because this can be above
or below the merge observation 2 conditions are specified separated by an OR symbol ie |
https://www.surveydesign.com.au/tips.html 36/114
17/6/2019 Stata | Tips
Top
With the release of Stata 12 the loading of Excel spreadsheets became even easier (see
previous tip - here ).
However, the loading of a spreadsheet may not always go as planned/hoped. The following
problems can occur:
(1) Stata will not load the variable names in the first row because some of these do not
correspond to Stata's variable name convention
Example:
It is good practice to do all the fix up work in a Stata do file; rather then doing repairs on the
spreadsheet.
Addressing problem 1
https://www.surveydesign.com.au/tips.html 37/114
17/6/2019 Stata | Tips
if _rc==0{
rename `i' Y`=`i'[1]'
}
else rename `i' `=`i'[1]'
}
drop in 1
The above imports the spreadsheet and renames the problem variables. The command confirm
number determines if the first row of the data set is a number or a string. capture in front of the
confirm command "captures" if the confirm statement is true or false, hence giving the return code
of the capture ( _rc)command a value of 7 if a string or 0 if a number. The return code is then used
to determine how the variable name is to be renamed.
Having a closer look at: `=`i'[1]'
`i' is the macro substitution of the looping index i. In this case the variable name
[1] when put adjacent to a variable name (without a space) the bracketed number indicates the
observation number. In this case the first time around the loop it would equal the first observation of
variable A. This is know as explicit subscripting.
`= ' the symbols around `i'[1] are to tell Stata to evaluate the expression see: Stata 12 Users Guide
18.3.8 page 201
foreach i of varlist Y* {
capture confirm numeric variable `i'
if _rc!=0 {
replace `i'="." if `i'==""
destring `i', replace
}
}
describe
The above once again uses the confirm command but this time with the variable option. Each
variable with a Y prefix (the character that was previously included) is put through a loop where
firstly empty cells are filled in with the Stata missing values and then the string variable is changed
to a numeric variable with the destring command.
If the output of the descibecommand indicates that the variable is still a string then this may be due
to a non-numeric characters in the variable. One way of looking for the observation that contains
this is:
forvalues i=1/`=_N' {
capture confirm number `=Y2001[`i']'
if _rc!=0 {
display "Potential problem observation: "`i'
}
}
The output to the above indicates a problem with observations 1 and 2 of the Y2001 variable.
Top
https://www.surveydesign.com.au/tips.html 38/114
17/6/2019 Stata | Tips
(April 2012)
(2) handling a text variable where the text exceeds 244 characters and only a portion of the data
is required to be imported.
An example of the text file; assume that the text is longer than 244 characters (Stata's max.
string length) The data can be created in a text editor eg. Stata's do editor and saved as with a
.txt extension.
"c:/file_try.txt"
one:two:three:four
one1:two1
For this example assume that we require only the last 2 words in a new file (the words are
delimited with a ":")
An example of a Stata program that will create a new file with only the last 2 words of each line
is:
file write `hdl' %st10 ("`No1'") %st10 (":") %st10 ("`No3'") _new
file read `fh' line
}
file close _all
end
(1) program The start of the program starts with program and a name of the program: ltype. The
program finishes with end.
(2) syntax command, passes information to the program via local macros ie. current is the macro
name containing the file name of the initial file (raw data). new is the name of the local macro that
contains the file name for the information obtained by the program (last 2 words of line). p is the
local macro name of the delimiter.
(3) Creating tempory names
(4) Opening the file with the initial information
(5) Opening a file for the required data to be entered
(6) Read the first line of the file and store this in the local macro line
(7) While loop; continues until end of file is reached
(8) Reverse the order of the contents of the line macro so the last word becomes the first
https://www.surveydesign.com.au/tips.html 39/114
17/6/2019 Stata | Tips
(9) tokenize the line based on the parse character : ; break string into local macro's with names:
1,2 etc.
(10) make the contents of the local macro's No1 and No3 the reverse of the last and 2nd last
words (: is treated as a word)
(11) writes macro contents to the new file separating then with ":"
type "c:/file_try.txt"
ltype ,current("c:/file_try.txt") new( "c:/file_try1.txt") p(":")
type "c:/file_try1.txt"
infile str10 a using "c:/file_try1.txt", clear //loading file into Stata
list
Top
There are 2 ways that data can be cleaned in Stata: manually or using a rule based system.
Below is one way that messy data can be cleaned with the assistance of Stata's soundex()
function and some manually cleaning.
clear
input str20 w1
"Microsoft"
"MicroSoft"
"Micro Soft"
"Micro-Soft"
"Microsoft Inc."
"Microsoft Inc"
"MicrosoftInc"
"MicrosoftAA"
"Microaa"
"MSFT"
"MS"
"M$"
"STATA"
"StataCorp"
"StataCorp LP"
"staCorp"
"Linux"
"linux"
end
list, clean noobs
save c:/a, replace
//Linux
foreach i in L520 { // <--2
replace New_w1="Linux" if kk=="`i'" // <--3
}
//Microsoft
foreach i in M262 M000 M200 M213 {
https://www.surveydesign.com.au/tips.html 40/114
17/6/2019 Stata | Tips
replace New_w1="Microsoft" if kk=="`i'"
}
//Stata
foreach i in S330 S326 S332 {
replace New_w1="StataCorp" if kk=="`i'"
}
sort kk
list, sepby(New_w1)
exit
(1) soundex() The soundex code consists of a letter followed by three numbers: the letter is the
first letter of the name and the numbers encode the remaining consonants. Similar sounding
consonants are encoded by the same number.
(3)replace command that replace existing contents with the name that mapps to the soundex code.
+-----------------------------------+
| w1 kk New_w1 |
|-----------------------------------|
1. | Linux L520 Linux |
2. | linux L520 Linux |
|-----------------------------------|
3. | M$ M000 Microsoft |
4. | MS M200 Microsoft |
5. | MSFT M213 Microsoft |
|-----------------------------------|
6. | Microaa M260 Micro AA |
|-----------------------------------|
7. | Micro Soft M262 Microsoft |
8. | Micro-Soft M262 Microsoft |
9. | Microsoft Inc. M262 Microsoft |
10. | MicroSoft M262 Microsoft |
11. | MicrosoftInc M262 Microsoft |
12. | MicrosoftAA M262 Microsoft |
13. | Microsoft M262 Microsoft |
14. | Microsoft Inc M262 Microsoft |
|-----------------------------------|
15. | staCorp S326 StataCorp |
16. | STATA S330 StataCorp |
17. | StataCorp LP S332 StataCorp |
18. | StataCorp S332 StataCorp |
+-----------------------------------+
help
and then the specific command eg.
help soundex()
help generate
help replace
help use
Top
https://www.surveydesign.com.au/tips.html 41/114
17/6/2019 Stata | Tips
There are 2 ways that data can be cleaned in Stata: manually or using a rule based system.
Below is one way that messy data can be cleaned manually so that names are consistent.
clear
input str20 w1
"Microsoft"
"MicroSoft"
"Micro Soft"
"Micro-Soft"
"Microsoft Inc."
"Microsoft Inc"
"MicrosoftInc"
"MSFT"
"MS"
"M$"
"STATA"
"StataCorp"
"StataCorp LP"
"staCorp"
"Linux"
"linux"
end
list, clean noobs
save c:/a, replace
contract w1 //-->1
edit //-->2
contract w1 if x2==""
edit
https://www.surveydesign.com.au/tips.html 42/114
17/6/2019 Stata | Tips
"StataCorp LP"
"staCorp"
end
generate x3="1"
//merging
use c:/a, clear
merge 1:m w1 using c:/a1, nogenerate
replace x2="Stata" if x3=="1"
drop x3
(2) open the Stata editor so that the various names can be copied. If a large list and the required
names are spread throughout the list a new variable can be created and a 1(one) put in, adjacent
to the names. The variable with the 1's can be sorted and then the variations on the required name
can be copied into the do file.
(3)generate a new variable to be used after the merge command that indicates the names to be
changed
(4) Load original dataset (5) Merge the original dataset with the list of names dataset (6) Replace
the "1" in the previously generated variable (3) to the official name "Microsoft"
A problem that comes up from time to time is where you have clustered dates eg. going to the
doctor for treatment and subsequent follow up(s). You may wish to group each issue (intial
treatment and follow up into a group). Without detailed records as to the ailment treated on which
date you can attempt to do this by making an assumption as to how long the treatment is likely to
last. For example you may have the following data:
clear
set more off
input ///
str6 id str20 date
01003 07Nov2008
01003 07Nov2008
01003 11Nov2008
01007 22Dec2008
https://www.surveydesign.com.au/tips.html 43/114
17/6/2019 Stata | Tips
01007 05Dec2008
01007 13Nov2007
01007 14Nov2007
01007 22Jul2006
01007 22Jul2006
01007 22Jul2006
01007 11Sep2006
01009 13Oct2005
01009 17May2006
01009 17May2006
01009 13Jan2010
01009 06Jun2010
01008 08Nov2007
01008 08Nov2007
01008 08Nov2007
01008 15Jul2009
01008 15Jul2009
01008 15Jul2009
01008 27May2010
01008 28May2010
01008 28May2010
01008 28May2010
end
l
exit
(1) generate a new variable (date1) that takes the date in string format from date and converts this
into elasped time (a numeric value)
(2) generate a new variable called cluster; all values equal to missing (.)
(4) Using the bysort prefix, by every level of id the values of the temporary variable max is
generated and filled with values of _N. Note macro subsittion single brackets around the temporary
variable name. _N stands for the max number of observations. In this case, because of bysort, it is
the max number of observation for each level of id.
(5) The summarize command is used to obtain the max number of obseravations in all the levels
of max. The summarize command has a handy return value that stores this eg. r(max). To see the
other values returned by this command type: return list (after the summarize command)
(6) looping over the code in the curly brackets using forvalues loop
(7) using the bysort command replace the value of cluster with the looping index value (this will be
the group number) if the qualifer is true ie. the start of the next cluster and within 30 days of the
start of the new cluster.
cluster!=. logical statement either true of false ie if cluster does not equal (!=) a missing value (.)
the observation is true and equals 1 (one)
sum(cluster!=.) : sums the results of cluster!=.
Top
A problem that comes up from time to time is where, say hospital wards (could also be hospital
beds, cars, hotel rooms,
machines etc.) are used for a patient of minutes/hours/days and management wishes to know at
the end
An example:
clear
input ///
str30 date_in str30 date_out ward
"7/22/2011 22:59" "7/27/2011 10:12" 1
"8/27/2011 12:05" "8/27/2011 21:07" 2
"8/27/2011 10:46" "8/28/2011 19:45" 1
"8/28/2011 15:34" "8/28/2011 16:43" 2
"8/28/2011 23:24" "8/29/2011 13:43" 1
"8/27/2011 14:32" "8/28/2011 15:15" 2
"8/28/2011 09:43" "8/28/2011 17:49" 1
"8/28/2011 01:33" "8/28/2011 02:32" 2
"8/28/2011 04:43" "8/29/2011 05:53" 1
"8/31/2011 07:30" "8/31/2011 08:11" 2
end
list
set more off
Top
A problem that comes up from time to time is where, say hotel rooms (could also be hospital
beds, cars,
machines etc.) are booked for a number of days by the one person and management wishes to
know at the end
of the month how many rooms for each day were occupied.
An example:
clear
set more off
input ///
str30 date_in str30 date_out
"7/22/2011" "8/27/2011"
"8/27/2011" "8/27/2011"
"8/27/2011" "8/28/2011"
"8/28/2011" "8/28/2011"
"8/28/2011" "8/29/2011"
"8/27/2011" "8/28/2011"
"8/28/2011" "8/28/2011"
"8/28/2011" "8/28/2011"
"8/28/2011" "8/29/2011"
"8/31/2011" "8/31/2011"
"8/31/2011" "8/31/2011"
"8/31/2011" "9/4/2011"
"8/23/2011" "8/23/2011"
"8/23/2011" "8/24/2011"
"8/24/2011" "9/15/2011"
"8/4/2011" "8/4/2011"
"8/4/2011" "8/8/2011"
"8/10/2011" "8/10/2011"
"8/10/2011" "8/17/2011"
end
https://www.surveydesign.com.au/tips.html 47/114
17/6/2019 Stata | Tips
list
(1) Dates; input as strings are converted to elasped time (numbers of days from a datum).
(2) Dates are formated.
(3) The minimum and maximum dates are obtained with the summarize command
and saved in local macros.
(4) The range is calculated and saved in a local macro.
(5) Using the forvalues command the days of the range are looped through.
(6) The date, in elasped days is calculated.
(7) A new variable for each day is calculated and 1 included in the observation where
the loop date is in the range indicated by the inrange() function.
(8) The newly created variable is give a label; which is the loop date.
(9) Generate a unique id value to be used by the reshape command.
(10) Reshape the data from wide to long data format.
(11) Collapse the data to give the required results.
(12) Use the destring command to convert a string variable to a numeric variable.
(13) Rename variable.
(14) Include a variable label.
(15) Finally, list the results.
https://www.surveydesign.com.au/tips.html 48/114
17/6/2019 Stata | Tips
help rename
help label
help list
Top
levelsof is a useful Stata command for doing something by levels of a variable. For example
producing a histogram of mpg by levels of the variable foreign eg.
clear
sysuse auto, clear
levelsof for, local(level)
However, levelsof fails when there are many levels, as can be seen from the snipit of code:
clear
set more off
set obs 100000
gen a=_n
levelsof a, local(aa)
The levelsof help file states that this command is best used if the number of levels is modest.
Method 1
This method contracts the variable that the levels are required for and then merges it with the
dataset, hence the levels are contained in the Stata dataset:
Example:
preserve
contract mpg
rename mpg levels
save c:/kk, replace
restore
drop _freq
drop _merge
sum levels
forvalues i=1/`=r(N)' {
scatter price weight if mpg==mpg[`i'], name(a`i')
}
exit
https://www.surveydesign.com.au/tips.html 49/114
17/6/2019 Stata | Tips
Method 2
mata:
a=uniqrows(st_data(.,"mpg"))
a
for(i=1;i<=rows(a);++i){
st_local("i1",strofreal(a[i]))
stata("scatter price weight if mpg=="+st_local("i1")+", name(a" + st_local("i1")+")")
}
end
help
and then the specific command eg.
help levelsof
help contract
help mata
help mata st_local()
help mata stata
help mata unique
Top
Stata is fast but it can be sped up by taking a close look at the way your Stata commands have
been coded. In this tip we will look at the if qualifer.
The if qualifier statement is computationly intensive and adds considerable time to the running of
a command that includes this. However there are certain circumstanes where this can be
replaced and hence Stata's running time reduced.
Example 1
This shows how you would normally run a number of regressions eg. just adding the qualifiers
behind the regress command.
//Example 1
//running regressions
timer on 1
use c:/exp1
regress a b c if c<.5
https://www.surveydesign.com.au/tips.html 50/114
17/6/2019 Stata | Tips
regress a b c if c<.5
regress a b c if c<.5
timer off 1
timer list
Example 2
The above example's comands have been modified to bring in only the required observations
(the ones that satisfy the qualifier). To do this we use the 2nd syntax of the use command.
clear
timer on 2
use if c<.5 & b<.5 using c:/exp1
regress a b c
regress a b c
regress a b c
//use c:/exp1
timer off 2
timer list
//the timer gives the following results:
. timer list
1: 20.25 / 1 = 20.2500
2: 9.75 / 1 = 9.7500
Example 3
the observations that are to be included in the regression and then use a
clear
timer on 3
use c:/exp1
mark a1 if c<.5 & b<.5
regress a b c if a1
regress a b c if a1
regress a b c if a1
timer off 3
timer list
help
and then the specific command eg.
help use
help mark
Top
https://www.surveydesign.com.au/tips.html 51/114
17/6/2019 Stata | Tips
(August 2011)
With Stata 12 there are some new commands that make getting tables into an Excel spreadsheet
easier.
Stata 12 returns a matrix of the regression table in r(table) to see this do a regression and type:
matrix list r(table)
Stata 12 has a command for exporting into data in an Excel file eg. export Excel This command
can be access via GUI eg. File>Export>Excel spreadsheet or via the commandline. To see the
syntax type:
help import_excel
clear all
local z=1
matrix a1=r(table)
matrix a2=a1[1..6,1..2]'
matrix list a2
clear
svmat a2, names(matcol)
generate name="`i'" in 1
replace name="_cons" in 2
if `z'==1 {
export excel using "c:\stuff.xls", sheetmodify cell(a`z') firstrow(variables)
}
else {
export excel using "c:\stuff.xls", sheetmodify cell(a`=((`z'-1)*2)+2')
}
local ++z
} //loop
help
and then the specific command eg.
help import
Top
In Stata 12 log files are still output as either SMCL or text. However, in Stata 12 these log files
can be converted into PDF files. This can be easily done with the Stata translate command for
https://www.surveydesign.com.au/tips.html 52/114
17/6/2019 Stata | Tips
example:
Also, in Stata 12 you can produce a PDF of a graph from within Stata. Example
sysuse auto, clear
scatter mpg weight //, name(g1)
graph export c:/graph.pdf //name(windowname)
help
and then the specific command eg.
help translate
help graph export
Top
It is sometimes more convenient to use value lables rather than the graph relabel options to
change graph bar labels. In the example below using value labels also allows the legend to be
spread over the width of the graph.
An example:
clear all
sysuse auto
https://www.surveydesign.com.au/tips.html 53/114
17/6/2019 Stata | Tips
For help on specific commands type: help and then the specific command eg. help label
Top
If you are running a large model and wish to know how Stata is progressing or would like a log
file emailed to you or others when Stata has finished a do file or would like Stata to send out
emails based on a program that you write, then the following can be used.
Options 1
Getting Stata to automatically send an email to indicate progress in the running of a do file:
forvalues i=1/2000 {
if mod(`i',100)==0 { //<--1
tempname fh
file open `fh' using kk2.txt, write //<--2
file write `fh' "smtpserver = mail.whatever.com.au" _n //<--3
file write `fh' "from = myeamail@whatever.com.au" _n //<--4
file write `fh' "to = reciever@whatever.com.au" _n //<--5
file write `fh' "subject = Test Message" _n //<--6
file write `fh' "body = `i' Test Message" _n //<--7
file close `fh'
!CommandLineEmailer /p:kk2.txt //<--8
erase kk2.txt //<--9
}
https://www.surveydesign.com.au/tips.html 54/114
17/6/2019 Stata | Tips
log close
exit
2. Using Stata's file command you create a text file that contains the instructions to run
CommandLineEmailer The text file created was called : kk2.txt
3. The address "mail.whatever.com.au" must be changed to your address. To find this out with
Windows Live:
Open Windows Live
Using pulldown menu: Tools>Accounts
Click on: Mail
Click on: Properties
Click on: the "Servers" tab
Find the address at: Outgoing Mail [STMP]
7. Change the message in the body of the email to that required. In the above we have included `i'
to indicate the number of loops that have been completed. Other data can be included.
Option 2
If you require that the log file be emailed to you (or others) when the analysis has been
completed. The following can be done:
forvalues i=1/2000 {
display "Looping index: `i'"
}
log close
https://www.surveydesign.com.au/tips.html 55/114
17/6/2019 Stata | Tips
file write `fh' "body = log sent: `c(current_date)' `c(current_time)'" _n
!CommandLineEmailer /p:kk2.txt
exit
Generating a dataset
(April 2011)
Sometimes researchers expect a large dataset at some time in the future and wish to make sure
that their version of Stata can handles the dataset (within the limits of their version of Stata).
Also, they may wish to check that their version does the analysis in a timely manner and their
computer is set up to handle the data; otherwise there may be a need to upgrade the computer
and/or upgrade the flavour of Stata eg. Stata/MP
To see the limits of your existing flavour of Stata type: Help limits
(1) Generates a data sets with a number of continuous variables and observations that are
specified.
clear all
set memory 300m //<-- allocates 300 megabits of memeory to Stata
set obs 1000 //<--No. of observations
gen y=uniform()*10
forvalues i=1/100 { //<--No of variables
gen a`i'=uniform()*100 //<--cont. variables
}
summarize
clear all
set memory 300m
set obs 1000 //<--No of observations
generate y=uniform()<.5 //<--binary variables
forvalues i=1/100 { //<--No. of cont. variables
generate a`i'=uniform() //<--cont. variables
}
tabulate y
clear all
set memory 300m
set more off
set obs 1000 //<--No of observations
https://www.surveydesign.com.au/tips.html 56/114
17/6/2019 Stata | Tips
generate y=mod(_n,4)+1 //<--cat. variable
forvalues i=1/10 { //<--No of cont. variables
generate a`i'=uniform() //<--cont. variables
}
tabulate y
For help on specific commands type: help and then the specific command eg. help obs
Top
Recently a few people have inquired about the printing of log files. People have had problems
with the truncation of right hand side the log file.
Stata has a few settings that allows control over the way a log is printed.
Option 1
Stata has various system settings. These can be seem by typing: query
To set the width of the text across the page use:
set linesize #
Example:
set linesize 85
The above example sets the linesize on the Results windows and hence the log to 85 characters.
(Note: not all commands are effected by the linesize setting, see the Stata 11 manual for more
details)
Note: The linsize setting must be done prior to running the log.
Option 2
When printing you can also control the print font size. To change this load the log file into a
Viewer window and :
https://www.surveydesign.com.au/tips.html 57/114
17/6/2019 Stata | Tips
Option 3
Print using the print command and include overrides eg.
print c:/experiment.smcl, header( off) fontsize( 6) logo(off) lmargin(3)
The overrides for the translators can be found by typing the "translator query" and the the name
of the translator. For example:
For further help on the above code, type the following on the Stata command line:
help log
query
help viewer
help translator
Top
Stata has a considerable collection of time and date functions. These can be found by typing:
help date()
Often you wish to limit the command to before, after or between particualar dates.This is easily
done using the date pseudofunction or if the dataset has been set for time series the tin()
function.
clear
input str20 starts_d a
"20jan1980" 1
"20jan1981" 2
"20jan1982" 3
"20jan1983" 4
"20jan1984" 5
"20jan1985" 6
"20jan1986" 7
"20jan1987" 8
"20jan1988" 9
end
list
Note 1: Generates a new variable (date1) which is the elapsed time in days from a date datum (1
Jan 1960). This variable is numeric.
Note 2: Summarizes a subset of the data. The subset being determined by the pseudofunction
function td(). The number of observations in the subset are shown under obs.
clear
input str20 starts_d a
"20jan1980" 1
"20jan1981" 2
"20jan1982" 3
"20jan1983" 4
"20jan1984" 5
"20jan1985" 6
"20jan1986" 7
"20jan1987" 8
"20jan1988" 9
end
list
generate date1=date(starts_d,"DMY")
format date1 %td
tsset date1 //<-Note 3
Note 3: tsset is the command to set the data for time series
Note 4: tin() determines the subset of the data. This function allows a lower and upper limit to be
specified; the lower limit being on the left and the upper on the right. If the left hand limits is omitted
Stata assumes that the lower limit is to be taken from the beginning of the data and conversely if
the right hand limit is omitted Stata assumes the end of the dataset.
https://www.surveydesign.com.au/tips.html 59/114
17/6/2019 Stata | Tips
For further help on the above code type the following on the Stata command line:
help date()
help tsset
Top
Multiple graphs can be produced in Stata 11 with loops. If all the numeric variables are required
to be graphed as histograms the following can be used:
The Stata "confirm" command checks if the variable is a numeric variable. If it is the Stata prefix
"capture" command returns _rc as 0 if not some other value is returned. Then the return code _rc
is then checked with the "if" command, if true the histogram is drawn if false the next variable in
the "foreach" loop is run.
If you do not wish to run all the variables in the dataset the following can be used:
sysuse auto, clear
graph drop _all // drop existing graphs
local a "mpg turn"
foreach i of local a {
capture confirm numeric variable `i'
if _rc==0 {
histogram `i', name("`i'")
}
}
exit
ds , has(type int)
return list
exit
Or
ds make , not
return list
https://www.surveydesign.com.au/tips.html 60/114
17/6/2019 Stata | Tips
histogram `i', name("`i'")
}
exit
This time still using the ds command but excluding the variables that you do not wish to graph with
the not options.
The default display for multiple graphs is to show each graph in a separate graphics window. To
show all the graphs in the one window (tab graphs) the stata setting: autotabgraphs can be set to
on eg.
set autotabgraphs on
Also, when displaying graph in the one graphics window the display can be altered by pulling the
tab into the desired part of the window. An example:
For further help on the above code, type the following on the Stata command line:
help capture
help ds
help forvalues
Top
https://www.surveydesign.com.au/tips.html 61/114
17/6/2019 Stata | Tips
Stata 11 PDF
(December 2010)
Stata 11 includes the manuals on PDF; all 8000+ pages! The manuals include detailed examples
of Stata commands, technical details, references and the maths for the command. While Stata's
online help is handy for those that are already familiar with the command, the manuals are very
useful for learning about new commands. There are various way of accessing the PDF manual.
These are :
1. To access the entire set of PDF manuals you can use the Pull down menu: Help>PDF
Documentation
2. For a specific entry, open a Stata online help page (eg. help regress ) and then click on the
hyperlink
https://www.surveydesign.com.au/tips.html 62/114
17/6/2019 Stata | Tips
3. Creating a hyperlink on the Results Windows. This is particularly helpful for Stata courses or
emailing a reference to a fellow Stata user.
OR
*******pdf.ado************
program pdf
display in smcl "{manpage GSM 141} {hline 2} starting MAC"
display in smcl "{manlink R regress} {hline 2} Linear regression"
end
**************************
save the above file as pdf.ado and put it in the adopath (suggest c:/ado/personal)
then to bring up the hyperlink type pdf on the Stata command line.
For further help on the above code, type the following on the Stata command line:
help adopath
help display
help smcl
Top
Regular Expressions
(November 2010)
Stata has regular expressions that allow you to work with simple or complex text.
https://www.surveydesign.com.au/tips.html 63/114
17/6/2019 Stata | Tips
One application of regular expression is for working with address data. The following show how
to (in most cases) separate the postcode and state from an address.
clear
input ///
str100 address
"1234 West St Blackburn 3000 Vic"
"West St 1234 Blackburn 3000 vic"
"West St 1234 Blackburn Vic 3000"
"West St 1234 Blackburn sa 3000"
"12 West St Backburner 2001 nsw"
end
list
//getting postcode
generate postcode1=regexs(2) if regexm(address,"(^.*)([0-9][0-9][0-9][0-9])") //comment: reg1
//get state
generate state=regexs(0) if regexm(address,"([Vv][Ii][Cc]|[Nn][Ss][Ww]|[Ss][Aa])") //comment: reg2
//You could varify the first number of the postcode matches the state
generate check=1 if lower(state)=="vic" & regexm(postcode1,"[0-9]") & regexs(0)=="3" //comment: reg
list
Notes:
reg1: (^.*) means get any text "." zero or more times "*" and the brackets around this indicate a
subsection of the string - in this case subsction 1
Subsection 1 is to continue until the last 4 digit number as indicated by: ([0-9][0-9][0-9][0-9])
reg3: looks for a match of state: lower(state)=="vic" , the lower() function makes sure that we are
comparing the states in lower case. regexs(0)=="3" checks the match of the previous statement
with the number 3; the correct start of the vic postcode.
Assuming that the postcode has been incorrectly coded with the inclusion of alpha characters
and needs to be cleaned up. The following is one way of doing this.
clear
input ///
str100 address
" 3a00c1 West St Blackburn 3a00c0 Vic"
"West St 123 Blackburn 3Re00c1 vic"
"West St 123 Blackburn Vic 3f000"
"West St 123 Blackburn sa 30jj00"
"12 West St Backburner 2001 nsw"
end
list
tempvar a1 a2 a3
gen `a1'=""
gen `a2'=""
gen `a3'=""
local aa "[A-Za-z]"
//assume that the post code is in the second half of the string
replace `a1'=regexs(0) if regexm( substr(address,strlen(address)/2,.)," ([3])(`aa'|[0-9])*") //comm
replace `a2'=regexs(3) if regexm(`a1', " ([3])(`aa'*)([0-9]*)")
replace `a3'=regexs(5) if regexm(`a1', " ([3])(`aa'*)([0-9]*)(`aa'*)([0-9]*)")
https://www.surveydesign.com.au/tips.html 64/114
17/6/2019 Stata | Tips
list
Notes:
reg4: substr(address,strlen(address)/2,.) limits the search to the second half of the string. The
space in " ([3]) between the " and (, indicates that a space is require and ([3]) indicates that this
must start with the number 3. The second subsection: (`aa'|[0-9])*") looks for lower or uppercase
characters OR; as indicated by OR symbol: "|", a number. The "*" at the end of the 2nd statements
indicates zero or more times.
The following is problem that requires the separating of the days, months and years into separate
variables.
clear
input ///
str40 dpr
"2 yrs 5months 26 days"
"3 yrs 2 months"
"1yr 9 months"
"1 yr 8 months"
"1 yr 11 months 28 days"
"1 yr 12 days"
"3 yrs 3 months12 days"
"3yrs 4 months 26 days"
"1 yr 9mnths 8 days"
end
list
list
For further help on the above code, type the following on the Stata command line:
A useful addition to your Stata setup is a profile.do file. This is a do file that Stata looks for and
runs when starting a Stata session.
To create a profile.do file, click on the "New do-file editor" or type doedit on the Stata command line
and then type in commands that you wish to have executed when Stata starts up. Then save this
file where Stata can find it ie. on the adopath.
https://www.surveydesign.com.au/tips.html 65/114
17/6/2019 Stata | Tips
global F4 "summarize;"
Pressing F4 now executes the summarize command.
The profile.do file can also be used to load dialogue boxes into the USER pulldown menu. For an
example see:
http://www.stata-journal.com/sjpdf.html?articlenum=pr0012
(this show how to include meta-analysis dialogue boxes)
When including a profile.do make sure that it is on the adopath; so Stata can find it. To see the
adopath type: adopath
Stata's system variabels' _n and _N can be used to do a large number of otherwise difficult tasks.
In this tip we will illustrate some of things that these can be used for.
Defintion:
_n : Current observation
**Example 1
Generating observations that are a sequent of numbers equal to the Stata observation number.
The resulting variable: number
Generating observations equal to the last observation number. The resulting variable: number_T
clear all
set obs 10
generate number=_n
generate number_T=_N
**Example 2
Reversing the data so that the _N (last) observation become the first. This done for a particular
variable.
clear
set obs 10
generate number=_n
generate rev_number=number[_N-_n+1]
list
**Example 3
Used _N with the bysort command to generate a variable that has the total number of children in
families.
clear
input ///
famid child
1 1
2 1
2 2
2 3
3 1
3 2
3 3
3 4
end
list, sepby(famid)
**Example 4
_n and _N can also be used as a qualifier. In this example marking ,for each family, the child who
has the greatest income. The income variable is in brackets which tells Stata to sort this variable by
income. When sorted the last observation (_N) ,by family, is the greatest income for that family.
clear
input ///
famid child income
1 1 100
2 1 150
2 2 200
2 3 250
3 1 10
3 2 100
3 3 500
3 4 250
end
l, sepby(famid)
**Example 5
input ///
time sales
1 100
2 150
3 200
4 250
5 10
6 100
7 500
8 250
end
generate lead=sales[_n+1]
generate lag=sales[_n-1]
generate lags=(sales[_n-1]+sales[_n-2])/2
list
https://www.surveydesign.com.au/tips.html 68/114
17/6/2019 Stata | Tips
+----------------------------------+
| time sales lead lag lags |
|----------------------------------|
1. | 1 100 150 . . |
2. | 2 150 200 100 . |
3. | 3 200 250 150 125 |
4. | 4 250 10 200 175 |
5. | 5 10 100 250 225 |
|----------------------------------|
6. | 6 100 500 10 130 |
7. | 7 500 250 100 55 |
8. | 8 250 . 500 300 |
+----------------------------------+
help bysort
Top
Stata's log file reproduces what you see in the Results windows. Often there is a lot of material
that is not needed for a final report and this material needs to be edited before presenting a
report to others. Stata's log file can be edited from the do file as it is written. Just write a do file as
is normally done and then decide what is required to be included.
The example below has 2 ways of contolling the final log file output:
1. Turning the log on and off so only the material that you wish to see is added
To do this write a few local macros at the start of the log file and include these where required
between Stata commands.
//set macros
local new "capture log using out1, text replace"
local on "capture log using out1,text append"
local off "capture log close"
*this is a comment
`off' //off
regress mpg weight
`on' //on
display "`e(rss)'"
`off' //off
generate gpm=1/mpg
`on' //on
*this is GPM
summarize gpm
`off' //off
https://www.surveydesign.com.au/tips.html 69/114
17/6/2019 Stata | Tips
help macro
help filefilter
help type
Top
When you're working with a data management or statistical command in Stata that you have not
previously used, you may not be confident that you are doing this correctly. So rather then work
with the complete data set it's often useful to make up a small data set that contains the critical
points and run this to see if it is doing what you had anticipated. Once satisfied you can run this
on the complete data set. For example if I wished to identify the observations that included the
current date and up to 4 days in advance the following could be used:
clear
input ///
str15 dates
"12/7/2010"
"13/7/2010"
"14/7/2010"
"15/7/2010"
"16/7/2010"
"17/7/2010"
"18/7/2010"
"19/7/2010"
end
list
. list
+-------------------+
| dates date1 |
|-------------------|
1. | 12/7/2010 . |
2. | 13/7/2010 . |
3. | 14/7/2010 1 |
4. | 15/7/2010 1 |
5. | 16/7/2010 1 |
|-------------------|
6. | 17/7/2010 1 |
7. | 18/7/2010 1 |
8. | 19/7/2010 . |
+-------------------+
https://www.surveydesign.com.au/tips.html 70/114
17/6/2019 Stata | Tips
. exit
Errors in logic can now more easily be spotted and you have saved time by not running the
complete data set. When this had been satisfactorily run it could be included in the main do file.
help input
Top
In Stata 11 the do editor can be split, making it easier to do some types of work. To do this there
must be at least two tabs on your do editor. Pull one of these to the middle of the editor. When a
selection box appears select one and 2 tabbed do editors windows appear.
https://www.surveydesign.com.au/tips.html 71/114
17/6/2019 Stata | Tips
https://www.surveydesign.com.au/tips.html 72/114
17/6/2019 Stata | Tips
Top
Stata 11's factor variable can be combined with lincom to quickly produce tables.
In this example we look at the table on P226 of "Statistical Modeling for Biomedical Researchers:
A Simple Introduction to the Analysis of Complex Data, 2nd Edition by William D. Dupont" (See
out bookshop to order).
https://www.surveydesign.com.au/tips.html 73/114
17/6/2019 Stata | Tips
local a`i'`j'=r(estimate)
}
}
local a11=1
contract a
keep a
matrix aa=( `a11', `a12', `a13' \ `a21', `a22' ,`a23' \ `a31', `a32' ,`a33'\ `a41', `a42' ,`a43')
svmat aa
rename aa1 Tobacco_0_9
rename aa2 Tobacco_10_29
rename aa3 Tobacco_30
list
exit
In the above the forvalue loop gets the different levels of alcohol and smoke. These are then applied to the factor variables
in the lincom command. The returned values from lincom are then stored in a Stata matrix; one at a time. After going
through all the combination of alcohol and smoke the matrix is then put into Stata and some labels applied.
For more information on the specific commands type help and then the command eg. help lincom
Top
Stata Graphs
(April 2010)
From time to time Stata is used to produce non-standard/interesting graphs. I have compiled some
of these graphs. These have mainly been presented on the Statalist. To see these graphs click
here. This page will be updated from time to time.
To see some of the User written graph commands click here. (from a previous tip)
Top
Tabdisp
(March 2010)
tabdisp is a Stata command that allows you to display Stata tables. This command allows lots
control of the way that the elements are displayed.
summarize _freq
generate percentage=(_freq/r(sum))*100
gen freq=string(percentage, "%5.2f")
If the above is what was required then instead the user written program: tab2way or tab3way
could be used
There are many other ways to display your data eg. including the words max and min in the table
cells
sort _freq
For help on the individual commands type help and then the command name.
To download the user written command tab2way or tab3way , type: ssc install tab2way or ssc
install tab3way
Top
Tables to spreadsheet
(February 2010)
The tabulate command allows the values of the table to be save as matrices eg. options for the
tabulate command are: matcell(), matrow() and matcol(). These matrices can be put into a
spreadsheet. The table command however does not have these matrix options. However, there
are workarounds that make it easy to put the results that the table command would have given
into a spreadsheet. This tip explores a number of ways that this can be done:
This is the command what we wish use and then get the resulting table out of Stata and into a
spread sheet
The following gives us what we want but does not allow the output to be put into a spreadsheet
https://www.surveydesign.com.au/tips.html 75/114
17/6/2019 Stata | Tips
This time getting the table into a Stata data set so it can be exported to a spreadsheet
This method has the advantage that the colum and row labels are also included
drop if rep78==.
reshape wide price, i(foreign) j(rep78) //because the data is in long form it can be reshape
// into the required table
list
outsheet using c:/table, replace //outputting the table to a form that can be read with a spreadsh
sysuse auto, clear collapse (mean) price, by(foreign rep78) fillin foreign rep78 drop if
rep78==. sort for rep78 list mata: //start of Mata a=st_data(.,.) a s=J(2,6,.) s for(i=1; i<=10; i++)
{ r=a[i,2] c=a[i,1] s[r+1,c]=a[i,3] } names = st_varname((1..3)) names
b2=st_varvaluelabel(names[1,1]) b2 if(b2!="") { zy2=uniqrows(a[.,1]) b3=st_vlmap(b2, zy2) b3
} else { b3=strofreal(uniqrows(a[.,1])) b3 } b2a=st_varvaluelabel(names[1,2]) b2a if(b2a!="") {
zy2a=uniqrows(a[.,2]) b3a=st_vlmap(b2a, zy2a) b3a } else { b3a=strofreal(uniqrows(a[.,2]))
b3a } table=(""\b3a) ,(b3',"."\strofreal(s)) table mm_outsheet( "c:/table1" ,table, mode="r")
//user written program output to a Excel readable file end
As you can see there are a number of different ways of getting table information out of Stata.
For help on the individual commands type help and then the command name. To download the
user written command mm_outsheet, type: ssc install moremata
Top
Tables to spreadsheet
(January 2010)
When a large number of tables are required to be put into a spreadsheet and no use written
program is available to easily do this the following method can be used:
Write a program for the particular table (or any output) that you require. If there are a number of
different tables then written a program for each type of table.
The program starts a log file and then runs the table command. It then closes the log file. The log
file is then put through a file filter to remove any unwanted text.
The partially cleaned up log file is then imported into Stata using the insheet command and then
further cleaned up; removing any unwanted text and then the columns in the table are split into
Stata columns. The extent of the clean up depends on the desired output.
Having finished the cleaning up, this is either saved or appended to, using the required program
option.
https://www.surveydesign.com.au/tips.html 76/114
17/6/2019 Stata | Tips
filefilter a.log a1.log , from("-") to("") replace //deleting unwanted text in the log file
filefilter a1.log a2.log , from("|") to("") replace
filefilter a2.log a3.log , from("+") to("")
replace
insheet using a3.log, clear //brings the modified log file into Stata
drop in -4/-1 //get rid of other material
drop if strpos(v1,"log")
drop if strpos(v1,"pause")
drop if strpos(v1,"resumed")
drop if strpos(v1,"unnamed")
drop if strpos(v1,":")
capture drop v2
split v1
drop v1
if "`append'"!="append" {
save `gen', replace
}
if "`append'"=="append" {
append using `gen'
save `gen' , replace //saves file to hard disc
}
end //end of program
*********************************************
**the commands that calls the above program
*********************************************
sysuse auto, clear
https://www.surveydesign.com.au/tips.html 77/114
17/6/2019 Stata | Tips
outsheet using c:/aa.csv, comma nonames //saving to disk this can be opened in a
replace spreadsheet
exit
Top
Examples:
OR
OR
This then displays the point estimate. The last method is useful when a number of estimates
https://www.surveydesign.com.au/tips.html 78/114
17/6/2019 Stata | Tips
need to be made.
Top
Stata's quietly command allows commands to be run without outputting to the results window. This
is useful if you only require the returned results (eg. r(mean) etc see help return list ) and not the
actual output.
Example:
sysuse auto, clear
quietly summarize mpg, detail
or
If you wish to see specific output in a quiet block you can add noisily to this
Example:
Top
Graphing functions
(October 2009)
The graph histogram command allows a normal distribution option to be included in this graph.
The twoway graph however does not have this option. However, this can also be easily done by
adding a function graph, as shown in the following example:
https://www.surveydesign.com.au/tips.html 79/114
17/6/2019 Stata | Tips
Top
Top
The various flavours of Stata have limits on various commands, label lengths, macro lengths etc.
One of the limits is the maximum number of variables that can be loaded into Stata.
https://www.surveydesign.com.au/tips.html 80/114
17/6/2019 Stata | Tips
To see the limits of the various flavours of Stata see: help limits
If your data set contains more than 2047 variables and you do not need all of these in Stata then
the second syntax of Stata's use command can be used to get a subset of this data set into Stata
help use
use [varlist] [if] [in] using filename [, clear nolabel]
example:
use mpg using "c:/program files/stata11/auto", clear
example:
describe using "c:/program files/stata11/auto", varlist
return list
Also see:
help memory
Top
Capture
(July 2009)
Stata commands that result in an error, issue a non zero return code (_rc). In Stata 10 and Stata
11 the return codes can be seen in the Review Windows (you may need to expand the Reviews
window to see the _rc column)
If a command in an do file produces an error the do file will stop. This can be prevented by prefixing
the command with the capture command eg.
In the above example 1, a do file/program would stop running if there was no log file open. Stata
requires a log file to be open before it can be closed and no other log file open before it can open a
log file.
In example 2, a do file/program would continue to run even if there was no log file open. The
capture command allows errors to be ignored.
Apart from preventing a do file/program from stopping, the capture command can also capture a
command's return code in _rc. The return code (_rc) can then be used to make a decision in your
do file/program.
Example
sysuse auto, clear
tostring mpg, replace //for the purposes of the example convert mpg to a string variable
describe
foreach v of varlist price-foreign {
capture confirm numeric variable `v'
display _rc //allow you to see the return code
https://www.surveydesign.com.au/tips.html 81/114
17/6/2019 Stata | Tips
if _rc { //if _rc is not 0 (zero) the statement is true and Stata goes into the loop
destring `v',replace
describe `v'
}
}
Also see:
http://www.stata.com/statalist/archive/2009-06/msg00623.html (An example of how to use a
return code to set up the default directory in Stata.)
help confirm
help capture
Top
Transparent Graphs
(June 2009)
Stata graphs can be made transparent in MS Word and other software. For example the
following graph was produced in Stata and then made transparent in Word.
twoway ///
(histogram mpg if rep78==3, fcolor(green)) ///
(histogram mpg if rep78==4, fcolor(blue))
graph export c:/hist.wmf, replace
Click on graph
Edit picture
Right Click on a bar that you wish to make transparent
Format AutoShape>Color and lines tab>Fill section and the move the transparency slider to 50%
and press OK
Continue to edit all the bars this way. The legend can also be modified as per above
Save
https://www.surveydesign.com.au/tips.html 82/114
17/6/2019 Stata | Tips
Also see:
http://www.stata.com/statalist/archive/2009-04/msg00574.html
http://www.stata.com/statalist/archive/2009-04/msg00612.html
Top
Stata has a great graph editor. However, after you have modified your graph the editor will not
produce the normal Stata code for this graph. However, it is possible to retrieve the editing
commands if they have been recorded using the Stata graph editor recorder, adding gr_edit at
the start of each editor line and then adding this to the initial graph code. Now you have the code
to reproduce the graph.
Example:
Assume that you have run the following
sysuse auto, clear
histogram mpg
Then click on the Start Graph Editor icon and pressed the Start recording icon. Then altered
the color of the histogram bins. Then stop the recorder and saved the record on the hard disk
with a suitable name and path. Then opened the record (just saved) in Stata's do editor.
the line:
plotregion1.plot1.style.editstyle area(shadestyle(color(gs7))) editcopy
was retreived and gr_edit added to the from of this.
this is run and will produce the original graph complete with the edit.
Also see:
http://www.stata.com/statalist/archive/2008-07/msg00932.html
help graph play
Top
It is possible to put results from Stata into a word document by first obtaining your data in Stata
and then using mail merge to get this into Word.
For example, if you wish to automate you report writing and required the max. and min. mpg in a
Word report (using the auto.dta data set that comes with Stata ) this can be done with the
following do file: The user written program moremata is used this must first be installed. To install
type the following on the Stata command line eg.
https://www.surveydesign.com.au/tips.html 83/114
17/6/2019 Stata | Tips
********************weaving do file*********************************
sysuse auto, clear
*determine max and min mpg
quietly: sum mpg
local max_mpg =r(max)
local min_mpg
=r(min)
di `max_mpg' //only if required to see results in
Stata
di `min_mpg' //only if required to see results in
Stata
mata
a="max_mpg"\st_local("max_mpg")
a1="min_mpg"\st_local("min_mpg")
a2=a,a1
a2
mm_outsheet("c:/tips.txt", a2, mode="r")
end
********************weaving do file*********************************
After running the above the text file tips.txt is produced (in C:/ drive)
Open the data source in Word and then run Mail Merge
Using this method you can include tables, graphs etc. into your Word document.
References:
http://ideas.repec.org/p/boc/asug05/14.html
When running a do file you may wish to inspect the data at various points. Stata has a number of
way of doing this. For example:
Option 1:
Using the edit command. Opens the data editor and allows you to inspect the data. When the
editor is closed the do file continues to run. (Instead of edit you could have used browse to open
the data browser)
sysuse auto, clear
regress mpg weight
edit
//stops Stata and opens the data edit window
summarize
https://www.surveydesign.com.au/tips.html 84/114
17/6/2019 Stata | Tips
exit
exit
Options 3:
sleep stops Stata for a specified number of milliseconds
sysuse
auto, clear
regress mpg weight
sleep
1000 //sleep specifies the number of milliseconds to wait
beep
//used to wake you up if the sleep is too long
summarize
exit
Options 4:
exit stops a do file. To run more of the do file move the exit command down the do file and run
again.
sysuse
auto, clear
regress mpg weight
local a
1
exit //program stop at this point then move to another line and run again
display `a'
help edit
help browse
help more
help exit
Top
Greek symbols (or other symbols) can be added to Stata graphs. To add these you must first set
up your computer for this eg.
In Windows XP
https://www.surveydesign.com.au/tips.html 85/114
17/6/2019 Stata | Tips
then click Apply and then OK (the computer will then be required to be restarted )
then in Stata:
using the pull down menu:
Edit>Preferences>Graph Preferences
Then font select Arial Greek
To see the numbers used in the extended code you can use the Nick Cox written graph:
asciiplot
(this is a user written command and must first be downloaded)
To download asciiplot type the following on the Stata command line
ssc install asciiplot
scatter weight mpg, title( Example of Greek characters in a Graph `=char(238)' `=char(243)'
`=char(236)' )
Top
bysort is a Stata prefix command that allows you to execute commands by levels/groups of the
variable(s) that you specify
https://www.surveydesign.com.au/tips.html 86/114
17/6/2019 Stata | Tips
Example:
If you wanted to generate a new variable with a 1 at the first occurrence of each level of mpg the
following can be used (using the auto data set that comes with Stata):
sysuse auto, clear //load the auto data set into Stata
bysort mpg: gen first=1 if _n==1
If you wanted to generate a new variable with a 1 at the last occurrence of a level of mpg the
following can be used:
sysuse auto, clear //load the auto data set into Stata
bysort mpg: gen last=1 if _n==_N
Sorting within the group eg. if you wanted the car with the smallest weight within each level of
mpg the following can be used:
Note that the brackets around the weight variable name indicates to Stata that this is not be
used as the level/group criteria but weight is to be sorted within each level of mpg
Top
A tutorial showing different options for the automatic production of tables can be obtained by the
following commands:
help tabletutorial
Top
Stata generally stores all of the dataset that it is working with, in the computer's memory.
Therefore, the computer should have sufficient RAM to load all of the data. Storing data in
memory allows fast access to the data. If the computer has insufficient memory and the
operating system allows, the data is stored on the computer hard disk, however this can be very
slow ie. Stata uses virtual memory where the operating system allows
Stata assigns an amount of memory for it's self so that it can store the data in RAM, so whatever
this is set to must be sufficient to store the entire data set. The memory settings in Stata can be
changed to allow sufficient memory for the data set.
https://www.surveydesign.com.au/tips.html 87/114
17/6/2019 Stata | Tips
A quick way to determine the average width of the variable ( bytes) is as follows:
(type the following on the command line or into a do file:)
describe
display r(width)/r(k)
Then put this number (average variable width) into the online calculator. The result from the online
calculator is the minimum memory required so allow 30-50% more then this for additional variables
etc.
then set the memory using the set memory command eg. set memory 50m
References
http://www.stata.com/statalist/archive/2005-07/msg00348.html
Top
Stata Comment
(October 2008)
Stata has a number of ways of adding comments to Stata code. Some of these are:
*
The star at the start of a line tells Stata to ignore what follows eg.
*this is ignored
/* */
The /* */ are used to add comments between code eg.
regress mpg /* weight is the independent variable */weight
or /* */ can be used to concatenate two lines of code eg.
twoway scatter (mpg weight) /*
*/ (lfitci mpg weight)
///
Stata ignores what is after /// and continues on the next line eg.
regress mpg /// dependent variable
weight
The #delim ; command is useful in a number of ways. One use is to comment out blocks of
code/text eg.
#delimit ;
*
display "this is a comment"
display "this is a comment"
display "this is a comment"
*;
#delimit cr
https://www.surveydesign.com.au/tips.html 88/114
17/6/2019 Stata | Tips
Top
this command also has lots of other options. To download this type the following on the Stata
commandline:
ssc install tab2way
Stata users have written many commands for tables. To see a list of some of them type the
following on Stata command line (when online):
findit tab table
Then to download click on the hyperlink and follow the instructions
Top
Sending Command(s) to the Stata Do Editor from the Stata Review Window (July 2008)
While running Stata interactively, either with dialogue boxes or from the command line the
command(s) that you issue to Stata are recorded in the Review Window. These commands can
be put directly into the Do Editor for rerunning a session of Stata again, modifying the commands
and rerunning or as a record of the analysis.
Putting the contents or some of the contents of the Review Window into the Do Editor can be
done as follows:
In the Review Windows selecting the command(s) that you wish to go into the Do editor by:
Then: Right clicking the mouse button and selecting send to do-file editor
The Do editor will then open with the highlighted command(s)in it. To run this using the Do Editor
pulldown memu select: Tools>Do or using the icon (in Stata 10 this is the icon on the far right) or
save this file and run from the command line eg. Saving this as c:/dofile and run by typing do
c:/dofile on the Stata commandline.
For more details the Stata command type the following on the Stata commandline:
help do
Top
odbc
Stat/transfer
Stata with the append command
In this example the Excel file is called book2 and is in c:/ drive. The file has two work sheets: kk1
and kk2
odbc
clear
tempfile kka
odbc load, dsn("Excel Files;DBQ=c:\book2.xls") table("kk1$")
save `kka'
list
clear
odbc load, dsn("Excel Files;DBQ=c:\book2.xls") table("kk2$")
list
append using `kka'
list
exit
Also see:
http://www.ats.ucla.edu:80/stat/stata/faq/odbc.htm
Using Stat/Transfer 9
With Stat/Transfer this would be done as follows:
Open tab: option 3
And then tick "concatenate worksheet pages"
Save each Excel worksheet as a csv in Excel. In this example c:/book2_kk1.csv and
c:/book2_kk2.csv are the two files created
For more details the Stata commands type the following on the Stata commandline:
help append
help insheet
Top
Nick Cox has written a useful graph command (catplot) that graphs categorical variables. This
user written program can be downloaded for free.
To download this:
https://www.surveydesign.com.au/tips.html 90/114
17/6/2019 Stata | Tips
findit catplot
then click on the hyperlink
catplot from http://fmwww.bc.edu/RePEc/bocode/c
and then follow instructions
If the catplot command didn't exist and you wanted to produce a bar plot of the frequencies of the
categories of rep78 then you would have to do something like:
For more details on catplot see the online help help catplot (once installed)
For other graphs that Nick Cox has written see:
http://www.ats.ucla.edu/stat/Stata/faq/graph/njcplot.htm
Top
Material documenting the Stata Users' group meetings is worth looking through. It contains
articles on a large number of topics.
http://www.stata.com/meeting/proceedings.html
For example if you're not sure what regular expressions are, then have a look at:
http://ideas.repec.org/s/boc/wsug07.html
https://www.surveydesign.com.au/tips.html 91/114
17/6/2019 Stata | Tips
http://repec.org/wcsug2007/cameronwcsug.pdf
Top
On way of creating a binary variable is to generate a new variable containing 0 and then replace
the contents of the variable with 1 based on a qualifier eg.
generate dummy1=0
replace dummy1=1 if mpg <=25
this works because mpg <=25 is either true or false. Stata qualifiers evaluates to 1 if true and 0 if
false.
If the variable that is part of the qualifier contains missing values then include the if condition:
!=missing() eg.
Other ways of creating dummy variables can be found at: Stata FAQ
Also see: What is true and false in Stata?
Top
Many user written commands are stored in the SSC (Statistical Software Components) archive.
In the lastest ado update for Stata 10 a new subcommand has been added to scc:
ssc whatshot
Examples:
whatshot
https://www.surveydesign.com.au/tips.html 92/114
17/6/2019 Stata | Tips
whatshot, author(cox)
To get these commands you need to update Stata. To do this with the pull down menu:
Help>Official Updates and then click on www.stata.com. Then follow instructions.
For more information on SCC type help scc on the Stata commandline
Top
Undocumented commands
(October 2007)
In addition to the commands found in the Stata manual there are also undocumented commands
that you may find useful. To see these type help undocumented .
A commands that you may find useful is: twoway__histogram_gen
This command generates coordinates of the bars in a histogram. An examples of how it works is:
This command generates 3 matrices in mata, one for each of: value label name, value and the
label
Top
When learning new commands in Stata it is often useful to have examples of how the syntax is
applied. Stata's documentation includes many examples and allows you to downloaded data sets
for these (File/Example datasets), thus allowing you to reproduce the results. Also, Stata's online
help includes many examples. Another useful source of examples is Nick Cox's examples user
written program
An example of some of what you get by typing examples egen
Setup
To see a description of examples type the following on the Stata command line when online
ssc describe examples
To install examples, type the following on the Stata command line when online
ssc install examples
Top
Various features of Stata can be set to individual preferences or changed to meet the
requirements for a particular analysis.
https://www.surveydesign.com.au/tips.html 94/114
17/6/2019 Stata | Tips
To see what can be set type query on the Stata command line
Amongst the things that can be set (in Stata 10) is whether or not you would like graphs tabbed on
the graph window or each open graph in a separate graph window.
The syntax for this command is:
set autotabgraphs on , permanently
Stata 10 has a copy feature that allows you to copy highlighted parts of the results windows to
Word, Excel and other packages, as a picture. To use this, highlight what you want copied in the
results window, right click the mouse button and click on to "Copy as Picture". Then paste into
another package. In the other package this can usually be cropped and edited in the normal way.
Top
Estout is a useful user written command for outputing regression results in various forms. For more
information you can see the estout web site go here
Top
Adoupdate
(May 2007)
The commands under update are useful for keeping Stata's executable file and the official Stata
ado files up to date (see help update). However, these do not keep the user written ado files up to
date. (user written programs that you have downloaded). To ensure that you are working with the
latest version of a user written ado file type adoupdate on the Stata command line or using the
pulldown menu help/SJ and user written programs and then click on update. (You must be online to
use this command)
Nested Do file
(April 2007)
https://www.surveydesign.com.au/tips.html 95/114
17/6/2019 Stata | Tips
Stata allows you to break up your analysis in to logical sections; each part being a separate do
file, with all the parts of the analysis contained in one do file. eg.
**master****the do files below are contained in do file that you have name master.do (can
be called any other name)
.
.
.
do projA_data
do projA_error_checking
run projA_data_man
if M1==2 {
do projA _A1 // projA_A1 exits finishes analysis
}
do projA_results
exit
**master*************
In fact Stata allows nesting up to a depth of 64. eg. a do file calls another do file which calls
another do file; up to 64 times.
Being able to reuse do files (that you have previously used an know that have no bugs) for
other projects
Stata doesn't have a "goto line X" command. However if you break down your analysis into
do files the same thing can be achieved.
Allows an quick overall view of the analysis
Easier to debug smaller do files than large do files
Some do files can be run (no output to the screen) and others can you can do (output to the
screen). This is easier then using the Stata's quiet command
Disadvantages:
To learn more see: Stata 9 Users guide [U] 16.2 and [U]16.6.2
Max. depth of nested do files, in Stata type help limits
Top
Stata comes with help files for it's commands. However you may wish to compile a list of
frequently used, but hard to remember commands in your personal help file
Your own help file is saved as file with hlp extension eg. me.hlp on the adopath
{smcl}
{* 03may2005}{...}
{cmd:help Joe Blow } {right:updated 1 March 2007}
{hline}
https://www.surveydesign.com.au/tips.html 96/114
17/6/2019 Stata | Tips
To learn more about smcl see the Stata Users Guide or look at a Stata help file (.hlp extension) in
the do file editor.
Top
spmap is a user written command that can be down loaded for free.
To download:
Make sure that you are online.
Type findit spmap
Then click on the hyperlink.
Once installed type help spmap to see the help file. At the bottom of the help file there are
examples of what can be done. Click the hyperlink to see the graphs.
Top
The user written command stcmd can be used within Stata to change the data format of data
sets stored on disk. stcmd uses Stat/Transfer to do this.
To use this command you must first have Stat/Transfer and stcmd installed
To get Stat/Transfer contact Survey Design and Analysis Services (details below).
To get stcmd type findit stcmd in the Stata command Window and follow instruction to install
the program
Examples
For more information see help stcmd (stcmd must first be installed)
Also see fdasave for another way of changing the Stata data format to SAS
Top
https://www.surveydesign.com.au/tips.html 97/114
17/6/2019 Stata | Tips
encode
(December 2006)
encode is a useful command for converting strings to numbers. encode does this in alphabetical
order eg.
var1
a
b
c
Var1 Var1a
a 1
b 2
c 3
(Note: when Stata encodes it produces a value label: to see this type label list )
If this is not the encoding that you require a way around this is to define a value label first and then
use the label options for encode.
Eg.
If you have:
var1
a
b
c
Resulting in:
Var1 Var1a
a 3
b 1
c 2
clear
input str1 var1
a
b
https://www.surveydesign.com.au/tips.html 98/114
17/6/2019 Stata | Tips
c
end
label define preference1 3 a 1 b 2 c
encode
var1, label(preference1) gen(var1a)
label list
list, nolab
Top
kdensity
(November 2006)
One of the problems with combining a number of histograms is that, generally where there are
more than 3, the graph becomes unreadable. kdensity may be an a solution to this problem.
Top
Graphs cannot always be combined; even with the addplot option. However, you can still get
combined graphs by using the pci , twoway scatteri and twoway pcarrow commands. For
example if you wished to add a box plot to a scatter plot this could be achieved with the aid of the
pci command and a twoway scatter. sysuse auto, clear
qui sum mpg, detail
local a= r(p25)
local b= r(p75)
local c=r(p50)
local uav=`b'- 1.5*(`a'-`b')
local lav=`b'+ 1.5*(`a'-`b')
twoway (scatter mpg weight) ///
(pci `a' 3000 `b' 3000, lcolor(red)) ///
(pci `a' 3400 `b' 3400, lcolor(red)) ///
(pci `a' 3000 `a' 3400, lcolor(red)) ///
https://www.surveydesign.com.au/tips.html 99/114
17/6/2019 Stata | Tips
Top
MATA
(September 2006)
If you haven't had a look at Mata yet, then here are some examples of what you can use it for:
Example 1
Sorting rows in alphabetical order (statalist-digest V4 #2451)
(the user written program moremata must first be installed)
clear
https://www.surveydesign.com.au/tips.html 100/114
17/6/2019 Stata | Tips
list
tempfile foo
mata
C= J(3,1,"") //creates a new vector
A = st_sdata(.,.)' // a transpose view of the data in stata
C=C[.,(2::cols(A)+1)]'
mm_outsheet("`foo'",C, mode="r")
end
insheet using `foo', clear tab
list
Example 2
xpose using mata (statalist-digest V4 #2328)
(the user written program moremata must first be installed)
clear
tempfile tmp1
list
mata
A = st_sdata(.,.)'
mm_outsheet("`tmp1'",A, mode="r")
end
Translating Fortran
SJ 5(3), 3rd quarter 2005, 421 - 441
Interactive use
SJ 6(3), 3rd quarter 2006, 387 — 396
https://www.surveydesign.com.au/tips.html 101/114
17/6/2019 Stata | Tips
Top
numlabel
(August 2006)
Without numlabel
For more information see help numlabel and the Stata 9 Data Management Manual
Top
viewsource
(July 2006)
viewsource is a command that allows a file located on the adopath to be viewed in the Stata
viewer.
Example: To view the code for the t test type viewsource ttest.ado
For more information see help viewsource and the Stata 9 Programming Manual
Top
If you have updated Stata 9 to the latest update (17 May 2006) you will find that a new Stata
command has been added: datasignature. (to find out what has been added with the update,
using the pulldown menu: Help>what's new or type whatsnew on the Stata command line)
https://www.surveydesign.com.au/tips.html 102/114
17/6/2019 Stata | Tips
4. The order in which the variables occur in the dataset if varlist is not specified, or in varlist if it
is.
2. checking if you are working with the same dataset as your colleges.
Top
tmap is a user written Stata program that allows you to map your data.
FAQ
I mapped the Victoria electoral map using the following for actual population. Other maps can be
generated by adding your own data and then mapping this.
*-----------------------start do file------------------------------------
clear
cd "C:\ASTATA INFO\learning\tmap" //where the data has been downloaded to
set matsize 3000
mif2dta VIC20030129_elb, genid(id)
use VIC20030129_elb-database
describe
tmap choropleth actual, id(id) map("VIC20030129_elb-Coordinates.dta") palette(Reds)
exit
data downloaded from
http://www.aec.gov.au/_content/Who/profiles/gis/gis_datadownload.htm
https://www.surveydesign.com.au/tips.html 103/114
17/6/2019 Stata | Tips
*-----------------------end do file------------------------------------
Top
For more information on the separate command see the Stata 9 Data Management manual or
online by typing help separate .
Top
Macro Expressions
(March 2006)
https://www.surveydesign.com.au/tips.html 104/114
17/6/2019 Stata | Tips
For more information on macro expressions see the Stata 9 Users guide [U] 18.3.8
Top
Stata has many functions that make using Stata easier. Eg.
Instead of
count if mpg>=23 & mpg<=34
These functions can be used after the if qualifier with commands such as generate, list,
summarize etc., or after assert,
Examples:
assert inlist(mpg,22,25,34,425)
generate mpg1=mpg if inlist(mpg,22,25,34,45)
list mpg if inlist(mpg,22,25,34,45)
list mpg if inlist(mpg,22,25,34,45) | inlist(mpg,15,26,35,55) ///use 2 inlist functions when the
list exceeds the max. allowed for 1 function
Top
set trace on
(January 2006)
local all `"`all' `"`=`v'[`i']'"'"'
set trace off
with our data a section of the trace will look like this:
The first line is the line being executed. It has a - in front of it to indicate it is being executed. The
second line is after macro substitution has occurred. It has a = in front of it to indicate that
substitution has occurred.
For more information on trace see the Stata 9 programming manual. Also see the Stata
command pause.
Top
https://www.surveydesign.com.au/tips.html 105/114
17/6/2019 Stata | Tips
Do-file Editor
(December 2005)
When typing commands in the Stata Do-Editor, individual commands or a selection of commands
can be run by highlighting the section that you would like to run and then pressing the do icon.
This allows you to try out your file section by section.
Top
Regular Expressions
(November 2005)
Regular expressions allow the matching of complex text patterns. Regular expression commands
have been included in Stata 9 with the commands:
For example
In the following example if you wish to have the day as a separate variable in the following data
set:
clear
input ///
str25 date
"12jan2003"
"1april1995"
"17may1977"
"02september2000"
end
list
. list
+-----------------------+
| date day |
|-----------------------|
1. | 12jan2003 12 |
https://www.surveydesign.com.au/tips.html 106/114
17/6/2019 Stata | Tips
2. | 1april1995 1 |
4. | 02september2000 02 |
+-----------------------+
Another example:
We have some text that includes citations. We wish to create a new variable that contains the
text of the last citation. In this case the last citation is not at the end of the text so it is useful to
reverve the text and then look for the desired pattern.
clear
input ///
id str200 cit_1
1 "EP696218-A -- WO9215370-A SUND _SUND-Individual_"
2 "WO9425112-A -- GB298635-A"
3 "EP578126-A -- CH180906-A AGE_OK"
4 "EP562128-A -- DE1684639-A"
5 "WO9318277-A -- DK137935-B"
6 "US4434855-A SEC OF NAVY _USNA_"
end
list
list
Text Editors
(October 2005)
The text editor that comes with Stata is fine for small programs. However, as the size of the
program increases other text editors are often used to make programming easier. For a
discussion on various text editors go here
Top
sum(x) returns the running sum of x. A basic use of sum() would be: generate running_tot
=sum(1)
Another example of the use of sum() is: given the data below you need to create a new
variable that starts with zero and goes to zero for changes in id and increases by 1 for changes
in var2.
id var2
1 7
1 7
1 7
1 7
1 7
1 7
1 8
1 8
2 8
2 8
2 1
https://www.surveydesign.com.au/tips.html 107/114
17/6/2019 Stata | Tips
further information can be found by typing help sum() on the Stata command line
Top
eg clonevar MPG=mpg
Top
Getting the path and file name onto the Stata command line
(March 2005)
Stata 8 has a handy way of getting files names complete with the path onto the command line.
Rather than typing folders, sub folders, and file name use the pull down menu File/Filename,
click onto the required file and path and file name will be shown on the command line; enclosed
in quotation marks. This is particularly handy when the path consists of many sub directories with
long names. You can then add commands such as cd, use to the command line.
Tabout -
a user written command (February 2005)
tabout- produces publication quality tables from Stata, with the output exported to a text file. It
can be exported as tab-delimited, html code or LaTeX/TeX code. -tabout- provides extensive user
control over formating of data and labels and generates table headers automatically
To make learning the syntax easy, an example file which can be used as a tutorial is available
here
Top
To have your current file name displayed on the Stata window you can add the following to your
do file:
See your programming manual for further details on the window command
Top
https://www.surveydesign.com.au/tips.html 108/114
17/6/2019 Stata | Tips
(January 2004)
ds lists the variable names of the dataset currently in memory in a compact form. The command
is useful if you require a list of variables that satisfies certain criteria. The list that results is saved
in r(varlist) which can be used in other commands eg.
(Using the auto dataset supplied with Stata)
Top
WORKING IN ROWS
(December 2004)
The egen command has a number of functions that make it easier to work with data in rows.
Rather than using xpose or reshape to convert the data to columns these commands may be
able to be used.
rfirst(varlist)
may not be combined with by. It gives the first nonmissing value in
varlist for each observation (row). If all values in varlist are
missing for an observation, newvar is set to missing.
rlast(varlist)
may not be combined with by. It gives the last nonmissing value in
varlist for each observation (row). If all values in varlist are
missing for an observation, newvar is set to missing.
rmax(varlist)
may not be combined with by. It gives the maximum value (ignoring
missing values) in varlist for each observation (row). If all values in
varlist are missing for an observation, newvar is set to missing.
rmean(varlist)
may not be combined with by. It creates the (row) means of the
variables in varlist, ignoring missing values; for example, if three
variables are specified and, in some observations, one of the variables
is missing, in those observations newvar will contain the mean of the
two variables that do exist. Other observations will contain the mean
of all three variables. Where none of the variables exist, newvar is
set to missing.
rmin(varlist)
may not be combined with by. It gives the minimum value in varlist for
each observation (row). If all values in varlist are missing for an
observation, newvar is set to missing.
rmiss(varlist)
may not be combined with by. It gives the number of missing variables
https://www.surveydesign.com.au/tips.html 109/114
17/6/2019 Stata | Tips
robs(varlist) [, strok]
may not be combined with by. It gives the number of nonmissing values
in varlist for each observation (row) -- this is the value used by
rmean() for the denominator in the mean calculation.
containing missing values when they contain ""; numeric variables will
rsd(varlist)
may not be combined with by. It creates the (row) standard deviations
of the variables in varlist, ignoring missing values. Also see rmean().
rsum(varlist)
may not be combined with by. It creates the (row) sum of the variables
in varlist, treating missing as 0.
Top
Would you like to be reminded to start a log each time that you start Stata.
One way of doing is this is to include the command below in your profile.do file
db log
For information on profile see the GETTING STARTED MANUAL - More on starting and stopping
Stata
Top
Version Control
(October 2004)
PROBLEM: Stata is continually being improved, meaning programs and do-files written for older
versions might stop working.
SOLUTION: Specify the version of Stata you are using at the top of programs and do-files that
you write:
https://www.surveydesign.com.au/tips.html 110/114
17/6/2019 Stata | Tips
Top
Top
Tips - Data Management
Below are examples of Stata table commands.
http://repec.org/bocode/e/estout/
http://www.ianwatson.com.au/stata.html
Table - out
https://www.surveydesign.com.au/tips.html 111/114
17/6/2019 Stata | Tips
+ Example code
Table - rtf
https://www.surveydesign.com.au/tips.html 112/114
17/6/2019 Stata | Tips
+ Example code
+ Example code
Top
https://www.surveydesign.com.au/tips.html 113/114
17/6/2019 Stata | Tips
Privacy policy Terms & conditions About Us
*Survey Design and Analysis Services Pty Ltd is the Australian, Indonesian and New Zealand distributor of Stata, Stata Journal, Stata Press produced by
StataCorp LLC; Stat/Transfer produced by Circle Systems Inc; and QDA Miner, WordStat, ProSuite and SimStat produced by Provalis Research. *Stata is a
registered trademark of StataCorp LLC, College Station, TX, USA, and the Stata logo is used with the permission of StataCorp. We are the Australian and New
Zealand distributor for Arbutus Software.
Survey Design and Analysis Services uses Google Analytics to track user actions on our website. This may involve the use of tracking cookies and other
features made available by Google to track user interaction with Survey Design and Analysis Services via our website.
https://www.surveydesign.com.au/tips.html 114/114