Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
50 views

STATA Data Summaries

This document discusses how to summarize data in Stata. It covers generating descriptive statistics for continuous variables using the summarize command. For categorical variables, it discusses one-way, two-way, and multi-way frequency tables created using the tab, tab1, tab2 commands. It also covers options for these commands like chi-squared tests for two-way tables and generating statistics within categories using tabstat and bysort.

Uploaded by

jminyoso
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views

STATA Data Summaries

This document discusses how to summarize data in Stata. It covers generating descriptive statistics for continuous variables using the summarize command. For categorical variables, it discusses one-way, two-way, and multi-way frequency tables created using the tab, tab1, tab2 commands. It also covers options for these commands like chi-squared tests for two-way tables and generating statistics within categories using tabstat and bysort.

Uploaded by

jminyoso
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 15

DATA SUMMARIES IN STATA

Sandra Jumba
CONTENTS

• Summarize data
• One-way table
• Two-way table
• Multi-way tables
• Conclusion
SUMMARIZING DATA

• Pattern in data can easily be understood by generating


descriptive statistics if it is a continuous/quantitative
variable
• The command is:

• summarize bp_before bp_after


• This command summarizes data by producing the
number of observations, average, std dev, min and max.
Summary cont.
• However, a more detailed summary statistics can be
obtained by adding the option – detail
• summarize bp_before, detail
TABLES

• The are three related commands that produce frequency


tables for discrete(categorical) variables.

• tab, tab1 and tab2

• tab agegrp
• Produces a oneway frequency table with percentages

• tab sex agegrp


• Produce a twoway frequency table (without percentages)
but no more than two variables is allowed by stata
Tables cont.
• tab1 agegrp sex
• produces a one-way frequency table for each variable in
the variable list

• tab2 agegrp sex


• produces all possible twoway tables from all the variables
in the list (unlike the tab command)
Options for two-way tables
• In one-way tables, STATA gives the count, the percentage,
and the cumulative percentage.

• In two-way tables, STATA gives the count only, unless you


ask for other statistics.

• The options col, row, cell are often used e.g.

• tab sex agegrp, cell


• tab sex agegrp, row
• tab sex agegrp, col
Two-way options cont.

• However a combination of the three options will request


STATA to include percentages in all the dimensions

• tab sex agegrp, col row cell


Tests of association in 2way tables

• Including the option for test of association in a two-way


table generates the test result output just after the table.

• For example:
• tab agegrp sex, chi2
• This provides the chi squared test for the two-way table

• Alternatively you can use the option all to generate all


tests of association using various approaches
• tab agegrp sex, all
Using the bysort prefix

• This is not an independent command but rather a “prefix”


that goes before another command and asks STATA to
repeat the command for each value of a variable.

• The general syntax is:


• bysort varlist: command
• where “varlist” is one or more variables (usually just one)
and “command” is the STATA command to be repeated
e.g.

• bysort sex: tab agegrp


Tables with descriptive stats
• tabstat bp_before bp_after, by( sex)
• tabstat bp_before bp_after, s(mean median sd var count
range min max) by(sex)
• This command generates descriptive statistics of bp_before
& bp_after but tabulated according to categorical levels of
sex

• tab agegrp sex, sum(bp_before)


• This cmd generates mean, sd and freq of bp_before for the
categorical two-way table

• tab agegrp sex, sum(bp_before) nofreq


• Removes frequency from the list of the discriptives
Two-way table & bysort cmd

• bysort varx: tab var1 var2, col row


• This cmd is used for generating a cross tabulation by
category levels of variable x.

• Three way crosstab with summary statistics bysort varx:


tab var1 var2, sum(var3)
• where varx, var1 and var2 are categorical but var3 is
quantitative
Multi-way table

• Using the menu bar for nested tables is easy and you
can conveniently decide which categorical variable to
act as the super column and super row
• >Statistics>summaries,tables & tests>tables>
table of summary statistics>dialogue window

• You could as well set your own formatting of the


table, for example, by adding the option:
• format(%9.2f) center
THE END
THANKS FOR LISTENING

IRES - Nairobi
jminyoso@gmail.com
+254 -713 -239 670

You might also like