Welcome to Scribd!

0% found this document useful (0 votes)

36 views

Eliminating Duplicate Cases Stat A

Uploaded by

This document describes a sample Stata program that eliminates duplicate cases from a dataset. It mimics CPS data where people can be interviewed for up to three years. The program keeps all cases interviewed in 2008 and cases from 2007 and 2009 not interviewed in 2008, keeping only one case per person. It generates a variable to identify 2008 cases, sorts by ID and this variable, generates a variable to flag duplicate IDs, and drops duplicates, retaining only unique cases.

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Eliminating Duplicate Cases Stat A

Uploaded by

Andrew Tandoh

0% found this document useful (0 votes)

36 views2 pages

Original Description:

Eliminating Duplicate Cases in Stata

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Download as pdf or txt

0% found this document useful (0 votes)

36 views2 pages

Eliminating Duplicate Cases Stat A

Uploaded by

Andrew Tandoh

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Download as pdf or txt

Jump to Page

You are on page 1of 2

Search inside document

Eliminating Duplicate Cases in Stata 1

* "h:\000\STATA_doc\DelDupCases.do"
* This is a sample program that eliminates duplicate cases from a datset.
* The sample data mimics data from CPS.
* People can be interviewed for up to three years.
* This researcher wants to save all of the cases that were interviewed
* in 2008, and any cases that were interviewed in 2007 and 2009
* who were not interviewed in 2008, and she wants only one case per person
* even if they were interviewed in multiple years.
** I have color coded this for you.
** My comments are in green, my commands to Stata are in blue, and the things
** stata has to tell me are in black.

** Get the data set.

use "h:\000\STATA_doc\test.dta", clear

** Show you the data as they are now .

list
+----------------+
| year id v1 |
|----------------|
1. | 2007 1 1 |
2. | 2008 1 2 |
3. | 2009 1 3 |
4. | 2007 2 4 |
5. | 2008 2 5 |
|----------------|
6. | 2009 2 6 |
7. | 2007 3 7 |
8. | 2009 4 8 |
+----------------+

** create a variable named tyear. If year = 2008 the value = 0.

** otherwise it will be 1. This variable will be used to sort
** so that we can keep ids for the year 2008 if they exist .
gen tyear = 1
replace tyear = 0 if year == 2008
(2 real changes made)

** Sort so that we can identify duplicate cases later.

sort id tyear

** Show you the data as they are now .

list
+------------------------+
| year id v1 tyear |
|------------------------|
1. | 2008 1 2 0 |
2. | 2009 1 3 1 |
3. | 2007 1 1 1 |
4. | 2008 2 5 0 |
5. | 2007 2 4 1 |
|------------------------|
6. | 2009 2 6 1 |
7. | 2007 3 7 1 |
8. | 2009 4 8 1 |
+------------------------+

1
Prepared by Patty Glynn, University of Washington, November 1, 2010. Thanks to Sara Vera for testing this.
** The following command creates a variable named ppdup that has the
** value of id for the case behind it.
** [_n-1] asks stata to look at the previous case .
** (similar to “lag” in SAS and SPSS) .

gen ppdup = id[_n-1]

(1 missing value generated)

** Show you the data as they are now .

list
+--------------------------------+
| year id v1 tyear ppdup |
|--------------------------------|
1. | 2008 1 2 0 . |
2. | 2007 1 1 1 1 |
3. | 2009 1 3 1 1 |
4. | 2008 2 5 0 1 |
5. | 2009 2 6 1 2 |
|--------------------------------|
6. | 2007 2 4 1 2 |
7. | 2007 3 7 1 2 |
8. | 2009 4 8 1 3 |
+--------------------------------+

** select cases where the id is not the same as the id in the previous case .
drop if ppdup == id
(4 observations deleted)

** Show you the data as they are now .

list
+--------------------------------+
| year id v1 tyear ppdup |
|--------------------------------|
1. | 2008 1 2 0 . |
2. | 2008 2 5 0 1 |
3. | 2007 3 7 1 2 |
4. | 2009 4 8 1 3 |
+--------------------------------+

** All of the key commands not annotated and without “list” commands .

use "h:\000\STATA_doc\test.dta", clear

gen tyear = 1
replace tyear = 0 if year == 2008
sort id tyear
gen ppdup = id[_n-1]
drop if ppdup == id

GIS Tutorial for ArcGIS Desktop 10.8
From Everand
GIS Tutorial for ArcGIS Desktop 10.8
Wilpen L. Gorr
Rating: 4 out of 5 stars
4/5 (6)
PNG Spring05
Document16 pages
PNG Spring05
David Sarif
No ratings yet
ASTM A615-A615M-15a
Document8 pages
ASTM A615-A615M-15a
Lydia
100% (2)
Tableau Your Data!: Fast and Easy Visual Analysis with Tableau Software
From Everand
Tableau Your Data!: Fast and Easy Visual Analysis with Tableau Software
Daniel G. Murray
Rating: 4.5 out of 5 stars
4.5/5 (4)
Grossman Model
Document32 pages
Grossman Model
Andrew Tandoh
100% (2)
New Church Monthly Report: Reporting Month: - Year
Document2 pages
New Church Monthly Report: Reporting Month: - Year
Andrew Tandoh
No ratings yet
Geometrodynamics of Gauge Fields - On The Geometry of Yang-Mills and Gravitational Gauge Theories PDF
Document377 pages
Geometrodynamics of Gauge Fields - On The Geometry of Yang-Mills and Gravitational Gauge Theories PDF
Dennis Gerken
100% (3)
A-Z Casing Cutters
Document45 pages
A-Z Casing Cutters
Chandrasekhar Sonar
100% (3)
Netcourse 101: Answers To Exercises in Lesson 3
Document7 pages
Netcourse 101: Answers To Exercises in Lesson 3
Adetola Adeosun
No ratings yet
Data: Wide Versus Long: Id Y1 Y2 Y3 Y4 1 3.5 4.5 7.5 7.5 2 6.5 5.5 8.5 8.5
Document25 pages
Data: Wide Versus Long: Id Y1 Y2 Y3 Y4 1 3.5 4.5 7.5 7.5 2 6.5 5.5 8.5 8.5
nadiyah kamilia
No ratings yet
Total Revenue To English Crown, 1485-1815
Document3 pages
Total Revenue To English Crown, 1485-1815
samuraijack7
No ratings yet
Data Manipulation
Document70 pages
Data Manipulation
Onduso Sammy Magara
No ratings yet
Dspace Tips Tabelas
Document5 pages
Dspace Tips Tabelas
Angelo Silva
No ratings yet
Technical Tips On Time Series With Stata: Gustavo Sánchez
Document37 pages
Technical Tips On Time Series With Stata: Gustavo Sánchez
rubengutieerrez
No ratings yet
Managing Hierarchical Data in Mysql: Mike Hillyer - Mysql Ab
Document33 pages
Managing Hierarchical Data in Mysql: Mike Hillyer - Mysql Ab
sleyter94
No ratings yet
Project 1
Document3 pages
Project 1
Naresh
No ratings yet
SW2011 SP0 x86 CRACK
Document3 pages
SW2011 SP0 x86 CRACK
qamarul
No ratings yet
Zvquemtl
Document113 pages
Zvquemtl
api-3708589
No ratings yet
Overview of Common Data Types PDF
Document24 pages
Overview of Common Data Types PDF
dieko
No ratings yet
Do - File - Quan Ly Va Lam Sach Du Lieu
Document6 pages
Do - File - Quan Ly Va Lam Sach Du Lieu
trannhi2806tg
No ratings yet
SQL - Answers Assignment-1
Document9 pages
SQL - Answers Assignment-1
nivya inventateq
No ratings yet
SQL Practice Questions 2 Chapter No 9 SQL
Document7 pages
SQL Practice Questions 2 Chapter No 9 SQL
VipinVKumar
No ratings yet
Date
Document5 pages
Date
Vishal Tandon
No ratings yet
XAMPP For Windows - Mysql - H Localhost - U Root - P
Document14 pages
XAMPP For Windows - Mysql - H Localhost - U Root - P
tcoffopharell
No ratings yet
Fontes WimpplA
Document1 page
Fontes WimpplA
Rafael Ramos
No ratings yet
Mysql Join Assignment
Document10 pages
Mysql Join Assignment
Onkar K
No ratings yet
MIS 406 Final Practical Assignment 2018-2!10!017
Document27 pages
MIS 406 Final Practical Assignment 2018-2!10!017
Towhid Ul Islam
No ratings yet
Creating Dynamic Internal Table
Document8 pages
Creating Dynamic Internal Table
Eldior Solutions
No ratings yet
Metadata Matters by Tom Kyte (Oracle)
Document60 pages
Metadata Matters by Tom Kyte (Oracle)
ittichai
100% (2)
MPI 2019 Assessment
Document10 pages
MPI 2019 Assessment
kumarkl
No ratings yet
OIA With ADBC
Document2 pages
OIA With ADBC
Nikhil Bhatia
No ratings yet
Struts 2 Date Format
Document25 pages
Struts 2 Date Format
Anonymous WstkPD9E9h
100% (3)
PG Profile Extension
Document4 pages
PG Profile Extension
Rajkishore Patro
No ratings yet
Class1 SnowSQL
Document10 pages
Class1 SnowSQL
snowflake batch
No ratings yet
Test Algo
Document6 pages
Test Algo
Rohit Chandekar
No ratings yet
Time Series by Oscar Torres-Reyna
Document31 pages
Time Series by Oscar Torres-Reyna
Dacian Balosin
No ratings yet
Output DBMS2 (B) Prac
Document7 pages
Output DBMS2 (B) Prac
Sapna Madhavai
No ratings yet
Python Coding Interview Interview Questions Questions
Document9 pages
Python Coding Interview Interview Questions Questions
Zakeer Hussain
No ratings yet
Panel
Document93 pages
Panel
jjanggu
100% (1)
Hints and Answers
Document13 pages
Hints and Answers
ashishamitav123
No ratings yet
Sample Problems
Document5 pages
Sample Problems
Ayush Kumar
No ratings yet
Tugas 11-11-2015
Document2 pages
Tugas 11-11-2015
R yanto
No ratings yet
19.3.4 Klasifikasi Di Spark
Document5 pages
19.3.4 Klasifikasi Di Spark
Yafi Shalihuddin
No ratings yet
Dealing With Date Variables in Stata: Created by Jennifer Cocohoba, 9/5/07
Document3 pages
Dealing With Date Variables in Stata: Created by Jennifer Cocohoba, 9/5/07
Mithun Bhattacharya
No ratings yet
Parking Management System
Document29 pages
Parking Management System
prithiks
No ratings yet
Lab Work SQL
Document2 pages
Lab Work SQL
Ahmad Danial
No ratings yet
SAP ABAP ALV Grid Explained With Real Example
Document18 pages
SAP ABAP ALV Grid Explained With Real Example
badzkun
No ratings yet
OpenStack Pike Volet 11
Document6 pages
OpenStack Pike Volet 11
IRIE
No ratings yet
Ceilometer To Gnocchi - A Guide For Openstack
Document23 pages
Ceilometer To Gnocchi - A Guide For Openstack
btkk zztb
No ratings yet
04 Sapnapr 03
Document6 pages
04 Sapnapr 03
Sapna Madhavai
No ratings yet
Technical Document
Document16 pages
Technical Document
rachmat99
No ratings yet
Alv Grid Abap
Document4 pages
Alv Grid Abap
Bruna
No ratings yet
Employee Details Report Using Logical Database - PNP
Document10 pages
Employee Details Report Using Logical Database - PNP
Ayaz Ahmed Shaik
No ratings yet
Leetcode SQL QnA 1693149052
Document60 pages
Leetcode SQL QnA 1693149052
krishna4351
No ratings yet
Final Fantasy Tactics Advance Stats Growth
Document3 pages
Final Fantasy Tactics Advance Stats Growth
Ray Anthony Uy Pating
No ratings yet
Check and Analyze The STATISTICS in The MySQL Database Smart Way of Technology
Document3 pages
Check and Analyze The STATISTICS in The MySQL Database Smart Way of Technology
ikke den dikke
No ratings yet
Stata Excersise Full With Out Put
Document38 pages
Stata Excersise Full With Out Put
Yoseph Bekele
No ratings yet
Managing Hierarchical Data in MySQL
Document23 pages
Managing Hierarchical Data in MySQL
Arisetty Sravan Kumar
No ratings yet
Dofile - Quan Ly Va Lam Sach Du Lieu 2
Document6 pages
Dofile - Quan Ly Va Lam Sach Du Lieu 2
trannhi2806tg
No ratings yet
JAVA NET BEANS - Call For Classes - 9717570033 - SQL PRACTICE QUESTIONS 2
Document10 pages
JAVA NET BEANS - Call For Classes - 9717570033 - SQL PRACTICE QUESTIONS 2
sachdevashaurya08
No ratings yet
Customer Data Outliers Pyspark
Document1 page
Customer Data Outliers Pyspark
lilo cireneu
No ratings yet
IBM System 360 RPG Debugging Template and Keypunch Card
From Everand
IBM System 360 RPG Debugging Template and Keypunch Card
Archive Classics
No ratings yet
150+ JavaScript Pattern Programs
From Everand
150+ JavaScript Pattern Programs
Hernando Abella
No ratings yet
Beginning C# and .NET
From Everand
Beginning C# and .NET
Benjamin Perkins
No ratings yet
150+ C Pattern Programs
From Everand
150+ C Pattern Programs
Hernando Abella
No ratings yet
Identifying at Risk Students With SPQ
Document8 pages
Identifying at Risk Students With SPQ
Andrew Tandoh
No ratings yet
Banking Sector Report - May 2017
Document18 pages
Banking Sector Report - May 2017
Andrew Tandoh
No ratings yet
All Saints, Stranton Church Hartlepool: Annual Report of The Parochial Church Council For The Year Ended 31 December 2013
Document18 pages
All Saints, Stranton Church Hartlepool: Annual Report of The Parochial Church Council For The Year Ended 31 December 2013
Andrew Tandoh
No ratings yet
Omitted Variable Tests
Document4 pages
Omitted Variable Tests
Andrew Tandoh
No ratings yet
SSRN Id2191280 1
Document21 pages
SSRN Id2191280 1
Andrew Tandoh
No ratings yet
Durbin-Watson Test: A Test That The Residuals From A Linear Regression or Multiple Regression Are Independent
Document6 pages
Durbin-Watson Test: A Test That The Residuals From A Linear Regression or Multiple Regression Are Independent
Andrew Tandoh
No ratings yet
Minimum Capital Requirement Ghana
Document2 pages
Minimum Capital Requirement Ghana
Andrew Tandoh
No ratings yet
The Impact of Dividend Policy On Commercial Banks Performance in Ghana
Document12 pages
The Impact of Dividend Policy On Commercial Banks Performance in Ghana
Andrew Tandoh
No ratings yet
The Impact of Dividend Policy On Commercial Banks Performance in Ghana
Document14 pages
The Impact of Dividend Policy On Commercial Banks Performance in Ghana
Andrew Tandoh
No ratings yet
Test PDF
Document1 page
Test PDF
Andrew Tandoh
No ratings yet
Graph Paper 1 CM Red PDF
Document1 page
Graph Paper 1 CM Red PDF
Andrew Tandoh
No ratings yet
Essays On The Consumption Saving and Borrowing Behavior of Poor - 2
Document7 pages
Essays On The Consumption Saving and Borrowing Behavior of Poor - 2
Andrew Tandoh
No ratings yet
Membership Form of Sswo
Document2 pages
Membership Form of Sswo
Andrew Tandoh
No ratings yet
Graph Paper 1 CM Red
Document1 page
Graph Paper 1 CM Red
Andrew Tandoh
No ratings yet
2 Period LC Model
Document8 pages
2 Period LC Model
Andrew Tandoh
No ratings yet
Proposal SherlockOutlook
Document1 page
Proposal SherlockOutlook
Andrew Tandoh
No ratings yet
Options: HAT RE OUR
Document6 pages
Options: HAT RE OUR
Andrew Tandoh
No ratings yet
Keynes Theory of Money and PDF
Document10 pages
Keynes Theory of Money and PDF
Andrew Tandoh
No ratings yet
doubleBassSyllabusComplete08 PDF
Document12 pages
doubleBassSyllabusComplete08 PDF
Hsu Thar
No ratings yet
Test Cube Spreadsheet G30, G40, G60
Document2 pages
Test Cube Spreadsheet G30, G40, G60
Hatta Zoidin
No ratings yet
Chapter04 SG
Document20 pages
Chapter04 SG
Muhammad Saqib Bari
No ratings yet
Websphere Clusters and Scalability
Document30 pages
Websphere Clusters and Scalability
Bi Bông
No ratings yet
Deliverables
Document13 pages
Deliverables
Krisno Mursitojati
No ratings yet
Smart Attendance System Using Raspberry Pi
Document4 pages
Smart Attendance System Using Raspberry Pi
International Journal of Innovative Science and Research Technology
No ratings yet
Womens Health Newsletter
Document2 pages
Womens Health Newsletter
api-535934790
No ratings yet
Cisco Rop
Document48 pages
Cisco Rop
mzxbcl
No ratings yet
Ls Gipam 115
Document15 pages
Ls Gipam 115
giau richky
No ratings yet
Quiz: Quiz: Review Concepts
Document16 pages
Quiz: Quiz: Review Concepts
Aimee Tarte
No ratings yet
Data Sheet For LED Downlight 15W
Document4 pages
Data Sheet For LED Downlight 15W
rahul rajan
No ratings yet
Precast Protein Gels Brochure
Document18 pages
Precast Protein Gels Brochure
shahnawaz
No ratings yet
First Year Syllabus of MS (Ayu) Shalakya Tantra
Document50 pages
First Year Syllabus of MS (Ayu) Shalakya Tantra
Divya Virupaksha
No ratings yet
m120 Mitsubishi S12r-Pta-S
Document2 pages
m120 Mitsubishi S12r-Pta-S
api-252499008
100% (1)
2.9 Distributive Property With Fractions PDF
Document6 pages
2.9 Distributive Property With Fractions PDF
Jamel Crawford
No ratings yet
Sound and Music For Games The Basics of Digital Audio For Video Games Robert Ciesla All Chapter
Document67 pages
Sound and Music For Games The Basics of Digital Audio For Video Games Robert Ciesla All Chapter
justin.cushing472
100% (5)
PCEA 006 - Module 1 - Kinematics (Rectilinear Motion)
Document17 pages
PCEA 006 - Module 1 - Kinematics (Rectilinear Motion)
Esli Joy Fernando
No ratings yet
Nester
Document2 pages
Nester
Klemens Hofer
No ratings yet
Principio Di Equivalenza e Minima Energia Conformazionale
Document17 pages
Principio Di Equivalenza e Minima Energia Conformazionale
Francesca Di Lauro
No ratings yet
The Application of Two Stage Indicator Kriging in Gold Vein Modeling
Document6 pages
The Application of Two Stage Indicator Kriging in Gold Vein Modeling
Dirceu Nascimento
No ratings yet
Equipment Earthing
Document5 pages
Equipment Earthing
dhamim Ansari
No ratings yet
IRU Pro UOR Catalogue
Document18 pages
IRU Pro UOR Catalogue
Biren Shah
No ratings yet
Variable DC Supply
Document3 pages
Variable DC Supply
Vamsi Patwari
No ratings yet
UNIT 6 Spreadsheets and Database Packages
Document15 pages
UNIT 6 Spreadsheets and Database Packages
sandeep
No ratings yet
Industrial Visit To Asahi India Glass Limited
Document3 pages
Industrial Visit To Asahi India Glass Limited
ramesisbhs
No ratings yet
Metallic Fatigue
Document69 pages
Metallic Fatigue
aap1
No ratings yet
Maths Class X Chapter 07 Coordinate Geometry Practice Paper 07 Answers 1
Document8 pages
Maths Class X Chapter 07 Coordinate Geometry Practice Paper 07 Answers 1
G.sathasivam Maha
No ratings yet