Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                


Donated on 4/30/1996

Predict whether annual income of an individual exceeds $50K/yr based on census data. Also known as "Census Income" dataset.

Dataset Characteristics


Subject Area

Social Science

Associated Tasks


Feature Type

Categorical, Integer

# Instances


# Features


Dataset Information

Additional Information

Extraction was done by Barry Becker from the 1994 Census database. A set of reasonably clean records was extracted using the following conditions: ((AAGE>16) && (AGI>100) && (AFNLWGT>1)&& (HRSWK>0)) Prediction task is to determine whether a person's income is over $50,000 a year.

Has Missing Values?


Variables Table

Variable NameRoleTypeDemographicDescriptionUnitsMissing Values
workclassFeatureCategoricalIncomePrivate, Self-emp-not-inc, Self-emp-inc, Federal-gov, Local-gov, State-gov, Without-pay, Never-worked.yes
educationFeatureCategoricalEducation Level Bachelors, Some-college, 11th, HS-grad, Prof-school, Assoc-acdm, Assoc-voc, 9th, 7th-8th, 12th, Masters, 1st-4th, 10th, Doctorate, 5th-6th, Preschool.no
education-numFeatureIntegerEducation Levelno
marital-statusFeatureCategoricalOtherMarried-civ-spouse, Divorced, Never-married, Separated, Widowed, Married-spouse-absent, Married-AF-spouse.no
occupationFeatureCategoricalOtherTech-support, Craft-repair, Other-service, Sales, Exec-managerial, Prof-specialty, Handlers-cleaners, Machine-op-inspct, Adm-clerical, Farming-fishing, Transport-moving, Priv-house-serv, Protective-serv, Armed-Forces.yes
relationshipFeatureCategoricalOtherWife, Own-child, Husband, Not-in-family, Other-relative, Unmarried.no
raceFeatureCategoricalRaceWhite, Asian-Pac-Islander, Amer-Indian-Eskimo, Other, Black.no
sexFeatureBinarySexFemale, Male.no

0 to 10 of 15

Additional Variable Information

Listing of attributes: >50K, <=50K. age: continuous. workclass: Private, Self-emp-not-inc, Self-emp-inc, Federal-gov, Local-gov, State-gov, Without-pay, Never-worked. fnlwgt: continuous. education: Bachelors, Some-college, 11th, HS-grad, Prof-school, Assoc-acdm, Assoc-voc, 9th, 7th-8th, 12th, Masters, 1st-4th, 10th, Doctorate, 5th-6th, Preschool. education-num: continuous. marital-status: Married-civ-spouse, Divorced, Never-married, Separated, Widowed, Married-spouse-absent, Married-AF-spouse. occupation: Tech-support, Craft-repair, Other-service, Sales, Exec-managerial, Prof-specialty, Handlers-cleaners, Machine-op-inspct, Adm-clerical, Farming-fishing, Transport-moving, Priv-house-serv, Protective-serv, Armed-Forces. relationship: Wife, Own-child, Husband, Not-in-family, Other-relative, Unmarried. race: White, Asian-Pac-Islander, Amer-Indian-Eskimo, Other, Black. sex: Female, Male. capital-gain: continuous. capital-loss: continuous. hours-per-week: continuous. native-country: United-States, Cambodia, England, Puerto-Rico, Canada, Germany, Outlying-US(Guam-USVI-etc), India, Japan, Greece, South, China, Cuba, Iran, Honduras, Philippines, Italy, Poland, Jamaica, Vietnam, Mexico, Portugal, Ireland, France, Dominican-Republic, Laos, Ecuador, Taiwan, Haiti, Columbia, Hungary, Guatemala, Nicaragua, Scotland, Thailand, Yugoslavia, El-Salvador, Trinadad&Tobago, Peru, Hong, Holand-Netherlands.

Baseline Model Performance

Dataset Files

adult.data3.8 MB
adult.test1.9 MB
adult.names5.1 KB
old.adult.names4.2 KB
Index140 Bytes

Papers Citing this Dataset

Integrating Association Rules with Decision Trees in Object-Relational Databases

By Maruthi Ayyagari. 2019

Published in International Journal of Engineering Trends and Technology 67.3 (2019): 102-108.

Outis: Crypto-Assisted Differential Privacy on Untrusted Servers

By Amrita Chowdhury, Chenghong Wang, Xi He, Ashwin Machanavajjhala, Somesh Jha. 2019

Published in ArXiv.

Refining the Structure of Neural Networks Using Matrix Conditioning

By Roozbeh Yousefzadeh, Dianne O'Leary. 2019

Published in ArXiv.

Fair k-Center Clustering for Data Summarization

By Matthaus Kleindessner, Pranjal Awasthi, Jamie Morgenstern. 2019

Published in ArXiv.

Effectiveness of Equalized Odds for Fair Classification under Imperfect Group Information

By Pranjal Awasthi, Matthaus Kleindessner, Jamie Morgenstern. 2019

Published in ArXiv.

0 to 5 of 257


There are no reviews for this dataset yet.

Login to Write a Review
Download (605.7 KB)
257 citations


Barry Becker

Silicon Graphics

Ronny Kohavi




By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.