The HAR-Brazilian Dataset is designed for a real out-of-home Human Activity Recognition (HAR) researches. The dataset has time (Day, Initial Time, End Time, and Total Duration), GPS (Distance from Previeous Point, Pace, and Point Of Interest Category) and user profile data (Profession and Has car). The data presented were preprocessed, because the raw data can allow users to be identified.
Our project was submitted to the university ethics committee and after being approved, we sent invitations to university students and workers to participate in our experiment. Each volunteer signed an agreement and installed a mobile app developed for Android smartphones by us. The Android platform was chosen based on the profile analysis of the volunteers of a previous study. The mobile app collects raw GPS data (Longitude and Latitude) and system date and time every three minutes.
We collected data from 22 subjects for 20 days, during March and April 2018. Subjects were from both genders, aged from 18 to 56, and lived either in Lavras, Ituiutaba, or Uberlândia, both cities in the state of Minas Gerais, Brazil. The volunteers completed an initial questionnaire that allowed us to create the initial individual user profiles. This questionnaire includes only four questions regarding age, gender, profession, and car ownership. Taking into account the population of volunteers, we configured the experiment to recognize 13 activities. Only out-of-home activities were used. They were divided into two categories: stationary and moving activities. The activities and instance number are shown below. Stationary Activities: Dinning (1,939 instances), Having lunch (5,119 instances), In bank (903 instances), Praying (986 instances), Recreation (954 instances), Shopping (1,629 instances), Studying (3,104 instances), Taking coffee (1,152 instances), Waiting transport (804 instances), and Working (8,456 instances). Moving Activities: Going by bus (6,774 instances), Going by car (14,110 instances), and Walking (1,378 instances). The dataset has a total of 47,308 instances.
This dataset is released solely for research purpose. Please cite the following paper if you use this dataset in your research.
Igor da Penha Natal, Luís Correia, Ana Cristina Garcia, Leandro Fernandes, "Efficient out-of-home activity recognition by complementing GPS data with semantic information" First Monday, V. 24, N. 11 (2019).
For more information, please refer to the paper available at https://firstmonday.org/ojs/index.php/fm/article/view/9971 or contact Igor Natal by email igorpnatal@gmail.com.