Estimating Occupancy in an Office Setting
Manar Amayri, Stéphane Ploix, Sanghamitra Bandyopadhyay
To cite this version:
Manar Amayri, Stéphane Ploix, Sanghamitra Bandyopadhyay. Estimating Occupancy in an Office
Setting. The First International Symposium on Sustainable Human-Building Ecosystems (ISSHBE),
Oct 2015, pittsburgh, United States. 10.1061/9780784479681.008. hal-01246159
HAL Id: hal-01246159
http://hal.univ-grenoble-alpes.fr/hal-01246159
Submitted on 12 Sep 2018
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Estimating Occupancy In An Office Setting
Manar Amayri1 , Stéphane Ploix2 , Sanghamitra Bandyopadhyay3
ASCE Conference 2015G-SCOP laboratory / Grenoble Institute of Technology,
1, 46 Avenue Felix Viallet, 38031 Grenoble, France
email: Manar.Amayri@grenoble-inp.fr
ABSTRACT:
A general approach is proposed to estimate the number of occupants in a zone
using different kinds of measurements such as motion detection, power consumption
or CO2 concentration. The proposed approach is inspired from machine learning. It
starts by determining among different measurements those that are the most useful
by calculating the information gains. Then, an estimation algorithm is proposed. It
relies on a C4.5 learning algorithm that yields human readable decision trees using
measurements to estimate the number of occupants. It has been applied to an office
setting.
INTRODUCTION
Recently, research about building turns to focus on occupant behaviors. Most
of these works deal with the design stage: the target is to represent the diversity of
occupant behaviors in order to guarantee minimal measured performances. Most of the
approaches uses statistics about human behaviors (Roulet et al., 1991; Page et al., 2007;
Robinson and Haldi, 2009). (Kashif et al., 2013) emphasized that inhabitants’ detailed
reactive and deliberative behaviors must also be taken into account and proposed a cosimulation methodology to find the impact of certain actions on energy consumption.
Nevertheless, human behavior is not only interesting during the design step,
but also during operation. It is indeed useful for diagnostic analyses to discriminate
human misbehaviors from building system performance, but also for energy management where strategies depend on human activities and, in particular, on the number
of occupants in a zone. Unfortunately, the number of occupants is not easy to measure. This paper tackles this issue. It proposes an occupancy estimator combining different measurements such as CO2 concentration, motion detection, power consumption,. . . because only one measurement proved to be not reliable enough to estimate
the number of occupants. For instance, CO2 concentration may be useful but in some
configurations, when a window is opened for instance, estimations become unreliable.
Motion detection and power consumptions depend on occupant activities. However,
altogether, these measurements can be combined to get a more reliable estimator.
STATE OF THE ART
Similar work for finding occupancy has been already tackled and various
methods have been investigated. The methods vary from basic single feature classifiers that distinguish among two classes Presence and Absence to multi-sensor, multi
feature models. A primary approach, which is prevalent in many commercial buildings
is to use passive infrared (PIR) sensors for occupancy. However, motion detectors fail
to detect presence when occupants remain relatively still, which is quite common during activities like working on a computer, or regular desk work. Furthermore, drifts of
warm or cold air on objects can be interpreted as motion leading to false positive detections. This makes the use of only PIRs for occupancy counting purpose less attractive.
Conjunction of PIRs with other sensors can be useful as discussed in (Agarwal et al.,
2010) who makes use of motion sensors and magnetic reed switches for occupancy detection to increase efficiency in HVAC systems of smart buildings, which is quite simple and non-intrusive. Apart from motion, acoustic sensors (Padmanabh et al., 2009)
may be utilized. However, audio from the environment can easily fool such sensors,
and with no support from other sensors it can report many false positive detections.
In the same way, other sensors like video cameras (Erickson et al., 2011; Milenkovic
and Amft, 2013b), which utilize the huge advances in the field of computer vision and
the ever increasing computational capabilities, RFID tags (Philipose et al., 2004) installed on id cards, sonar sensors (Milenkovic and Amft, 2013a) plugged on monitors
to identify presence of a person on the computer, have been used and have proved to
be much better at solving the problem of occupancy count, yet can not be employed in
most office buildings for reasons like privacy and cost concerns. Pressure sensors and
PIRs has been discussed in (Nguyen and Aiello, 2012) to determine presence/absence
in single desk offices. They further tag activities based on this knowledge.
A new approach for occupancy recognition is going on by understanding the
relationships existing between carbon dioxide concentration and indoor air quality IAQ
in terms of occupant number. Physical CO2 model built on sensor networks (Aglan,
2003) have been used extensively in smart office projects to improve occupancy comfort and building energy use. However in this paper, CO2 physical model is studied to
find out the valuable of using it in occupancy estimation. However, for various applications like activity recognition, or context analysis within a larger office space, information regarding the presence or absence of people isn’t sufficient, and an estimation
of the number of people occupying the space is essential. (Lam et al., 2009) investigates this problem in open offices, estimating occupancy and human activities using
a multitude of ambient information, and compare the performance of HMMs, SVMs
and Artificial Neural Networks. However, none of these methods generate humanunderstandable rules which may be very helpful to building managers.
In general, an occupancy count algorithm that fully exploits information available from low cost, non-intrusive, environmental sensors and provides meaningful information is an important yet little explored problem in office buildings.
PROCESS USED FOR ESTIMATION
Experiment setup
The testbed is an office in Grenoble Institute of Technology, which accommodates a professor and 3 PhD students. The office has frequent visitors with a lot
of meetings and presentations all through the week. The setup for the sensor network
includes:
• 2 video cameras for recording real occupancy numbers and activities.
• an ambience sensing network, which measures illuminance, temperature, rela-
tive humidity (RH), motion at a sampling rate of 30 seconds.
• a centralized database with a web-application for retrieving data from different
sources continuously.
All the data possibly used for estimating the occupancy are called features as
in machine learning.
Generating and Selecting features
The underlying approach for the experiments is to formulate the classification problem as a map from a feature vector into some feature space that comprises
several classes. Therefore, the success of such an approach heavily depends on how
good (those which provide maximum separability among classes) the selected features
are. In this case, features are attributes from multiple sensors accumulated over an
interval. The choice of interval duration is highly context dependent, and has to be
done according to the granularity required. However, some features do not allow this
duration to be arbitrarily small. As an example, it has been observed that CO2 levels
do not rise immediately, and one of the factors affecting this time is the ventilation of
the space being observed. Regarding the results presented in this paper, an interval of
Ts = 30 minutes (which has been referred to here as 1 quantum) has been considered.
Before any features are calculated for the training data, some basic preprocessing of
data had to be done: basic interpolation for nonexistent data and application of an
outlier removal algorithm. The interpolation part is necessary for filling in missing
values from the sensor data. This is frequent in devices which are event-triggered i.e.
no data points are reported if there is no change in the feature being reported. Thus, the
previous data point had to be copied into the voids. One quantitative measurement of
the usefulness of a feature is information gain. Before detailing what is an information
gain, it is imperative to discuss the concept of entropy. Entropy is an attribute of a
random variable that categorizes its disorder. Higher the entropy, higher is the disorder
associated with the variable i.e. the less it can be predicted. Mathematically, entropy is
defined by: H(y) = ∑n−1
i=0 −p(yi ) log2 p(yi ) where y is a random variable whose value
domain is dom(y) = {y0 , . . . , yn−1 }, H(y): is the entropy of a random variable y and
p(yi ) is the probability for y to be equal to the value yi . Information gain can now be
defined between two random variables, x and y as: IG(x, y) = H(y) − H(y|x) where
y is a target random variable, H(y) is the entropy of y and H(y|x) is the conditional
entropy of y given x. The higher the reduction of disorder by fixing feature x is, the
more is the information gained for determining y thus making x a good feature to use
for classifying y.
Learning process
From the large set of features displayed in, (Abhay Arora and Bandyopadhyay, 2015), some of them may not be worthwhile to consider, to achieve our target
of occupancy classification. These features are ones, which when added to the classification algorithm make no difference to overall output. Regarding the mathematical
calculation of the information gain, which was discussed in, (Abhay Arora and Bandyopadhyay, 2015) . Let Tk = ti : ti [kTs, (k + 1)Ts] be the time samples related to time
quantum k, the most relevant features are:
1. fluctuation count: The PIR sensor in use is a binary sensor that reports a value of 1
whenever it senses some motions. The number of times a motion is detected within
the specified duration of 1 quantum has been computed. ∑ti ∈Tk data(ti )
2. Occupancy from power consumption, which estimated for 4 sensors by lˆconsumption =
∑i πi with πi ∈ {0, 1}. It satisfies: if poweri < threshold then πi = 0 else πi = 1
where poweri stands for the actual laptop average power consumption during time
quantum i and threshold = 15W .
3. average: |T1 | ∑ti ∈Tk data(ti ), where 0 ≤ k ≤ 47 for one day, since the number of ‘
k
half-hours′ in a day are limited to 48. This feature is calculated for carbon dioxide.
4. time slot generated from calendar. One of NIGHT, PRELUNCH, LUNCH, POSTLUNCH.
These correspond to time intervals [20-8), [8-12), [12-14), [14-20) respectively.
5. first order derivative: Gives the trend of data. The data points are interpolated to a
first-order linear equation, and then the derivative of the resultant line is recorded.
This feature is useful to quantify the rate of increase / decrease of occupancy relative to the previous time interval.This feature is calculated for carbon dioxide.
6. contact state: this feature is extracted for the door contact sensors. Possible values
for this feature can be 0: door open, 1: door closed, a real number fstate ∈ [0, 1],
which denotes state change.
In this paper new features are added in addition to the previous one: Audio microphone
detection and occupancy from CO2 physical model. The correlation with occupancy
estimation with these features is discussed below.
Classification algorithm
A supervised learning approach has been used. Occupancy has been measured
before using a classification algorithm. Occupancy count was manually annotated using the video feed from two cameras strategically positioned in the office The decision
tree classification technique has been selected because both it provides very good results and the results are easy to read, analyze and adapt. The decision tree algorithm
selects a class by descending a tree of decision nodes. Each internal node represents
a comparison of a single feature value with a learnt threshold. The target of the decision tree algorithm is to select features that are more useful for classification. One
quantitative measurement of the usefulness of a feature is the information gain that
has been discussed in (Abhay Arora and Bandyopadhyay, 2015). As information gain
approaches to zero, the difference between initial disorder (entropy) of the target variable, and after having added knowledge from the test feature x is negligible. Hence, the
particular feature is not probably going to help very much during the decision making
process. Decision tree algorithm provides quite a few advantages. As per (Quinlan,
1986), the features with higher information gain are much higher up the tree, therefore
making the process of feature selection intrinsic to the classifier. Since the path to the
leaf may consist of many internal nodes, each of which may check different feature
values, such paths exploit the correlation among the various features. The decision
tree approach offers the advantage of generating rules that the path towards the leaf
node is quite informative and it clearly points out direct causes for the selection of a
particular class. Unlike methods that use decision boundaries (SVMs, regression tech-
niques), decision tree analyses are independent of the scale of the input data, so no
conditioning of the data is necessary.
Using this raw training data, previously mentioned features were extracted.
A vector of features and target h f1 , f2 , f3 , . . . , fN ; yi had been generated for each time
quantum, where fi stands for the ith feature and y, for the level of occupancy.
Occupancy from Acoustic Sensor
Acoustic features are a very important part of occupancy classification when
other non-intrusive sensors offer low class separation. A single omnidirectional microphone can be used as an important tool, when it comes to classify occupancy. Omnidirectional microphones are ones, which can pick up sound from virtually any direction.
They are considerably cheaper than having multiple unidirectional microphones, and
prove to be much advantageous in places where it is required to track/ listen to multiple sources like in meetings, discussions, (Abhay Arora and Bandyopadhyay, 2015).
In this paper, the recording signal from an office is generally background environmental noise with a few human voices, some door opening, and tapping events. From the
recording signal RMS amplitude feature is defined, which is the root mean square (or
average)q
of the amplitude of a sound. However, it is related to the volume of the sound:
i=1 (S 2 )
th
VRMS = ∑n n i , where n is the number of samples taken and Si the i sample. High
and low RMS value will give indicator to the level of occupants inside the office, this
relationship is easy to visualize in (figure 2, left side), which represents both, the RMS
amplitude in dB for 4 days, and the actual occupancy profile with respect to time
(quantum time is 30 minutes).
Occupancy from CO2 physical model
An alternative approach for occupancy estimation can be done by using physical CO2 model. According to ASHRAE (1985), the model given by (1) represents
the relation between carbon dioxide generation, the volumetric flow rate of fresh air
entering the office, the volumetric air flow rate outgoing from the office and occupancy
(Aglan, 2003). The proposed approach relies on the data coming from CO2 concentration sensors, door contact, window contact, occupancy labels extracted from video
cameras for tuning air flows, and constant parameters associated to the office.
V
dCin (t)
= − Qout (t) + Qcor (t) Cin (t) + Qout (t)Cout + Qcor (t)Ccor (t) + n(t)S (1)
dt
It yields the following estimator:
Cout + (Dk QD + Qcor
Cin,k+1 − αkCin,k Qout
0 )Ccor,k
− 0
Sβk
S
out
cor
(Dk QD +Q0 +Q0 )Ts
1 − αk
V
α k = e−
and βk =
cor
Dk QD + Qout
0 + Q0
nk =
where:
• time quantum Ts =1800 seconds
• indoor CO2 concentration: Cin (t)
(2)
parameter
S
Cout
Q0out
Q0cor
QD
initial value
7ppm. m3 /s
395ppm
0.004m3 /s
0.004m3 /s
0.04m3 /s
adjusted value
19.6ppm. m3 /s
420ppm
0.076m3 /s
0m3 /s
0.1m3 /s
Table 1: Adjusted parameter values for physical CO2 model
Number of levels
L=2
L=3
L=4
L=5
L=6
Discretizations
{[= 0], [> 0]}
{[= 0], [> 0, ≤ 3], [> 3]}
{[= 0], [> 0, ≤ 2], [> 2, ≤ 4], [> 4]}
{[= 0], [> 0, ≤ 1], [> 1, ≤ 2.2], [> 2.2, ≤ 3.2], [> 3.2]}
{[= 0], [> 0, ≤ 1], [> 1, ≤ 2], [> 2 ≤ 3], [> 3, ≤ 4], [> 4]}
Table 2: Levels of occupancy considered with ranges
•
•
•
•
•
corridor CO2 concentration: Ccor (t)
average opening of the door during a time quantum k: Dk ∈ [0, 1]
CO2 production for 1 average person: S
number of persons: nk
cor
air flow exchange with corridor: Qcor,k = Dk QD + Qcor
0 where Q0 stands for
leak air flow with corridor and window air flow is assumed to be proportional to
door opening
The first step is to find the best parameter values for invariant parameters S,
Cout , Q0out , Q0cor and QD using an iterative nonlinear optimization approach, taking into
account the positions of the door and the window, as shown in table 1. An objective
function is determined to minimize the difference between actual and measured number of occupants in the room. Optimization covers a long period of time but it can be
imagined that less representative observations could be sufficient.
The next step is to use these adjusted parameters for calculating the number of
occupants over a time quantum lasting 30 minutes. Occupancy estimation is obtained
from equation (2). Finally, the last step is to use this estimation of occupants as one
feature in the classification model.
Deciding the number of occupancy levels
In this section, how to choose the number of levels (L) of occupancy for classification is discussed. This number is not fixed and can be changed according to
the required average error (average distance between actual occupancy numbers and
the mid points of estimated levels). To determine the number of levels and related
non overlapping ranges of occupancy, training data are partitioned into L clusters with
2 ≤ L ≤ N , where N is the maximum possible number of occupants. At L = 2, the
problem amounts to classify presence and absence of people. Table 2 shows the different discretizations considered (N = 4).
Basic Set Of Features
1.motion detector counting
2.occupancy estimation from power consumption
3.CO2 average value
4.time slot
5.CO2 derivative
6.door position
Table 3: Basic Set Of Features
Figure 1: (left) Occupancy estimation considering basic features (right) Occupancy
estimation considering all the features
RESULTING OCCUPANCY ESTIMATORS
The C4.5 decision tree algorithm has been used to perform recognition by
using aggregated features and the labels extracted from video cameras. Training data
cover 11 days from 4 May 2015 to 14 May 2015 while testing data are collected over
for 4 days from 17 May 2015 to 20 May 2015. Over the training period, 120000 data
points have been collected.
Figure 1, left side, shows the result obtained from the learnt decision tree
considering the basic set of features (table 3), as input to the detection model. The plot
shows both actual occupancy profile and the estimated profile as a graph of number of
occupants with respect to time (quantum time is 30 minutes). The average error yields
to 0.32 occupant.
Figure 1, right side, shows the result obtained from the decision tree after
considering the two additional features of audio microphone detection and occupancy
from CO2 physical model, in addition to the previous basic set of features. CO2 average
value and CO2 derivative are removed from the initial set of features and replaced by
the estimation of occupancy from CO2 physical model, equation (2). Considering these
features, leads to improvement in occupancy estimation with an average error of 0.24
occupant.
Both acoustic pressure(figure 2, left side) and occupancy from CO2 physical
model (figure 2, right side) are observed to be one of the most important features for
occupancy classification, according to the final Decision tree classification which ranks
the features assendingly due to information gain for each feature, (figure 3,right side).
Acoustic pressure improves the estimation in occupancy at high levels while occupancy
from CO2 physical model decrease the whole average error in the classification.
Finally, (figure 3, left side) shows the results of average error corresponding
Figure 2: (left) Correlation between acoustic pressure and occupation (right) estimation of CO2 using a physical model
Figure 3: (left) Resulting estimation error function of number of occupancy levels
(right) Normalized Information Gain from final DT
to each level. Accordingly, 5 levels of occupancy is the best option for the occupancy
classification.
CONCLUSIONS
A supervised learning approach have been proposed in this paper to estimate
the number of occupants in a room. In the presented application, motion fluctuation
counters using PIR sensors, power consumption sensors, CO2 mean and derivative
and door position is the most interesting information. The estimation of the number of
occupants using a physical CO2 model is also very promising. Classification has been
done using the C4.5 classification algorithm, which leads to decision trees. Application
to an office leads to an average estimation error of 0.24 occupant for 4 days period,
which is quit good.
Supervised learning has been done thanks to 2 video cameras but this approach is limited because of privacy issues. Another option has been envisaged: using
discrete feedbacks from occupant themselves such as with a keyboard or any other
means. In addition, because decision trees are human readable, they can be adjusted
using expert knowledge, adjusting threshold for instance, or removing some nodes
when an information is not available, depending on the considered living areas. The
two extensions can be combined to avoid the use of video cameras. It will be investigated further in the future.
REFERENCES
Abhay Arora, Manar Amayri, V. R. B. S. P. and Bandyopadhyay, S. 2015. Estimating
occupancy in an office setting. 1BS-2015 Secretariat, Hyderabad, India.
Agarwal, Y., Balaji, B., Gupta, R., Lyles, J., Wei, M., and Weng, T. 2010. Occupancydriven energy management for smart building automation. In Proceedings of the 2nd
ACM Workshop on Embedded Sensing Systems for Energy-Efficiency in Building,
pages 1–6. ACM.
Aglan, H. 2003. Predictive model for co2 generation and decay in building envelopes.
JOURNAL OF APPLIED PHYSICS, 93(2).
ASHRAE, Atlanta, G. 1985. Fundamentals American Society of Heating, Refrig- erating and Air-Conditioning Engineers. Fundamentals American Society of Heating,
Refrig- erating and Air-Conditioning Engineers.
Erickson, V. L., Carreira-Perpiñán, M. Á., and Cerpa, A. E. 2011. Observe:
Occupancy-based system for efficient reduction of hvac energy. In Information Processing in Sensor Networks (IPSN), 2011 10th International Conference on, pages
258–269. IEEE.
Kashif, A., Dugdale, J., and Ploix, S. 2013. Simulating occupants’ behaviour for
energy waste reduction in dwellings: A multi agent methodology. Advances in
Complex Systems, 16:37.
Lam, K. P., Höynck, M., Dong, B., Andrews, B., shang Chiou, Y., Benitez, D., and
Choi, J. 2009. Occupancy detection through an extensive environmental sensor network in an open-plan office building. In Proc. of Building Simulation 09, an IBPSA
Conference.
Milenkovic, M. and Amft, O. 2013a. An opportunistic activity-sensing approach to
save energy in office buildings. In Proceedings of the fourth international conference
on Future energy systems, pages 247–258. ACM.
Milenkovic, M. and Amft, O. 2013b. Recognizing energy-related activities using sensors commonly installed in office buildings. Procedia Computer Science, 19:669–
677.
Nguyen, T. A. and Aiello, M. 2012. Beyond indoor presence monitoring with simple
sensors. In PECCS, pages 5–14.
Padmanabh, K., Malikarjuna V, A., Sen, S., Katru, S. P., Kumar, A., Vuppala, S. K.,
Paul, S., et al. 2009. isense: a wireless sensor network based conference room
management system. In Proceedings of the First ACM Workshop on Embedded
Sensing Systems for Energy-Efficiency in Buildings, pages 37–42. ACM.
Page, J., Robinson, D., and Scartezzini, J. 2007. Stochastic simulation of occupant
presence and behaviour in buildings. Proc. Tenth Int. IBPSA Conf : Building Simulation, pages 757–764.
Philipose, M., Fishkin, K. P., Perkowitz, M., Patterson, D. J., Fox, D., Kautz, H.,
and Hahnel, D. 2004. Inferring activities from interactions with objects. Pervasive
Computing, IEEE, 3(4):50–57.
Quinlan, J. R. 1986. Induction of decision trees. Machine learning, 1(1):81–106.
Robinson, D. and Haldi, F. 2009. Interactions with window openings by office occupants. Energy and Buildings, 44:2378–2395.
Roulet, C., Fritsch, R., Scartezzini, J., and Cretton, P. 1991. Stochastic model of
inhabitant behavior with regard to ventilation. Technical report.