Crimson Publishers Publish With Us Reprints e-Books Video articles

Full Text

Advancements in Civil Engineering & Technology

A Succinct Summary of Vision-based Building Occupancy Estimation

Kailai Sun*

Center for Intelligent and Networked Systems, Department of Automation, BNRist, Tsinghua University, Beijing, China

*Corresponding author:Kailai Sun, Center for Intelligent and Networked Systems, Department of Automation, BNRist, Tsinghua University, Beijing, China

Submission: January 27, 2023;Published: February 07, 2023

DOI: 10.31031/ACET.2023.05.000615

ISSN: 2639-0574
Volume5 Issue3


Buildings currently represent 36% of global energy consumption and 37% of global greenhouse gas emissions, according to U.N. data [1]. In reducing carbon emissions in buildings for achieving zero emissions by 2050, the human factor plays a crucial role. Many studies showed that approximately 10% to 40% of the energy consumption in buildings can be saved with occupancy information [2,3]. How to estimate accurate occupancy has increasingly become a hot topic. On the other hand, With the rapid development of artificial intelligence (AI) and computer vision, image/video analysis technologies have been widely applied in buildings. This mini review will present and reveal the advanced technology of vision-based building occupancy estimation. Finally, the challenges and future trends are provided.

Mini Review

In terms of captured visual information, vision-based building occupancy estimation methods can be divided mainly into scene-based counting and line-based counting. In terms of different installed camera locations, there are two visual situations: room interior and entrance.

(1) For scene-based counting methods (SCMs), cameras are usually installed inside the rooms. Captured videos are analyzed by AI and computer vision technologies. Most studies apply people detection algorithms to estimate indoor occupancy (e.g., YOLO [4]). They are mainly divided into body and head detection. Body detection methods extract hand-crafted or deep features for body recognition [5]. Considering complex indoor scenes, head detection has gradually become the mainstream SCMs because heads are more visible [6]. As for detectors, most studies apply general object detectors, but many studies propose specific detectors by considering occupants’ knowledge (e.g., head size [7] and head motion information [8]). However, many other objects are recognized as occupants (i.e., false positives) because of the complex environments; it is hard to deploy cameras to cover the entire room without occlusion. Besides, applying the single-frame detectors to estimate occupancy is unstable, thus many studies have adopted multi-frame methods to enhance features and remove irregular estimation results [9].

(2) For line-based counting methods (LCMs), cameras are usually installed at room entrances. Most studies detect and track occupants at entrances. They estimated occupancy in buildings by counting passing events. They segment the fore-ground area by background subtraction technologies, and track occupants’ moving by trackers (Kalman filter, Deep-sort, etc.), and then distinguish moving directions whether occupants arrive or leave doors [10]. Cameras at room en-trances are usually installed at different locations: the side or overhead views. At the side view, LCMs detect and track occupancy body [11]; at the overhead view, LCMs recognize head and shoulder parts [12]. However, errors may occur when many occupants simultaneously pass through room entrances [13]. Once an occupant is misrecognized, errors will accumulate until manually cleared.

(3) To mitigate the above limitations and improve the estimation performance, many studies have developed fusion methods [13-15]. They consider heterogeneous visual information by combining LCMs and SCMs to eliminate cumulative errors and irregular estimation results. Considering scene knowledge and the indoor number of occupants, they adjust or automatically switch LCMs and SCMs at the people level to obtain more fine-grained estimation results.

Challenges and Future Trends

To achieve accurate occupancy estimation for building energysaving, although existing methods have achieved remarkable progress, they suffer from inherent limitations:
A. Complex indoor scenes, occlusion, and illumination have severe influence on SCMs. Occupants are often occluded by other objects (e.g., tables, chairs, computers). Moving occupants cause significant variations in scale, pose, texture, and illumination.
B. It is generally known that datasets are critical for AI, while public building visual occupancy datasets are lacking.
C. Vision-based methods provide fine-grained information but will cause leakage of the privacy problem.
D. Many studies apply AI neural networks, achieving stateof- the-art (SOTA) performance. However, when people deploy detectors and trackers in buildings, how to make AI more reliable is a big challenge.
E. It is hard to clear cumulative errors only by LCMs.

To address the above challenges, future research can focus on the following aspects:
a) Developing advanced sensor fusion technologies by machine-learning algorithms for occupancy estimation.
b) Collecting and establishing multi-model occupancy datasets in buildings.
c) Before practical deployment, the verification, testing, adversarial attack, and defense of the deep neural network become necessary.
d) Applying neural network compression technologies on AI and IoT edge computing devices to enable smart buildings and reduce the communication delay.
e) Considering federated learning to meet the requirements of user privacy protection, and data security. In particular, each edge collects data and trains local machine learning models, and only uploads parameters to the server, which largely decreases the risk of data privacy.


Occupancy information is important to building HVAC system control and energy-saving. This paper reviews recent vision-based occupancy estimation methods, including technical details and limitations. Challenges and future trends are presented, including datasets, edge computing, and federated learning.


  2. Jiaqing Xie, Haoyang Li, Chuting Li, Jingsi Zhang, Maohui Luo (2020) Review on occupant-centric thermal comfort sensing, predicting and controlling. Energy and Buildings 226: 110392.
  3. Zhihong Pang, Yan Chen, Jian Zhang, Zheng O Neill, Hwakong Cheng, et al. (2020) Nationwide hvac energy-saving potential quantification for office buildings with occupant-centric controls in various climates. Applied Energy 279: 115727.
  4. Ivan Mutis, Abhijeet Ambekar, Virat Joshi (2020) Realtime space occupancy sensing and human motion analysis using deep learning for indoor air quality control. Automation in Construction 116: 103237.
  5. Haneul Choi, Chai Yoon Um, Kyungmo Kang, Hyungkeun Kim, Taeyeon Kim (2021) Application of vision-based occupancy counting method using deep learning and performance analysis. Energy and Buildings 252: 111389.
  6. Yue Bo Meng, Tong Yue Li, Guang Hui Liu, Sheng Jun Xu, Tuo Ji (2020) Real-time dynamic estimation of occupancy load and an air-conditioning predictive control method based on image information fusion. Building and Environment 173: 106741.
  7. Russell Stewart, Mykhaylo Andriluka, Andrew Y Ng (2016) End-to-end people detection in crowded scenes. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2325–2333.
  8. Kailai Sun, Xiaoteng Ma, Peng Liu, Qianchuan Zhao (2022) Mpsn: Motion-aware pseudo-siamese network for indoor video head detection in buildings. Building and Environment 222: 109354.
  9. Jianhong Zou, Qianchuan Zhao, Wen Yang, Fulin Wang (2017) Occupancy detection in the office by analyzing surveillance videos and its application to building energy conservation. Energy and Buildings 152: 385-398.
  10. Haneul Choi, Chai Yoon Um, Kyungmo Kang, Hyungkeun Kim, Taeyeon Kim (2021) Review of vision-based occupant information sensing systems for occupant-centric control. Building and Environment 203: 108064.
  11. Muhammad Aftab, Chien Chen, Chi-Kin Chau, Talal Rahwan (2017) Automatic hvac control with real-time occupancy recognition and simulation-guided model predictive control in low-cost embedded system. Energy and Buildings 154: 141-156.
  12. Junjing Yang, Alexandros Pantazaras, Karn Ashokkumar Chaturvedi, Arun Kumar Chandran, Mat Santamouris, et al. (2018) Comparison of different occupancy counting methods for single system-single zone applications. Energy and Buildings 172: 221-234.
  13. Ipek Gursel Dino, Esat Kalfaoglu, Orcun Koral Iseri, Bilge Erdogan, Sinan Kalkan, et al. (2022) Vision-based estimation of the number of occupants using video cameras. Advanced Engineering Informatics 53: 101662.
  14. Kailai Sun, Peng Liu, Tian Xing, Qianchuan Zhao, Xinwei Wang (2022) A fusion framework for vision-based indoor occupancy estimation. Building and Environment 225: 109631.
  15. Kailai Sun, Qianchuan Zhao, Ziyou Zhang, Xinyuan Hu (2022) Indoor occupancy measurement by the fusion of motion detection and static estimation. Energy and Buildings 254: 111593.

© 2023 Kailai Sun. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and build upon your work non-commercially.