Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
Haohui Chen

    Haohui Chen

    PurposeThis paper aims to demonstrate how skills taxonomies can be used in combination with machine learning to integrate diverse online datasets and reveal skills gaps. The purpose of this study is then to show how the skills gaps... more
    PurposeThis paper aims to demonstrate how skills taxonomies can be used in combination with machine learning to integrate diverse online datasets and reveal skills gaps. The purpose of this study is then to show how the skills gaps revealed by the integrated datasets can be used to achieve better labour market alignment, keep educational offerings up to date and assist graduates to communicate the value of their qualifications.Design/methodology/approachUsing the ESCO taxonomy and natural language processing, this study captures skills data from three types of online data (job ads, course descriptions and resumes), allowing us to compare demand for skills and supply of skills for three different occupations.FindingsThis study illustrates three practical applications for the integrated data, showing how they can be used to help workers who are disrupted by technology to identify alternative career pathways, assist educators to identify gaps in their course offerings and support stude...
    Timely and accurate statistics on the labour market enable policymakers to rapidly respond to changing economic conditions. Estimates of job vacancies by national statistical agencies are highly accurate but reported infrequently and with... more
    Timely and accurate statistics on the labour market enable policymakers to rapidly respond to changing economic conditions. Estimates of job vacancies by national statistical agencies are highly accurate but reported infrequently and with time lags. In contrast, online job postings provide a high-frequency indicator of vacancies with less accuracy. In this study we develop a robust signal averaging algorithm to measure job vacancies using online job postings data. We apply the algorithm using data on Australian job postings and show that it accurately predicts changes in job vacancies over a 4.5-year period. We also show that the algorithm is significantly more accurate than using raw counts of job postings to predict vacancies. The algorithm therefore offers a promising approach to the timely and reliable measurement of changes in vacancies.
    Postgres database of followees (backup created by pg_dump in pgAdmin). Partitioned into 9 smaller files with 'split' command. To assemble use "cat followees[1-9] > combined_followees.backup
    Postgres database of tweets (backup created by pg_dump in pgAdmin). Partitioned into 5 smaller files with 'split' command. To assemble use "cat rawdata[1-5] > combined_rawdata.backup
    Postgres database of tweets (backup created by pg_dump in pgAdmin). Partitioned into 5 smaller files with 'split' command. To assemble use "cat rawdata[1-5] > combined_rawdata.backup
    Postgres database of tweets (backup created by pg_dump in pgAdmin). Partitioned into 5 smaller files with 'split' command. To assemble use "cat rawdata[1-5] > combined_rawdata.backup
    Postgres database of followees (backup created by pg_dump in pgAdmin). Partitioned into 9 smaller files with 'split' command. To assemble use "cat followees[1-9] > combined_followees.backup
    Postgres database of followees (backup created by pg_dump in pgAdmin). Partitioned into 9 smaller files with 'split' command. To assemble use "cat followees[1-9] > combined_followees.backup
    Postgres database of followees (backup created by pg_dump in pgAdmin). Partitioned into 9 smaller files with 'split' command. To assemble use "cat followees[1-9] > combined_followees.backup
    Information flow during catastrophic events is a critical aspect of disaster management. Modern communication platforms, in particular online social networks, provide an opportunity to study such flow and derive early-warning sensors,... more
    Information flow during catastrophic events is a critical aspect of disaster management. Modern communication platforms, in particular online social networks, provide an opportunity to study such flow and derive early-warning sensors, thus improving emergency preparedness and response. Performance of the social networks sensor method, based on topological and behavioral properties derived from the “friendship paradox”, is studied here for over 50 million Twitter messages posted before, during, and after Hurricane Sandy. We find that differences in users’ network centrality effectively translate into moderate awareness advantage (up to 26 hours); and that geo-location of users within or outside of the hurricane-affected area plays a significant role in determining the scale of such an advantage. Emotional response appears to be universal regardless of the position in the network topology, and displays characteristic, easily detectable patterns, opening a possibility to implement a simple “sentiment sensing” technique that can detect and locate disasters
    Supporting materials for Bandit Strategies in Social Search: the case of the DARPA Red Balloon Challenge. (pdf)
    China is the world's second largest economy. After four decades of economic miracles, China's economy is transitioning into an advanced, knowledge-based economy. Yet, we still lack a detailed understanding of the skills that... more
    China is the world's second largest economy. After four decades of economic miracles, China's economy is transitioning into an advanced, knowledge-based economy. Yet, we still lack a detailed understanding of the skills that underly the Chinese labor force, and the development and spatial distribution of these skills. For example, the US standardized skill taxonomy O*NET played an important role in understanding the dynamics of manufacturing and knowledge-based work, as well as potential risks from automation and outsourcing. Here, we use Machine Learning techniques to bridge this gap, creating China's first workforce skill taxonomy, and map it to O*NET. This enables us to reveal workforce skill polarization into social-cognitive skills and sensory-physical skills, and to explore the China's regional inequality in light of workforce skills, and compare it to traditional metrics such as education. We build an online tool for the public and policy makers to explore the...
    Postgres database of followees (backup created by pg_dump in pgAdmin). Partitioned into 9 smaller files with 'split' command. To assemble use "cat followees[1-9] > combined_followees.backup
    When facing threats from automation, a worker residing in a large Chinese city might not be as lucky as a worker in a large U.S. city, depending on the type of large city in which one resides. Empirical studies found that large U.S.... more
    When facing threats from automation, a worker residing in a large Chinese city might not be as lucky as a worker in a large U.S. city, depending on the type of large city in which one resides. Empirical studies found that large U.S. cities exhibit resilience to automation impacts because of the increased occupational and skill specialization. However, in this study, we observe polarized responses in large Chinese cities to automation impacts. The polarization might be attributed to the elaborate master planning of the central government, through which cities are assigned with different industrial goals to achieve globally optimal economic success and, thus, a fast-growing economy. By dividing Chinese cities into two groups based on their administrative levels and premium resources allocated by the central government, we find that Chinese cities follow two distinct industrial development trajectories, one trajectory owning government support leads to a diversified industrial structur...
    Immersive virtual- and augmented-reality headsets can overlay a flat image against any surface or hang virtual objects in the space around the user. The technology is rapidly improving and may, in the long term, replace traditional flat... more
    Immersive virtual- and augmented-reality headsets can overlay a flat image against any surface or hang virtual objects in the space around the user. The technology is rapidly improving and may, in the long term, replace traditional flat panel displays in many situations. When displays are no longer intrinsically flat, how should we use the space around the user for abstract data visualisation? In this paper, we ask this question with respect to origin-destination flow data in a global geographic context. We report on the findings of three studies exploring different spatial encodings for flow maps. The first experiment focuses on different 2D and 3D encodings for flows on flat maps. We find that participants are significantly more accurate with raised flow paths whose height is proportional to flow distance but fastest with traditional straight line 2D flows. In our second and third experiment, we compared flat maps, 3D globes and a novel interactive design we call MapsLink, involvi...
    Collective search for people and information has tremendously benefited from emerging communication technologies that leverage the wisdom of the crowds, and has been increasingly influential in solving time-critical tasks such as the... more
    Collective search for people and information has tremendously benefited from emerging communication technologies that leverage the wisdom of the crowds, and has been increasingly influential in solving time-critical tasks such as the DARPA Network Challenge (DNC, also known as the Red Balloon Challenge). However, while collective search often invests significant resources in encouraging the crowd to contribute new information, the effort invested in verifying this information is comparable, yet often neglected in crowdsourcing models. This paper studies how the exploration-verification trade-off displayed by the teams modulated their success in the DNC, as teams had limited human resources that they had to divide between recruitment (exploration) and verification (exploitation). Our analysis suggests that team performance in the DNC can be modelled as a modified multi-armed bandit (MAB) problem, where information arrives to the team originating from sources of different levels of ve...
    of Bandit strategies in social search: the case of the DARPA red balloon challenge
    Mapping new and old buildings are of great significance for understanding socio-economic development in rural areas. In recent years, deep neural networks have achieved remarkable building segmentation results in high-resolution remote... more
    Mapping new and old buildings are of great significance for understanding socio-economic development in rural areas. In recent years, deep neural networks have achieved remarkable building segmentation results in high-resolution remote sensing images. However, the scarce training data and the varying geographical environments have posed challenges for scalable building segmentation. This study proposes a novel framework based on Mask R-CNN, named HTMask R-CNN, to extract new and old rural buildings even when the label is scarce. The framework adopts the result of single-object instance segmentation from the orthodox Mask R-CNN. Further, it classifies the rural buildings into new and old ones based on a dynamic grayscale threshold inferred from the result of a two-object instance segmentation task where training data is scarce. We found that the framework can extract more buildings and achieve a much higher mean Average Precision (mAP) than the orthodox Mask R-CNN model. We tested th...
    China, the world’s second largest economy, is transitioning into an advanced, knowledge-based economy after four decades of rapid economic development. However, China still lacks a detailed understanding of the skills that underly the... more
    China, the world’s second largest economy, is transitioning into an advanced, knowledge-based economy after four decades of rapid economic development. However, China still lacks a detailed understanding of the skills that underly the Chinese labor force, and the development and spatial distribution of these skills. Similar data has proven essential in other contexts; for example, the US standardized skill taxonomy, Occupational Information Network (O*NET), played an important role in understanding the dynamics of manufacturing and knowledge-based work, and the potential risks from automation and outsourcing. Here, we use Machine Learning techniques to bridge this gap, creating China’s first workforce skill taxonomy, and map it to O*NET. This enables us to reveal workforce skill polarization into social-cognitive skills and sensory-physical skills, and to explore China’s regional inequality in light of workforce skills, and compare it to traditional metrics such as education. We bui...
    Location-based social network data offers the promise of collecting the data from a large base of users over a longer span of time at negligible cost. While several studies have applied social network data to activity and mobility... more
    Location-based social network data offers the promise of collecting the data from a large base of users over a longer span of time at negligible cost. While several studies have applied social network data to activity and mobility analysis, a comparison with travel diaries and general statistics has been lacking. In this paper, we analysed geo-referenced Twitter activities from a large number of users in Singapore and neighbouring countries. By combining this data, population statistics and travel diaries and applying clustering techniques, we addressed detection of activity locations, as well as spatial separation and transitions between these locations. Kernel density estimation performs best to detect activity locations due to the scattered nature of the twitter data; more activity locations are detected per user than reported in the travel survey. The descriptive analysis shows that determining home locations is more difficult than detecting work locations for most planning zone...
    Mapping new and old buildings are of great significance for understanding socio-economic development in rural areas. In recent years, deep neural networks have achieved remarkable building segmentation results in high-resolution remote... more
    Mapping new and old buildings are of great significance for understanding socio-economic development in rural areas. In recent years, deep neural networks have achieved remarkable building segmentation results in high-resolution remote sensing images. However, the scarce training data and the varying geographical environments have posed challenges for scalable building segmentation. This study proposes a novel framework based on Mask R-CNN, named Histogram Thresholding Mask Region-Based Convolutional Neural Network (HTMask R-CNN), to extract new and old rural buildings even when the label is scarce. The framework adopts the result of single-object instance segmentation from the orthodox Mask R-CNN. Further, it classifies the rural buildings into new and old ones based on a dynamic grayscale threshold inferred from the result of a two-object instance segmentation task where training data is scarce. We found that the framework can extract more buildings and achieve a much higher mean ...
    Multiple driving forces shape cities. These forces include the costs of transporting goods and people, the types of predominant local industries, and the policies that govern urban planning. Here, we examine how agglomeration and... more
    Multiple driving forces shape cities. These forces include the costs of transporting goods and people, the types of predominant local industries, and the policies that govern urban planning. Here, we examine how agglomeration and dispersion change with increasing population and population density. We study the patterns in the evolution of urban forms and analyze the differences between developed and developing countries. We analyze agglomeration across 233 European and 258 Chinese cities using nighttime luminosity data. We find a universal inverted U-shape curve for the agglomeration metric (Lasym index). Cities attain their maximum agglomeration level at an intermediate density, above which dispersion increases. Our findings may guide strategic urban planning for the timely adoption of appropriate development policies.
    Could social media data aid in disaster response and damage assessment? Countries face both an increasing frequency and intensity of natural disasters due to climate change. And during such events, citizens are turning to social media... more
    Could social media data aid in disaster response and damage assessment? Countries face both an increasing frequency and intensity of natural disasters due to climate change. And during such events, citizens are turning to social media platforms for disaster-related communication and information. Social media improves situational awareness, facilitates dissemination of emergency information, enables early warning systems, and helps coordinate relief efforts. Additionally, spatiotemporal distribution of disaster-related messages helps with real-time monitoring and assessment of the disaster itself. Here we present a multiscale analysis of Twitter activity before, during, and after Hurricane Sandy. We examine the online response of 50 metropolitan areas of the United States and find a strong relationship between proximity to Sandy's path and hurricane-related social media activity. We show that real and perceived threats -- together with the physical disaster effects -- are directl...
    This study focuses on investigating the changing export patterns, evolution characteristics, and influencing trade mechanisms of countries on a global scale. Based on comprehensive customs data, our study found that core location and... more
    This study focuses on investigating the changing export patterns, evolution characteristics, and influencing trade mechanisms of countries on a global scale. Based on comprehensive customs data, our study found that core location and export types, including machinery and chemical products, both play positive roles in promoting countries’ economic development. Developed countries are more likely to be at the core of the product space and to export machinery and chemical products. Countries’ R&D investment can affect the export location and types regardless of their economy, while high education matters in developed countries, and FDI (Foreign Direct Investment) is critical in developing countries. It indicates that technological benefits created by human capital can promote the export economy. Nevertheless, developing countries are not able to release strong knowledge spillover effects through their education systems, and they are relying more on the introduction of foreign investmen...
    This study examines publicly available online search data in China to investigate the spread of public awareness of the 2019 novel coronavirus (SARS-CoV-2) outbreak. We found that cities that had previously suffered from SARS (in 2003–04)... more
    This study examines publicly available online search data in China to investigate the spread of public awareness of the 2019 novel coronavirus (SARS-CoV-2) outbreak. We found that cities that had previously suffered from SARS (in 2003–04) and have greater migration ties to Wuhan had earlier, stronger and more durable public awareness of the outbreak. Our data indicate that 48 such cities developed awareness up to 19 days earlier than 255 comparable cities, giving them an opportunity to better prepare. This study suggests that it is important to consider memory of prior catastrophic events as they will influence the public response to emerging threats.
    This study examines publicly available online search data in China to investigate the spread of public awareness of the 2019 novel coronavirus (COVID-19) outbreak. We found that cities that suffered from SARS and have greater migration... more
    This study examines publicly available online search data in China to investigate the spread of public awareness of the 2019 novel coronavirus (COVID-19) outbreak. We found that cities that suffered from SARS and have greater migration ties to the epicentre, Wuhan, had earlier, stronger and more durable public awareness of the outbreak. Our data indicate that forty-eight such cities developed awareness up to 19 days earlier than 255 comparable cities, giving them an opportunity to better prepare. This study suggests that it is important to consider memory of prior catastrophic events as they will influence the public response to emerging threats.
    The compact city, as a sustainable concept, is intended to augment the efficiency of urban function. However, previous studies have concentrated more on morphology than on structure. The present study focuses on urban structural elements,... more
    The compact city, as a sustainable concept, is intended to augment the efficiency of urban function. However, previous studies have concentrated more on morphology than on structure. The present study focuses on urban structural elements, i.e. urban hotspots consisting of high-density and high-intensity socioeconomic zones, and explores the economic performance associated with their spatial structure. We use night-time luminosity data and the Loubar method to identify and extract the hotspot and ultimately draw two conclusions. First, with population increasing, the hotspot number scales sublinearly with an exponent of approximately 0.50–0.55, regardless of the location in China, the EU or the USA, while the intersect values are totally different, which is mainly due to different economic developmental level. Secondly, we demonstrate that the compactness of hotspots imposes an inverted U-shaped influence on economic growth, which implies that an optimal compactness coefficient does ...
    Increasing economic integration between countries has spurred the rapid growth of border areas. However, whether city-level boundary areas can benefit from regional integrations within one regime is unknown. Along with growing numbers of... more
    Increasing economic integration between countries has spurred the rapid growth of border areas. However, whether city-level boundary areas can benefit from regional integrations within one regime is unknown. Along with growing numbers of integrated regions in developing economies, especially in China, understanding the extent to which geographic growth and location conditions play roles in the process is of great importance. Therefore, this study used night-time light data between 2013 and 2018 to investigate the growth patterns of different parts of the highly integrated Pearl River Delta (PRD) of China. The results showed that border areas, especially those with emerging economies, grew faster than city centers during the study period. Moreover, we built ordinary least square (OLS) regression and spatial econometrics models to understand how location conditions across the two cities affected the growth process. The models showed that the urbanization levels across the two cities h...
    Immersive virtual- and augmented-reality headsets can overlay a flat image against any surface or hang virtual objects in the space around the user. The technology is rapidly improving and may, in the long term, replace traditional flat... more
    Immersive virtual- and augmented-reality headsets can overlay a flat image against any surface or hang virtual objects in the space around the user. The technology is rapidly improving and may, in the long term, replace traditional flat panel displays in many situations. When displays are no longer intrinsically flat, how should we use the space around the user for abstract data visualisation? In this paper, we ask this question with respect to origin-destination flow data in a global geographic context. We report on the findings of three studies exploring different spatial encodings for flow maps. The first experiment focuses on different 2D and 3D encodings for flows on flat maps. We find that participants are significantly more accurate with raised flow paths whose height is proportional to flow distance but fastest with traditional straight line 2D flows. In our second and third experiment we compared flat maps, 3D globes and a novel interactive design we call MapsLink, involvin...
    Could social media data aid in disaster response and damage assessment? Countries face both an increasing frequency and an increasing intensity of natural disasters resulting from climate change. During such events, citizens turn to... more
    Could social media data aid in disaster response and damage assessment? Countries face both an increasing frequency and an increasing intensity of natural disasters resulting from climate change. During such events, citizens turn to social media platforms for disaster-related communication and information. Social media improves situational awareness, facilitates dissemination of emergency information, enables early warning systems, and helps coordinate relief efforts. In addition, the spatiotemporal distribution of disaster-related messages helps with the real-time monitoring and assessment of the disaster itself. We present a multiscale analysis of Twitter activity before, during, and after Hurricane Sandy. We examine the online response of 50 metropolitan areas of the United States and find a strong relationship between proximity to Sandy's path and hurricane-related social media activity. We show that real and perceived threats, together with physical disaster effects, are di...
    Could social media data aid in disaster response and damage assessment? Countries face both an increasing frequency and intensity of natural disasters due to climate change. And during such events, citizens are turning to social media... more
    Could social media data aid in disaster response and damage assessment? Countries face both an increasing frequency and intensity of natural disasters due to climate change. And during such events, citizens are turning to social media platforms for disaster-related communication and information. Social media improves situational awareness, facilitates dissemination of emergency information, enables early warning systems, and helps coordinate relief efforts. Additionally, spatiotemporal distribution of disaster-related messages helps with real-time monitoring and assessment of the disaster itself. Here we present a multiscale analysis of Twitter activity before, during, and after Hurricane Sandy. We examine the online response of 50 metropolitan areas of the United States and find a strong relationship between proximity to Sandy's path and hurricane-related social media activity. We show that real and perceived threats -- together with the physical disaster effects -- are directl...
    This article describes the integration of a smartphone, a world viewer and a geodatabase into a collaborative virtual environment (CVE) as a knowledge management platform for use in land management. A spatial interoperability mechanism... more
    This article describes the integration of a smartphone, a world viewer and a geodatabase into a collaborative virtual environment (CVE) as a knowledge management platform for use in land management. A spatial interoperability mechanism was designed for integration of these various technologies distributed in different system layers and written in different programming languages. As users may vary in their education backgrounds and understanding of advanced information technologies, the proposed platform employs existing popular spatial technologies to facilitate usage. The platform includes an iPhone™ application, a web portal based on Google Earth™ viewer and a data server, all of which may be deployed in different and distant places, allowing remote collaboration. To evaluate the usability of the platform, a case study was implemented involving a scientist, a farmer and an agricultural consultant working collaboratively, but remotely, within the system to support their farming pra...
    Organizations gain competiveness through knowledge management. However, knowledge management in the context of distributed environment has two main issues, geographical distance and cognitive distance. This research adopted the concepts... more
    Organizations gain competiveness through knowledge management. However, knowledge management in the context of distributed environment has two main issues, geographical distance and cognitive distance. This research adopted the concepts of Web 2.0 and designed a knowledge management system, iFarming, based on the technology of collaborative virtual environment for reducing these two distances in the context of Australian agriculture. A case study involving real farmer, scientist and agricultural consultant was carried out to assess the value of iFarming, through which a new paradigm for distributed communications was achieved.