Private information can either take the form of key phrases that are explicitly contained in the ... more Private information can either take the form of key phrases that are explicitly contained in the text or be implicit. For example, demographic information about the author of a text can be predicted with above-chance accuracy from linguistic cues in the text itself. Letting alone its explicitness, some of the private information correlates with the output labels and therefore can be learned by a neural network. In such a case, there is a tradeoff between the utility of the representation (measured by the accuracy of the classification network) and its privacy. This problem is inherently a multi-objective problem because these two objectives may conflict, necessitating a trade-off. Thus, we explicitly cast this problem as multi-objective optimization (MOO) with the overall objective of finding a Pareto stationary solution. We, therefore, propose a multiple-gradient descent algorithm (MGDA) that enables the efficient application of the Frank-Wolfe algorithm [10] using the line search. Experimental results on sentiment analysis and part-of-speech (POS) tagging show that MGDA produces higher-performing models than most recent proxy objective approaches, and performs as well as single objective baselines.
2021 IEEE International Conference on Big Knowledge (ICBK), 2021
Summarization of long sequences into a concise statement is a core problem in natural language pr... more Summarization of long sequences into a concise statement is a core problem in natural language processing, which requires a non-trivial understanding of the weakly structured text. Therefore, integrating crowdsourced multiple users' comments into a concise summary is even harder because (1) it requires transferring the weakly structured comments to structured knowledge. Besides, (2) the users comments are informal and noisy. In order to capture the long-distance relationships in staggered long sentences, we propose a neural multi-comment summarization (MCS) system that incorporates the sentence relationships via graph heuristics that utilize relation knowledge graphs, i.e., sentence relation graphs (SRG) and approximate dis-course graphs (ADG). Motivated by the promising results of gated graph neural networks (GG- NNs) on highly structured data, we develop a GG-NNs with sequence encoder that incorporates SRG or ADG in order to capture the sentence relationships. Specifi-cally, we employ the GG- NNs on both relation knowledge graphs, with the sentence embeddings as the input node features and the graph heuristics as the edges' weights. Through multiple layer-wise propagations, the GG- NNs generate the salience for each sentence from high-level hidden sentence features. Consequently, we use a greedy heuristic to extract salient users' comments while avoiding the noise in comments. The experimental results show that the proposed MCS improves the summarization performance both quantitatively and qualitatively.
Concentrating active Pt atoms in the outer layers of electrocatalysts is a very effective approac... more Concentrating active Pt atoms in the outer layers of electrocatalysts is a very effective approach to greatly reduce the Pt loading without compromising the electrocatalytic performance and the total electrochemically active surface area (ECSA) for the oxygen reduction reaction (ORR) in hydrogen-based proton-exchange membrane fuel cells. Accordingly, a facile, low-cost, and hydrogen-assisted two-step method is developed in this work, to massively prepare carbon-supported uniform, small-sized, and surfactant-free Pd nanoparticles (NPs) with ultrathin ∼3-atomic-layer Pt shells (Pd@Pt3L NPs/C). Comprehensive physicochemical characterizations, electrochemical analyses, fuel cell tests, and density functional theory calculations reveal that, benefiting from the ultrathin Pt-shell nanostructure as well as the resulting ligand and geometric effects, Pd@Pt3L NPs/C exhibits not only significantly enhanced ECSA, electrocatalytic activity, and noble-metal (NM) utilization compared to commercial Pt/C, showing 81.24 m2/gPt, 0.710 mA/cm2, and 352/577 mA/mgNM/Pt in ECSA, area-, and NM-/Pt-mass-specific activity, respectively; but also a much better electrochemical stability during the 10,000-cycle accelerated degradation test. More importantly, the corresponding 25-cm2 H2-air/O2 fuel cell with the low cathodic Pt loading of ∼ 0.152 mgPt/cm2geo achieves the high power density of 0.962/1.261 W/cm2geo at the current density of only 1,600 mA/cm2geo, which is much higher than that for the commercial Pt/C. This work not only develops a high-performance and practical Pt-based ORR electrocatalyst, but also provides a scalable preparation method for fabricating the ultrathin Pt-shell nanostructure, which can be further expanded to other metal shells for other energy-conversion applications.
The level of urinary retinol-binding protein (RBP) can be estimated as a significant index of ren... more The level of urinary retinol-binding protein (RBP) can be estimated as a significant index of renal tubular injury. In this work, we used Ag@BSA microspheres as a sensing interface to cross-link RBP monoclonal antibody (RBP mAb) via glutaraldehyde for sensitive detection of RBP. The Ag@BSA microspheres covered on a Au electrode could provide a larger surface area and multifunctional substrate for the effective immobilization of RBP mAb, and the outside BSA layer acted as a biocompatible support to maintain the bioactivity and stability of immobilized immunogen. Electrochemical measurements containing electrochemical impedance spectroscopy (EIS) and differential pulse voltammetry (DPV) were employed to evaluate the analytical performance of the fabricated immunosensor and a higher detection sensitivity was obtained by DPV attributed to the excellent electrical conductivity of Ag@BSA which could enhance the peak current response. This immunosensor had a best detection limit (DL) of 18 ng mL(-1) and a linear response range between 50 and 4500 ng mL(-1). The proposed approach showed high specificity for RBP detection, acceptable reproducibility with an RSD of 5.6%, and good precision with the RSD of 4.5% and 6.3% at the RBP concentrations of 500 and 1500 ng mL(-1). Compared with the ELISA method by analyzing real urine samples from a patient, this immunosensor revealed acceptable accuracy with a relative deviation lower than 6.5%, indicating a potential alternative method for RBP detection in clinical diagnosis.
2021 IEEE Symposium Series on Computational Intelligence (SSCI), Dec 5, 2021
With inputs from human crowds, usually through the Internet, crowdsourcing has become a promising... more With inputs from human crowds, usually through the Internet, crowdsourcing has become a promising methodology in AI and machine learning for applications that require human knowledge. Researchers have recently proposed interval-valued labels (IVLs), instead of commonly used binary-valued ones, to manage uncertainty in crowdsourcing [19]. However, that work has not yet taken the crowd worker's reliability into consideration. Crowd workers usually come with various social and economic backgrounds, and have different levels of reliability. To further improve the overall quality of crowdsourcing with IVLs, this work presents practical methods that quantitatively estimate worker's reliability in terms of his/her correctness, confidence, stability, and predictability from his/her IVLs. With worker's reliability, this paper proposes two learning schemes: weighted interval majority voting (WIMV) and weighted preferred matching probability (WPMP). Computational experiments on sample datasets demonstrate that both WIMV and WPMP can significantly improve learning results in terms of higher precision. accuracy. and F1 -score than other methods.
We present the techniques we have used to bound the range of the arcsine, arccosine, arctangent, ... more We present the techniques we have used to bound the range of the arcsine, arccosine, arctangent, arccotangent, and hyperbolic sine functions in our portable FORTRAN‐77 library INTLIB. The design of this library is based on a balance of simplicity and eciency, subject to rigor and portability.
2017 IEEE International Conference on Big Knowledge (ICBK), 2017
This paper proposes an approach KeyRank to extract high quality keyphrases from a document in Eng... more This paper proposes an approach KeyRank to extract high quality keyphrases from a document in English. It firstly searches all keyphrase candidates from the document, and then ranks them for selecting top-N keyphrase candidates as final keyphrases. Based on a common sense that words do not repeat-edly appear in an effective keyphrase in English, a novel keyphrase candidate search algorithm applying sequential pat-tern mining with gap constraints (called KCSP) is proposed to search keyphrase candidates for KeyRank. An effectiveness eval-uation measure pattern frequency with entropy (called PF-H) is then proposed to rank these keyphrase candidates for KeyRank. Our experimental results show that KeyRank performs better than existing popular approaches do, such as TextRank and KeyEx. Besides, KCSP is much more efficient than a closely re-lated approach SPMW, and PF-H can be applied to improve the performance of TextRank.
Information Processing and Management of Uncertainty in Knowledge-Based Systems, 2020
Applying interval-valued data and methods, researchers have made solid accomplishments in informa... more Applying interval-valued data and methods, researchers have made solid accomplishments in information processing and uncertainty management. Although interval-valued statistics and probability are available for interval-valued data, current inferential decision making schemes rely on point-valued statistic and probabilistic measures mostly. To enable direct applications of these point-valued schemes on interval-valued datasets, we present point-valued variational statistics, probability, and entropy for interval-valued datasets. Related algorithms are reported with illustrative examples.
... However, the decision as to what basis to use would require a fore-knowledge of the pattern o... more ... However, the decision as to what basis to use would require a fore-knowledge of the pattern one wants to identify ... the problem of generalization can be alleviated using neural network implemen-tations of the Sammon mapping; de Ridder and Duin (1997) present a comparison. ...
Private information can either take the form of key phrases that are explicitly contained in the ... more Private information can either take the form of key phrases that are explicitly contained in the text or be implicit. For example, demographic information about the author of a text can be predicted with above-chance accuracy from linguistic cues in the text itself. Letting alone its explicitness, some of the private information correlates with the output labels and therefore can be learned by a neural network. In such a case, there is a tradeoff between the utility of the representation (measured by the accuracy of the classification network) and its privacy. This problem is inherently a multi-objective problem because these two objectives may conflict, necessitating a trade-off. Thus, we explicitly cast this problem as multi-objective optimization (MOO) with the overall objective of finding a Pareto stationary solution. We, therefore, propose a multiple-gradient descent algorithm (MGDA) that enables the efficient application of the Frank-Wolfe algorithm [10] using the line search. Experimental results on sentiment analysis and part-of-speech (POS) tagging show that MGDA produces higher-performing models than most recent proxy objective approaches, and performs as well as single objective baselines.
2021 IEEE International Conference on Big Knowledge (ICBK), 2021
Summarization of long sequences into a concise statement is a core problem in natural language pr... more Summarization of long sequences into a concise statement is a core problem in natural language processing, which requires a non-trivial understanding of the weakly structured text. Therefore, integrating crowdsourced multiple users' comments into a concise summary is even harder because (1) it requires transferring the weakly structured comments to structured knowledge. Besides, (2) the users comments are informal and noisy. In order to capture the long-distance relationships in staggered long sentences, we propose a neural multi-comment summarization (MCS) system that incorporates the sentence relationships via graph heuristics that utilize relation knowledge graphs, i.e., sentence relation graphs (SRG) and approximate dis-course graphs (ADG). Motivated by the promising results of gated graph neural networks (GG- NNs) on highly structured data, we develop a GG-NNs with sequence encoder that incorporates SRG or ADG in order to capture the sentence relationships. Specifi-cally, we employ the GG- NNs on both relation knowledge graphs, with the sentence embeddings as the input node features and the graph heuristics as the edges' weights. Through multiple layer-wise propagations, the GG- NNs generate the salience for each sentence from high-level hidden sentence features. Consequently, we use a greedy heuristic to extract salient users' comments while avoiding the noise in comments. The experimental results show that the proposed MCS improves the summarization performance both quantitatively and qualitatively.
Concentrating active Pt atoms in the outer layers of electrocatalysts is a very effective approac... more Concentrating active Pt atoms in the outer layers of electrocatalysts is a very effective approach to greatly reduce the Pt loading without compromising the electrocatalytic performance and the total electrochemically active surface area (ECSA) for the oxygen reduction reaction (ORR) in hydrogen-based proton-exchange membrane fuel cells. Accordingly, a facile, low-cost, and hydrogen-assisted two-step method is developed in this work, to massively prepare carbon-supported uniform, small-sized, and surfactant-free Pd nanoparticles (NPs) with ultrathin ∼3-atomic-layer Pt shells (Pd@Pt3L NPs/C). Comprehensive physicochemical characterizations, electrochemical analyses, fuel cell tests, and density functional theory calculations reveal that, benefiting from the ultrathin Pt-shell nanostructure as well as the resulting ligand and geometric effects, Pd@Pt3L NPs/C exhibits not only significantly enhanced ECSA, electrocatalytic activity, and noble-metal (NM) utilization compared to commercial Pt/C, showing 81.24 m2/gPt, 0.710 mA/cm2, and 352/577 mA/mgNM/Pt in ECSA, area-, and NM-/Pt-mass-specific activity, respectively; but also a much better electrochemical stability during the 10,000-cycle accelerated degradation test. More importantly, the corresponding 25-cm2 H2-air/O2 fuel cell with the low cathodic Pt loading of ∼ 0.152 mgPt/cm2geo achieves the high power density of 0.962/1.261 W/cm2geo at the current density of only 1,600 mA/cm2geo, which is much higher than that for the commercial Pt/C. This work not only develops a high-performance and practical Pt-based ORR electrocatalyst, but also provides a scalable preparation method for fabricating the ultrathin Pt-shell nanostructure, which can be further expanded to other metal shells for other energy-conversion applications.
The level of urinary retinol-binding protein (RBP) can be estimated as a significant index of ren... more The level of urinary retinol-binding protein (RBP) can be estimated as a significant index of renal tubular injury. In this work, we used Ag@BSA microspheres as a sensing interface to cross-link RBP monoclonal antibody (RBP mAb) via glutaraldehyde for sensitive detection of RBP. The Ag@BSA microspheres covered on a Au electrode could provide a larger surface area and multifunctional substrate for the effective immobilization of RBP mAb, and the outside BSA layer acted as a biocompatible support to maintain the bioactivity and stability of immobilized immunogen. Electrochemical measurements containing electrochemical impedance spectroscopy (EIS) and differential pulse voltammetry (DPV) were employed to evaluate the analytical performance of the fabricated immunosensor and a higher detection sensitivity was obtained by DPV attributed to the excellent electrical conductivity of Ag@BSA which could enhance the peak current response. This immunosensor had a best detection limit (DL) of 18 ng mL(-1) and a linear response range between 50 and 4500 ng mL(-1). The proposed approach showed high specificity for RBP detection, acceptable reproducibility with an RSD of 5.6%, and good precision with the RSD of 4.5% and 6.3% at the RBP concentrations of 500 and 1500 ng mL(-1). Compared with the ELISA method by analyzing real urine samples from a patient, this immunosensor revealed acceptable accuracy with a relative deviation lower than 6.5%, indicating a potential alternative method for RBP detection in clinical diagnosis.
2021 IEEE Symposium Series on Computational Intelligence (SSCI), Dec 5, 2021
With inputs from human crowds, usually through the Internet, crowdsourcing has become a promising... more With inputs from human crowds, usually through the Internet, crowdsourcing has become a promising methodology in AI and machine learning for applications that require human knowledge. Researchers have recently proposed interval-valued labels (IVLs), instead of commonly used binary-valued ones, to manage uncertainty in crowdsourcing [19]. However, that work has not yet taken the crowd worker's reliability into consideration. Crowd workers usually come with various social and economic backgrounds, and have different levels of reliability. To further improve the overall quality of crowdsourcing with IVLs, this work presents practical methods that quantitatively estimate worker's reliability in terms of his/her correctness, confidence, stability, and predictability from his/her IVLs. With worker's reliability, this paper proposes two learning schemes: weighted interval majority voting (WIMV) and weighted preferred matching probability (WPMP). Computational experiments on sample datasets demonstrate that both WIMV and WPMP can significantly improve learning results in terms of higher precision. accuracy. and F1 -score than other methods.
We present the techniques we have used to bound the range of the arcsine, arccosine, arctangent, ... more We present the techniques we have used to bound the range of the arcsine, arccosine, arctangent, arccotangent, and hyperbolic sine functions in our portable FORTRAN‐77 library INTLIB. The design of this library is based on a balance of simplicity and eciency, subject to rigor and portability.
2017 IEEE International Conference on Big Knowledge (ICBK), 2017
This paper proposes an approach KeyRank to extract high quality keyphrases from a document in Eng... more This paper proposes an approach KeyRank to extract high quality keyphrases from a document in English. It firstly searches all keyphrase candidates from the document, and then ranks them for selecting top-N keyphrase candidates as final keyphrases. Based on a common sense that words do not repeat-edly appear in an effective keyphrase in English, a novel keyphrase candidate search algorithm applying sequential pat-tern mining with gap constraints (called KCSP) is proposed to search keyphrase candidates for KeyRank. An effectiveness eval-uation measure pattern frequency with entropy (called PF-H) is then proposed to rank these keyphrase candidates for KeyRank. Our experimental results show that KeyRank performs better than existing popular approaches do, such as TextRank and KeyEx. Besides, KCSP is much more efficient than a closely re-lated approach SPMW, and PF-H can be applied to improve the performance of TextRank.
Information Processing and Management of Uncertainty in Knowledge-Based Systems, 2020
Applying interval-valued data and methods, researchers have made solid accomplishments in informa... more Applying interval-valued data and methods, researchers have made solid accomplishments in information processing and uncertainty management. Although interval-valued statistics and probability are available for interval-valued data, current inferential decision making schemes rely on point-valued statistic and probabilistic measures mostly. To enable direct applications of these point-valued schemes on interval-valued datasets, we present point-valued variational statistics, probability, and entropy for interval-valued datasets. Related algorithms are reported with illustrative examples.
... However, the decision as to what basis to use would require a fore-knowledge of the pattern o... more ... However, the decision as to what basis to use would require a fore-knowledge of the pattern one wants to identify ... the problem of generalization can be alleviated using neural network implemen-tations of the Sammon mapping; de Ridder and Duin (1997) present a comparison. ...
Uploads
Papers by Chenyi Hu