Hi! My name is Ekta Gujral and I come from the state of five rivers Punjab (Persian words PANJ (five) and AB (waters)), India. I received my Electronics
Product attribute extraction is an growing field in e-commerce business, with several application... more Product attribute extraction is an growing field in e-commerce business, with several applications including product ranking, product recommendation, future assortment planning and improving online shopping customer experiences. Understanding the customer needs is critical part of online business, specifically fashion products. Retailers uses assortment planning to determine the mix of products to offer in each store and channel, stay responsive to market dynamics and to manage inventory and catalogs. The goal is to offer the right styles, in the right sizes and colors, through the right channels. When shoppers find products that meet their needs and desires, they are more likely to return for future purchases, fostering customer loyalty. Product attributes are a key factor in assortment planning. In this paper we present PAE, a product attribute extraction algorithm for future trend reports consisting text and images in PDF format. Most existing methods focus on attribute extraction from titles or product descriptions or utilize visual information from existing product images. Compared to the prior works, our work focuses on attribute extraction from PDF files where upcoming fashion trends are explained. This work proposes a more comprehensive framework that fully utilizes the different modalities for attribute extraction and help retailers to plan the assortment in advance. Our contributions are three-fold: (a) We develop PAE, an efficient framework to extract attributes from unstructured data (text and images); (b) We provide catalog matching methodology based on BERT representations to discover the existing attributes using upcoming attribute values; (c) We conduct extensive experiments with several baselines and show that PAE is an effective, flexible and on par or superior (avg 92.5% F1-Score) framework to existing state-of-the-art for attribute value extraction task.
2020 54th Asilomar Conference on Signals, Systems, and Computers
Given data from a variety of sources that share a number of dimensions, how can we effectively de... more Given data from a variety of sources that share a number of dimensions, how can we effectively decompose them jointly into interpretable latent factors? The coupled tensor decomposition framework captures this idea by jointly supporting the decomposition of several CP tensors. However, coupling tends to suffer when one dimension of data is irregular, i.e., one of the dimensions of the tensor is uneven, such as in the case of PARAFAC2. In this work, we provide a scalable method for decomposing coupled CP and PARAFAC2 tensor datasets through non-negativity-constrained least squares optimization on a variety of objective functions. Comprehensive experiments on large data confirmed that C3APTION is up to 5× faster and 70 − 80% accurate than several baselines.
Data collected at very frequent intervals is usually extremely sparse and has no structure that i... more Data collected at very frequent intervals is usually extremely sparse and has no structure that is exploitable by modern tensor decomposition algorithms. Thus, the utility of such tensors is low, in terms of the amount of interpretable and exploitable structure that one can extract from them. In this paper, we introduce the problem of finding a tensor of adaptive aggregated granularity that can be decomposed to reveal meaningful latent concepts (structures) from datasets that, in their original form, are not amenable to tensor analysis. Such datasets fall under the broad category of sparse point processes that evolve over space and/or time. To the best of our knowledge, this is the first work that explores adaptive granularity aggregation in tensors. Furthermore, we formally define the problem and discuss different definitions of “good structure” that are in practice and show that the optimal solution is of prohibitive combinatorial complexity. Subsequently, we propose an efficient ...
2019 IEEE 29th International Workshop on Machine Learning for Signal Processing (MLSP), 2019
Communities (also referred to as clusters) are essential building blocks of all networks. Hierarc... more Communities (also referred to as clusters) are essential building blocks of all networks. Hierarchical clustering methods are common graph-based approaches for graph clustering. Traditional hierarchical clustering algorithms proceed in a bottom-up or top-down fashion to encode global information in the graph and cluster according to the global modularity of the graph. In this paper, we propose an efficient Hierarchical Agglomerative Community Detection (HACD) algorithm, that combines the local information in a graph with membership propagation to solve the problem, achieving 10-25% quality improvement over all baselines. The first contribution of this paper is to present fundamental limitations of the general modularity optimization-based approach. We show that based only on modularity information, the method does not provide high-quality clusters. Furthermore, even with modularity optimization, we experimentally show that the final level partitioning of such methods cannot successfully cluster data that contain highly mixed structures at different levels and densities. Based on these findings, the second contribution of this paper is a novel method to propagate knowledge throughout the graph, to split or merge the communities in order to evaluate the consistency of individual clusters. Our approach is bottom-up graph-based clustering, is scale-free, and can determine clusters at all scales. We extensively evaluate HACD’s performance in comparison to state-of-the-art approaches across six real world and seven synthetic benchmark datasets, and demonstrate that HACD, through combining graph’s local information and membership propagation, outperforms the baselines in terms of finding well-integrated communities.
2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 2018
Graph representations have increasingly grown in popularity during the last years. Existing embed... more Graph representations have increasingly grown in popularity during the last years. Existing embedding approaches explicitly encode network structure. Despite their good performance in downstream processes (e.g., node classification), there is still room for improvement in different aspects, like effectiveness. In this paper, we propose, t-PNE, a method that addresses this limitation. Contrary to baseline methods, which generally learn explicit node representations by solely using an adjacency matrix, t-PNE avails a multi-view information graph-the adjacency matrix represents the first view, and a nearest neighbor adjacency, computed over the node features, is the second view-in order to learn explicit and implicit node representations, using the Canonical Polyadic (a.k.a. CP) decomposition. We argue that the implicit and the explicit mapping from a higher-dimensional to a lower-dimensional vector space is the key to learn more useful and highly predictable representations. Extensive experiments show that t-PNE drastically outperforms baseline methods by up to 158.6% with respect to Micro-Fl, in several multi-label classification problems.
Explainable machine learning methods have attracted increased interest in recent years. In this w... more Explainable machine learning methods have attracted increased interest in recent years. In this work, we pose and study the niche detection problem, which imposes an explainable lens on the classical problem of co-clustering interactions across two modes. In the niche detection problem, our goal is to identify niches, or coclusters with node-attribute oriented explanations. Niche detection is applicable to many social content consumption scenarios, where an end goal is to describe and distill high-level insights about usercontent associations: not only that certain users like certain types of content, but rather the types of users and content, explained via node attributes. Some examples are an e-commerce platform with who-buys-what interactions and user and product attributes, or a mobile call platform with who-calls-whom interactions and user attributes. Discovering and characterizing niches has powerful implications for user behavior understanding, as well as marketing and target...
Product attribute extraction is an growing field in e-commerce business, with several application... more Product attribute extraction is an growing field in e-commerce business, with several applications including product ranking, product recommendation, future assortment planning and improving online shopping customer experiences. Understanding the customer needs is critical part of online business, specifically fashion products. Retailers uses assortment planning to determine the mix of products to offer in each store and channel, stay responsive to market dynamics and to manage inventory and catalogs. The goal is to offer the right styles, in the right sizes and colors, through the right channels. When shoppers find products that meet their needs and desires, they are more likely to return for future purchases, fostering customer loyalty. Product attributes are a key factor in assortment planning. In this paper we present PAE, a product attribute extraction algorithm for future trend reports consisting text and images in PDF format. Most existing methods focus on attribute extraction from titles or product descriptions or utilize visual information from existing product images. Compared to the prior works, our work focuses on attribute extraction from PDF files where upcoming fashion trends are explained. This work proposes a more comprehensive framework that fully utilizes the different modalities for attribute extraction and help retailers to plan the assortment in advance. Our contributions are three-fold: (a) We develop PAE, an efficient framework to extract attributes from unstructured data (text and images); (b) We provide catalog matching methodology based on BERT representations to discover the existing attributes using upcoming attribute values; (c) We conduct extensive experiments with several baselines and show that PAE is an effective, flexible and on par or superior (avg 92.5% F1-Score) framework to existing state-of-the-art for attribute value extraction task.
2020 54th Asilomar Conference on Signals, Systems, and Computers
Given data from a variety of sources that share a number of dimensions, how can we effectively de... more Given data from a variety of sources that share a number of dimensions, how can we effectively decompose them jointly into interpretable latent factors? The coupled tensor decomposition framework captures this idea by jointly supporting the decomposition of several CP tensors. However, coupling tends to suffer when one dimension of data is irregular, i.e., one of the dimensions of the tensor is uneven, such as in the case of PARAFAC2. In this work, we provide a scalable method for decomposing coupled CP and PARAFAC2 tensor datasets through non-negativity-constrained least squares optimization on a variety of objective functions. Comprehensive experiments on large data confirmed that C3APTION is up to 5× faster and 70 − 80% accurate than several baselines.
Data collected at very frequent intervals is usually extremely sparse and has no structure that i... more Data collected at very frequent intervals is usually extremely sparse and has no structure that is exploitable by modern tensor decomposition algorithms. Thus, the utility of such tensors is low, in terms of the amount of interpretable and exploitable structure that one can extract from them. In this paper, we introduce the problem of finding a tensor of adaptive aggregated granularity that can be decomposed to reveal meaningful latent concepts (structures) from datasets that, in their original form, are not amenable to tensor analysis. Such datasets fall under the broad category of sparse point processes that evolve over space and/or time. To the best of our knowledge, this is the first work that explores adaptive granularity aggregation in tensors. Furthermore, we formally define the problem and discuss different definitions of “good structure” that are in practice and show that the optimal solution is of prohibitive combinatorial complexity. Subsequently, we propose an efficient ...
2019 IEEE 29th International Workshop on Machine Learning for Signal Processing (MLSP), 2019
Communities (also referred to as clusters) are essential building blocks of all networks. Hierarc... more Communities (also referred to as clusters) are essential building blocks of all networks. Hierarchical clustering methods are common graph-based approaches for graph clustering. Traditional hierarchical clustering algorithms proceed in a bottom-up or top-down fashion to encode global information in the graph and cluster according to the global modularity of the graph. In this paper, we propose an efficient Hierarchical Agglomerative Community Detection (HACD) algorithm, that combines the local information in a graph with membership propagation to solve the problem, achieving 10-25% quality improvement over all baselines. The first contribution of this paper is to present fundamental limitations of the general modularity optimization-based approach. We show that based only on modularity information, the method does not provide high-quality clusters. Furthermore, even with modularity optimization, we experimentally show that the final level partitioning of such methods cannot successfully cluster data that contain highly mixed structures at different levels and densities. Based on these findings, the second contribution of this paper is a novel method to propagate knowledge throughout the graph, to split or merge the communities in order to evaluate the consistency of individual clusters. Our approach is bottom-up graph-based clustering, is scale-free, and can determine clusters at all scales. We extensively evaluate HACD’s performance in comparison to state-of-the-art approaches across six real world and seven synthetic benchmark datasets, and demonstrate that HACD, through combining graph’s local information and membership propagation, outperforms the baselines in terms of finding well-integrated communities.
2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 2018
Graph representations have increasingly grown in popularity during the last years. Existing embed... more Graph representations have increasingly grown in popularity during the last years. Existing embedding approaches explicitly encode network structure. Despite their good performance in downstream processes (e.g., node classification), there is still room for improvement in different aspects, like effectiveness. In this paper, we propose, t-PNE, a method that addresses this limitation. Contrary to baseline methods, which generally learn explicit node representations by solely using an adjacency matrix, t-PNE avails a multi-view information graph-the adjacency matrix represents the first view, and a nearest neighbor adjacency, computed over the node features, is the second view-in order to learn explicit and implicit node representations, using the Canonical Polyadic (a.k.a. CP) decomposition. We argue that the implicit and the explicit mapping from a higher-dimensional to a lower-dimensional vector space is the key to learn more useful and highly predictable representations. Extensive experiments show that t-PNE drastically outperforms baseline methods by up to 158.6% with respect to Micro-Fl, in several multi-label classification problems.
Explainable machine learning methods have attracted increased interest in recent years. In this w... more Explainable machine learning methods have attracted increased interest in recent years. In this work, we pose and study the niche detection problem, which imposes an explainable lens on the classical problem of co-clustering interactions across two modes. In the niche detection problem, our goal is to identify niches, or coclusters with node-attribute oriented explanations. Niche detection is applicable to many social content consumption scenarios, where an end goal is to describe and distill high-level insights about usercontent associations: not only that certain users like certain types of content, but rather the types of users and content, explained via node attributes. Some examples are an e-commerce platform with who-buys-what interactions and user and product attributes, or a mobile call platform with who-calls-whom interactions and user attributes. Discovering and characterizing niches has powerful implications for user behavior understanding, as well as marketing and target...
Uploads
Papers by Ekta Gujral