2018 IEEE International Conference on Data Mining (ICDM), 2018
Severe environmental events have extreme effects on all segments of society, including criminal a... more Severe environmental events have extreme effects on all segments of society, including criminal activity. Extreme weather events, such as tropical storms, fires, and floods create instability in communities, and can be exploited by criminal organizations. Here we investigate the potential impact of catastrophic storms on the criminal activity of human trafficking. We propose three theories of how these catastrophic storms might impact trafficking and provide evidence for each. Researching human trafficking is made difficult by its illicit nature and the obscurity of high-quality data. Here, we analyze online advertisements for services which can be collected at scale and provide insights into traffickers' behavior. To successfully combine relevant heterogenous sources of information, as well as spatial and temporal structure, we propose a collective, probabilistic approach. We implement this approach with Probabilistic Soft Logic, a probabilistic programming framework which can flexibly model relational structure and for which inference of future locations is highly efficient. Furthermore, this framework can be used to model hidden structure, such as latent links between locations. Our proposed approach can model and predict how traffickers move. In addition, we propose a model which learns connections between locations. This model is then adapted to have knowledge of environmental events, and we demonstrate that incorporating knowledge of environmental events can improve prediction of future locations. While we have validated our models on the impact of severe weather on human trafficking, we believe our models can be generalized to a variety of other settings in which environmental events impact human behavior.
Inductive learning algorithms typically use a set of labeled examples to learn class descriptions... more Inductive learning algorithms typically use a set of labeled examples to learn class descriptions for a set of user-specified concepts of interest. In practice, labeling the training examples is a tedious, time consuming, error-prone process. Furthermore, in some applications, the labeling of each example also may be extremely expensive (e.g., it may require running costly laboratory tests). In order to reduce the number of labeled examples that are required for learning the concepts of interest, researchers proposed a variety of methods, such as active learning, semi-supervised learning, and meta-learning.
Wrapper induction algorithms, which use labeled exam-ples to learn extraction rules, are a crucia... more Wrapper induction algorithms, which use labeled exam-ples to learn extraction rules, are a crucial component of information agents that integrate semi-structured in-formation sources. Multi-view wrapper induction algo-rithms reduce the amount of training data by exploit-ing several types of rules (i.e., views), each of which being sufficient to extract the relevant data. All multi-view algorithms rely on the assumption that the views are sufficiently compatible for multi-view learning (i.e., most examples are labeled identically in all views). In practice, it is unclear whether or not two views are suf-ficiently compatible for solving a new, unseen learning task. In order to cope with this problem, we introduce a view validation algorithm: given a learning task, the algorithm predicts whether or not the views are suffi-ciently compatible for solving that particular task. We use information acquired while solving several exem-plar learning tasks to train a classifier that discriminat...
The Web is based on a browsing paradigm that makes it difficult to retrieve and integrate data fr... more The Web is based on a browsing paradigm that makes it difficult to retrieve and integrate data from multiple sites. Today, the only way to achieve this integration is by building specialized applications, which are time-consuming to develop and difficult to maintain. We are addressing this problem by creating the technology and tools for rapidly constructing information mediators that extract, query, and integrate data from web sources. The resulting system, called Ariadne, makes it feasible to rapidly build information mediators that access existing web sources.
Applying constraint-based problem solving methods in a new domain often requires considerable wor... more Applying constraint-based problem solving methods in a new domain often requires considerable work. In this talk I will examine the state of the art in constraint-based problem solving techniques and the diiculties involved in selecting and tuning an algorithm to solve a problem. Most constraint-based solvers have many algorithmic variations, and it can make a very signiicant diierence exactly which algorithm is used and how the problem is encoded. I will describe promising new approaches in which generic algorithms are automatically conngured for speciic applications. Using the \right" heuristic algorithm can make a tremendous diierence in the eeciency of solving a constraint-satisfaction problem (CSP). Without a good algorithm, solving even a moderate-sized CSP (or any combinatorial problem) may be extremely time consuming. A great variety of heuristic algorithms have been described in the literature, each purported to perform well on some example problems. Unfortunately, it ...
â– The Journal of Artificial Intelligence Research (JAIR) was one of the first scientific journals... more â– The Journal of Artificial Intelligence Research (JAIR) was one of the first scientific journals distributed over the web. It has now completed over five years of successful publication. Electronic publishing is reshaping the way academic work is disseminated, and JAIR is leading the way toward a future where scientific articles are freely and easily accessible to all. This report describes how the journal has evolved, its grassroots philosophy, and prospects for the future.
BACKGROUND Rapid access to evidence is crucial in times of an evolving clinical crisis. To that e... more BACKGROUND Rapid access to evidence is crucial in times of an evolving clinical crisis. To that end, we propose a novel approach to answer clinical queries, termed rapid meta-analysis (RMA). Unlike traditional meta-analysis, RMA balances a quick time to production with reasonable data quality assurances, leveraging artificial intelligence (AI) to strike this balance. OBJECTIVE We aimed to evaluate whether RMA can generate meaningful clinical insights, but crucially, in a much faster processing time than traditional meta-analysis, using a relevant, real-world example. METHODS The development of our RMA approach was motivated by a currently relevant clinical question: is ocular toxicity and vision compromise a side effect of hydroxychloroquine therapy? At the time of designing this study, hydroxychloroquine was a leading candidate in the treatment of coronavirus disease (COVID-19). We then leveraged AI to pull and screen articles, automatically extract their results, review the studie...
Rapid access to evidence is crucial in times of evolving clinical crisis. To that end, we propose... more Rapid access to evidence is crucial in times of evolving clinical crisis. To that end, we propose a novel mechanism to answer clinical queries: Rapid Meta-Analysis (RMA). Unlike traditional meta-analysis, RMA balances quick time-to-production with reasonable data quality assurances, leveraging Artificial Intelligence to strike this balance. This article presents an example RMA to a currently relevant clinical question: Is ocular toxicity and vision compromise a side effect with hydroxychloroquine therapy? As of this writing, hydroxychloroquine is a leading candidate in the treatment of COVID-19. By combining AI with human analysis, our RMA identified 11 studies looking at ocular toxicity as a side effect and estimated the incidence to be 3.4% (95% CI: 1.11-9.96%). The heterogeneity across the individual study findings was high, and interpretation of the result should take this into account. Importantly, this RMA, from search to screen to analysis, took less than 30 minutes to produce.
For many years, the intuitions underlying partial-order planning were largely taken for granted. ... more For many years, the intuitions underlying partial-order planning were largely taken for granted. Only in the past few years has there been renewed interest in the fundamental principles underlying this paradigm. In this paper, we present a rigorous comparative analysis of partial-order and total-order planning by focusing on two specific planners that can be directly compared. We show that there are some subtle assumptions that underly the wide-spread intuitions regarding the supposed efficiency of partial-order planning. For instance, the superiority ofpartial-order planning can depend critically upon the search strategy and the structure of the search space. Understanding the underlying assumptions is crucial for constructing efficient planners.
2018 IEEE International Conference on Data Mining (ICDM), 2018
Severe environmental events have extreme effects on all segments of society, including criminal a... more Severe environmental events have extreme effects on all segments of society, including criminal activity. Extreme weather events, such as tropical storms, fires, and floods create instability in communities, and can be exploited by criminal organizations. Here we investigate the potential impact of catastrophic storms on the criminal activity of human trafficking. We propose three theories of how these catastrophic storms might impact trafficking and provide evidence for each. Researching human trafficking is made difficult by its illicit nature and the obscurity of high-quality data. Here, we analyze online advertisements for services which can be collected at scale and provide insights into traffickers' behavior. To successfully combine relevant heterogenous sources of information, as well as spatial and temporal structure, we propose a collective, probabilistic approach. We implement this approach with Probabilistic Soft Logic, a probabilistic programming framework which can flexibly model relational structure and for which inference of future locations is highly efficient. Furthermore, this framework can be used to model hidden structure, such as latent links between locations. Our proposed approach can model and predict how traffickers move. In addition, we propose a model which learns connections between locations. This model is then adapted to have knowledge of environmental events, and we demonstrate that incorporating knowledge of environmental events can improve prediction of future locations. While we have validated our models on the impact of severe weather on human trafficking, we believe our models can be generalized to a variety of other settings in which environmental events impact human behavior.
Inductive learning algorithms typically use a set of labeled examples to learn class descriptions... more Inductive learning algorithms typically use a set of labeled examples to learn class descriptions for a set of user-specified concepts of interest. In practice, labeling the training examples is a tedious, time consuming, error-prone process. Furthermore, in some applications, the labeling of each example also may be extremely expensive (e.g., it may require running costly laboratory tests). In order to reduce the number of labeled examples that are required for learning the concepts of interest, researchers proposed a variety of methods, such as active learning, semi-supervised learning, and meta-learning.
Wrapper induction algorithms, which use labeled exam-ples to learn extraction rules, are a crucia... more Wrapper induction algorithms, which use labeled exam-ples to learn extraction rules, are a crucial component of information agents that integrate semi-structured in-formation sources. Multi-view wrapper induction algo-rithms reduce the amount of training data by exploit-ing several types of rules (i.e., views), each of which being sufficient to extract the relevant data. All multi-view algorithms rely on the assumption that the views are sufficiently compatible for multi-view learning (i.e., most examples are labeled identically in all views). In practice, it is unclear whether or not two views are suf-ficiently compatible for solving a new, unseen learning task. In order to cope with this problem, we introduce a view validation algorithm: given a learning task, the algorithm predicts whether or not the views are suffi-ciently compatible for solving that particular task. We use information acquired while solving several exem-plar learning tasks to train a classifier that discriminat...
The Web is based on a browsing paradigm that makes it difficult to retrieve and integrate data fr... more The Web is based on a browsing paradigm that makes it difficult to retrieve and integrate data from multiple sites. Today, the only way to achieve this integration is by building specialized applications, which are time-consuming to develop and difficult to maintain. We are addressing this problem by creating the technology and tools for rapidly constructing information mediators that extract, query, and integrate data from web sources. The resulting system, called Ariadne, makes it feasible to rapidly build information mediators that access existing web sources.
Applying constraint-based problem solving methods in a new domain often requires considerable wor... more Applying constraint-based problem solving methods in a new domain often requires considerable work. In this talk I will examine the state of the art in constraint-based problem solving techniques and the diiculties involved in selecting and tuning an algorithm to solve a problem. Most constraint-based solvers have many algorithmic variations, and it can make a very signiicant diierence exactly which algorithm is used and how the problem is encoded. I will describe promising new approaches in which generic algorithms are automatically conngured for speciic applications. Using the \right" heuristic algorithm can make a tremendous diierence in the eeciency of solving a constraint-satisfaction problem (CSP). Without a good algorithm, solving even a moderate-sized CSP (or any combinatorial problem) may be extremely time consuming. A great variety of heuristic algorithms have been described in the literature, each purported to perform well on some example problems. Unfortunately, it ...
â– The Journal of Artificial Intelligence Research (JAIR) was one of the first scientific journals... more â– The Journal of Artificial Intelligence Research (JAIR) was one of the first scientific journals distributed over the web. It has now completed over five years of successful publication. Electronic publishing is reshaping the way academic work is disseminated, and JAIR is leading the way toward a future where scientific articles are freely and easily accessible to all. This report describes how the journal has evolved, its grassroots philosophy, and prospects for the future.
BACKGROUND Rapid access to evidence is crucial in times of an evolving clinical crisis. To that e... more BACKGROUND Rapid access to evidence is crucial in times of an evolving clinical crisis. To that end, we propose a novel approach to answer clinical queries, termed rapid meta-analysis (RMA). Unlike traditional meta-analysis, RMA balances a quick time to production with reasonable data quality assurances, leveraging artificial intelligence (AI) to strike this balance. OBJECTIVE We aimed to evaluate whether RMA can generate meaningful clinical insights, but crucially, in a much faster processing time than traditional meta-analysis, using a relevant, real-world example. METHODS The development of our RMA approach was motivated by a currently relevant clinical question: is ocular toxicity and vision compromise a side effect of hydroxychloroquine therapy? At the time of designing this study, hydroxychloroquine was a leading candidate in the treatment of coronavirus disease (COVID-19). We then leveraged AI to pull and screen articles, automatically extract their results, review the studie...
Rapid access to evidence is crucial in times of evolving clinical crisis. To that end, we propose... more Rapid access to evidence is crucial in times of evolving clinical crisis. To that end, we propose a novel mechanism to answer clinical queries: Rapid Meta-Analysis (RMA). Unlike traditional meta-analysis, RMA balances quick time-to-production with reasonable data quality assurances, leveraging Artificial Intelligence to strike this balance. This article presents an example RMA to a currently relevant clinical question: Is ocular toxicity and vision compromise a side effect with hydroxychloroquine therapy? As of this writing, hydroxychloroquine is a leading candidate in the treatment of COVID-19. By combining AI with human analysis, our RMA identified 11 studies looking at ocular toxicity as a side effect and estimated the incidence to be 3.4% (95% CI: 1.11-9.96%). The heterogeneity across the individual study findings was high, and interpretation of the result should take this into account. Importantly, this RMA, from search to screen to analysis, took less than 30 minutes to produce.
For many years, the intuitions underlying partial-order planning were largely taken for granted. ... more For many years, the intuitions underlying partial-order planning were largely taken for granted. Only in the past few years has there been renewed interest in the fundamental principles underlying this paradigm. In this paper, we present a rigorous comparative analysis of partial-order and total-order planning by focusing on two specific planners that can be directly compared. We show that there are some subtle assumptions that underly the wide-spread intuitions regarding the supposed efficiency of partial-order planning. For instance, the superiority ofpartial-order planning can depend critically upon the search strategy and the structure of the search space. Understanding the underlying assumptions is crucial for constructing efficient planners.
Uploads
Papers by Steven Minton