Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
Philip Stark

    Philip Stark

    Many widely used models amount to an elaborate means of making up numbers—but once a number has been produced, it tends to be taken seriously and its source (the model) is rarely examined carefully. Many widely used models have little... more
    Many widely used models amount to an elaborate means of making up numbers—but once a number has been produced, it tends to be taken seriously and its source (the model) is rarely examined carefully. Many widely used models have little connection to the real-world phenomena they purport to explain. Common steps in modeling to support policy decisions, such as putting disparate things on the same scale, may conflict with reality. Not all costs and benefits can be put on the same scale, not all uncertainties can be expressed as probabilities, and not all model parameters measure what they purport to measure. These ideas are illustrated with examples from seismology, wind-turbine bird deaths, soccer penalty cards, gender bias in academia, and climate policy.
    We explain why the Australian Electoral Commission should perform an audit of the paper Senate ballots against the published preference data files. We suggest four different post-election audit methods appropriate for Australian Senate... more
    We explain why the Australian Electoral Commission should perform an audit of the paper Senate ballots against the published preference data files. We suggest four different post-election audit methods appropriate for Australian Senate elections. We have developed prototype code for all of them and tested it on preference data from the 2016 election.
    The pseudo-random number generators (PRNGs), sampling algorithms, and algorithms for generating random integers in some common statistical packages and programming languages are unnecessarily inaccurate, by an amount that may matter for... more
    The pseudo-random number generators (PRNGs), sampling algorithms, and algorithms for generating random integers in some common statistical packages and programming languages are unnecessarily inaccurate, by an amount that may matter for statistical inference. Most use PRNGs with state spaces that are too small for contemporary sampling problems and methods such as the bootstrap and permutation tests. The random sampling algorithms in many packages rely on the false assumption that PRNGs produce IID U[0, 1) outputs. The discreteness of PRNG outputs and the limited state space of common PRNGs cause those algorithms to perform poorly in practice. Statistics packages and scientific programming languages should use cryptographically secure PRNGs by default (not for their security properties, but for their statistical ones), and offer weaker PRNGs only as an option. Software should not use methods that assume PRNG outputs are IID U[0,1) random variables, such as generating a random sample...
    Statistical tests of earthquake predictions require a null hypothesis to model occasional chance successes. To define and quantify `chance success' is knotty. Some null hypotheses ascribe chance to the Earth: Seismicity is modeled as... more
    Statistical tests of earthquake predictions require a null hypothesis to model occasional chance successes. To define and quantify `chance success' is knotty. Some null hypotheses ascribe chance to the Earth: Seismicity is modeled as random. The null distribution of the number of successful predictions -- or any other test statistic -- is taken to be its distribution when the fixed set of predictions is applied to random seismicity. Such tests tacitly assume that the predictions do not depend on the observed seismicity. Conditioning on the predictions in this way sets a low hurdle for statistical significance. Consider this scheme: When an earthquake of magnitude 5.5 or greater occurs anywhere in the world, predict that an earthquake at least as large will occur within 21 days and within an epicentral distance of 50 km. We apply this rule to the Harvard centroid-moment-tensor (CMT) catalog for 2000--2004 to generate a set of predictions. The null hypothesis is that earthquake ti...
    Consider approximating a "black box" function f by an emulator f̂ based on n noiseless observations of f. Let w be a point in the domain of f. How big might the error |f̂(w) - f(w)| be? If f could be arbitrarily rough, this... more
    Consider approximating a "black box" function f by an emulator f̂ based on n noiseless observations of f. Let w be a point in the domain of f. How big might the error |f̂(w) - f(w)| be? If f could be arbitrarily rough, this error could be arbitrarily large: we need some constraint on f besides the data. Suppose f is Lipschitz with known constant. We find a lower bound on the number of observations required to ensure that for the best emulator f̂ based on the n data, |f̂(w) - f(w)| <ϵ. But in general, we will not know whether f is Lipschitz, much less know its Lipschitz constant. Assume optimistically that f is Lipschitz-continuous with the smallest constant consistent with the n data. We find the maximum (over such regular f) of |f̂(w) - f(w)| for the best possible emulator f̂; we call this the "mini-minimax uncertainty" at w. In reality, f might not be Lipschitz or---if it is---it might not attain its Lipschitz constant on the data. Hence, the mini-minimax un...
    The uncertainties associated with mathematical models that assess the costs and benefits of climate change policy options are unknowable. Such models can be valuable guides to scientific inquiry, but they should not be used to guide... more
    The uncertainties associated with mathematical models that assess the costs and benefits of climate change policy options are unknowable. Such models can be valuable guides to scientific inquiry, but they should not be used to guide climate policy decisions. In the polarized climate change debate, cost-benefit analyses of policy options are taking on an increasingly influential role. These analyses have been presented by authoritative scholars as a useful contribution to the debate. But models of climate—and especially models of the impact of climate policy—are theorists' tools, not policy justification tools. The methods used to appraise model uncertainties give optimistic lower bounds on the true uncertainty, at best. Even in the finest modeling exercises, uncertainty in model structure is presented as known and manageable, when it is likely neither. Numbers arising from these modeling exercises should therefore not be presented as " facts " providing support to policy decisions. Building more complex models of climate will not necessarily reduce the uncertainties. Indeed, if previous experience is a guide, such models will reveal that current uncertainty estimates are unrealistically small. The fate of the evidence Climate change is the quintessential " wicked problem: " a knot in the uncomfortable area where uncertainty and disagreement about values affect the very framing of what the problem is. The issue of climate change has become so resonant and fraught that it speaks directly to our individual political and cultural identities. Scientists and other scholars often use non-scientific and value-laden rhetoric to emphasize to non-expert audiences what they believe to be the implications of their knowledge. For example, in Modelling the Climate System: An Overview, Gabriele Gramelsberger and Johann Feichter—after a sober discussion of statistical methods applicable to climate models—observe that " if mankind is unable to decide how to frame an appropriate response to climate change, nature will decide for both—environmental and economic calamities—as the economy is inextricably interconnected with the climate. " Historians Naomi Oreskes and Erik M. Conway, in their recent book The Collapse of Western Civilization (2014), paint an apocalyptic picture of the next 80 years, beginning with the " year of perpetual summer " in 2023, and mass-imprisonment of " alarmist " scientists in 2025. Estimates of the impact of climate change turn out to be far too cautious: global temperatures increase dramatically and the sea level rises by eight meters, resulting in plagues of devastating diseases and insects, mass-extinction, the overthrow of governments, and the annihilation of the human populations of Africa and Australia. In the aftermath, survivors take the names of climate scientists as their middle names in recognition of their heroic attempts to warn the world. That the Earth's climate is changing, partly or largely because of anthropogenic emissions of CO 2 and other
    Author(s): Benaloh, Josh; Rivest, Ronald; Ryan, Peter YA; Stark, Philip; Teague, Vanessa; Vora, Poorvi | Abstract: This pamphlet describes end-to-end election verifiability (E2E-V) for a nontechnical audience: election officials, public... more
    Author(s): Benaloh, Josh; Rivest, Ronald; Ryan, Peter YA; Stark, Philip; Teague, Vanessa; Vora, Poorvi | Abstract: This pamphlet describes end-to-end election verifiability (E2E-V) for a nontechnical audience: election officials, public policymakers, and anyone else interested in secure, transparent, evidence-based electronic elections. This work is part of the Overseas Vote Foundation's End-to-End Verifiable Internet Voting: Specification and Feasibility Assessment Study (E2E VIV Project), funded by the Democracy Fund.
    Significance: Foraged leafy greens are consumed around the globe, including in urban areas, and may play a larger role when food is scarce or expensive. It is thus important to assess the safety and nutritional value of wild greens... more
    Significance: Foraged leafy greens are consumed around the globe, including in urban areas, and may play a larger role when food is scarce or expensive. It is thus important to assess the safety and nutritional value of wild greens foraged in urban environments. Methods: Field observations, soil tests, and nutritional and toxicology tests on plant tissue were conducted for three sites, each roughly 9 square blocks, in disadvantaged neighborhoods in the East San Francisco Bay Area in 2014--2015. The sites included mixed-use areas and areas with high vehicle traffic. Results: Edible wild greens were abundant, even during record droughts. Soil at some survey sites had elevated concentrations of lead and cadmium, but tissue tests suggest that rinsed greens of the tested species are safe to eat. Daily consumption of standard servings comprise less than the EPA reference doses of lead, cadmium, and other heavy metals. Pesticides, glyphosate, and PCBs were below detection limits. The nutri...
    Many voter-verifiable, coercion-resistant schemes have been proposed, but even the most carefully designed systems necessarily leak information via the announced result. In corner cases, this may be problematic. For example, if all the... more
    Many voter-verifiable, coercion-resistant schemes have been proposed, but even the most carefully designed systems necessarily leak information via the announced result. In corner cases, this may be problematic. For example, if all the votes go to one candidate then all vote privacy evaporates. The mere possibility of candidates getting no or few votes could have implications for security in practice: if a coercer demands that a voter cast a vote for such an unpopular candidate, then the voter may feel obliged to obey, even if she is confident that the voting system satisfies the standard coercion resistance definitions. With complex ballots, there may also be a danger of "Italian" style (aka "signature") attacks: the coercer demands the voter cast a ballot with a specific, identifying pattern. Here we propose an approach to tallying end-to-end verifiable schemes that avoids revealing all the votes but still achieves whatever confidence level in the announced res...
    There are many sources of error in counting votes: the apparent winner might not be the rightful winner. Hand tallies of the votes in a random sample of precincts can be used to test the hypothesis that a full manual recount would find a... more
    There are many sources of error in counting votes: the apparent winner might not be the rightful winner. Hand tallies of the votes in a random sample of precincts can be used to test the hypothesis that a full manual recount would find a different outcome. This paper develops a conservative sequential test based on the vote-counting errors found in a hand tally of a simple or stratified random sample of precincts. The procedure includes a natural escalation: If the hypothesis that the apparent outcome is incorrect is not rejected at stage s, more precincts are audited. Eventually, either the hypothesis is rejected--and the apparent outcome is confirmed--or all precincts have been audited and the true outcome is known. The test uses a priori bounds on the overstatement of the margin that could result from error in each precinct. Such bounds can be derived from the reported counts in each precinct and upper bounds on the number of votes cast in each precinct. The test allows errors in...
    2 Student ratings of teaching have been used, studied, and debated for almost a century. This article examines student ratings of teaching from a statistical perspective. The common practice of relying on averages of student teaching... more
    2 Student ratings of teaching have been used, studied, and debated for almost a century. This article examines student ratings of teaching from a statistical perspective. The common practice of relying on averages of student teaching evaluation scores as the primary measure of teaching effectiveness for promotion and tenure decisions should be abandoned for substantive and statistical reasons: There is strong evidence that student responses to questions of "effectiveness" do not measure teaching effectiveness. Response rates and response variability matter. And comparing averages of categorical responses, even if the categories are represented by numbers, makes little sense. Student ratings of teaching are valuable when they ask the right questions, report response rates and score distributions, and are balanced by a variety of other sources and methods to evaluate teaching. Since 1975, course evaluations at University of California, Berkeley have asked: Considering both t...
    The City and County of San Francisco, CA, has used Instant Runoff Voting (IRV) for some elections since 2004. This report describes the first ever process pilot of Risk Limiting Audits for IRV, for the San Francisco District... more
    The City and County of San Francisco, CA, has used Instant Runoff Voting (IRV) for some elections since 2004. This report describes the first ever process pilot of Risk Limiting Audits for IRV, for the San Francisco District Attorney's race in November, 2019. We found that the vote-by-mail outcome could be efficiently audited to well under the 0.05 risk limit given a sample of only 200 ballots. All the software we developed for the pilot is open source.
    An evaluation of course evaluations Student ratings of teaching have been used, studied, and debated for almost a century. This article examines student ratings of teaching from a statistical perspective. The common practice of relying on... more
    An evaluation of course evaluations Student ratings of teaching have been used, studied, and debated for almost a century. This article examines student ratings of teaching from a statistical perspective. The common practice of relying on averages of student teaching evaluation scores as the primary measure of teaching effectiveness for promotion and tenure decisions should be abandoned for substantive and statistical reasons: There is strong evidence that student responses to questions of “effectiveness ” do not measure teaching effectiveness. Response rates and response variability matter. And comparing averages of categorical responses, even if the categories are represented by numbers, makes little sense. Student ratings of teaching are valuable when they ask the right questions, report response rates and score distributions, and are balanced by a variety of other sources and methods to evaluate teaching.
    Abstract. What mathematicians, scientists, engineers, and statisticians mean by \inverse problem " di ers. For a statistician, an inverse problem is an inference or estimation problem. The data are nite in number and contain errors,... more
    Abstract. What mathematicians, scientists, engineers, and statisticians mean by \inverse problem " di ers. For a statistician, an inverse problem is an inference or estimation problem. The data are nite in number and contain errors, as they do in classical estimation or inference problems, and the unknown typically is in nite-dimensional, as it is in nonparametric regres-sion. The additional complication in an inverse problem is that the data are only indirectly related to the unknown. Canonical abstract formulations of statistical estimation problems subsume this complication by allowing probability distributions to be indexed in more-or-less arbitrary ways by parameters, which can be in nite-dimensional. Standard statistical concepts, questions, and considerations such as bias, variance, mean-squared error, identi ability, con-sistency, e ciency, and various forms of optimality, apply to inverse problems. This article discusses inverse problems as statistical estimation and i...
    The U.S. Census tries to enumerate all residents of the U.S., block by block, every ten years. (A block is the smallest unit of census geography; the area of blocks varies with population density: There are about 7 million blocks in the... more
    The U.S. Census tries to enumerate all residents of the U.S., block by block, every ten years. (A block is the smallest unit of census geography; the area of blocks varies with population density: There are about 7 million blocks in the U.S.) State and sub-state counts matter for apportioning the House of Representatives, allocating Federal funds, congressional redistricting, urban planning, and so forth. Counting the population is difficult, and two kinds of error occur: gross omissions (GOs) and erroneous enumerations (EEs). A GO results from failing to count a person; an EE results from counting a person in error. Counting a person in the wrong block creates both a GO and an EE. Generally, GOs slightly exceed EEs, producing an undercount that is uneven demographically and geographically. In 1980, 1990, and 2000, the U.S. Census Bureau tried unsuccessfully to adjust census counts to reduce differential undercount using Dual Systems Estimation (DSE), a method based on CAPTURE-RECAP...
    U.S. elections rely heavily on computers which introduce digital threats to election outcomes. Risk-limiting audits (RLAs) mitigate threats to some of these systems by manually inspecting random samples of ballot cards. RLAs have a large... more
    U.S. elections rely heavily on computers which introduce digital threats to election outcomes. Risk-limiting audits (RLAs) mitigate threats to some of these systems by manually inspecting random samples of ballot cards. RLAs have a large chance of correcting wrong outcomes (by conducting a full manual tabulation of a trustworthy record of the votes), but can save labor when reported outcomes are correct. This efficiency is eroded when sampling cannot be targeted to ballot cards that contain the contest(s) under audit. States that conduct RLAs of contests on multi-card ballots or of small contests can dramatically reduce sample sizes by using information about which ballot cards contain which contests---by keeping track of card-style data (CSD). For instance, CSD reduces the expected number of draws needed to audit a single countywide contest on a 4-card ballot by 75%. Similarly, CSD reduces the expected number of draws by 95% or more for an audit of two contests with the same margin...
    A collection of races in a single election can be audited as a group by auditing a random sample of batches of ballots and combining observed discrepancies in the races represented in those batches in a par- ticular way: the maximum... more
    A collection of races in a single election can be audited as a group by auditing a random sample of batches of ballots and combining observed discrepancies in the races represented in those batches in a par- ticular way: the maximum across-race relative overstatement of pairwise margins (MARROP). A risk-limiting audit for the entire collection of races can be built on this ballot-based auditing using a variety of probability sam- pling schemes. The audit controls the familywise error rate (the chance that one or more incorrect outcomes fails to be corrected by a full hand count) at a cost that can be lower than that of controlling the per-comparison error rate with independent audits. The approach is particularly ecient if batches are drawn with probability proportional to a bound on the MAR- ROP (PPEB sampling).
    A revised plan for the 2000 Decennial Census was announced in a 24 February 1999 Bureau of the Census publication 99 and a press statement b y K. Prewitt, Director of the Bureau of the Census 39. Census 2000 will include counts and... more
    A revised plan for the 2000 Decennial Census was announced in a 24 February 1999 Bureau of the Census publication 99 and a press statement b y K. Prewitt, Director of the Bureau of the Census 39. Census 2000 will include counts and adjusted" counts. The adjustments involve complicated procedures and calculations on data from a sample of blocks, extrapolated throughout the country to demographic groups called post-strata." The 2000 adjustment plan is called Accuracy and Coverage Evaluation ACE. ACE is quite similar to the 1990 adjustment plan, called the Post-Enumeration Survey PES. The 1990 PES fails some plausibility c hecks 4, 1 2 , 444 and probably would have reduced the accuracy of counts and state shares 3, 4. ACE and PES diier in sample size, data capture, timing, record matching, post-stratiication, methods to compensate for missing data, the treatment o f m o vers, and ACE improves on PES in a number of ways, including using a larger sample, using a simpler model t...
    A revised plan for the 2000 Decennial Census was announced in a 24 February 1999 Bureau of the Census publication and a press statement b y K. Prewitt, Director of the Bureau of the Census. Census 2000 will include counts and... more
    A revised plan for the 2000 Decennial Census was announced in a 24 February 1999 Bureau of the Census publication and a press statement b y K. Prewitt, Director of the Bureau of the Census. Census 2000 will include counts and adjusted" counts. The adjustments involve complicated procedures and calculations on data from a sample of blocks, extrapolated throughout the country to demographic groups called post-strata." The 2000 adjustment plan is called Accuracy and Coverage Evaluation ACE. ACE is quite similar to the 1990 adjustment plan, called the Post-Enumeration Survey PES. The 1990 PES fails some plausibility c hecks and might w ell have h a ve reduced the accuracy of counts and state shares. ACE and PES diier in sample size, data capture, timing, record matching, post-stratiication, methods to compensate for missing data, the treatment o f m o vers, and details of the data analysis. ACE improves on PES in a numb e r o f w ays, including using a larger sample, using a s...
    Direct recording electronic (DRE) voting systems have been shown time and time again to be vulnerable to hacking and malfunctioning. Despite mounting evidence that DREs are unfit for use, some states in the U.S. continue to use them for... more
    Direct recording electronic (DRE) voting systems have been shown time and time again to be vulnerable to hacking and malfunctioning. Despite mounting evidence that DREs are unfit for use, some states in the U.S. continue to use them for local, state, and federal elections. Georgia uses DREs exclusively, among many practices that have made its elections unfair and insecure. We give a brief history of election security and integrity in Georgia from the early 2000s to the 2018 election. Nonparametric permutation tests give strong evidence that something caused DREs not to record a substantial number of votes in this election. The undervote rate in the Lieutenant Governor’s race was far higher for voters who used DREs than for voters who used paper ballots. Undervote rates were strongly associated with ethnicity, with higher undervote rates in precincts where the percentage of Black voters was higher. There is specific evidence of DRE malfunction, too: one of the seven DREs in the Winte...
    We explain why the Australian Electoral Commission should perform an audit of the paper Senate ballots against the published preference data files. We suggest four different post-election audit methods appropriate for Australian Senate... more
    We explain why the Australian Electoral Commission should perform an audit of the paper Senate ballots against the published preference data files. We suggest four different post-election audit methods appropriate for Australian Senate elections. We have developed prototype code for all of them and tested it on preference data from the 2016 election.
    For references, see J. Shaffer (1995) Multiple Hypothesis Testing, Ann. Rev. Psychol., 46, 561-584; J. Hsu (1996) Multiple Comparisons: Theory and Methods, Chapman and Hall, London. It is often the case that one wishes to test not just... more
    For references, see J. Shaffer (1995) Multiple Hypothesis Testing, Ann. Rev. Psychol., 46, 561-584; J. Hsu (1996) Multiple Comparisons: Theory and Methods, Chapman and Hall, London. It is often the case that one wishes to test not just one, but several or many hypotheses. For example, one might be evaluating a collection of drugs, and want to test the family of null hypotheses that each is not effective. Suppose one tests each of these null hypotheses at level α. This level is called the “per-comparison error rate” (PCER). Clearly, the chance of making at least one Type I error is at least α, and is typically larger. Let {Hj}j=1 (m for multiplicity) be the family of null hypotheses to be tested, and let H = ∩jHj be the “grand null hypothesis.” If H is true, the expected number of rejections is αm. The “familywise error rate” (FWER) is the probability of one or more incorrect rejections:
    We propose a family of novel social choice functions. Our goal is to explore social choice functions for which ease of auditing is a primary design goal, instead of being ignored or left as a puzzle to solve later.
    The pseudo-random number generators (PRNGs), sampling algorithms, and algorithms for generating random integers in some common statistical packages and programming languages are unnecessarily inaccurate, by an amount that may matter for... more
    The pseudo-random number generators (PRNGs), sampling algorithms, and algorithms for generating random integers in some common statistical packages and programming languages are unnecessarily inaccurate, by an amount that may matter for statistical inference. Most use PRNGs with state spaces that are too small for contemporary sampling problems and methods such as the bootstrap and permutation tests. The random sampling algorithms in many packages rely on the false assumption that PRNGs produce IID $U[0, 1)$ outputs. The discreteness of PRNG outputs and the limited state space of common PRNGs cause those algorithms to perform poorly in practice. Statistics packages and scientific programming languages should use cryptographically secure PRNGs by default (not for their security properties, but for their statistical ones), and offer weaker PRNGs only as an option. Software should not use methods that assume PRNG outputs are IID $U[0,1)$ random variables, such as generating a random sa...
    We provide Risk Limiting Audits for proportional representation election systems such as D’Hondt and Sainte-Lague. These techniques could be used to produce evidence of correct (electronic) election outcomes in Denmark, Luxembourg,... more
    We provide Risk Limiting Audits for proportional representation election systems such as D’Hondt and Sainte-Lague. These techniques could be used to produce evidence of correct (electronic) election outcomes in Denmark, Luxembourg, Estonia, Norway, and many other countries.
    The City and County of San Francisco, CA, has used Instant Runoff Voting (IRV) for some elections since 2004. This report describes the first ever process pilot of Risk Limiting Audits for IRV, for the San Francisco District... more
    The City and County of San Francisco, CA, has used Instant Runoff Voting (IRV) for some elections since 2004. This report describes the first ever process pilot of Risk Limiting Audits for IRV, for the San Francisco District Attorney's race in November, 2019. We found that the vote-by-mail outcome could be efficiently audited to well under the 0.05 risk limit given a sample of only 200 ballots. All the software we developed for the pilot is open source.
    Statistical tests of earthquake predictions require a null hypothesis to model occasional chance successes. To define and quantify ‘chance success’ is knotty. Some null hypotheses ascribe chance to the Earth: Seismicity is modeled as... more
    Statistical tests of earthquake predictions require a null hypothesis to model occasional chance successes. To define and quantify ‘chance success’ is knotty. Some null hypotheses ascribe chance to the Earth: Seismicity is modeled as random. The null distribution of the number of successful predictions – or any other test statistic – is taken to be its distribution when the fixed set of predictions is applied to random seismicity. Such tests tacitly assume that the predictions do not depend on the observed seismicity. Conditioning on the predictions in this way sets a low hurdle for statistical significance. Consider this scheme: When an earthquake of magnitude 5.5 or greater occurs anywhere in the world, predict that an earthquake at least as large will occur within 21 days and within an epicentral distance of 50 km. We apply this rule to the Harvard centroid-moment-tensor (CMT) catalog for 2000–2004 to generate a set of predictions. The null hypothesis is that earthquake times are...
    Post-election audits can provide convincing evidence that election outcomes are correct—that the reported winner(s) really won—by manually inspecting ballots selected at random from a trustworthy paper trail of votes. Risk-limiting audits... more
    Post-election audits can provide convincing evidence that election outcomes are correct—that the reported winner(s) really won—by manually inspecting ballots selected at random from a trustworthy paper trail of votes. Risk-limiting audits (RLAs) control the probability that, if the reported outcome is wrong, it is not corrected before the outcome becomes official. RLAs keep this probability below the specified “risk limit.” Bayesian audits (BAs) control the probability that the reported outcome is wrong, the “upset probability.” The upset probability does not exist unless one posits a prior probability distribution for cast votes. RLAs ensure that if this election’s reported outcome is wrong, the procedure has a large chance of correcting it. BAs control a weighted average probability of correcting wrong outcomes over a hypothetical collection of elections; the weights come from the prior. There are priors for which the upset probability is equal to the risk, but in general, BAs do ...
    References: Daubechies, I. 1992. Ten lectures on wavelets , SIAM, Philadelphia, PA. Donoho, D.L., I.M. Johnstone, G. Kerkyacharian, and D. Picard, 1993. Density estimation by wavelet thresholding. http://www-stat.stanford.edu/... more
    References: Daubechies, I. 1992. Ten lectures on wavelets , SIAM, Philadelphia, PA. Donoho, D.L., I.M. Johnstone, G. Kerkyacharian, and D. Picard, 1993. Density estimation by wavelet thresholding. http://www-stat.stanford.edu/ donoho/Reports/1993/dens.pdf Evans, S.N. and P.B. Stark, 2002. Inverse problems as statistics, Inverse Problems , 18 , R55–R97. Hengartner, N.W. and P.B. Stark, 1995. Finite-sample confidence envelopes for shape-restricted densities. Ann. Stat., 23 , pp. 525–550. Silverman, B.W., 1990. Density Estimation for Statistics and Data Analysis , Chapman and Hall, London.
    This note presents three ways of constructing simultaneous condence intervals for linear estimates of linear functionals in inverse problems, including \Backus-Gilbert" estimates. Simultaneous con dence intervals are needed to... more
    This note presents three ways of constructing simultaneous condence intervals for linear estimates of linear functionals in inverse problems, including \Backus-Gilbert" estimates. Simultaneous con dence intervals are needed to compare estimates, for example, to nd spatial variations in a distributed parameter. The notion of simultaneous con dence intervals is introduced using coin tossing as an example before moving to linear inverse problems. The rst method for constructing simultaneous con dence intervals is based on the Bonferroni inequality, and applies generally to con dence intervals for any set of parameters, from dependent or independent observations. The second method for constructing simultaneous con dence intervals in inverse problems is based on a \global" measure of t to the data, which allows one to compute simultaneous con dence intervals for any number of linear functionals of the model that are linear combinations of the data mappings. This leads to con de...
    Student evaluations of teaching (SET) are widely used in academic personnel decisions as a measure of teaching effectiveness. We show: SET are biased against female instructors by an amount that is large and statistically significant the... more
    Student evaluations of teaching (SET) are widely used in academic personnel decisions as a measure of teaching effectiveness. We show: SET are biased against female instructors by an amount that is large and statistically significant the bias affects how students rate even putatively objective aspects of teaching, such as how promptly assignments are graded the bias varies by discipline and by student gender, among other things it is not possible to adjust for the bias, because it depends on so many factors SET are more sensitive to students' gender bias and grade expectations than they are to teaching effectiveness gender biases can be large enough to cause more effective instructors to get lower SET than less effective instructors.These findings are based on nonparametric statistical tests applied to two datasets: 23,001 SET of 379 instructors by 4,423 students in six mandatory first-year courses in a five-year natural experiment at a French university, and 43 SET for four sec...
    Student ratings of teaching have been used, studied, and debated for almost a century. This article examines student ratings of teaching from a statistical perspective. The common practice of relying on averages of student teaching... more
    Student ratings of teaching have been used, studied, and debated for almost a century. This article examines student ratings of teaching from a statistical perspective. The common practice of relying on averages of student teaching evaluation scores as the primary measure of teaching effectiveness for promotion and tenure decisions should be abandoned for substantive and statistical reasons: There is strong evidence that student responses to questions of “effectiveness” do not measure teaching effectiveness. Response rates and response variability matter. And comparing averages of categorical responses, even if the categories are represented by numbers, makes little sense. Student ratings of teaching are valuable when they ask the right questions, report response rates and score distributions, and are balanced by a variety of other sources and methods to evaluate teaching.
    Research Interests:
    Research Interests:
    The complexity of U.S. elections usually requires computers to count ballots-but computers can be hacked, so election integrity requires a voting system in which paper ballots can be recounted by hand. However, paper ballots provide no... more
    The complexity of U.S. elections usually requires computers to count ballots-but computers can be hacked, so election integrity requires a voting system in which paper ballots can be recounted by hand. However, paper ballots provide no assurance unless they accurately record the votes as expressed by the voters. Voters can express their intent by indelibly hand-marking ballots or using computers called ballot-marking devices (BMDs). Voters can make mistakes in expressing their intent in either technology, but only BMDs are also subject to hacking, bugs, and misconfiguration of the software that prints the marked ballots. Most voters do not review BMD-printed ballots, and those who do often fail to notice when the printed vote is not what they expressed on the touchscreen. Furthermore, there is no action a voter can take to demonstrate to election officials that a BMD altered their expressed votes, nor is there a corrective action that election officials can take if notified by voters-there is no way to deter, contain, or correct computer hacking in BMDs. These are the essential security flaws of BMDs. Risk-limiting audits can ensure that the votes recorded on paper ballots are tabulated correctly, but no audit can ensure that the votes on paper are the ones expressed by the voter on a touchscreen: Elections conducted on current BMDs cannot be confirmed by audits. We identify two properties of voting systems, contestability and defensibility, necessary for audits to confirm election outcomes. No available BMD certified by the Election Assistance Commission is contestable or defensible.
    Computers, including all modern voting systems, can be hacked and mispro-grammed. The scale and complexity of U.S. elections may require the use of computers to count ballots, but election integrity requires a paper-ballot voting system... more
    Computers, including all modern voting systems, can be hacked and mispro-grammed. The scale and complexity of U.S. elections may require the use of computers to count ballots, but election integrity requires a paper-ballot voting system in which, regardless of how they are initially counted, ballots can be recounted by hand to check whether election outcomes have been altered by buggy or hacked software. Furthermore, secure voting systems must be able to recover from any errors that might have occurred. However, paper ballots provide no assurance unless they accurately record the vote as the voter expresses it. Voters can express their intent by hand-marking a ballot with a pen, or using a computer called a ballot-marking device (BMD), which generally has a touchscreen and assistive interfaces. Voters can make mistakes in expressing their intent in either technology, but only the BMD is also sub
    Science lies nowadays in the centre of several storms. The better known is the finding of non-reproducibility of many scientific results, which stretches from the medical field (clinic and pre-clinic tests) to study on behaviour (priming... more
    Science lies nowadays in the centre of several storms. The better known is the finding of non-reproducibility of many scientific results, which stretches from the medical field (clinic and pre-clinic tests) to study on behaviour (priming research). Although the bad use of statistics is reported to be a patent cause of the reproducibility crisis, its deep reasons are to be sought elsewhere; particularly, in the passage from a regimen of little science - regulated by small communities of researchers - to the current big science - identified by a hypertrophic production of millions of research papers and by the imperative "publish or perish", in a setting dominated by market. While spirited debates (on vaccines, climate change, GMO) unfold in society, scientific articles which are bought or withdrawn are the signal of a deep crisis not only of science, but also of the expert thought. In this background, statistics is the main defendant, charged with using methods which expert...
    The scientific community is increasingly concerned with cases of published "discoveries" that are not replicated in further studies. The field of mouse behavioral phenotyping was one of the first to raise this concern, and to... more
    The scientific community is increasingly concerned with cases of published "discoveries" that are not replicated in further studies. The field of mouse behavioral phenotyping was one of the first to raise this concern, and to relate it to other complicated methodological issues: the complex interaction between genotype and environment; the definitions of behavioral constructs; and the use of the mouse as a model animal for human health and disease mechanisms. In January 2015, researchers from various disciplines including genetics, behavior genetics, neuroscience, ethology, statistics and bioinformatics gathered in Tel Aviv University to discuss these issues. The general consent presented here was that the issue is prevalent and of concern, and should be addressed at the statistical, methodological and policy levels, but is not so severe as to call into question the validity and the usefulness of model organisms as a whole. Well-organized community efforts, coupled with im...
    ... 8 DA FREEDMAN AND PB STARK ... Those data are available only for the 1906 earthquake on the San Andreas Fault and the 1868 earthquake on the southern segment of the Hayward Fault (USGS, 1999, p. 17), so the time-predictable model... more
    ... 8 DA FREEDMAN AND PB STARK ... Those data are available only for the 1906 earthquake on the San Andreas Fault and the 1868 earthquake on the southern segment of the Hayward Fault (USGS, 1999, p. 17), so the time-predictable model could not be used for many Bay ...
    ABSTRACT
    An estimator or confidence set is statistically consistent if, in a well-defined sense, it converges in probability to the truth as the number of data grows. We give sufficient conditions for it to be impossible to find consistent... more
    An estimator or confidence set is statistically consistent if, in a well-defined sense, it converges in probability to the truth as the number of data grows. We give sufficient conditions for it to be impossible to find consistent estimators or confidence sets in some linear inverse problems. Several common approaches to statistical inference in geophysical inverse problems use the set
    Drawing a random sample of ballots to conduct a risk-limiting audit generally requires knowing how the ballots cast in an election are organized into groups, for instance, how many containers of ballots there are in all and how many... more
    Drawing a random sample of ballots to conduct a risk-limiting audit generally requires knowing how the ballots cast in an election are organized into groups, for instance, how many containers of ballots there are in all and how many ballots are in each container. A list of the ballot group identifiers along with number of ballots in each group is called a ballot manifest. What if the ballot manifest is not accurate? Surprisingly, even if ballots are known to be missing from the manifest, it is not necessary to make worst-case assumptions about those ballots--for instance, to adjust the margin by the number of missing ballots--to ensure that the audit remains conservative. Rather, it suffices to make worst-case assumptions about the individual randomly selected ballots that the audit cannot find. This observation provides a simple modification to some risk-limiting audit procedures that makes them automatically become more conservative if the ballot manifest has errors. The modification--phantoms to evil zombies (~2EZ)--requires only an upper bound on the total number of ballots cast. ~2EZ makes the audit P-value stochastically larger than it would be had the manifest been accurate, automatically requiring more than enough ballots to be audited to offset the manifest errors. This ensures that the true risk limit remains smaller than the nominal risk limit. On the other hand, if the manifest is in fact accurate and the upper bound on the total number of ballots equals the total according to the manifest, ~2EZ has no effect at all on the number of ballots audited nor on the true risk limit.
    Frequently physical scientists seek a confidence set for a parameter whose precise value is unknown, but con- strained by theory or previous experiments. The confidence set should exclude parameter values that violate those constraints,... more
    Frequently physical scientists seek a confidence set for a parameter whose precise value is unknown, but con- strained by theory or previous experiments. The confidence set should exclude parameter values that violate those constraints, but further improvements are possible: ...
    In her 2011 EVT/WOTE keynote, Travis County, Texas County Clerk Dana DeBeauvoir described the qualities she wanted in her ideal election system to replace their existing DREs. In response, in April of 2012, the authors, working with... more
    In her 2011 EVT/WOTE keynote, Travis County, Texas County Clerk Dana DeBeauvoir described the qualities she wanted in her ideal election system to replace their existing DREs. In response, in April of 2012, the authors, working with DeBeauvoir and her staff, jointly architected STAR-Vote, a voting system with a DRE-style human interface and a "belt and suspenders" approach to verifiability. It provides both a paper trail and end-to-end cryptography using COTS hardware. It is designed to support both ballot-level risk-limiting audits, and auditing by individual voters and observers. The human interface and process flow is based on modern usability research. This paper describes the STAR-Vote architecture, which could well be the next-generation voting system for Travis County and perhaps elsewhere.
    SOBA is an approach to election verification that provides observers with justifiably high confidence that the reported results of an election are consistent with an audit trail ("ballots"), which can be paper or electronic. SOBA combines... more
    SOBA is an approach to election verification that provides observers with justifiably high confidence that the reported results of an election are consistent with an audit trail ("ballots"), which can be paper or electronic. SOBA combines three ideas: (1) publishing cast vote records (CVRs) separately for each contest, so that anyone can verify that each reported contest outcome is correct, if the CVRs reflect voters' intentions with sufficient accuracy; (2) shrouding a mapping between ballots and the CVRs for those ballots to prevent the loss of privacy that could occur otherwise; (3) assessing the accuracy with which the CVRs reflect voters' intentions for a collection of contests while simultaneously assessing the integrity of the shrouded mapping between ballots and CVRs by comparing randomly selected ballots to the CVRs that purport to represent them. Step (1) is related to work by the Humboldt County Election Transparency Project, but publishing CVRs separately for individual contests rather than images of entire ballots preserves privacy. Step (2) requires a cryptographic commitment from elections officials. Observers participate in step (3), which relies on the "super-simple simultaneous single-ballot risk-limiting audit." Step (3) is designed to reveal relatively few ballots if the shrouded mapping is proper and the CVRs accurately reflect voter intent. But if the reported outcomes of the contests differ from the outcomes that a full hand count would show, step (3) is guaranteed to have a large chance of requiring all the ballots to be counted by hand, thereby limiting the risk that an incorrect outcome will become official and final.
    This pamphlet describes end-to-end election verifiability (E2E-V) for a nontechnical audience: election officials, public policymakers, and anyone else interested in secure, transparent, evidence-based electronic elections. This work is... more
    This pamphlet describes end-to-end election verifiability (E2E-V) for a nontechnical audience: election officials, public policymakers, and anyone else interested in secure, transparent, evidence-based electronic elections. This work is part of the Overseas Vote Foundation's End-to-End Verifiable Internet Voting: Specification and Feasibility Assessment Study (E2E VIV Project), funded by the Democracy Fund.
    ABSTRACT
    In the words of D D Jackson, the data of real-world inverse problems tend to be inaccurate, insufficient and inconsistent (1972 Geophys. J. R. Astron. Soc. 28 97-110). In view of these features, the characterization of solution... more
    In the words of D D Jackson, the data of real-world inverse problems tend to be inaccurate, insufficient and inconsistent (1972 Geophys. J. R. Astron. Soc. 28 97-110). In view of these features, the characterization of solution uncertainty is an essential aspect of the study of inverse problems. The development of computational technology, in particular of multiscale and adaptive methods
    Research Interests:
    ABSTRACT
    Research Interests:
    Research Interests:
    SOBA is an approach to election verification that provides observers with justifiably high confidence that the reported results of an election are consistent with an audit trail ("ballots"), which can be paper or electronic.... more
    SOBA is an approach to election verification that provides observers with justifiably high confidence that the reported results of an election are consistent with an audit trail ("ballots"), which can be paper or electronic. SOBA combines three ideas: (1) publishing cast vote records (CVRs) separately for each contest, so that anyone can verify that each reported contest outcome is correct,
    Estimating cosmological parameters using measurements of the Cosmic Microwave Background (CMB) is scientifically important and computationally and statistically challenging. Bayesian methods and blends of Bayesian and frequentist ideas... more
    Estimating cosmological parameters using measurements of the Cosmic Microwave Background (CMB) is scientifically important and computationally and statistically challenging. Bayesian methods and blends of Bayesian and frequentist ideas are common in cosmology. Constructing purely frequentist confidence intervals raises questions about the probability that the intervals falsely contain incorrect values. A computable bound on this false coverage probability can help find
    We develop new simultaneous confidence intervals for the components of a multivariate mean. The intervals determine the signs of the parameters more frequently than standard intervals do: the set of data values for which each interval... more
    We develop new simultaneous confidence intervals for the components of a multivariate mean. The intervals determine the signs of the parameters more frequently than standard intervals do: the set of data values for which each interval includes parameter ...
    We compare solar p-mode parameters, such as central frequency, width, and amplitude, derived from GONG and SOHO-SOI/MDI Medium-l Program time series obtained during the same time period. With the excellent data available now from GONG and... more
    We compare solar p-mode parameters, such as central frequency, width, and amplitude, derived from GONG and SOHO-SOI/MDI Medium-l Program time series obtained during the same time period. With the excellent data available now from GONG and SOHO-SOI/MDI, there exist data sets long enough to make such a comparison useful. For this study, we have chosen time series of three ell values (ell = 30, 65, and 100) corresponding to GONG month 16 (Oct 28 -- Dec 2, 1996). For each time series, we calculated multitaper power spectra using generalized sine tapers to reduce the influence of the gap structure, which is different for the two data sets. Then, we applied the GONG peakfitting algorithm to the spectra to derive mode parameters and selected `good' fits common to both MDI and GONG spectra, according to three selection criteria. Preliminary results show that mode frequencies determined from MDI spectra are essentially the same as the frequencies from GONG spectra and that the differenc...
    ... REFERENCES Backus, U. 1989, Geophys. J., 97, 119 Bennett, CL, et al. 1992, ApJ, 396, L7 Gaier, T., Schuster, J., Gundersen, J., Koch, T., Seiffert, M., Meinhold, P., & Lubin, P. 1992, ApJ, 298, Li Gould, A. 1993, Api, 403,... more
    ... REFERENCES Backus, U. 1989, Geophys. J., 97, 119 Bennett, CL, et al. 1992, ApJ, 396, L7 Gaier, T., Schuster, J., Gundersen, J., Koch, T., Seiffert, M., Meinhold, P., & Lubin, P. 1992, ApJ, 298, Li Gould, A. 1993, Api, 403, L5i Kogut,A.,etal. 1992,Api,40l, 1 Smoot, 0. F., et al. ...
    ... 13 (1992) 3944 2 January 1992 NorthHolland Affine minimax confidence intervals for a bounded normal mean Philip B. Stark ... be computed analytically, and the literature contains significant work on nonlinear minimax MSE estimators... more
    ... 13 (1992) 3944 2 January 1992 NorthHolland Affine minimax confidence intervals for a bounded normal mean Philip B. Stark ... be computed analytically, and the literature contains significant work on nonlinear minimax MSE estimators (Bickel, 1981; Casella and Strawderman ...
    The Global Oscillation Network Group (GONG) project estimates the frequencies, amplitudes, and linewidths of more than 250,000 acoustic resonances of the sun from data sets lasting 36 days. The frequency resolution of a single data set is... more
    The Global Oscillation Network Group (GONG) project estimates the frequencies, amplitudes, and linewidths of more than 250,000 acoustic resonances of the sun from data sets lasting 36 days. The frequency resolution of a single data set is 0.321 microhertz. For frequencies averaged over the azimuthal order m, the median formal error is 0.044 microhertz, and the associated median fractional error is 1.6 x 10(-5). For a 3-year data set, the fractional error is expected to be 3 x 10(-6). The GONG m-averaged frequency measurements differ from other helioseismic data sets by 0.03 to 0.08 microhertz. The differences arise from a combination of systematic errors, random errors, and possible changes in solar structure.
    We look at conventional methods for removing endogeneity bias in regression models, including the linear model and the probit model. It is known that the usual Heckman two-step procedure should not be used in the probit model: from a... more
    We look at conventional methods for removing endogeneity bias in regression models, including the linear model and the probit model. It is known that the usual Heckman two-step procedure should not be used in the probit model: from a theoretical perspective, it is unsatisfactory, and likelihood methods are superior. However, serious numerical problems occur when standard software packages try to maximize the biprobit likelihood function, even if the number of covariates is small. We draw conclusions for statistical practice. Finally, we prove the conditions under which parameters in the model are identifiable. The conditions for identification are delicate; we believe these results are new.
    An estimator or confidence set is statistically consistent if, in a well-defined sense, it converges in probability to the truth as the number of data grows. We give sufficient conditions for it to be impossible to find consistent... more
    An estimator or confidence set is statistically consistent if, in a well-defined sense, it converges in probability to the truth as the number of data grows. We give sufficient conditions for it to be impossible to find consistent estimators or confidence sets in some linear inverse problems. Several common approaches to statistical inference in geophysical inverse problems use the set
    We present two new families of two-sided nonequivariant confidence intervals for the mean θ of a continuous, unimodal, symmetric random variable. Compared with the conventional symmetric equivariant confidence interval, they are shorter... more
    We present two new families of two-sided nonequivariant confidence intervals for the mean θ of a continuous, unimodal, symmetric random variable. Compared with the conventional symmetric equivariant confidence interval, they are shorter when the observation is small, and restrict the sign of θ for smaller observations. One of the families, a modification of Pratt's construction of intervals with minimal expected
    ABSTRACT
    ... Strict Bounds on Seismic Velocity in the Spherical Earth. Strict Bounds on Seismic Velocity in the Spherical Earth. Philip B. Stark. Institute of Geophysics and Planetary Physics, University of California, San Diego, La Jolla. Robert... more
    ... Strict Bounds on Seismic Velocity in the Spherical Earth. Strict Bounds on Seismic Velocity in the Spherical Earth. Philip B. Stark. Institute of Geophysics and Planetary Physics, University of California, San Diego, La Jolla. Robert L. Parker. ...
    ... Velocity Bounds from Statistical Estimates of τ (p) and X (p). Velocity Bounds from Statistical Estimates of τ (p) and X (p). Philip B. Stark. Institute of Geophysics and Planetary Physics, Scripps Institution of Oceanography,... more
    ... Velocity Bounds from Statistical Estimates of τ (p) and X (p). Velocity Bounds from Statistical Estimates of τ (p) and X (p). Philip B. Stark. Institute of Geophysics and Planetary Physics, Scripps Institution of Oceanography, University of California, San Diego. Robert L. Parker. ...
    The salt hypothesis is that higher levels of salt in the diet lead to higher levels of blood pressure, increasing the risk of cardiovascular disease. Intersalt, a cross-sectional study of salt levels and blood pressures in 52 populations,... more
    The salt hypothesis is that higher levels of salt in the diet lead to higher levels of blood pressure, increasing the risk of cardiovascular disease. Intersalt, a cross-sectional study of salt levels and blood pressures in 52 populations, is often cited to support the salt hypothesis, but the data are somewhat contradictory. Four of the populations (Kenya, Papua, and 2 Indian tribes in Brazil) do have low levels of salt and blood pressure. Across the other 48 populations, however, blood pressures go down as salt levels go up, contradicting the hypothesis. Experimental evidence suggests that the effect of a large reduction in salt intake on blood pressure is modest, and health consequences remain to be determined. Funding agencies and medical journals have taken a stronger position favoring the salt hypothesis than is warranted, raising questions about the interaction between the policy process and science.
    Epidemiologic methods were developed to prove general causation: identifying exposures that increase the risk of particular diseases. Courts often are more interested in specific causation: On balance of probabilities, was the... more
    Epidemiologic methods were developed to prove general causation: identifying exposures that increase the risk of particular diseases. Courts often are more interested in specific causation: On balance of probabilities, was the plaintiff's disease caused by exposure to the agent in question? Some authorities have suggested that a relative risk greater than 2.0 meets the standard of proof for specific causation. Such a definite criterion is appealing, but there are difficulties. Bias and confounding are familiar problems; individual differences must be considered too. The issues are explored in the context of the swine flu vaccine and Guillain-Barré syndrome. The conclusion: There is a considerable gap between relative risks and proof of specific causation.
    Regressions can be weighted by propensity scores in order to reduce bias. However, weighting is likely to increase random error in the estimates, and to bias the estimated standard errors downward, even when selection mechanisms are well... more
    Regressions can be weighted by propensity scores in order to reduce bias. However, weighting is likely to increase random error in the estimates, and to bias the estimated standard errors downward, even when selection mechanisms are well understood. Moreover, in some cases, weighting will increase the bias in estimated causal parameters. If investigators have a good causal model, it seems better just to fit the model without weights. If the causal model is improperly specified, there can be significant problems in retrieving the situation by weighting, although weighting may help under some circumstances.
    Stark, PB and Tenorio, L.(2010) A Primer of Frequentist and Bayesian Inference in Inverse Problems, in Large-Scale Inverse Problems and Quantification of Uncertainty (eds L. Biegler, G. Biros, O. Ghattas, M. Heinkenschloss, D. Keyes, B.... more
    Stark, PB and Tenorio, L.(2010) A Primer of Frequentist and Bayesian Inference in Inverse Problems, in Large-Scale Inverse Problems and Quantification of Uncertainty (eds L. Biegler, G. Biros, O. Ghattas, M. Heinkenschloss, D. Keyes, B. Mallick, Y. Marzouk, L. ...
    ABSTRACT Many studies draw inferences about multiple endpoints but ignore the statistical implications of multiplicity. Effects inferred to be positive when there is no adjustment for multiplicity can lose their statistical significance... more
    ABSTRACT Many studies draw inferences about multiple endpoints but ignore the statistical implications of multiplicity. Effects inferred to be positive when there is no adjustment for multiplicity can lose their statistical significance when multiplicity is taken into account, perhaps explaining why such adjustments are so often omitted. We develop new simultaneous confidence intervals that mitigate this problem; these are uniformly more likely to determine signs than are standard simultaneous confidence intervals. When one or more of the parameter estimates are small, the new intervals sacrifice some length to avoid crossing zero; but when all the parameter estimates are large, the new intervals coincide with standard simultaneous confidence intervals, so there is no loss of precision. When only a small fraction of the estimates are small, the procedure can determine signs essentially as well as one-sided tests with prespecified directions, incurring only a modest penalty in maximum length. The intervals are constructed by inverting level-α tests to form a 1−α confidence set, and then projecting that set onto the coordinate axes to get confidence intervals. The tests have hyper-rectangular acceptance regions that minimize the maximum amount by which the acceptance region protrudes from the orthant that contains the hypothesized parameter value, subject to a constraint on the maximum side-length of the hyper-rectangle.
    The Fourier transform infrared (FT-IR) spectrum of a rock contains information about its constituent minerals. Using the wavelet transform, we roughly separate the mineralogical information in the FT-IR spectrum from the noise, using an... more
    The Fourier transform infrared (FT-IR) spectrum of a rock contains information about its constituent minerals. Using the wavelet transform, we roughly separate the mineralogical information in the FT-IR spectrum from the noise, using an extensive set of training data for which the true mineralogy is known. We ignore wavelet coefficients that vary too much among repeated measurements on rocks with the same mineralogy, since these are likely to reflect analytical noise. We also ignore those that vary too little across the entire training set, since they do not help to discriminate among minerals. We use the remaining wavelet coefficients as the data for the problem of estimating mineralogy from FT-IR data. For each mineral of interest, we construct an affine estimator ◯ of the mass fraction x of the mineral of the form [Formula: see text], where [Formula: see text] is a vector, [Formula: see text] is the vector of retained wavelet coefficients, and b is a scalar. We find [Formula: see...
    ABSTRACT UQ studies all sources of error and uncertainty, including: systematic and stochastic measurement error; ignorance; limitations of theoretical models; limitations of numerical representations of those models; limitations on the... more
    ABSTRACT UQ studies all sources of error and uncertainty, including: systematic and stochastic measurement error; ignorance; limitations of theoretical models; limitations of numerical representations of those models; limitations on the accuracy and reliability of computations, approximations, and algorithms; and human error. A more precise definition for UQ is suggested below.
    Data volumes from multiple sky surveys have grown from gigabytes into terabytes during the past decade, and will grow from terabytes into tens (or hundreds) of petabytes in the next decade. This exponential growth of new data both enables... more
    Data volumes from multiple sky surveys have grown from gigabytes into terabytes during the past decade, and will grow from terabytes into tens (or hundreds) of petabytes in the next decade. This exponential growth of new data both enables and challenges effective ...
    Skip to main content. ...