This is a review of the literature of field experimental studies of markets. The main results covered by the review are as follows: (1) Generally speaking, markets organize the efficient exchange of commodities; (2) There are some behavioral anomalies that impede efficient exchange; (3) Many behavioral anomalies disappear when traders are experienced.
What was once broadly viewed as an impossibility - learning from experimental data in economics - has now become commonplace. Governmental bodies, think tanks, and corporations around the world employ teams of experimental researchers to answer their most pressing questions. For their part, in the past two decades academics have begun to more actively partner with organizations to generate data via field experimentation. While this revolution in evidence-based approaches has served to deepen the economic science, recently a credibility crisis has caused even the most ardent experimental proponents to pause. This study takes a step back from the burgeoning experimental literature and introduces 12 actions that might help to alleviate this credibility crisis and raise experimental economics to an even higher level. In this way, we view our "12 action wish list" as discussion points to enrich the field.
The sciences are in an era o fan alleged "credibility crisis'. In this study, we discuss the reproducibility of empirical results, focusing on economics research. By combining theory and empirical evidence, we discuss the import of replication studies, and whether they improve our confidence in novel findings. The theory sheds light on the importance of replications, even when replications are subject to bias. We then present a pilot meta-study of replication in experimental economics, a subfield serving as a positive benchmark for investigating the credibility of economics. Our meta-study highlights certain difficulties when applying meta-research (Ioannidis et al., 2015) and systematizing the economics literature.
In the face of worryingly low performance on standardized test, offering students financial incentives linked to academic performance has been proposed as a potentially cost-effective way to support improvement. However, a large literature across disciplines finds that extrinsic incentives, once removed, may crowd out intrinsic motivation on subsequent, similar tasks. We conduct a field experiment where students, parents, and tutors are offered incentives designed to encourage student preparation for a high-stakes state test. The incentives reward performance on a separate low-stakes assessment designed to measure the same skills as the high-stakes test. Performance on the high-stakes test, however, is not incentivized. We find substantial treatment effects on the incented tests but no effect on the non-incented test; if anything, the incentives result in worse performance on the non-incented test. We also find evidence supporting the conclusion that the incentives crowd out intrinsic motivation to perform well on the non-incented test, but this effect is only temporary. One year later, students who had been in the incentives treatments perform better than those in the control on the same non-incented test.
Increasing evidence indicates the importance of management in determining firms' productivity. Yet, causal evidence regarding the effectiveness of management practices is scarce, especially for high-skilled workers in the developed world. In an eight-month field experiment measuring the productivity of captains in the commercial aviation sector, we test four distinct management practices: (i) performance monitoring; (ii) performance feedback; (iii) target setting; and (iv) prosocial incentives. We find that these management practices -particularly performance monitoring and target setting- significantly increase captains' productivity with respect to the targeted fuel-saving dimensions. We identify positive spillovers of the tested management practices on job satisfaction and carbon dioxide emissions, and captains overwhelmingly express desire for deeper managerial engagement. Both the implementation and the results of the study reveal an uncharted opportunity for management researchers to delve into the black box of firms and rigorously examine the determinants of productivity amongst skilled labor.
We propose to change the default P-value threshold for statistical significance for claims of new discoveries from 0.05 to 0.005.
What are the determinants of the health and of well-being? Income and wealth are clearly part of the story, but does access to health-care have a large independent effect, as the advocates of more investment in health-care, such as the World Health Organization's Commission on Macroeconomics and Health (Commission on Macroeconomics and Health (2001)), have argued? This paper reports on a recent survey in a poor rural area of the state of Rajasthan in India intended to shed some light on this issue, where there was an attempt to use a set of interlocking surveys to collect data on health and economic status, as well as the public and private provision of health care.
Empiricism in the sciences allows us to test theories, formulate optimal policies, and learn how the world works. In this manner, it is critical that our empirical work provides accurate conclusions about underlying data patterns. False positives represent an especially important problem, as vast public and private resources can be misguided if we base decisions on false discovery. This study explores one especially pernicious influence on false positives-multiple hypothesis testing (MHT). While MHT potentially affects all types of empirical work, we consider three common scenarios where MHT influences inference within experimental economics: jointly identifying treatment effects for a set of outcomes, estimating heterogenous treatment effects through subgroup analysis, and conducting hypothesis testing for multiple treatment conditions. Building upon the work of Romano and Wolf (2010), we present a correction procedure that incorporates the three scenarios, and illustrate the improvement in power by comparing our results with those obtained by the classic studies due to Bonferroni (1935) and Holm (1979). Importantly, under weak assumptions, our testing procedure asymptotically controls the familywise error rate - the probability of one false rejection - and is asymptotically balanced. We showcase our approach by revisiting the data reported in Karlan and List (2007), to deepen our understanding of why people give to charitable causes.
Behavioral economics and field experiments within the social sciences have advanced well beyond academic curiosum. Governments around the globe as well as the most powerful firms in modern economies employ staffs of behavioralists and experimentalists to advance and test best practices. In this study, we combine behavioral economics with field experiments to reimagine a new model of early childhood education. Our approach has three distinct features. First, by focusing public policy dollars on prevention rather than remediation, we call for much earlier educational programs than currently conceived. Second, our approach has parents at the center of the education production function rather than at its periphery. Third, we advocate attacking the macro education problem using a public health methodology, rather than focusing on piecemeal advances.
Research on behavioral economics has established the importance of factors such as reference dependent preferences, hyperbolic preferences, and the value placed on non-financial rewards. To date, these insights have had little impact on the way the educational system operates. Through a series of field experiments involving thousands of primary and secondary school students, we demonstrate the power of behavioral economics to influence educational performance. Several insights emerge. First, we find that incentives framed as losses have more robust effects than comparable incentives framed as gains. Second, we find that non-financial incentives are considerably more cost-effective than financial incentives for younger students, but were not effective with older students. Finally, and perhaps most importantly, consistent with hyperbolic discounting, all motivating power of the incentives vanishes when rewards are handed out with a delay. Since the rewards to educational investment virtually always come with a delay, our results suggest that the current set of incentives may lead to under-investment. For policymakers, our findings imply that in the absence of immediate incentives, many students put forth low effort on standardized tests, which may create biases in measures of student ability, teacher value added, school quality, and achievement gaps.