2019-02-18 RRG Notes

Table of Contents

Prospecting For Gold

  • Gold, in this metaphor, is a proxy for whatever we truly value
  • We can notice that some people manage to accomplish a lot more of what we altruistically value
  • What is it that gives some people better opportunities than others?
  • How can we go and find opportunities like that?
  • Techniques for finding gold
    • Why are we using "gold" as a metaphor
    • Put the focus on means, rather than ends
    • The problem is that the ends you pursue will dictate your means, to a large extent
    • Replace big, complex values with a very simple thing that we can maximize
    • And to me, that's the problem with utilitarianism and EA – it replaces a very complex set of human values with a simple stand-in which can be maximized
  • Gold is unevenly spread
    • Value, like literal gold is unevenly spread
      • The original form of this sentence was, "Gold, like literal gold is unevenly spread," which is quite possibly the most confusing sentence I've read this week
    • We should work to find the seams where gold is rich
  • Heavy-tailed distributions
    • If I want to figure out the average height, I can sample 5 random people and get a pretty good idea of what the global average height is
      • His sampling procedure for finding heights is pretty dubious
      • If you sample a random 5 people, I'm not sure that your estimate of the mean height will actually be that good
      • One of the things that Kahneman brings up in Thinking Fast and Slow is that people, even trained statisticians, always underestimate how large a sample they need in order to achieve a given level of statistical power
    • However, if I want an estimate of how much gold there is in the world, sampling 5 random places isn't going to give me a good idea
      • Quite possibly none of the places I'll sample have gold, and then I'll think there's no gold in the world
      • Possibly one of the places I'll sample will have a lot of gold, and then I'll think there's a lot of gold in the world
      • That's not how statistics works!
      • You know gold (real or metaphorical) is rare
      • You know that a n = 5 is a pathetically small sample size
      • So if you sample 5 random locations, why would you expect that to tell you anything about how much gold there is in the world? C'mon, this is basic Bayes
    • Gold follows a heavy-tailed distribution
      • Most places have no gold
      • Some places have a massive amount of gold
      • There's a long tail, where the probabilities aren't dying off very fast, where even vast amount of gold have a non-negligible probability
    • In the case of a normal distribution, value is spread evenly in most places, and the important thing is to get to as many places as possible
    • In the case of a heavy tailed distribition, most places have a small amount of value, and a few locations have value that dwarfs all the other locations combined
    • The important thing is to get to the right places
    • We know that literal gold follows a heavy-tailed distribution
    • Does the same apply to opportunities to do good?
      • When we look at the world, we see heavy tailed distributions in a lot of places
      • Heavy-tailed distributions arise naturally in complex systems with lots of interactions
      • Given that the world is a complex place with lots of interactions, we should expect that many of distributions we encounter to be heavy-tailed
      • We can also directly look at opportunities to do good
        • Let's say that we care about solving hunger
        • We could give to famine relief and try to stop hunger today
        • But, it's probably more effective to focus on figuring out what to do if agriculture collapses
          • Okay, this is EA in a nutshell
          • Want to keep people from starving? You could give people food or subsidize better agricultural practices or whatever. But, actually, it's far more effective to focus on <speculative scenario where all agriculture collapses>
  • Heavy-tailed property in opportunities for good
    • We can look at data on developing world health interventions
    • We see a distribution where the most effective intervention is roughly 10,000 times more effective than the least effective intervention
    • The interesting thing about knowing that distributions are heavy-tailed is that it gives us some counterintuitve notions about the value of interventions
    • Before we knew anything about distributions, if we'd been told that an intervention was at the 90th percentile, we'd think it's pretty good
    • But if we know it's a heavy tailed distribution, where most of the value is at the 99th percentile, then that knowledge can make the 90th percentile intervention look worse by comparison
    • Another thing that comes out of heavy-tailed distributions is that naive empiricism doesn't work
    • You can't just try a bunch of different things and see which ones work best, because you don't have the time or resources to try enough things to find that 99th percentile intervention that's 10,000 times as good as the lowest performing intervention
  • To maximize gold, want…
    • If we want to extract the most gold, we need
      • A place where there is a lot of gold
      • The right tools for extracting that gold
      • The right people using those tools
    • This analogy applies to altruism
      • Measure effectiveness of cause area (find the a place where we can have outsize impact)
      • Measure effectiveness of intervention (find an intervention that will realize the outsize impact)
      • Measure the ability of the team or organization to implement that intervention (find the right people to put the intervention in place)
  • Value is roughly multiplicative
    • The value from an intervention is the product of effectiveness of cause area x effectiveness of intervention x ability of the team
    • If we find a good team working in an ineffective area, it might make sense to not support them
    • Similarly, if we find an ineffective team working in a highly impactful area, it might make sense to not support them and encourage another team to start working in that area
  • Recognizing gold
    • A nice property of real gold is that when you dig it up, it's pretty easy to determine that it's real gold
    • Altruistic value isn't the same – often have to infer the presence of value by using other tools
  • Running out of easy gold
    • Real gold mining runs into the problem of diminishing returns
    • As more gold is extracted from an area, it requires more and more effort to get the remnants
    • We see this in EA interventions
      • Now that the Gates Foundation is funding mass vaccinations, adding additional funding to mass vaccination isn't going to be as cost-effective
      • The 101st book on Superintelligence isn't going to be as impactful as the first book
        • Actually, is this true?
        • If the 101st book contains the solution to the AI safety problem, it's entirely possible that it would be as impactful as the first, which laid out the problem
  • How do we find the right cause areas?
    • Scale: all else being equal, we want to go to places where there is a lot of good that can be done, as opposed to only a little bit
    • Tractability: we want to go to places where we can make more progress per unit of work
    • Uncrowdedness (neglectedness): We want to go to an area where there is still low-hanging fruit, if possible
    • Ideally, we'd want to be in a place that was all three – large scale, easily tractable, completely neglected
    • However, that combination never occurs in the real world
    • So how can we trade off among the three
    • The value of extra work can be expressed by the following equation: \[ \frac{dU}{dW} = \frac{dU}{\%dS} \times \frac{\%dS}{\%dW} \times \frac{\%dW}{dW} \]
      • \(\dfrac{dU}{dW}\) represents the value of the next unit of marginal effort
      • The first term on the right represents the value of a little bit of the solution
      • The second term represents the elasticity of progress with work – how much closer to a solution does additional work get you
      • The final term is a measure of uncrowdedness that cancels to one over the total amount of work being done
    • This equation is a more precise version of the scale, uncrowdedness and tractability framework that people have been talking about for years
      • But is it really?
      • This is one thing that economics gets very wrong: just because you write an equation and typeset it in LaTeX doesn't mean your thinking has become more clear
      • Really any equation that trades off scale, tractability and crowdedness would do – the real question is how do you measure scale, tractability and crowdedness?
    • Applying this framework:
      • Helping a bee: fails the scale test – ultimately an individual bee isn't that important and no matter now much you help it, you've only helped one bee
      • Perpetual motion: would be fantastic to have, but it's not a tractable problem – we'd need to significantly revise physics to make it happen
      • Climate change: massive scale, and is tractable (i.e. doesn't require any major scientific breakthroughs) but it's a huge cause area that gets attention from millions of people
        • Not clear that the next dollar will do anything special
  • Absolute and marginal priority
    • Given two areas which both satisfy the scale/tractability/uncrowdedness framework, we have to decide where the next dollar of spending or next hour of labor must go
    • We need to track both absolute spending and marginal spending
    • As individuals or small groups, we should think in terms of marginal spending and marginal impact – how much work will my dollar or hour of labor do?
    • As societies, we should think in terms of absolute impact – how much spending should there be in total on a cause area
    • Okay, but you realize that a society is composed of individuals and small groups, right?
    • If individuals and small groups are thinking of marginal impact, while society "as a whole" is thinking of total resource allocation, how do those competing priorities get adjudicated?
    • More explicitly, how does society get individuals and small groups to work on a project that has low marginal impact (like climate change) but which requires a large amount of resources for progress to occur?
  • Long-term gold
    • Oftentimes there are technologies that unlock a lot of value in the short run, but destroy some value in the process
    • There are other technologies which operate more slowly but which are more efficient and allow you to extract more value in the long run
    • Many philosophers like Nick Bostrom argue that we should improve our decision making skills as a society before developing technologies that might threaten the long-run viability of civilization
    • This discussion highlights another problem with the gold analogy
    • Gold is finite – there's only so much of it in the earth
    • If you find a way to destroy some of it or render it unusable, then it's gone
    • I'm not sure that the value he's talking about is like that
  • Working together
    • EA is fortunate in that most people who are in the EA movement have pretty similar values
    • Widespread agreement on what the most important goals are
    • We need to make sure we're getting people to go to where they can do the most good
  • Comparative advantage
    • Can we please have a rule against using Harry Potter examples
    • Don't just focus on where you're absolutely the best, focus on where you have a comparative advantage
    • Maybe the most effective thing for you to be doing is the thing you're second best at, because there's someone else who's also pretty good at doing the thing you're best at, but no one else who can do the thing you're second best at
  • Comparative advantage at multiple levels
    • Comparative advantage applies at the group level as well as at the individual level
    • Different organizations or groups may be better placed to take advantage of different opportunities
    • Another thing that we need to consider is comparative advantage might vary with time – we might be better positioned to do something now than people in the past or future
    • Can we influence which problems people in the future work on, compounding our impact?
  • Building a map together
    • All of us have small parts of the the model that tells us where real value is
    • We need mechanisms like peer review or Wikipedia's review process to help us aggregate and filter everyone's intuitions on where the most value is
    • As the EA movement grows, this aggregation and filtering will become more important
    • As we get more resources, it becomes more important that those resources get used wisely
  • Good local norms
    • We need to have good norms to ensure the spread of good ideas
    • Pay attention to why we believe things
      • Do you believe things because it's what you've been told or because you've worked out the reasoning for yourself?
      • Not that you working something out yourself is necessarily a strong reason for you to believe it: it's entirely possible that you've made a mistake
      • But you should know why you believe something and be able to communicate that why to others
        • This is why citations are important, and it makes me sad that the community devalues them
    • Shortening the chain
      • Go back to original sources
      • When people tell you that they read a claim on a website, go to the website and check it out
      • Going back and verifying that the original sources for a claim are correct can make you more robustly confident in the claim
      • Hence citations
    • Disagreement is an opportunity to learn
      • When you find yourself talking to someone who has a point of view that's unlikely to be correct, try to figure out how they came to that point of view
      • Not only is it polite, it also helps you build a deeper picture of the evidence that you do have
      • This runs into diminishing returns quickly
      • It's fascinating to meet the first young-earth creationist, global warming denier, or the person wh think that 9/11 was inside job
      • By the time you've met the tenth, there really isn't much more you can learn
  • Retrospective: What I believe and Why
    • Why should we believe Owen?
    • Heavy tailed distributions
      • The fact that many distributions are heavy tailed is a fairly well established property
      • Heavy-tailed isn't a binary property – there's a whole continuum of distributions from standard gaussian to heavy-tailed
    • Digression: Altruistic market efficiency
      • Side-note: never ever put digressions in the conclusion of your talk
      • One thing that comes up in financial markets is that people start out by exploring a lot of different ways to make money
      • Most of those ways kind of suck, but a few work really well
      • Then everyone rushes in to those few ways and they stop working as well
      • In effect, efficient allocation of resources makes the distribution less heavy-tailed
      • Is this a factor for EA?
        • While EA has heavy-tailed distributions, the EA market isn't all that efficient
        • We don't have the feedback loops or ways of calculating effect that would allow market-like mechanisms to operate
          • Part of the EA project is working out ways of calculating effect to allow charities to get feedback from their interventions
    • Factoring cost effectiveness
      • This is a simple point – not really space for it to be wrong
      • I'm not sure that it is that obvious – he seems to be taking it as a given that the value from a given cause area is multiplicative based upon the effectiveness of the cause area, the effectiveness of the intervention and the ability of the team
      • What if it's not? What if the effectiveness of the team is only an additive factor?
      • There might be more variation among some of the dimensions (effectiveness of cause area, effectiveness of intervention, and effectiveness of team) than others
    • Diminishing returns
      • Some areas have diminishing returns, but other areas might actually have increasing returns to scale
      • Returns to scale probably apply more at the organization scale than at the domain scale
    • Scale, tractability, uncrowdedness
      • It's obvious that all three of these matter
      • It's obviously correct that this factorization is correct
        • I don't think it's obvious at all
      • Does the factorization break things up into things that are easier to measure?
      • It does match up with an informal framework that people have been using for years, so it's probably good
        • I don't know about that either – one of the nice properties of informal frameworks is that people can choose to stop using them when they don't work any more
        • You turn Scale Tractability Uncrowdedness into a mathematical framework, slap a nice three-letter abbreviation on it (STU, or better STuC), and then people are going to find that they have to justify everything in terms of scale, tractability, and uncrowdedness, even when those aren't necessarily the correct metrics to be using
    • Absolute and marginal priorities
      • This is also a fairly trivial point
      • It's easy to understand that some things requiring more spending overall won't necessarily benefit that much from my additional dollar
    • Differential progress
      • The argument checks out and it's appeared in a few academic papers
      • However, it is counterintuitive, and we should give it more scrutiny
      • I'm amused that out of all the counterintuitive notions in the presentation, he chooses to highlight as counterintuitive the only concept which I didn't find counterintuitive
      • That said, I do agree that it should be subject to more scrutiny – I didn't like the analogy
        • I'm still not clear what "dynamite" maps to in the analogy
        • Concrete examples of fast technology that destroyed long term value vs. slow technology that preserved long term value would be good to have
    • Comparative advantage
      • Comparative advantage is a standard idea from economics
      • The new thing here is adding a time component to the comparative advantage calculation
    • Aggregating knowledge
      • We all want better ways of aggregating knowledge
      • The question is can we actually build those better ways
    • Stating reasons for beliefs
      • This is another common-sense thing
      • There are, of course, costs to stating why you believe something
        • Slows down communication
        • Makes the community more off-putting to newcomers
        • I think these are all costs worth bearing
        • If EA is serious about its mission: finding the most effective interventions and allocating resources towards them, it makes sense to be absolutely rigorous in making sure that the interventions that are found are actually those which are most worthy
        • Otherwise why should I believe GiveWell over my own intution?
  • Conclusion
    • We need to be careful about aiming at the right things
    • We need to spread broadly the knowledge of how to find the right things to aim at
    • It's important that we think about these things now, when the community is still in its early days, so we can get these norms established before it becomes difficult to do so

How To Compare different global problems in terms of impact

  • How do you figure out which area is most effective to focus on?
  • What problem you choose to focus on is the biggest determinant of the social impact you have with your career
  • Framework:
    • Scale
    • Neglectedness
    • Solvability
    • Personal fit
  • Introducing how we define the factors
    • Ultimately, what we want to know is the expected good that will result from the next unit of resources invested in a problem
    • This is hard to estimate, so we break it down into components that we can estimate individually
      • I literally facepalmed at this: "Here is a hard thing that we don't know how to estimate. By breaking it down into three smaller things, which we also don't know how to estimate, we have made the problem more tractable
      • Scale: (good done/% of problem solved)
      • Solvability: (% of problem solved / % increase in resources)
      • Neglectedness: (% increase in resources / extra person or $)
        • Credit here for explaining the equation from the previous post better
        • I can see how the neglectedness thing makes sense – it's literally "How much of an increase does the next person or dollar represent?"
      • The nice thing about breaking it down this way is that if you multiply Scale, Solvability and Neglectedness, you get (good done) / (extra person or $)
    • Finally, add a bonus factor for suitability, when attempting to decide which problems you should work on
  • Defining a problem carefully
    • Make sure you have a clear definition of the scope of the problems you're comparing
    • Example: "global health"
      • Which diseases
      • Which countries
    • Note that narrowly described problems tend to look better than broad problems
    • Problems can be made to look more or less pressing by altering their definitions
  • Creating a (logarithmic) scale
    • There are often huge differences between cause areas on the metrics listed above
    • Using a logarithmic scale allows us to take the logarithm of each metric and then add them together instead of multiplying them all
    • When comparing the cost-effectiveness of various problems, you can look at the differences of their log scores
  • How to assess scale
    • Definition of scale: if we solved this problem, how much would the world improve?
      • Measure scale in terms of its effect on well-being (in terms of QALYs)
      • Scale can be increased by
        • Affecting more people
        • Having a greater impact
      • If you have different values, you can plug that in to your definition of "scale"
    • Measuring scale
      • Measuring scale is difficult, especially when considering the long-term and indirect effects of solving a problem
      • Example: what was the impact of Einstein's discovery of relativity?
        • It would have been difficult to assess the impact of the theory of relativity in 1916, but that doesn't mean that breakthroughs in physics don't matter
      • To make wide-ranging comparisons between problems, you need to turn to "yardsticks" for scale
        • One commonly used yardstick in economics is GDP (although GDP certainly has problems of its own)
        • Another yardstick proposed by Bostrom is whether an action increases or reduces existential risk
    • The process of measuring scale is most difficult when you're comparing across yardsticks
      • How much does a particular health intervention lower existential risk?
      • These tradeoffs are also most susceptible to worldview and value judgements
      • There are big disagreements over how much to value the future, how much to value animals, etc.
    • I don't disagree with the methodology, but I do find their examples illustrative
    • Turning 10,000 people vegan ranks as a 2. Saving 3 lives ranks as a 0
    • So turning 10,000 people vegan ranks as being literally 100x more beneficial than saving 3 human lives – and really, that's the most stereotypical EA calculation I've seen
  • How to assess neglectedness
    • How many people or dollars are currently being allocated to the problem?
    • Why is it important?
      • Often, after a large amount of resources have been devoted to a problem, you'll hit diminishing returns
      • For example, mass vaccination is a very effective intervention, but goverments have already poured massive amounts of money into vaccination programs
      • Neglectedness also allows us to determine which problems are "most pressing"
      • If there's a new problem that no one has worked on yet, it might turn out to be more solvable than previously thought
    • How to assess it
      • A challenge - direct vs. indirect effort
        • Often there's a lot of money being spent on efforts that are indirectly working on the problem or working on adjacent problems
        • For example: there's not a lot of money being spent on anti-aging research directly, but there is a lot of money being spent on biomedical research more broadly
        • Even though the money spent on the indirect effort may not be as well-targeted as the money spent on the direct effort, there might be so much more money spent on the indirect effort that the indirect spending is responsible for most of the progress in the cause area
        • To go back to the example, the indirect spending on medical research is probably responsible for most of the progress on anti-aging
        • Indirect efforts are often difficult to measure and score – for this reason 80,000 hours only scores direct efforts on a problem
        • This isn't as much of a problem as it appears because this is adjusted for when we measure solvability (tractability)
      • More tips on how to assess
        • Rather than trying to assess neglectedness directly, you can think about questions like:
          • Why hasn't this already been addressed by markets and/or governments
          • Is this a new field or a field that's at the intersection of two disciplines (for research)
          • If you don't work on this problem, how likely is it that someone else will step in to work on the problem
          • If you work on this problem, will you learn more about how pressing it is in comparison to other problems
      • It's important to assess scale and neglectdness together
      • We care about the ratio of scale to neglectedness – we want the biggest problem that also has the least amount of resources devoted to it
      • If several kinds of input are being dedicated to a problem, assess neglectedness by the lowest value among the different kinds of input
  • How to assess how solvable a problem is
    • Definition: if we doubled the amount of direct effort on this problem, what fraction of the remaining problem would we expect to solve?
    • Why is it important
      • Even if a problem is hugely important and highly neglected, there might not be very much we can do about it
      • Example: aging
        • Huge in scale
        • Highly neglected – 2/3s of global ill-health is some form of aging
        • However, direct research on aging is neglected because researchers believe that it's very hard to solve
    • How to assess it
      • Are there cost-effective interventions for making progress on this problem with rigorous evidence behind them
      • Are there promising but unproven interventions which can be cheaply tested?
      • Are there theoretical arguments that progress should be possible (such as a good track record in a related area?)
      • Are there interventions that could make a huge contribution to solving the problem, even if they're unlikely to work?
    • Looking to find the best interventions to make progress on the problem, and then evaluate them on
      • Potential upside
      • Likelihood of upside
    • Take a Bayesian approach to evaluating both factors
    • Prior is that any given intervention isn't very effective
    • Challenges in assessment
      • Solvability is the hardest of the three areas to assess because it requires anticipating the future
      • In some cases we can use the cost-effectiveness of existing techniques
      • In other cases, we have to use judgment calls
      • Use an "expected value" approach to scoring – this allows us to judge incremental approaches and radical approaches using the same yardstick
      • Problems for which most of the work is being performed indirectly will likely be solved more slowly through an increase in direct work – many promising approaches have been tried by other groups and found wanting
  • What do the summed scores mean
    • We can sanity-check our scores by adding them up and converting them back into a measure of actual impact from one additional person working on the problem
    • Don't put weight on the figures specifically, instead use the scores to make relative comparisons
  • How to assess personal fit
    • Within a field, top performers have 10 to 100 times as much impact as the median performer
    • I mean, this might be true for research, but I'm not sure how applicable this is for other domains
    • It's important to choose a field that you'll like and be good at
    • Definition
      • Given your skills, resources, knowledge, connection and passions, how likely are you to excel in this area?
    • How can it be assessed?
      • What's your most valuable career capital? Is it especially relevant to one problem and not others
      • How motivated do you expect to be if you worked on this problem?
      • What specific roles could you take in this problem and do you expect you'd excel at them?
    • Personal fit matters more for some kinds of altruism than others
      • If you're planning to contribute directly, it matters a lot
      • If you're planning on donating money, to matters less
  • Other factors for assessing career opportunities
    • Also need to consider the other factors in the career framework
      • How influential a role can you get?
      • How much career capital can you get?
      • The value of information of working on this options
  • How should we interpret the results?
    • Using this framework, we can add together the scores for scale, neglectedness and solvability to get a rough idea of which problems are most important
    • These scores are imprecise and adding them together only increases the uncertainty because each of the scores has its own error
    • If the difference in scores is 4 or larger, one problem is clearly more important than the other
    • If the difference is 3 or smaller, it's a close call
  • How does this compare with ordinary cost-effectiveness analysis
    • An alternative approach is to compare the cost-effectiveness of past interventions against a problem
    • When comparing problems in two different domains, convert their cost-effectiveness with a conversion factor (which adds uncertainty)
    • The difficulty with cost-benefit analysis is that it's very difficult in many circumstances
      • Political advocacy – circumstances are constantly shifting
      • Original research – no one knows how long it will take to make a new discovery
      • Any field in which interventions are unknown or poorly studied
  • Advantages and disadvantages of quantitative problem prioritization
    • Benefits of going through process from above
      • Explicitly quantifying outcomes can help you notice large, robust differences in effectiveness that might be difficult to notice qualitatively
      • Helps avoid scope neglect
      • Going through the process tests your understanding of a problem by forcing you to be explicit about your assumptions
      • Can help others understand and critique your reasoning
    • Disadvantages
      • High levels of uncertainty
      • Different assumptions can greatly alter the outcomes of the analysis
      • Danger of being misled by an incomplete model where it would have been better to go with qualitative analysis or common sense
    • Don't use this model alone, combine wit with other forms of evidence
  • Conclusion
    • Difficult to measure effectiveness precisely, but the large differences between problems means that even inaccurate measurements can be a useful guide

Four Focus Areas of Effective Altruism

  • EAs tend to be:
    1. Globally altruistic – care about people equally regardless of location
    2. Value consequences – value causes according to their consequences, whether those consequences are happiness, health, justice, etc.
    3. Try to do as much good as possible – don't want to do some good, want to do as much good as possible
    4. Think scientifically and quantitatively – use numbers to figure out what is the most good
    5. Be willing to make significant life changes in order to be significantly more altruistic
      • Change which charities they support financially
      • Change careers
      • Spend significant chunks of time investigating which causes are most cost-effective
      • Make other significant life changes
  • Despite this, EAs tend to be fairly diverse and focus on a variety of causes
  • These causes tend to cluster in 4 groups
    1. Poverty reduction
      • Economic benefit, better health, better education
      • Major organizations
        • GiveWell – most rigorous research on charitable causes, especially with regards to poverty reduction and global health
        • GoodVentures – works closely with GiveWell
        • The Life You Can Save – encourages people to pledge a fraction of their income to effective charities
        • Giving What We Can – does some charity evaluation and enocourages people to donate 10% of their income to effective charities
        • In addition some major foundations, such as the Bill and Melinda Gates Foundation fund many of the most cost-effective interventions in the developing world
      • In the future, EAs might focus on economic, political or research infrastructure changes that might achieve poverty reduction more directly
      • GiveWell Labs and The Vannevar Group are beginning to evaluate the likely cost-effectiveness of these measures
    2. Meta-effective altruism
      • Raising awareness of EA
      • Helping EAs reach their potential
      • Doing research to decide which areas EAs should focus on
      • Major organizations
        1. 80,000 hours – highlights the importance of helping the world through one's career
        2. Center for Applied Rationality (CFAR) – trains people in rationality skills, but are especially focused on the application of rational thought to altruism
        3. Leverage Research – focuse on growing and empowering the EA movement
          • Hosts the EA summit
          • Organizes the THINK student group network
          • Searches for mind hacks which can make EAs more effective
      • Most EA organizations spend some time on growing the EA movement, even if it's not their primary focus
    3. The Far Future
      • Many EAs value future people as much as currently living people
      • Therefore, the vast majority of value is found in the astronomical numbers of people who could contribute in the far future
      • Focus on efforts to capture some of these benefits by reducing existential risk
      • Major organizations
        1. Future of Humanity Institute at Oxford University – main hub for research on existential risk mitigation
        2. Machine Intelligence Research Institute – focuses on doing the research necessary to build Friendly AI, which could make the future far better off
      • Other groups also study existential risks
        • NASA searches for asteroids that could be an existential threat
        • Many organizations, such as GCRI study worst-case scenarios for climate change or nuclear warfare
    4. Animal suffering
      • Reducing animal suffering in cost-effective ways
      • Animals vastly outnumber humans
      • Growing numbers of scientists believe that animals consciously experience pleasure and suffering
      • The primary organization in this field is Effective Animal Activism
      • Major thinkers in this area include Peter Singer, David Pierce and Brian Tomasik
  • Other focus areas
    • Effective environmental altruism
      • Environmental movement is large and well known
      • However not many EAs take environmentalism as the most important thing for them to be working on
  • EAs should go out of their way to cooperate and learn from each other, even when they're working in different focus areas

Why We Can't Take Expected Value Estimates Literally Even When They're Unbiased

  • There are some organizations which criticize GiveWell on its preference for strong evidence over high "expected value"
  • Critique is based on the role of non-formalized intuitions in GiveWell's decision-making
  • The problem with this critique is that expected value is often based on a formula whose inputs are guesses or very rough estimates
  • Any estimate made along these lines needs to be adjusted with a "Bayesian prior"
  • This adjustment can rarely be made with an explicit formal calculation
  • Most formal attempts to do so, even when they're making significant negative adjustments, are not making nearly as much an adjustment as they ought to be making in order to be consistent with the proper Bayesian approach
  • This is why, even though recommendations are grounded in relevant facts, calculations and quantifications, they still have a strong dose of intuition
  • Generally, GiveWell prefers to recommend areas where there is strong evidence that donations can do some good rather than weak evidence that donations can do a lot of good
  • This preference is inconsistent with expected-value approaches which don't include Bayesian adjustments
  • The approach we oppose: "explicit expected value" (EEV) decisionmaking
    • The EEV approach generally involves an argument of the form:
      • Each dollar spent on program P has an estimated value V
      • This estimate is extremely rough and unreliable, but it's unbiased (as likely to be too pessimistic as too optimistic)
      • Therefore V represents the per-dollar expected value of P
      • I don't know how good charity C is at implementing program P, but even if it wastes 75% of its money, its per-dollar expected value is 25% of V, which is still excellent
    • Examples of EEV decision-making
      • Deworm the World
        • Spends 74% of its funding on technical assistance and scaling up deworming programs
        • Even if we assess the charity on that 74%, it would still do well in QALYs/DALYs saved
      • Back of the Envelope Guide To Charity
        • Donating to political advocacy for foreign aid is between 8x and 22x as good as a donation to VillageReach
      • X-risk charities must be the best ones to support, because the value of saving the human race is so high that "any imaginable probability of success" would lead to a higher expected value than the others
        • As one of my friends said – these people look at the logic behind Pascal's Mugging and say, "One person's modus tollens is another person's modus ponens"
      • Pascal's Mugging is the reductio-ad-absurdium of this sort of reasoning
    • The general problem with the EEV approach is that it doesn't incorporate a preference for better-grounded evidence over rougher estimates
    • Ranks charities/actions solely based on their expected value, ignoring differences in the robustness of the expected value calculations
  • Informal objections to EEV decisionmaking
    • Nothing in EEV penalizes ignorance or poorly grounded estimates
    • Because of this, a world in which people acted on EEV would be problematic in a number of ways
      • Nearly all altruists would put their resources toward people they knew little about, rather than helping themselves, their families, or their communities
        • Peter Singer would say that's a feature, not a bug, since the communities most effective altruists live in are in rich countries, and thus don't need help
      • In such a world, once an action is decided to have high EEV, there is little or no incentive to engage in costly sceptical inquiry into the actual value of the action
    • Giving based on EEV seems to create bad incentives
      • Doesn't allow rewarding charities based on transparency
      • Charities would have every incentive to announce that they were focusing on the highest expected value programs without disclosing the details on how they were focusing on these programs
      • Or, worse, disclosing everything and then accompanying it with rationalizations around how e.g. "our marketing budget has high expected value, because it might recruit the researcher who makes the next breakthrough, so it's totally worth it"
    • Basing your decisions on EEV analysis leaves you vulnerable to Pascal's Mugging – a tiny probability of a huge positive or negative outcome can dominate your decisionmaking, in ways that violate common sense
  • Simple example of a Bayesian approach vs. an EEV approach
    • Beer Advocate ranks beers using a bayesian approach
    • A new beer added to the site has a score of 3.66 (the average score of all beers on the site)
    • As it accumulates reviews, the score is updated using a bayesian approach
    • As the number of reviews grows, the formula's "confidence" in the quality of the beer grows and the beer's score asymptotically approaches its "true" score
    • However, there are few problems with this approach
      • Judgment call in which prior to use – is it really reasonable to assume that a beer is average until proven otherwise?
      • BA also has a minimum number of reviews required before a beer can be scored – the choice of this value is a judgment call
      • The basic approach is much more straightforward than estimating how much good a charity does
      • For charities, it's often not clear what the reference class should be or what your priors should be set to
  • Applying Bayesian adjustments to cost-effectiveness estimates of donations, actions, etc.
    • Giving What We Can and Back of the Envelope Guide to Philanthropy both use forms of EEV in arguing for their recommendations
    • Propose a model in which we log-normally distribute estimate error around the "cost-effectiveness" estimate, with a mean of no-error
    • Prior distribution of for cost-effectiveness is normally (or log-normally) distributed as well
    • The more one feels confident in one's estimate for what one's action should be, the smaller the variance of the "estimate error"
    • Effects:
      • A reliable estimate causes the Bayesian adjusted conclusion to end up very close to the estimated value
      • When the estimate is relatively unreliable (large confidence intervals), the Bayesian adjustment causes the estimate to have virtually no effect on the final view
    • The takeaway is that having the mid-point of a cost-effectiveness estimate is not enough, you need to understand the sources of estimate error and the degree of estimate error relative to the degree of variation in the estimated cost-effectiveness of various interventions
  • Pascal's Mugging
    • Non-bayesian approaches to Pascal's Mugging say that, even if the analysis is wrong, are you certain that it's 99.99…% wrong?
    • However in many of these cases, the lion's share of variance in estimated expected value is coming from estimate error
    • A Bayesian adjustment would divide the expected value of the action by the estimate of the error in the expected value
    • The larger the expected value, the larger the estimated error, so extremely large EEV actions should end up affecting your choices the least
  • Generalizing the Bayesian approach
    • One needs to quantify both the appropriate prior for cost-effectiveness and the strength/confidence of an effectiveness estimate in order to quantify estimated cost-effectiveness
    • However, when it comes to giving, reasonable quantifications of these things usually aren't possible
    • To have a prior, you need to have a reference class and reference classes are debatable
    • Our brains process a huge amount of information to come up with priors from intuition
    • Attempting to formalize this reduces the amount of information you process, degrading the quality of your priors
    • When formulas are too rough, the loss of information outweighs the gains in transparency
    • Incorrect approaches to Bayesian estimates
      • "I have a weak or uninformative prior, so I can take rough estimates literally"
        • You have more information than you think you do
        • Even a sense of the consequences to actions in your own life gives you an "outside view" and a starting probability distribution for estimating the consequences to actions
      • Making "downward adjustments" to an EEV estimate
        • How do you tell whether the downward adjustment has the correct relationship to the weakness of the estimate, the strength of the prior and the distance of the estimate of the prior
        • As an extreme example, in the Pascal's Mugging case, applying a 99.99% downward estimate seems reasonable, but in fact the correct Bayesian adjustment is much larger
    • Heuristics used judge whether prior-based adjustments are correct
      • The more action is asked of me, the more evidence I require
        • Significant actions require more evidence than trivial actions
      • Pay attention to how much of the variation in estimates is likely to be driven by true variation rather than estimate error
        • When an estimate is so rough that estimation error occurs for the bulk of the observed variation, a proper Bayesian approach involves applying a massive discount to the estimate
      • Put more weight on conclusions which seem to be supported by multiple lines of analysis, preferably unrelated to one another
        • The less correlated the estimates, the greater the decline in the variance of estimate of error
        • Diversified reasons for believing something lead to more robust beliefs
      • Be hesitant to embrace arguments which have anti-common-sense implications (unless the evidence behind these claims is strong)
        • Too weak priors can lead to many seemingly absurd beliefs
        • Remove incentive for investigating strong claims
      • The prior for charity should be generally skeptical
        • Giving well is conceptually pretty difficult
        • The more we dig on cost-effectiveness estimates the more unwarranted optimism we discover
        • Optimistic priors incentivize giving to opaque charities, which violates common sense
        • Look for charities with strong evidence of effectiveness and reasonably high cost-effectiveness over charities with weaker evidence and very-high cost-effectiveness
  • Conclusion
    • Any giving approach that relies on estimated expected value is flawed
    • Thus when aiming to maximize positive impact, it's not advisable to make giving decisions based solely on explicit formulas
    • Proper Bayesian adjustments are important and difficult to formalize

Author: Rohit Patnaik

Created: 2019-02-18 Mon 14:19

Validate