paper, we provide a comprehensive survey of causal relation pressed using many different types of propositions (e.g., active, .. text by screening out all the non-causative sentences in the test build a knowledge base since it has broad coverage and is ..  use Conditional Random Fields to. many different distributions are possible, making it difficult to test . tinction between these kinds of causal relationships, more recent research .. the preventive condition were similar, except that for the second judgment, the instructions were: . gating these studies does not necessarily result in a balanced coverage of all. Each sentence conveys a causal relation, but pig- gybacks it on a related types that overlap with causality: temporal, cor- relation coverage linguistic approach was taken by Mirza and Tonelli . The temporal order test: Is the cause asserted to precede the . sion: even including instances of even if, 77% of conditional.
The DAG is shown in Figure 5. Conditions AC 1 and AC 2 are clearly satisfied. In words, this counterfactual says that if neither Billy nor Suzy had thrown their rock, the window would not have shattered. Thus condition AC 3a is satisfied. Here is how AC works in this example. S influences W along two different paths: These two paths interact in such a way that they cancel each other out, and the value of S makes no net difference to the value of W.
However, by holding B fixed at its actual value of 0, we eliminate the influence of S on W along that path. The result is that we isolate the contribution that S made to W along the direct path.
AC defines actual causation as a particular kind of path-specific effect. To treat Overdetermination, let B, S, and W keep the same meanings. Our setting and equation will be: Conditions AC 1 and AC 2 are obviously satisfied. The key idea here is that S is a member of a minimal set of variables that need to be changed in order to change the value of W. Despite these successes, none of the analyses of actual causation developed so far perfectly captures our pre-theoretic intuitions in every case.
One strategy that has been pursued by a number of authors is to incorporate some distinction between default and deviant values of variables, or between normal and abnormal conditions.
Probabilistic Causal Models In this section, we will discuss causal models that incorporate probability in some way. Probability may be used to represent our uncertainty about the value of unobserved variables in a particular case, or the distribution of variable values in a population. Often we are interested in when some feature of the causal structure of a system can be identified from the probability distribution over values of variables, perhaps in conjunction with background assumptions and other observations.
In realistic scientific cases, we never directly observe the true probability distribution P over a set of variables. Rather, we observe finite data that approximate the true probability when sample sizes are large enough and observation protocols are well-designed. We will not address these important practical concerns here.
Rather, our focus will be on what it is possible to infer from probabilities, in principle if not in practice.
- Causal Models
We will also consider the application of probabilistic causal models to decision theory and counterfactuals. The assumption that each endogenous variable has exactly one error variable is innocuous. Moreover, the error variables need not be distinct or independent from one another. Let us take some time to explain each of these formulations. MCd-separation introduces the graphical notion of d-separation.
Note that MC provides sufficient conditions for variables to be probabilistically independent, conditional on others, but no necessary condition. Here are some illustrations: W also screens T off from all of the other variables, which is most easily seen from MCd-separation. T does not necessarily screen Y off from Z or indeed anything from anything.
Figure 7 In Figure 7MC entails that X and Z will be unconditionally independent, but not that they will be independent conditional on Y. This is most easily seen from MCd-separation. We will discuss this kind of case in more detail in Section 4. For example, MC will typically fail in the following kinds of case: If X represents a spin measurement on one particle, Y a spin measurement in the same direction on the other, then X and Y are perfectly anti-correlated.
One particle will be spin-up just in case the other is spin-down. The measurements can be conducted sufficiently far away from each other that it is impossible for one outcome to causally influence the other. However, it can be shown that there is no local common cause Z that screens off the two measurement outcomes.
For example, suppose that X, Y, and Z are variables that are probabilistically independent and causally unrelated. Then U and W will be probabilistically dependent, even though there is no causal relation between them. MC may fail if the variables are too coarsely grained. Suppose X, Y, and Z are quantitative variables. Z is a common cause of X and Y, and neither X nor Y causes the other.
The statements are in fact quite different from one another. Pearl interprets this result in the following way: Macroscopic systems, he believes, are deterministic. In practice, however, we never have access to all of the causally relevant variables affecting a macroscopic system. But if we include enough variables in our model so that the excluded variables are probabilistically independent of one another, then our model will satisfy the MC, and we will have a powerful set of analytic tools for studying the system.
Thus MC characterizes a point at which we have constructed a useful approximation of the complete system. They defend this assumption in at least two ways: Empirically, it seems that a great many systems do in fact satisfy MC.
Many of the methods that are in fact used to detect causal relationships tacitly presuppose the MC. In particular, the use of randomized trials presupposes a special case of the MC.
The effect of randomization is to eliminate all of the parents of D, so MC tells us that if R is not a descendant of D, then R and D should be probabilistically independent. If we do not make this assumption, how can we infer from the experiment that D is a cause of R?
Hausman and Woodwardattempt to defend MC for indeterministic systems. As such, the MC by itself can never entail that two variables are conditionally or unconditionally dependent. The Minimality and Faithfulness Conditions are two conditions that give necessary conditions for probabilistic independence. This is employing the terminology of Spirtes et al.
This graph would satisfy the MC with respect to P: But this graph would violate the Minimality Condition with respect to P, since the subgraph that omits the arrow from X to Y would also satisfy the MC.
The Minimality Condition implies that if there is an arrow from X to Y, then X makes a probabilistic difference for Y, conditional on the other parents of Y. Suppose also that X and Z are unconditionally independent of one another, but dependent, conditional upon Y.
The other two variable pairs are dependent, both conditionally and unconditionally. The graph shown in Figure 8 does not satisfy FC with respect to this distribution colloquially, the graph is not faithful to the distribution. MC, when applied to the graph of Figure 8, does not imply the independence of X and Z.
This can be seen by noting that X and Z are d-connected by the empty set: By contrast, the graph shown in Figure 7 above is faithful to the described distribution. Note that Figure 8 does satisfy the Minimality Condition with respect to the distribution; no subgraph satisfies MC with respect to the described distribution. In fact, FC is strictly stronger than the Minimality Condition. Figure 8 Here are some other examples: FC can fail if the probabilistic parameters in a causal model are just so.
In Figure 8for example, X influences Z along two different directed paths. If the effect of one path is to exactly undo the influence along the other path, then X and Z will be probabilistically independent.
Causal Models (Stanford Encyclopedia of Philosophy)
If the underlying SEM is linear, Spirtes et al. Nonetheless, parameter values leading to violations of FC are possible, so FC does not seem plausible as a metaphysical or conceptual constraint upon the connection between causation and probabilities. It is, rather, a methodological principle: This is not because Figure 8 is conclusively ruled out by the distribution, but rather because it is preferable to postulate a causal structure that implies the independence of X and Z rather than one that is merely consistent with independence.
See Zhang and Spirtes for comprehensive discussion of the role of FC. Violations of FC are often detectable in principle. For example, suppose that the true causal structure is that shown in Figure 7and that the probability distribution over X, Y, and Z exhibits all of the conditional independence relations required by MC.
Suppose, moreover, that X and Z are independent, conditional upon Y. This conditional independence relation is not entailed by MC, so it constitutes a violation of FC. It turns out that there is no DAG that is faithful to this probability distribution. This tips us off that there is a violation of FC.
While we will not be able to infer the correct causal structure, we will at least avoid inferring an incorrect one in this case. Researchers have explored the consequences of adopting a variety of assumptions that are weaker than FC; see for example Ramsey et al. This epistemological question is closely related to the metaphysical question of whether it is possible to reduce causation to probability as, e.
Chapter 3 proves the following theorem: It is relatively easy to see why this holds. Suppose our probability distribution has the following properties: X and Y are dependent unconditionally, and conditional on Z Y and Z are dependent unconditionally, and conditional on X X and Z are dependent unconditionally, but independent conditional on Y Then the Markov equivalence class is: On the other hand, suppose the probability distribution is as follows: X and Y are dependent unconditionally, and conditional on Z Y and Z are dependent unconditionally, and conditional on X X and Z are independent unconditionally, but dependent conditional on Y Then the Markov equivalence class is: In other words, our probabilistic SEM will generate a unique causal Bayes net.
These methods can do no better than identifying the Markov equivalence class. Can we do better by making use of additional information about the probability distribution P, beyond relations of dependence and independence? There is good news and there is bad news. First the bad news.
Now for the good news. There are fairly general assumptions that allow us to infer a good deal more. Here are some fairly simple cases: LiNGaM Shimizu et al.
Non-linear additive Hoyer et al. While there are specific assumptions behind these results, they are nonetheless remarkable. They entail, for example, that given the assumptions of the theorems knowing only the probability distribution on two variables X and Y, we can infer whether X causes Y or Y causes X. As we noted in Section 2.
For example, the acyclic directed mixed graph in Figure 9 represents a latent common cause of X and Z. More generally, we can use an ADMG like Figure 9 to represent that the error variables for X and Z are not probabilistically independent.
However, we would expect X and Z to be correlated, even when we condition on Y, due to the latent common cause.
The problem is that the graph is missing a relevant parent of Z, namely the omitted common cause. If we allow that the correct causal graph may be an ADMG, we can still apply MCd-separation, and ask which graphs imply the same sets of conditional independence relations. The Markov equivalence class will be larger than it was when we did not allow for latent variables. X and Y are dependent unconditionally, and conditional on Z Y and Z are dependent unconditionally, and conditional on X X and Z are independent unconditionally, but dependent conditional on Y We saw in Section 4.
Latent variables present a further complication. This means that we may be able to rule out some of the ADMGs in the Markov equivalence class using different kinds of probabilistic constraints.
Often, however, we are interested in predicting the value of Y that will result if we intervene to set the value of X equal to some particular value x.
What is the difference between observation and intervention? But proximate cause is still met if a thrown baseball misses the target and knocks a heavy object off a shelf behind them, which causes a blunt-force injury.
This is also known as the "extraordinary in hindsight" rule. The main thrust of direct causation is that there are no intervening causes between an act and the resulting harm. An intervening cause has several requirements: Direct causation is the only theory that addresses only causation and does not take into account the culpability of the original actor.
If the action were repeated, the likelihood of the harm would correspondingly increase. This is also called foreseeable risk. Harm within the risk[ edit ] The harm within the risk HWR test determines whether the victim was among the class of persons who could foreseeably be harmed, and whether the harm was foreseeable within the class of risks.
It is the strictest test of causation, made famous by Benjamin Cardozo in Palsgraf v. Long Island Railroad Co. For example, a pedestrian, as an expected user of sidewalks, is among the class of people put at risk by driving on a sidewalk, whereas a driver who is distracted by another driver driving on the sidewalk, and consequently crashes into a utility pole, is not.
When it is used, it is used to consider the class of people injured, not the type of harm. The main criticism of this test is that it is preeminently concerned with culpability, rather than actual causation. Two examples will illustrate this principle: The classic example is that of a father who gives his child a loaded gun, which she carelessly drops upon the plaintiff's foot, causing injury.
The plaintiff argues that it is negligent to give a child a loaded gun and that such negligence caused the injury, but this argument fails, for the injury did not result from the risk that made the conduct negligent. The risk that made the conduct negligent was the risk of the child accidentally firing the gun; the harm suffered could just as easily have resulted from handing the child an unloaded gun.
The story is that during the lunch rush, the can explodes, severely injuring the chef who is preparing food in the kitchen.
The chef sues the owner for negligence. The chef may not recover.