Responses to severe weather warnings and affective decision-making

When public agencies provide information provision to help people make better decisions, they often face the choice between economy and completeness. For weather services warning people of high-impact weather events, this choice is between offering standard warnings (SWs) only of the weather event itself, such as wind-speed, or also describing the likely impacts (so-called impact-based warnings, IBWs). Previous studies have shown IBWs to lead to a greater behavioral response. These studies, however, have relied on surveys describing hypothetical weather events; given that participants did not feel threatened, they may have been more likely to process the warning slowly and analytically, which could bias the results towards finding a greater response to the IBWs. In this study, we conducted a field experiment involving actual and potentially threatening weather events for which there was variance with respect to the time interval between the warning and the forecasted event and for which we randomly assigned participants to receive SWs or IBWs. We observe that shorter time intervals led to a greater behavioral response, suggesting that fear of an imminent threat is an important factor motivating behavior. We observe that IBWs did not lead to greater rates of behavioral change than SWs, suggesting that when fear is a driving factor, the additional information in IBWs may be of little importance. We note that our findings are highly contextualized, but we call into question the prevailing belief that IBWs are necessarily more helpful than SWs.


Introduction
To the extent that people make decisions based on information, it would seem right that the more information they receive about a situation demanding potential action, and the earlier they receive it, the better they can adjust their behavior. However, there is evidence that people often make decisions based on their emotional response to information (Slovic et al., 2004(Slovic et al., , 2007. In such cases, more information is not necessarily better. Moreover, which decision-making pathway people utilize may depend on the context. However, this pathway is not necessarily self-exclusive and could involve the interaction of information-based reasoning and emotions (Kahneman, 2011). Here, we investigate the effectiveness of different kinds of information, as well as its timing, used to warn people about impending high-impact weather events. Our primary focus is on the difference between standard warnings (SWs), which describe the weather event itself, compared to impact-based warnings (IBWs), which, in addition, describe the impacts that result from the weather.
Research in social sciences has broadly accepted two ideas about human nature. The analytical or cognitive idea suggests that people make rational decisions based on formal logic, risk assessment and statistical probabilities, for instance on the impacts and likelihood of a hazard (Loewenstein et al., 2001;Slovic et al., 2004). This system is rather slow as it requires mental work which is effortful and orderly (Kahneman, 2011;Slovic et al., 2004). Affective decisionmaking relates to the importance of emotions and feelings in making decisions (Slovic et al., 2004). It operates automatically and fast with neither effort nor sense of voluntary con-P. Weyrich et al.: Responses to severe weather warnings and affective decision-making trol, although it is often influenced by beliefs or mental models about how the world works (Morgan et al., 2002;Slovic et al., 2004).
Research that has investigated whether feelings, information-based action or both influence people's behavior related to risks has primarily relied on laboratory studies. For example, scholars have used different messages to manipulate affect by increasing or decreasing the perceived benefits and risks of different technologies (Finucane et al., 2000). In two experiments, these researchers demonstrated that affect influenced judgments directly and was not simply a response to a prior deliberate evaluation. In only two studies, which we describe below, have researchers evaluated behavior under varying conditions of actual fear, something which cannot be simulated in a laboratory.
A real-world situation when the emotional decisionmaking pathway could dominate is the response to warnings of potentially life-threatening weather events, such as tornados or severe storms. Research that is based on informationbased decision-making has suggested that message content and style are important factors in determining whether people take self-protective behavioral responses to the extent that rational analysis would deem appropriate (Mileti and Sorensen, 1990). In order to be effective at inducing such behavior, a message should contain five information elements -hazard, location, time, guidance and source -which should each be addressed by five stylistic dimensions -specificity, consistency, accuracy, certainty and clarity (Mileti and Sorensen, 1990). A warning with these characteristics is easy to understand, to believe and to personalize for the recipient, which are identified as prerequisites for triggering behavioral change (Mileti and Peek, 2000). Thus, an IBW which provides more specific and clear information on the impacts of the hazard should help people to better understand the message compared to an SW. IBWs should also increase the personalization of risk and make people feel more concerned for their safety, resulting in stronger behavioral responses compared to SWs. For example, some people have difficulties in interpreting a "heavy" rainfall warning, indicating 100 mm of rain, into effective impacts. In this case, communicating specific impacts, for instance, on road and rail transport and possibilities of delays, ought to improve warning effectiveness. Interviews with forecasters, emergency managers and broadcast meteorologists (Harrison et al., 2014;Losego et al., 2013), as well as with officials from the public and private sectors (Weyrich et al., 2018), all reveal a widespread belief within the expert community that providing impact information creates an added value in the specific case of highimpact weather warnings.
Recent studies offer empirical support for this belief, although the results are somewhat mixed (Kox et al., 2018). For example, scholars showed that IBWs, compared to SWs, positively influenced the recipient's sense of threat and concern associated with a hypothetical event, as well as their understanding of the potential impacts (Morss et al., 2018;Pot-ter et al., 2018;Weyrich et al., 2018). More importantly, the IBW of the hypothetical event resulted in a greater likelihood of people planning to take self-protective action should such an event occur (Casteel, 2016;Morss et al., 2018;Weyrich et al., 2018). There have also been contradictory findings. One study detected no effect of IBWs on perception of warning credibility or on intended behavioral response (Perreault et al., 2014), while another study identified a threshold beyond which increasing the projected impact of a storm no longer significantly increased the probability of taking protective action (Ripberger et al., 2014). All of these empirical studies, however, share a common research design: they used hypothetical scenarios and relied on people's anticipated and intended reactions to study the effects of IBWs. For example, in one study of tornado warnings, the effectiveness of IBWs was examined with respondents being in the hypothetical role of a factory operator having to decide whether to order workers to take shelter in response to SWs and IBWs (Casteel, 2016). In another study, participants had to imagine that they were hiking in the Swiss mountains when they received a thunderstorm warning, and they then had to decide upon several intended actions; those receiving an IBW were more likely to modify their plans than those receiving an SW (Weyrich et al., 2018).
If indeed it is feelings that dominate behavioral decisionmaking in real-life situations, then it may be that these studies on the effectiveness of IBWs are poor predictors of actual behavior as it is unlikely that the respondents experienced real feelings of fear since they were not actually at risk. Two studies exist that have looked at actual self-protective behavior during a crisis suggest this to be the case. Researchers in Indonesia investigated evacuation behavior and intentions during tsunamis and observed that feelings, not rational evaluation, drive decision-making (McCaughey et al., 2020). Their findings suggest that under an imminent threat of life, information-based action may be absent or far less influential than feelings. Scholars from the Netherlands analyzed the behavioral effects of mobile fire warning messages (Gutteling et al., 2017). They found that emotions and the social environment were the main predictors for adaptive behavior. Even though perceived message quality was significant, other factors, such as perceived threat, were insignificant. These results confirm the importance of affective reactions as a driver for behavior.
If affective decision-making is the dominant pathway in real-world crises, then SWs may provide all the information that is needed to trigger the feelings of fear, while IBWs add no additional trigger. We speculate that hazard severity and warning lead time could also influence the response to weather warnings in different ways depending on the model of decision-making. If information-based action dominates, then more severe events and greater lead times should generate a greater behavioral response; longer lead times would translate into greater ease of preparation and actually taking self-protective behavioral responses. If affective decision-making dominates, however, more severe events and shorter lead times should increase response since the fear would be heightened at the time of the information reception.
There have been two studies examining the effect of warning lead time and one related to event magnitude. One of these examined tornado responses and showed that an increase in lead time of up to about 15 min reduces fatalities, while lead times longer than 15 min increase fatalities compared to no warning (Simmons and Sutter, 2008). The second study showed more generally that people have lead time preferences that do not always match with what the warning system offers and that they engage in different protective behavior depending on the lead time (Hoekstra et al., 2011). The one study examining event magnitude using a hypothetical survey design showed that the greater the severity, the more likely people were to take protective action (Kox and Thieken, 2017). Perceived severity of the hazard is also used in many decision-making theories. For instance, in protection motivation theory, it is one of four core perceptions that form the basis for decisions about how to respond to a threat (Maddux and Rogers, 1983).
In this paper, we report on results from a randomized control trial in which we disseminated wind warnings through an existing smartphone application of a Swiss weather provider (Wetter-Alarm) and collected real-time data on people's responses. The information that people received varied randomly in terms of being an SW or IBW and, given that there were a number of events for which the warnings were issued, in terms of both the warning lead time and the events' anticipated severity.

Materials and methods
The method used here was a large field experiment conducted in Switzerland which tested for effects of warning type, severity level and lead time on warning response. SWs and IBWs for wind were disseminated to users via the smartphone weather application (app) Wetter-Alarm. The application resulted out of cooperation between the GVB (House Insurance Bern) Services AG (joint-stock company which is responsible for the app) and SRF (Swiss Radio and Television) Meteo, which provides the weather (i.e., warnings for frost, thunderstorm, slipperiness, rain, snow and wind among others). The users could receive warnings for three severity levels: moderate (slight risk of damage), severe (increased risk of damage) and very severe (great risk of damage or even risk of death). The standard warnings disseminated in the Wetter-Alarm app included information about the type of hazard, its severity, the timing and location, as well as some general behavioral recommendations (e.g., secure lose items or avoid forests). Figure 1 shows a standard wind warning of medium severity. In Table 1, we list the general behavioral recommendations that were provided in both standard and impactbased warnings. It is important to note that most European Meteorological Services do not include generic behavioral recommendations in their standard warning (Kaltenberger et al., 2020). The impact-based warning included identical information as the SW but with additional impact information of the weather, which is shown in Table 2. We developed these messages based on publicly available information on impacts of wind in Switzerland and in close collaboration with the staff of Wetter-Alarm. A link was provided at the end of the warning message which directed participants to a short survey. The survey was available from the moment when the warning message was disseminated until the end of the event. We focused on severe wind due to its frequency, the time of the year (winter season) and the possibility to investigate different lead times. We collected data for two wind severity levels: moderate and severe. As this research involves research on humans, appropriate ethical procedures were followed, which were approved by the ethics commission of ETH Zurich. Participants voluntarily participated once they had been informed about the research project and signed a declaration of consent. They received no incentive to complete the survey.
A total of 3223 participants completed the online survey from 1 December 2018 to 10 February 2019. We excluded 611 people from the analysis as they are believed to have responded to a warning message with a different severity Do not make fire Avoid wind-exposed areas Be aware of falling objects Close windows Secure loose items Follow instructions of emergency services Drive slowly Avoid forests Seek protection in buildings level than was actually the case. This can be explained by the fact that the warning message they received initially was updated in the meantime (e.g., from a moderate to a severe level) or that the participants received multiple warning messages for different locations and got confused. Thus, to avoid any possible misinterpretation of data, the analysis was conducted with data from 2615 participants that indicated the correct severity level. As respondents were randomly assigned to either an SW or IBW, the subgroups are roughly even (1364 and 1247). However, more people responded to severe warning messages (n = 1667) than to moderate messages (n = 948). No very severe wind was observed. Warning lead times also differed, and people were grouped into three groups depending on when they looked at the warning message (i.e., participated in the survey): during the wind event itself (35.6 %, n = 932), in the 6 h preceding the wind event (17.1 %, n = 448) and prior to 6 h (47.2 %, n = 1235). On average, people responded to the survey 5.14 h in advance of the wind event.
Information about the basic sociodemographic characteristics of the sample is provided in Table 3. The sample matches the profile of the general Wetter-Alarm app user who is older (48.8 years) than the Swiss average (43.14 years), more often male (63.1 %) than female (Swiss average 49.5 % vs. 50.5 %) (FSO, 2017b) and slightly more educated than the Swiss population (FSO, 2017a). As the survey was conducted online based on active users of the app Wetter-Alarm, it did not reach people who did not download the application, who do not actively use the app, or who do not have internet access. People could only participate once in the survey, which was guaranteed by posing the question of whether they had already participated in a Wetter-Alarm survey recently.
In the survey, we asked questions on warning perception and subjective understanding. Perceptions that we measured using a five-point Likert scale from "totally disagree" to "totally agree" were credibility and concern. We measured three types of understanding: the warning, the threats to safety and how to respond. Then, we asked participants whether the weather described in the warning would pose a risk to them and whether it would affect them in carrying out their usual activities (e.g., commuting, working, shopping, etc.). If they answered yes, they continued with the survey. The following three questions were used to build the variable behavioral response. First, participants had to indicate whether they responded to the warning. If answered "yes", they had to indicate whether they adapted but continued with their activities or whether they canceled their activities (respectively taking other measures for protection). If answered "no", participants had to indicate whether they would not change their behavior or still planned to do so, i.e., adapting activities or canceling activities. Thus, we computed the variable behavioral response on a five-point scale (1= no action planned, 2 = plan to adapt, 3 = plan to protect, 4 = did adapt, 5 = did protect). We used this scale from no response to strongest risk-minimizing behavior as we believe that it catches more variance than only the binary question on whether people responded to the warning or not. Similar to other research (Gutteling et al., 2017), we used a number of questions to ask what kind of feelings the warning did trigger: relaxed, anxious, concerned, reassured and angry (five-point Likert scale from "not at all" to "very much"). These questions were used in other studies that investigated behavioral responses to emergencies (Gutteling et al., 2017;Kievik et al., 2012;Kievik and Gutteling, 2011) and thus seemed to be an appropriate measure also in this study's context. The items "relaxed" and "reassured" were inverted, and the scale yielded good internal consistency (Cronbach's alpha = 0.68, N = 5). We also gathered data on whether people consulted For the data analysis, we use standard statistical software (IBM SPSS 25) to conduct a factorial analysis of variance (ANOVA) to study the effects of warning type, severity level and lead time on behavioral response. In addition, we did a multiple regression analysis to investigate the effects of other covariables (e.g., warning perception and understanding) on behavior.

Results
We first describe the effects of warning type, lead time and event magnitude on participants' perception and subjective understanding. We summarize the mean values in the appendix. IBWs were not perceived to be more credible nor to be better understood in terms of the warning, the threats to safety and how to respond compared to SWs. People were only slightly more concerned for their safety when receiving an IBW. Participants' perception and understanding did not change with different lead times. However, people indicated higher perceived concern levels for severe compared to moderate warnings. Not surprisingly, people reported increased feelings with decreasing lead times and increasing severity levels.
To analyze the effects of warning type, severity level and lead time on behavior, we focus on those people who indicated the warning to be relevant and analyzed their behavior. A total of 54 % of people (n = 1426) reported that the warning message affected their personal safety, impacted their daily routine or both. The majority of those people already changed their behavior, either by adapting their activities (35.2 %) or by canceling them (25.7 %). Fewer people indicated that they still planned to adapt (22.7 %) or to cancel (6.9 %) their activities. Nine percent of people reported not changing their behavior even though the message was found to be relevant. We conducted a factorial ANOVA -2 (warning type) × 3 (lead time) × 2 (warning severity level) -predicting behavior, which showed no effect of warning type (p = 0.963) but effects of lead time (F (1, 1410) = 11.00, p<0.001, η 2 p = 0.02) and of severity level (F (1, 1410) = 12.21, p<0.001, η 2 p = 0.01). The Bon-ferroni post hoc test revealed that changing behavior was significantly lower for long lead times compared to short (p = 0.007) or no lead times (p<0.001). All interaction effects between any of the three variables (type, severity and time) on behavior were not significant (p values between 0.360 and 0.546). Figure 2 underlines that IBWs did not result in a greater behavioral response compared to SWs. However, as Fig. 3 highlights, lead time and warning severity significantly influenced people's decisions to change behavior; decreasing lead times and increasing severity level resulted in a greater response. We also observe that the differences in behavioral response between moderate and severe warnings are quite low for long lead times. This difference becomes more important for shorter lead times. However, the interaction is not significant (p = 0.360). In the next set of relationships, we examined what additional factors influence behavioral response. Specifically, we analyze the relationship between feelings, warning perception/understanding and behavioral action. Table 4 shows that irrespective of warning type received feelings (a unit increase in feelings leads to a 0.25 unit increase in changing behavior), perception of credibility (β = 0.134) and concern (β = 0.098), as well as understanding the threats (β = 0.193) and how to respond to the message (β = 0.154), significantly influence taking protective action. Moreover, age (β = 0.081) and information behavior (β = 0.100) showed significant positive effects. Thus, the more people felt in danger, the better they perceived or understood the message; the older they were and the more they looked for information, the more likely they were to undertake strong risk-minimizing behavior. The linear regression analysis again confirms the importance of lead time (p<.001) and warning severity (p<.01) on the behavior variable. With decreasing lead times, people are more likely to take protective action (a unit increase in lead time predicted a 0.154 unit decrease in changing behavior). For severe warnings, people were also more likely to change their behavior (by 0.073 unit) compared to people who received moderate severity warnings.

Discussion
This research investigates the effectiveness of impact information, as well as its timing, used in warning people about an imminent threat. Our results show that while IBWs result 0.250 0.062 0.122 * * * Information behavior 1 (no = 0; yes = 1) 0.690 0.179 0.100 * * * Lead time 3 (none = 0; short = 1; long = 2) −0.224 0.038 −0.154 * * * Warning severity level 1 (moderate = 0; severe = 1) 0.212 0.077 0.072 * * * < p.0.05. * * p < 0.01. * * * p<0.001. Significant results are in bold. B indicates the unstandardized coefficients, SE the standard error and β the standardized coefficients. Note that factors with a 1 are binary, and factors with a 2 were measured on a 1-5 scale and with a 3 on a 1-3 scale). Figure 2. Mean self-reported behavior in response to two warning types (standard and impact-based) and the three lead times (no, short and long) to the two severity levels (moderate and severe). Behavioral response was measured on a five-point scale from no response to strongest risk-minimizing behavior. For lead times, "no" indicates that respondents considered the warning during the event, "short" refers to 0-6 h prior to the event, and "long" refers to more than 6 h. Error bars indicate +/ − 1 the standard error (N = 1426). in no greater behavioral response, decreasing lead times and stronger severity level do increase responses. Taken together, these results suggest that affective decision-making appears to be the dominant mode of decision-making in real-world situations.
IBWs do not significantly impact warning perception and subjective understanding nor do they result in a greater behavioral response compared to SWs. This result contradicts the majority of previous studies that used hypothetical situations to collect their data (Casteel, 2016;Morss et al., 2018;Weyrich et al., 2018). We speculate that this difference in research findings can be explained by the different levels of fear experienced in a hypothetical and a real crisis. Unlike in an imagined situation when information-based action is the dominant factor, our findings suggest that in a crisis situation, real feelings of fear arise and dominate decision-making. We Figure 3. Mean self-reported behavior for all three lead times (no, short and long) and two severity levels (moderate and severe). Behavioral response was measured on a five-point scale from no response to strongest risk-minimizing behavior. For lead times, "no" indicates that respondents considered the warning during the event, "short" refers to 0-6 h prior to the event, and "long" refers to more than 6 h. Error bars indicate +/ − 1 the standard error (N = 1426).
assume that SWs provide all the information that is needed to trigger the feeling of fear. Indeed, IBWs may leave less to the imagination of the recipient, which could -in some cases -dampen the fear response.
Our results on the effects of lead time and hazard severity are also consistent with affective reactions. We observe lead time and self-protective behavior to be inversely correlated, find that increasing lead times decrease the likelihood of engaging in a greater behavioral response, and observe the greatest response when the event has already unfolded. These results complement other research on different lead times for tornado warnings (Hoekstra et al., 2011). We also show that stronger events generate a greater response than weaker events, which is in line with previous research (Kox and Thieken, 2017). Moreover, we observe that longer lead times do not generate a greater additional response to stronger rather than weaker events. This interaction (even though not significant) is in line with an affective reaction: with long lead times, the additional fear associated with the stronger event may dissipate, meaning that the stronger events would generate little more of a response than weaker events.
These findings support scholars who reached a similar conclusion when investigating evacuation behavior following a strong earthquake (McCaughey et al., 2020). Nonetheless, cognitive factors, such as warning perception and understanding, can also influence decision-making. In our study, four of these information-based attributes correlate with changing behavior and thus seem to be obvious prerequisites for behavior (Gutteling et al., 2017). Indeed, the two decision-making pathways should not be seen as independent systems; they can interact and influence each other as the rational process can modify, to some extent, the way we make intuitive and affective decisions by changing the normally automatic functions of attention and memory (Kahneman, 2011). The research also shows that the two systems are not always self-exclusive; for instance, when people are asked to judge risk, they first consider how they feel about the risk and then collect further information, usually to support their feelings (Slovic et al., 2004). Therefore, further empirical studies of real-world crises are needed to understand if and how feelings and information-based action interact to influence people's behavior to risks.

Conclusions
We conclude that practitioners cannot assume that additional impact-based information necessarily results in a greater behavioral response in real-world crises. Appropriate lead times and communication that addresses the decisionmakers' feelings (e.g., by relying on images) may be more beneficial and result in a stronger behavioral response. Ultimately, the results show that people may respond differently in the field than in a scenario-based experiment based on more affective or rational decision-making, respectively. This has serious implications for future research, emphasizing that we should examine responses to risks using research designs that capture realistic conditions and be cautious in interpreting results from hypothetical research designs as these could be a poor predictor of actual behavior.
The research has some limitations. One shortcoming of this study is the absence of a very severe wind event in the winter season 2018/19 in Switzerland, and additional data should be collected for these events too. Indeed, most of the research on IBWs used hypothetical warning messages of the most severe category as people are least familiar with these messages, and, thus, the added information could help them in decision-making. In consequence, the difference in the results on the effectiveness of IBWs in our study and previous studies could also be due -to some extent -to the differences in event severity levels. Moreover, participants were self-selected as they had downloaded the weather app and decided whether or not to participate in the survey. This may indicate higher levels of weather awareness and knowledge, which could also be another explanation for the lack of effect of warning type. Another limitation is that, even though we collect data on actual behavior in response to real-life warnings, these were still self-reported. This may indicate higher levels of weather awareness and knowledge, which could also be another explanation for the lack of effect of warning type. There is a dearth of literature on the effects of such self-selection in social science research, though ideally researchers would design field experiments in which selfselection is not present. Thus, additional research could analyze whether these results are also valid for other natural hazards, as well as for different time periods of the year.
Also, we should be cautious in generalizing the results as these are somehow contextually dependent. The provision of rather little additional information in the warning message might be another reason why, in the field experiment, IBWs did not result in a greater behavioral response compared to SWs. It could be that SWs without behavioral recommendations and IBWs with stronger language and richer impact descriptions could have resulted in different findings.
Appendix A Table A1. Descriptive statistics. M is mean, and SE is standard error. Variables were measured on a five-point scale from 1 (totally disagree) to 5 (totally agree). Data availability. In the research design that we originally submitted to our ethical commission (equivalent to an internal review board), we had stated that all data would be deleted from ETH computers after the end of the project, but they would be stored on servers at our partner (Wetter-Alarm) and would be potentially used to improve the design of their mobile application. Thus, interested researchers should contact us, and we should be able to work with Wetter-Alarm to provide the data requested.