ImageVerifierCode 换一换
格式:DOCX , 页数:15 ,大小:150.44KB ,
资源ID:6400801      下载积分:3 金币
快捷下载
登录下载
邮箱/手机:
温馨提示:
快捷下载时,用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)。 如填写123,账号就是123,密码也是123。
特别说明:
请自助下载,系统不会自动发送文件的哦; 如果您已付费,想二次下载,请登录后访问:我的下载记录
支付方式: 支付宝    微信支付   
验证码:   换一换

加入VIP,免费下载
 

温馨提示:由于个人手机设置不同,如果发现不能下载,请复制以下地址【https://www.bdocx.com/down/6400801.html】到电脑端继续下载(重复下载不扣费)。

已注册用户请登录:
账号:
密码:
验证码:   换一换
  忘记密码?
三方登录: 微信登录   QQ登录  

下载须知

1: 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。
2: 试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。
3: 文件的所有权益归上传用户所有。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 本站仅提供交流平台,并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

版权提示 | 免责声明

本文(R语言数据分析报告美国天气事件对人员伤亡和经济损失的影响 附代码数据.docx)为本站会员(b****5)主动上传,冰豆网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知冰豆网(发送邮件至service@bdocx.com或直接QQ联系客服),我们立即给予删除!

R语言数据分析报告美国天气事件对人员伤亡和经济损失的影响 附代码数据.docx

1、R语言数据分析报告美国天气事件对人员伤亡和经济损失的影响 附代码数据R语言数据分析报告:美国天气事件对人员伤亡和经济损失的影响 概要这个分析的重点是回答两个问题:1)在美国各地,哪类事件对人群健康危害最大; 2)在整个美国,哪类事件具有最大的经济后果? 为了应对这些问题,使用美国国家气象局在1950年至2011年在美国所有州收集的数据进行了一些分析。 分析由两个主要维度构成:时间和地理(在州一级)。 使用这两个维度作为支点,三个度量被汇总以按类型度量事件的影响; 即:a)人员伤亡; b)财产损失; 和c)作物损失。 这些结果提供的见解,可能有助于地方长官采取预防措施,以减少在他们的地理区域盛行

2、的天气事件的影响。 数据处理 # - Setup - #library(dplyr)library(ggplot2)library(lubridate)library(knitr)# - Constants Definition - #RECENCY_SPAN_IN_YEARS - 10 # Last X years Top Events by frequency, by geographic area C_NOT_DEFINED_STR - NOT DEFINEDC_NOT_DEFINED_INT - -1 -阶段1:加载源数据- -setwd(/Users/prosales/Documen

3、ts/Capacitaciones/Certificaciones/Coursera DS Certificate - Course 5 - Reproducible Research/Final Project/)natural_events_df - read.csv(repdata%2Fdata%2FStormData.csv.bz2)state_geocodes_df - read.csv(state-geocodes-v2015.csv)# -阶段2:数据准备:增强和重组- -regions_df % filter(division = 0 & state_fips = 0) % s

4、elect(region, name)colnames(regions_df) - c(region_id, region_name)divisions_df % filter(division != 0 & state_fips = 0) % select(division, name)colnames(divisions_df) - c(division_id, division_name)states_df % filter(state_fips != 0) % select(region, division, state_fips, name)colnames(states_df) -

5、 c(region_id, division_id, state_id, state_name)complete_geography_df - merge(states_df, regions_df, by = region_id)complete_geography_df - merge(complete_geography_df, divisions_df, by = division_id)complete_geography_df % select(region_id, region_name, division_id, division_name, state_id, state_n

6、ame)geography_structured_events_df - merge(natural_events_df, complete_geography_df, by.x = STATE_, by.y = state_id, all.x = TRUE)geography_structured_events_df % mutate(region_name = as.character(region_name)geography_structured_events_df % mutate(division_name = as.character(division_name)geograph

7、y_structured_events_df % mutate(state_name = as.character(state_name)geography_structured_events_df % mutate(region_name = replace(region_name, is.na(region_name), C_NOT_DEFINED_STR)geography_structured_events_df % mutate(division_name = replace(division_name, is.na(division_name), C_NOT_DEFINED_STR

8、)geography_structured_events_df % mutate(state_name = replace(state_name, is.na(state_name), C_NOT_DEFINED_STR)geography_structured_events_df % mutate(region_id = replace(region_id, is.na(region_id), C_NOT_DEFINED_INT)geography_structured_events_df % mutate(division_id = replace(division_id, is.na(d

9、ivision_id), C_NOT_DEFINED_INT)geography_structured_events_df % mutate(BGN_DATE = as.Date(BGN_DATE, format =%m/%d/%Y)# -第三阶段:按地理区域划分的频率,历史最高事件- -events_frequency_by_geography_df % count(region_name, state_name, EVTYPE)top_events_by_geography_df % group_by(region_name, state_name) % mutate(my_rank =

10、rank(desc(n) % filter(my_rank = 3)top_events_by_geography_df - top_events_by_geography_dfwith(top_events_by_geography_df, order(region_name, state_name, my_rank), max_dates_by_geography_df % filter(!is.na(BGN_DATE) % filter(is.Date(BGN_DATE) % group_by(region_name, state_name) % summarise(max_date =

11、 max(BGN_DATE) % mutate(event_date_lower_bound = max_date - years(RECENCY_SPAN_IN_YEARS)last_X_years_events_by_geography_df % filter(BGN_DATE event_date_lower_bound) % select(region_name, state_name, EVTYPE, BGN_DATE, event_date_lower_bound)last_X_years_events_frequency_by_geography_df % count(regio

12、n_name, state_name, EVTYPE)top_events_in_last_X_years_events_frequency_by_geography_df % group_by(region_name, state_name) % mutate(my_rank = rank(desc(n) % filter(my_rank = 3)top_events_in_last_X_years_events_frequency_by_geography_df - top_events_in_last_X_years_events_frequency_by_geography_dfwit

13、h(top_events_in_last_X_years_events_frequency_by_geography_df, order(region_name, state_name, my_rank), # -第四阶段:地理区域致命事件- -fatalities_by_event_type_by_geography_df % filter(!is.na(FATALITIES) % group_by(region_name, state_name, EVTYPE) % summarise(sum(FATALITIES)colnames(fatalities_by_event_type_by_

14、geography_df) - c(region_name, state_name, EVTYPE, fatalities_sum)top_deadliest_events_types_by_geography_df % mutate(my_rank = rank(desc(fatalities_sum) % filter(my_rank = 3)top_deadliest_events_types_by_geography_df - top_deadliest_events_types_by_geography_dfwith(top_deadliest_events_types_by_geo

15、graphy_df, order(region_name, state_name, my_rank), # -阶段5:按地理区域造成大部分财产损失的事件类型- - Stage 5: Events types that cause most property losses by geographic area - #property_losses_by_event_type_by_geography_df % filter(!is.na(PROPDMG) % group_by(region_name, state_name, EVTYPE) % summarise(sum(PROPDMG)col

16、names(property_losses_by_event_type_by_geography_df) - c(region_name, state_name, EVTYPE, property_losses_sum)top_property_costly_events_types_by_geography_df % mutate(my_rank = rank(desc(property_losses_sum) % filter(my_rank = 3)top_property_costly_events_types_by_geography_df - top_property_costly

17、_events_types_by_geography_dfwith(top_property_costly_events_types_by_geography_df, order(region_name, state_name, my_rank), top_property_costly_events_types_by_geography_df % mutate(property_losses_sum = property_losses_sum / 1000)# - Stage 6: Events types that cause most crop losses by geographic

18、area - #crop_losses_by_event_type_by_geography_df % filter(!is.na(CROPDMG) % group_by(region_name, state_name, EVTYPE) % summarise(sum(CROPDMG)colnames(crop_losses_by_event_type_by_geography_df) - c(region_name, state_name, EVTYPE, crop_losses_sum)top_crop_costly_events_types_by_geography_df % mutat

19、e(my_rank = rank(desc(crop_losses_sum) % filter(my_rank = 3)top_crop_costly_events_types_by_geography_df - top_crop_costly_events_types_by_geography_dfwith(top_crop_costly_events_types_by_geography_df, order(region_name, state_name, my_rank), top_crop_costly_events_types_by_geography_df % mutate(cro

20、p_losses_sum = crop_losses_sum / 1000)# - Stage 7: Events occurence by geographic area by month, during the last X years recorded - #last_X_years_events_frequency_by_month_by_geography_df % count(region_name, state_name, EVTYPE, event_month)top_events_in_last_X_years_by_month_by_geography_df - merge

21、(last_X_years_events_frequency_by_month_by_geography_df, top_events_in_last_X_years_events_frequency_by_geography_df, by.x = c(state_name, EVTYPE), by.y = c(state_name, EVTYPE) )top_events_in_last_X_years_by_month_by_geography_df % select(region_name.y, state_name, EVTYPE, event_month, n.x)colnames(

22、top_events_in_last_X_years_by_month_by_geography_df) - c(region_name, state_name, EVTYPE, event_month, n)# - Stage 8: Deadliest events by geographic area by month - #fatalities_by_geography_by_event_type_by_month_df % mutate(event_month = month(BGN_DATE) % filter(!is.na(FATALITIES) % group_by(region

23、_name, state_name, EVTYPE, event_month) % summarise(sum(FATALITIES)colnames(fatalities_by_geography_by_event_type_by_month_df) - c(region_name, state_name, EVTYPE, event_month, fatalities_sum)top_fatalities_by_geography_by_event_type_by_month_df - merge(fatalities_by_geography_by_event_type_by_month

24、_df, top_deadliest_events_types_by_geography_df, by.x = c(state_name, EVTYPE), by.y = c(state_name, EVTYPE)top_fatalities_by_geography_by_event_type_by_month_df % select(region_name.x, state_name, EVTYPE, event_month, fatalities_sum.x)colnames(top_fatalities_by_geography_by_event_type_by_month_df) -

25、 c(region_name, state_name, EVTYPE, event_month,fatalities_sum)# - Stage 9: Events types that cause most PROPERTY losses by geographic area, by month - #property_losses_by_geography_by_event_type_by_month_df % mutate(event_month = month(BGN_DATE) % filter(!is.na(PROPDMG) % group_by(region_name, state_name, EVTYPE, event_month) % summarise(sum(PROPDMG)colnames(property_losses_by_geography_by_event_type_by_month_df) - c(region_name

copyright@ 2008-2022 冰豆网网站版权所有

经营许可证编号:鄂ICP备2022015515号-1