python图表之pygal入门篇.docx

资源描述

python图表之pygal入门篇.docx

《python图表之pygal入门篇.docx》由会员分享，可在线阅读，更多相关《python图表之pygal入门篇.docx（11页珍藏版）》请在冰豆网上搜索。

python图表之pygal入门篇.docx

python图表之pygal入门篇

例子来自此书:

《Python编程从入门到实战》【美】EricMatthes

pygal是一个SVG图表库。

SVG是一种矢量图格式。

全称ScalableVectorGraphics--可缩放矢量图形。

用浏览器打开svg，可以方便的与之交互。

以下代码均在JupyterNotebook中运行

模拟掷骰子

来看一个简单的例子。

它模拟了掷骰子。

importrandom

classDie:

'''

一个骰子类

'''

def__init__（self,num_sides=6）:

self.num_sides=num_sides

defroll（self）:

returnrandom.randint（1,self.num_sides）

模拟掷骰子并可视化

importpygal

die=Die（）

result_list=[]

#掷1000次

forroll_numinrange（1000）:

result=die.roll（）

result_list.append（result）

frequencies=[]

#范围1~6，统计每个数字出现的次数

forvalueinrange（1,die.num_sides1）:

frequency=result_list.count（value）

frequencies.append（frequency）

#条形图

hist=pygal.Bar（）

hist.title='ResultsofrollingoneD61000times'

#x轴坐标

hist.x_labels=[1,2,3,4,5,6]

#x、y轴的描述

hist.x_title='Result'

hist.y_title='FrequencyofResult'

#添加数据，第一个参数是数据的标题

hist.add（'D6',frequencies）

#保存到本地，格式必须是svg

hist.render_to_file（'die_visual.svg'）

使用浏览器打开这个文件，鼠标指向数据，可以看到显示了标题“D6”，x轴的坐标以及y轴坐标。

可以发现，六个数字出现的频次是差不多的（理论上概率是1/6，随着实验次数的增加，趋势越来越明显）同时掷两个骰子

稍微改下代码就行，再实例化一个骰子

die_1=Die（）

die_2=Die（）

result_list=[]

forroll_numinrange（5000）:

#两个骰子的点数和

result=die_1.roll（）die_2.roll（）

result_list.append（result）

frequencies=[]

#能掷出的最大数

max_result=die_1.num_sidesdie_2.num_sides

forvalueinrange（2,max_result1）:

frequency=result_list.count（value）

frequencies.append（frequency）

#可视化

hist=pygal.Bar（）

hist.title='ResultsofrollingtwoD6dice5000times'

hist.x_labels=[xforxinrange（2,max_result1）]

hist.x_title='Result'

hist.y_title='FrequencyofResult'

#添加数据

hist.add（'twoD6',frequencies）

#格式必须是svg

hist.render_to_file（'2_die_visual.svg'）

从图中可以看出，两个骰子之和为7的次数最多，和为2的次数最少。

因为能掷出2的只有一种情况->（1,1）;而掷出7的情况有（1,6）,（2,5）,（3,4）,（4,3）,（5,2）,（6,1）共6种情况，其余数字的情况都没有7的多，故掷得7得概率最大。

处理json数据--世界人口地图

需要用到人口数据

点击这里下载population.json，该数据来源于okfn.org这个网站

打开看下数据，其实这是个很长的列表，包含了许多国家从1960~2015年的人口数据。

看第一数据，如下。

后面的数据和第一个键都一样。

[

{

'CountryName':

'ArabWorld',

'CountryCode':

'ARB',

'Year':

'1960',

'Value':

'92496099'

...

只有四个键，其中CountryCode指的是国别码，这里是3位的。

Value就是人口数了。

importjson

filename=r'F:

\JupyterNotebook\matplotlib_pygal_csv_json\population.json'

withopen（filename）asf:

#json.load（）可以将json文件转为Python能处理的形式，这里位列表，列表里是字典

pop_data=json.load（f）

cc_populations={}

forpop_dictinpop_data:

ifpop_dict['Year']=='2015':

country_name=pop_dict['CountryName']

#有些值是小数，先转为float再转为int

population=int（float（pop_dict['Value']））

print（country_name':

'population）

上面的程序打印了2015年各个国家的人口数，当然要分析2014年的，代码中数字改改就行。

ArabWorld:

392168030

Caribbeansmallstates:

7116360

CentralEuropeandtheBaltics:

103256779

Early-demographicdividend:

3122757473.68203

EastAsia&Pacific:

2279146555

...

需要注意的是，人口数据有些值是小数（不可思议）。

人口数据类型是字符串str，如果直接转int，像'35435.12432'这样的字符串是不能强转位int的，必须先转为float，再丢失精度转为int。

获取两个字母的国别码

我们的数据中，国别码是三位的，而pygal的地图工具使用两位国别码。

要使用pygal绘制世界地图。

需要安装依赖包。

pipinstallpygal_maps_world就可以了

国别码位于i18n模块

frompygal_maps_world.i18nimportCOUNTRIES这样就导入了,COUNTRIES是一个字典，键是两位国别码，值是具体国家名。

key->value

afAfghanistan

alAlbania

dzAlgeria

adAndorra

aoAngola

写一个函数，根据具体国家名返回pygal提供的两位国别码

defget_country_code（country_name）:

'''

根据国家名返回两位国别码

'''

forcode,nameinCOUNTRIES.items（）:

ifname==country_name:

returncode

returnNone

世界人口地图绘制

先给出全部代码，需要用到World类

importjson

frompygal_maps_world.i18nimportCOUNTRIES

frompygal_maps_world.mapsimportWorld

#颜色相关

frompygal.styleimportRotateStyle

frompygal.styleimportLightColorizedStyle

defget_country_code（country_name）:

'''

根据国家名返回两位国别码

'''

forcode,nameinCOUNTRIES.items（）:

ifname==country_name:

returncode

returnNone

filename=r'F:

\JupyterNotebook\matplotlib_pygal_csv_json\population.json'

withopen（filename）asf:

pop_data=json.load（f）

cc_populations={}

forpop_dictinpop_data:

ifpop_dict['Year']=='2015':

country_name=pop_dict['CountryName']

#有些值是小数，先转为float再转为int

population=int（float（pop_dict['Value']））

code=get_country_code（country_name）

ifcode:

cc_populations[code]=population

#为了使颜色分层更加明显

cc_populations_1,cc_populations_2,cc_populations_3={},{},{}

forcc,populationincc_populations.items（）:

ifpopulation10000000:

cc_populations_1[cc]=population

elifpopulation1000000000:

cc_populations_2[cc]=population

else:

cc_populations_3[cc]=population

wm_style=RotateStyle（'#336699',base_style=LightColorizedStyle）

world=World（style=wm_style）

world.title='WorldPopulationsin2015,ByCountry'

world.add（'0-10m',cc_populations_1）

world.add（'10m-1bn',cc_populations_2）

world.add（'>1bn',cc_populations_3）

world.render_to_file（'world_population_2015.svg'）

有几个变量比较重要

cc_populations是一个dict，里面存放了两位国别码与人口的键值对。

cc_populations_1,cc_populations_2,cc_populations_3这是3个字典，把人口按照数量分阶梯，人口一千万以下的存放在cc_populations_1中，一千万~十亿级别的存放在cc_population

展开阅读全文