《python数据分析(第2版)-阿曼多.凡丹戈》第1章读书笔记

时间:2020-04-01
本文章向大家介绍《python数据分析(第2版)-阿曼多.凡丹戈》第1章读书笔记,主要包括《python数据分析(第2版)-阿曼多.凡丹戈》第1章读书笔记使用实例、应用技巧、基本知识点总结和需要注意事项,具有一定的参考价值,需要的朋友可以参考一下。

python数据分析(第2版)-阿曼多.凡丹戈》第1章读书笔记

python学习笔记-目录索引

第1章“Python程序库入门”手把手地指导读者正确安装配置Python和基础的Python数值分析软件库。同时,本章还会展示如何通过NumPy创建一个小程序以及如何利用Matplotlib来绘制简单的图形。

首先:需要了解的是Python生态系统为数据分析师和数据科学家提供的常用程序库

☆☆☆☆☆NumPy:这是一个通用程序库,不仅支持常用的数值数组,同时提供了用于高效处理这些数组的函数。

☆☆☆☆☆SciPy:这是Python的科学计算库,对NumPy的功能进行了大量扩充,同时也有部分功能是重合的。Numpy和SciPy曾经共享基础代码,后来分道扬镳了。

☆☆☆☆☆Pandas:这是一个用于数据处理的程序库,不仅提供了丰富的数据结构,同时为处理数据表和时间序列提供了相应的函数。

☆☆☆☆Matplotlib:这是一个2D绘图库,在绘制图形和图像方面提供了良好的支持。当前,Matplotlib已经并入SciPy中并支持NumPy。

☆☆☆☆IPython:这个库为Python提供了强大的交互式Shell,也为Jupyter提供了内核,同时还支持交互式数据可视化功能。我们将在本章稍后介绍IPython shell。

☆☆☆☆Jupyter Notebook:它提供了一个基于Web的交互式shell,可以创建和共享支持可实时代码和可视化的文档。Jupyter Notebook通过IPython提供的内核支持多个版本的Python。

常见官方地址

NumPy和SciPy的主要文档网站是http://docs.scipy.org/doc/。通过该网站,您可以浏览NumPy和SciPy程序库的用户指南和参考指南,以及一些相关教程

Pandas http://pandas.pydata.org/pandas-docs/stable/

Matplotlib http://matplotlib.org/contents.html

Ipython http://ipython.readthedocs.io/en/stable/

Jupyter Notebook http://jupyter-notebook.readthedocs.io/en/latest/

1、安装python略。

2、安装jupyter。

Jupyter Notebook(此前被称为 IPython notebook)是一个交互式笔记本,支持运行 40 多种编程语言。用途包括:数据清理和转换,数值模拟,统计建模,机器学习等等。

安装常见的有两个途径。

1)安装好Anaconda3后,通过Jupyter Notebook (Anaconda3)快捷方式访问即可。

2)在eclipse中通过pip安装。

pip install jupyter

安装界面:

/* 
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting jupyter
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/83/df/0f5dd132200728a86190397e1ea87cd76244e42d39ec5e88efd25b2abd7e/jupyter-1.0.0-py2.py3-none-any.whl
Collecting notebook
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/f5/69/d2ffaf7efc20ce47469187e3a41e6e03e17b45de5a6559f4e7ab3eace5e1/notebook-6.0.2-py3-none-any.whl (9.7MB)
Collecting ipykernel
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/e1/92/8fec943b5b81078399f969f00557804d884c96fcd0bc296e81a2ed4fd270/ipykernel-5.1.3-py3-none-any.whl
Collecting jupyter-console
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/cb/ee/6374ae8c21b7d0847f9c3722dcdfac986b8e54fa9ad9ea66e1eb6320d2b8/jupyter_console-6.0.0-py2.py3-none-any.whl
Collecting qtconsole
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/7c/57/3528b84ffa753e2089908bbf74bb5ae60653eb7a63797b6234e88b847d67/qtconsole-4.6.0-py2.py3-none-any.whl
Collecting ipywidgets
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/56/a0/dbcf5881bb2f51e8db678211907f16ea0a182b232c591a6d6f276985ca95/ipywidgets-7.5.1-py2.py3-none-any.whl (121kB)
Collecting nbconvert
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/79/6c/05a569e9f703d18aacb89b7ad6075b404e8a4afde2c26b73ca77bb644b14/nbconvert-5.6.1-py2.py3-none-any.whl (455kB)
Collecting prometheus-client
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/b3/23/41a5a24b502d35a4ad50a5bb7202a5e1d9a0364d0c12f56db3dbf7aca76d/prometheus_client-0.7.1.tar.gz
Collecting traitlets>=4.2.1
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ca/ab/872a23e29cec3cf2594af7e857f18b687ad21039c1f9b922fac5b9b142d5/traitlets-4.3.3-py2.py3-none-any.whl (75kB)
Collecting nbformat
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/da/27/9a654d2b6cc1eaa517d1c5a4405166c7f6d72f04f6e7eea41855fe808a46/nbformat-4.4.0-py2.py3-none-any.whl (155kB)
Collecting ipython-genutils
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/fa/bc/9bd3b5c2b4774d5f33b2d544f1460be9df7df2fe42f352135381c347c69a/ipython_genutils-0.2.0-py2.py3-none-any.whl
Collecting pyzmq>=17
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/e8/be/9cbcdf37890942a9f8f09102903dd69d275258752a530b87fe7273fa26ba/pyzmq-18.1.1-cp37-cp37m-win_amd64.whl (1.0MB)
Collecting jupyter-core>=4.6.0
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/fb/82/86437f661875e30682e99d04c13ba6c216f86f5f6ca6ef212d3ee8b6ca11/jupyter_core-4.6.1-py2.py3-none-any.whl (82kB)
Collecting terminado>=0.8.1
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ff/96/1d9a2c23990aea8f8e0b5c3b6627d03196a73771a17a2d9860bbe9823ab6/terminado-0.8.3-py2.py3-none-any.whl
Requirement already satisfied: tornado>=5.0 in d:\tools\python37\lib\site-packages (from notebook->jupyter) (6.0.3)
Collecting Send2Trash
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/49/46/c3dc27481d1cc57b9385aff41c474ceb7714f7935b1247194adae45db714/Send2Trash-1.5.0-py3-none-any.whl
Requirement already satisfied: jinja2 in d:\tools\python37\lib\site-packages (from notebook->jupyter) (2.10.3)
Collecting jupyter-client>=5.3.4
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/13/81/fe0eee1bcf949851a120254b1f530ae1e01bdde2d3ab9710c6ff81525061/jupyter_client-5.3.4-py2.py3-none-any.whl
Collecting ipython>=5.0.0
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/05/d7/77b7a1988c99227f52402f93fb0f7e88c97239960516f53907ebbc44149c/ipython-7.11.0-py3-none-any.whl (777kB)
Collecting pygments
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/be/39/32da3184734730c0e4d3fa3b2b5872104668ad6dc1b5a73d8e477e5fe967/Pygments-2.5.2-py2.py3-none-any.whl (896kB)
Collecting prompt-toolkit<2.1.0,>=2.0.0
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/87/61/2dfea88583d5454e3a64f9308a686071d58d59a55db638268a6413e1eb6d/prompt_toolkit-2.0.10-py3-none-any.whl (340kB)
Collecting widgetsnbextension~=3.5.0
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/6c/7b/7ac231c20d2d33c445eaacf8a433f4e22c60677eb9776c7c5262d7ddee2d/widgetsnbextension-3.5.1-py2.py3-none-any.whl (2.2MB)
Collecting entrypoints>=0.2.2
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ac/c6/44694103f8c221443ee6b0041f69e2740d89a25641e62fb4f2ee568f2f9c/entrypoints-0.3-py2.py3-none-any.whl
Collecting bleach
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ab/05/27e1466475e816d3001efb6e0a85a819be17411420494a1e602c36f8299d/bleach-3.1.0-py2.py3-none-any.whl (157kB)
Collecting testpath
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/1b/9e/1a170feaa54f22aeb5a5d16c9015e82234275a3c8ab630b552493f9cb8a9/testpath-0.4.4-py2.py3-none-any.whl (163kB)
Collecting pandocfilters>=1.4.1
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/4c/ea/236e2584af67bb6df960832731a6e5325fd4441de001767da328c33368ce/pandocfilters-1.4.2.tar.gz
Collecting mistune<2,>=0.8.1
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/09/ec/4b43dae793655b7d8a25f76119624350b4d65eb663459eb9603d7f1f0345/mistune-0.8.4-py2.py3-none-any.whl
Collecting defusedxml
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/06/74/9b387472866358ebc08732de3da6dc48e44b0aacd2ddaa5cb85ab7e986a2/defusedxml-0.6.0-py2.py3-none-any.whl
Requirement already satisfied: decorator in d:\tools\python37\lib\site-packages (from traitlets>=4.2.1->notebook->jupyter) (4.4.1)
Requirement already satisfied: six in d:\tools\python37\lib\site-packages (from traitlets>=4.2.1->notebook->jupyter) (1.12.0)
Collecting jsonschema!=2.5.0,>=2.4
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/c5/8f/51e89ce52a085483359217bc72cdbf6e75ee595d5b1d4b5ade40c7e018b8/jsonschema-3.2.0-py2.py3-none-any.whl (56kB)
Collecting pywin32>=1.0; sys_platform == "win32"
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/bb/23/00fe4fbf9963f3bcb34a443eba0d0283fc51e5887d4045552c87490394e4/pywin32-227-cp37-cp37m-win_amd64.whl (9.1MB)
Collecting pywinpty>=0.5; os_name == "nt"
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/7b/de/c69772738f10140d531b46b7462fc1dccb4175987daaa851a8cda2326251/pywinpty-0.5.7-cp37-cp37m-win_amd64.whl (1.3MB)
Requirement already satisfied: MarkupSafe>=0.23 in d:\tools\python37\lib\site-packages (from jinja2->notebook->jupyter) (1.1.1)
Requirement already satisfied: python-dateutil>=2.1 in d:\tools\python37\lib\site-packages (from jupyter-client>=5.3.4->notebook->jupyter) (2.8.1)
Requirement already satisfied: setuptools>=18.5 in d:\tools\python37\lib\site-packages (from ipython>=5.0.0->ipykernel->jupyter) (41.2.0)
Collecting pickleshare
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/9a/41/220f49aaea88bc6fa6cba8d05ecf24676326156c23b991e80b3f2fc24c77/pickleshare-0.7.5-py2.py3-none-any.whl
Collecting colorama; sys_platform == "win32"
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/c9/dc/45cdef1b4d119eb96316b3117e6d5708a08029992b2fee2c143c7a0a5cc5/colorama-0.4.3-py2.py3-none-any.whl
Collecting jedi>=0.10
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/e9/97/55e575a5b49e5c3df9eb3c116c61021d7badf556c816be13bbd7baf55234/jedi-0.15.2-py2.py3-none-any.whl (1.1MB)
Collecting backcall
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/84/71/c8ca4f5bb1e08401b916c68003acf0a0655df935d74d93bf3f3364b310e0/backcall-0.1.0.tar.gz
Collecting wcwidth
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/7e/9f/526a6947247599b084ee5232e4f9190a38f398d7300d866af3ab571a5bfe/wcwidth-0.1.7-py2.py3-none-any.whl
Requirement already satisfied: webencodings in d:\tools\python37\lib\site-packages (from bleach->nbconvert->jupyter) (0.5.1)
Collecting attrs>=17.4.0
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/a2/db/4313ab3be961f7a763066401fb77f7748373b6094076ae2bda2806988af6/attrs-19.3.0-py2.py3-none-any.whl
Collecting pyrsistent>=0.14.0
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/6c/6f/c1a2e8da80a0029f6b618d7e20e1a6f2a61dd04e2e54225309c2cc4268f7/pyrsistent-0.15.6.tar.gz (107kB)
Collecting importlib-metadata; python_version < "3.8"
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/e9/71/1a1e0ed0981bb6a67bce55a210f168126b7ebd2065958673797ea66489ca/importlib_metadata-1.3.0-py2.py3-none-any.whl
Collecting parso>=0.5.2
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/9b/b0/90353a5ece0987279837835224dead0c424833a224195683e188d384e06b/parso-0.5.2-py2.py3-none-any.whl (99kB)
Collecting zipp>=0.5
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/74/3d/1ee25a26411ba0401b43c6376d2316a71addcc72ef8690b101b4ea56d76a/zipp-0.6.0-py2.py3-none-any.whl
Collecting more-itertools
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/68/03/0604cec1ea13c9f063dd50f900d1a36160334dd3cfb01fd0e638f61b46ba/more_itertools-8.0.2-py3-none-any.whl (40kB)
Building wheels for collected packages: prometheus-client, pandocfilters, backcall, pyrsistent
  Building wheel for prometheus-client (setup.py): started
  Building wheel for prometheus-client (setup.py): finished with status 'done'
  Created wheel for prometheus-client: filename=prometheus_client-0.7.1-cp37-none-any.whl size=41407 sha256=c20f43706024995078fe8c037005d905c614e3cf5d6c3919ef7db82bdd9a4435
  Stored in directory: C:\Users\tony zhang\AppData\Local\pip\Cache\wheels\9d\21\d1\2b2a9a083573001599e830a30085eed8e18abb66255fd9ca31
  Building wheel for pandocfilters (setup.py): started
  Building wheel for pandocfilters (setup.py): finished with status 'done'
  Created wheel for pandocfilters: filename=pandocfilters-1.4.2-cp37-none-any.whl size=7862 sha256=c34d2510ee20c461b08609ca3145b5b81d06aaf57371e96d83c3e9221465dcd2
  Stored in directory: C:\Users\tony zhang\AppData\Local\pip\Cache\wheels\2b\37\58\486b9403bb31231ad05667e3f7f738e1a9bb9cfc03b50a01c6
  Building wheel for backcall (setup.py): started
  Building wheel for backcall (setup.py): finished with status 'done'
  Created wheel for backcall: filename=backcall-0.1.0-cp37-none-any.whl size=10418 sha256=682309dc5afe04018f3d6c7272d7ce4e794b5026d8407d26c381c5928adda585
  Stored in directory: C:\Users\tony zhang\AppData\Local\pip\Cache\wheels\80\42\9d\a372415f5bfc53fa21d72e1d5925595cd5808e9bc1fd0e31a4
  Building wheel for pyrsistent (setup.py): started
  Building wheel for pyrsistent (setup.py): finished with status 'done'
  Created wheel for pyrsistent: filename=pyrsistent-0.15.6-cp37-cp37m-win_amd64.whl size=56465 sha256=8859424fe353ae5b7b0e16500abf63a54e7743f3f66394e41eaf7a0af083e18d
  Stored in directory: C:\Users\tony zhang\AppData\Local\pip\Cache\wheels\4f\d6\5d\18980adb7b24443cb907e7015b112e47e5b32a199690a196ee
Successfully built prometheus-client pandocfilters backcall pyrsistent
Installing collected packages: prometheus-client, ipython-genutils, traitlets, pygments, pickleshare, colorama, parso, jedi, backcall, wcwidth, prompt-toolkit, ipython, pywin32, jupyter-core, pyzmq, jupyter-client, ipykernel, attrs, pyrsistent, more-itertools, zipp, importlib-metadata, jsonschema, nbformat, pywinpty, terminado, Send2Trash, entrypoints, bleach, testpath, pandocfilters, mistune, defusedxml, nbconvert, notebook, jupyter-console, qtconsole, widgetsnbextension, ipywidgets, jupyter
Successfully installed Send2Trash-1.5.0 attrs-19.3.0 backcall-0.1.0 bleach-3.1.0 colorama-0.4.3 defusedxml-0.6.0 entrypoints-0.3 importlib-metadata-1.3.0 ipykernel-5.1.3 ipython-7.11.0 ipython-genutils-0.2.0 ipywidgets-7.5.1 jedi-0.15.2 jsonschema-3.2.0 jupyter-1.0.0 jupyter-client-5.3.4 jupyter-console-6.0.0 jupyter-core-4.6.1 mistune-0.8.4 more-itertools-8.0.2 nbconvert-5.6.1 nbformat-4.4.0 notebook-6.0.2 pandocfilters-1.4.2 parso-0.5.2 pickleshare-0.7.5 prometheus-client-0.7.1 prompt-toolkit-2.0.10 pygments-2.5.2 pyrsistent-0.15.6 pywin32-227 pywinpty-0.5.7 pyzmq-18.1.1 qtconsole-4.6.0 terminado-0.8.3 testpath-0.4.4 traitlets-4.3.3 wcwidth-0.1.7 widgetsnbextension-3.5.1 zipp-0.6.0
FINISHED */
View Code
安装完成后,用管理员身份打开powershell或cmd,输入你所在的jupyter文件路径,即会自动打开一个浏览器http://localhost:8888/tree进程。比如以下命令:
d:
cd D:\Java2018\PythonDataAnalysis2\Chapter02
jupyter notebook
 
打开对应的ipynb文件即可进入jupyter运行界面。
这个界面相当友好,赞一个。
 

3、常见jupyter技巧:

1)在cell首行加上%%writefile filename.py 便会在当前工作目录下创建一个名为filename.py文件,如:

1 %%writefile filename.py
2 print("hello jupyter")

2)安装代码自动补全

#安装插件
pip install jupyter_contrib_nbextensions


#检查插件配置
jupyter contrib nbextension install --user --skip-running-check

在启动后的界面中Files并列的tab页“Nbextensions”中勾选"Hinterland"即可。但这个代码补全与eclipse的代码提示有云泥之别,应付简单代码还行。

3)其他常见Tips可以参照这里:https://www.jianshu.com/p/bb0eab1b2535

4、Numpy与python数组运算的一个简单对比

NumPy在进行数组运算时,速度是相当快的。可是到底有多快呢?下面的程序代码将为我们展示numpysum()和pythonsum()这两个函数的实耗时间,这里以μs(微秒)为单位。同时,它还会显示向量sum最后面的两个元素值。下面来看使用Python和NumPy能否得到相同的答案。

 1 import sys
 2 from datetime import datetime
 3 import numpy as np
 4 
 5 def pythonsum(n):
 6    a = list(range(n))
 7    b = list(range(n))
 8    c = []
 9    for i in range(len(a)):
10        a[i] = i ** 2
11        b[i] = i ** 3
12        c.append(a[i] + b[i])
13    return c
14 
15 def numpysum(n):
16    a = np.arange(n) ** 2
17    b = np.arange(n) ** 3
18    c = a + b
19    return c
20 
21 size=10000
22 
23 start = datetime.now()
24 c = pythonsum(size)
25 delta = datetime.now() - start
26 print("The last 2 elements of the sum", c[-2:])
27 print("PythonSum elapsed time in microseconds", delta.microseconds)
28 
29 start = datetime.now()
30 c = numpysum(size)
31 delta = datetime.now() - start
32 print("The last 2 elements of the sum", c[-2:])
33 print("NumPySum elapsed time in microseconds", delta.microseconds)
/*
The last 2 elements of the sum [999500079996, 999800010000]
PythonSum elapsed time in microseconds 14463
The last 2 elements of the sum [-1227299972  -927369968]
NumPySum elapsed time in microseconds 998
*/

不同的机器运行效果不同,但是差距是显著的。

5、一个简单的matplotlib图。

 1 from sklearn.datasets import load_iris
 2 from sklearn.datasets import load_boston
 3 from matplotlib import pyplot as plt
 4 
 5 # 加载iris数据集,显示数据集的相关描述,同时将第1列(萼片长度)作为x坐标值,将第2列(萼片宽度)作为y坐标值。
 6 iris = load_iris()
 7 print(iris.DESCR)
 8 
 9 data=iris.data
10 plt.plot(data[:,0],data[:,1],".")
11 plt.show()
12 
13 # 加载波士顿数据集,显示数据集的相关描述,同时将第3列(非零售业务的比例)作为x坐标值,将第5列(一氧化氮浓度)做为y坐标值,图上的每个点用“+”号表示。
14 boston = load_boston()
15 print(boston.DESCR)
16 
17 data=boston.data
18 plt.plot(data[:,2],data[:,4],"+")
19 plt.show()

小结:

本章安装了以后要用到的NumPy、SciPy、Pandas、Matplotlib、IPython和Jupyter Notebook等程序库,并通过一个向量加法程序,体验了NumPy带来的卓越性能。此外,我们还探讨了有关的文档和在线资源。同时,我们还尝试通过运行代码来查找库中的模块,并加载了一些样本数据集,还使用Matplotlib绘制一些简单的图形。

第2章将继续与NumPy有关的内容,以探索数组和数据类型等基本概念。

邀月的体会是:相比上一本实战书,这个要简单的多,但重要的是基础概念的理解,权当作上一阶段的巩固和迭代。

第1章完。

 python学习笔记-目录索引

随书源码官方下载:
https://www.ptpress.com.cn/shopping/buy?bookId=bae24ecb-a1a1-41c7-be7c-d913b163c111

需要登录后免费下载。

原文地址:https://www.cnblogs.com/downmoon/p/12598135.html