如何使用Python Impyla客户端连接Hive和Impala
温馨提示:要看高清无码套图,请使用手机打开并单击图片放大查看。
1.文档编写目的
继上一章讲述如何在CDH集群安装Anaconda&搭建Python私有源后,本章节主要讲述如何使用Pyton Impyla客户端连接CDH集群的HiveServer2和Impala Daemon,并进行SQL操作。
- 内容概述
1.依赖包安装
2.代码编写
3.代码测试
- 测试环境
1.CM和CDH版本为5.11.2
2.RedHat7.2
- 前置条件
1.CDH集群环境正常运行
2.Anaconda已安装并配置环境变量
3.pip工具能够正常安装Python包
4.Python版本2.6+ or 3.3+
5.非安全集群环境
2.Impyla依赖包安装
Impyla所依赖的Python包
- six
- bit_array
- thrift (on Python 2.x) orthriftpy (on Python 3.x)
- thrift_sasl
- sasl
1.首先安装Impyla依赖的Python包
[root@ip-172-31-22-86 ~]# pip install bit_array
[root@ip-172-31-22-86 ~]# pip install thrift==0.9.3
[root@ip-172-31-22-86 ~]# pip install six
[root@ip-172-31-22-86 ~]# pip install thrift_sasl
[root@ip-172-31-22-86 ~]# pip install sasl
注意:thrift的版本必须使用0.9.3,默认安装的为0.10.0版本,需要卸载后重新安装0.9.3版本,卸载命令pip uninstall thrift
2.安装Impyla包
impyla版本,默认安装的是0.14.0,需要将卸载后安装0.13.8版本
[root@ip-172-31-22-86 ec2-user]# pip install impyla==0.13.8
Collecting impyla
Downloading impyla-0.14.0.tar.gz (151kB)
100% |████████████████████████████████| 153kB 1.0MB/s
Requirement already satisfied: six in /opt/cloudera/parcels/Anaconda-4.2.0/lib/python2.7/site-packages (from impyla)
Requirement already satisfied: bitarray in /opt/cloudera/parcels/Anaconda-4.2.0/lib/python2.7/site-packages (from impyla)
Requirement already satisfied: thrift in /opt/cloudera/parcels/Anaconda-4.2.0/lib/python2.7/site-packages (from impyla)
Building wheels for collected packages: impyla
Running setup.py bdist_wheel for impyla ... done
Stored in directory: /root/.cache/pip/wheels/96/fa/d8/40e676f3cead7ec45f20ac43eb373edc471348ac5cb485d6f5
Successfully built impyla
Installing collected packages: impyla
Successfully installed impyla-0.14.0
3.编写Python代码
Python连接Hive(HiveTest.py)
from impala.dbapi importconnect
conn = connect(host='ip-172-31-21-45.ap-southeast-1.compute.internal',port=10000,database='default',auth_mechan
ism='PLAIN')
print(conn)
cursor = conn.cursor()
cursor.execute('show databases')
print cursor.description # prints the result set's schema
results = cursor.fetchall()
print(results)
cursor.execute('SELECT * FROM test limit 10')
print cursor.description # prints the result set's schema
results = cursor.fetchall()
print(results)
Python连接Impala(ImpalaTest.py)
from impala.dbapi importconnect
conn = connect(host='ip-172-31-26-80.ap-southeast-1.compute.internal',port=21050)
print(conn)
cursor = conn.cursor()
cursor.execute('show databases')
print cursor.description # prints the result set's schema
results = cursor.fetchall()
print(results)
cursor.execute('SELECT * FROM test limit 10')
print cursor.description # prints the result set's schema
results = cursor.fetchall()
print(results)
4.测试代码
在shell命令行执行Python代码测试
1.测试连接Hive
_root@ip-172-31-22-86_ec2-user# python HiveTest.py
<impala.hiveserver2.HiveServer2Connection_object at 0x7f66eee00250>_
('database_name', 'STRING', None, None, None, None, None)
('default',)
('test.s1', 'STRING',None, None, None, None, None), ('test.s2', 'STRING', None, None, None, None, None)
('name1', 'age1'), ('name2', 'age2'), ('name3', 'age3'), ('name4', 'age4'), ('name5', 'age5'), ('name6', 'age6'), ('name7', 'age7'), ('name8', 'age8'), ('name9', 'age9'), ('name10', 'age10')
[root@ip-172-31-22-86 ec2-user]#
2.测试连接Impala
_root@ip-172-31-22-86_ec2-user# python ImpalaTest.py
<impala.hiveserver2.HiveServer2Connection_object at 0x7f7e1f2cfad0>_
('name', 'STRING', None, None, None, None, None), ('comment', 'STRING', None, None, None, None, None)
('_impala_builtins', 'Systemdatabase for Impala builtin functions'), ('default', 'Default Hive database')
('s1', 'STRING', None, None, None,None, None), ('s2', 'STRING', None, None, None,None, None)
('name1', 'age1'), ('name2', 'age2'), ('name3', 'age3'), ('name4', 'age4'), ('name5', 'age5'), ('name6', 'age6'), ('name7', 'age7'), ('name8', 'age8'), ('name9', 'age9'), ('name10', 'age10')
[root@ip-172-31-22-86 ec2-user]#
5.常见问题
1.错误一
building 'sasl.saslwrapper' extension
creating build/temp.linux-x86_64-2.7
creating build/temp.linux-x86_64-2.7/sasl
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Isasl -I/opt/cloudera/parcels/Anaconda/include/python2.7 -c sasl/saslwrapper.cpp -o build/temp.linux-x86_64-2.7/sasl/saslwrapper.o
unable to execute 'gcc': No such file or directory
error: command 'gcc' failed with exit status 1
----------------------------------------
Command "/opt/cloudera/parcels/Anaconda/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-kD6tvP/sasl/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('rn', 'n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-WJFNeG-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-kD6tvP/sasl/
解决方法:
[root@ip-172-31-22-86 ec2-user]# yum -y install gcc
[root@ip-172-31-22-86 ec2-user]# yum install gcc-c++
2.错误二
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Isasl -I/opt/cloudera/parcels/Anaconda/include/python2.7 -c sasl/saslwrapper.cpp -o build/temp.linux-x86_64-2.7/sasl/saslwrapper.o
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ [enabled by default]
In file included from sasl/saslwrapper.cpp:254:0:
sasl/saslwrapper.h:22:23: fatal error: sasl/sasl.h: No such file or directory
#include <sasl/sasl.h>
^
compilation terminated.
error: command 'gcc' failed with exit status 1
解决方法:
[root@ip-172-31-22-86 ec2-user]# yum -y install python-devel.x86_64 cyrus-sasl-devel.x86_64
醉酒鞭名马,少年多浮夸! 岭南浣溪沙,呕吐酒肆下!挚友不肯放,数据玩的花! 温馨提示:要看高清无码套图,请使用手机打开并单击图片放大查看。
推荐关注Hadoop实操,第一时间,分享更多Hadoop干货,欢迎转发和分享。
原创文章,欢迎转载,转载请注明:转载自微信公众号Hadoop实操
- JavaScript 教程
- JavaScript 编辑工具
- JavaScript 与HTML
- JavaScript 与Java
- JavaScript 数据结构
- JavaScript 基本数据类型
- JavaScript 特殊数据类型
- JavaScript 运算符
- JavaScript typeof 运算符
- JavaScript 表达式
- JavaScript 类型转换
- JavaScript 基本语法
- JavaScript 注释
- Javascript 基本处理流程
- Javascript 选择结构
- Javascript if 语句
- Javascript if 语句的嵌套
- Javascript switch 语句
- Javascript 循环结构
- Javascript 循环结构实例
- Javascript 跳转语句
- Javascript 控制语句总结
- Javascript 函数介绍
- Javascript 函数的定义
- Javascript 函数调用
- Javascript 几种特殊的函数
- JavaScript 内置函数简介
- Javascript eval() 函数
- Javascript isFinite() 函数
- Javascript isNaN() 函数
- parseInt() 与 parseFloat()
- escape() 与 unescape()
- Javascript 字符串介绍
- Javascript length属性
- javascript 字符串函数
- Javascript 日期对象简介
- Javascript 日期对象用途
- Date 对象属性和方法
- Javascript 数组是什么
- Javascript 创建数组
- Javascript 数组赋值与取值
- Javascript 数组属性和方法
- keras实现VGG16 CIFAR10数据集方式
- PyTorch: Softmax多分类实战操作
- 为什么称python为胶水语言
- opencv 图像礼帽和图像黑帽的实现
- python文件及目录操作代码汇总
- 使用pyplot.matshow()函数添加绘图标题
- 如何卸载python插件
- Keras实现支持masking的Flatten层代码
- 用opencv给图片换背景色的示例代码
- 解决Keras 自定义层时遇到版本的问题
- Laravel框架运行出错提示RuntimeException No application encryption key has been specified.解决方法
- Laravel访问出错提示:`Warning: require(/vendor/autoload.php): failed to open stream: No such file or di解决方法
- python实现PDF中表格转化为Excel的方法
- php实现算术验证码功能
- 基于matplotlib中ion()和ioff()的使用详解