安装配置Pig
mkdir -p /opt/hadoop/pig
wget http://mirror.bit.edu.cn/apache/pig/pig-0.16.0/pig-0.16.0.tar.gz
tar -zxvf pig-0.16.0.tar.gz
#配置环境变量
修改/etc/profile
#Pig
export PIG_HOME=/opt/hadoop/pig/pig-0.16.0
export PATH=$PATH:$PIG_HOME/bin
#校验是否安装成功
pig -help
#其中-x 参数为运行,有三种模式
-x, -exectype - Set execution mode: local|mapreduce|tez, default is mapreduce.
运行Local模式
[hadoop@localhost pig-0.16.0]$ pig -x local
16/06/13 18:36:59 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
16/06/13 18:36:59 INFO pig.ExecTypeProvider: Picked LOCAL as the ExecType
2016-06-13 18:36:59,884 [main] INFO org.apache.pig.Main - Apache Pig version 0.16.0 (r1746530) compiled Jun 01 2016, 23:10:49
2016-06-13 18:36:59,884 [main] INFO org.apache.pig.Main - Logging error messages to: /opt/hadoop/pig/pig-0.16.0/pig_1465814219882.log
2016-06-13 18:36:59,974 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /home/hadoop/.pigbootup not found
2016-06-13 18:37:00,438 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2016-06-13 18:37:00,439 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2016-06-13 18:37:00,442 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///
2016-06-13 18:37:00,694 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2016-06-13 18:37:00,737 [main] INFO org.apache.pig.PigServer - Pig Script ID for the session: PIG-default-1accdf7e-8ebf-43e3-9082-9ff626aababc
2016-06-13 18:37:00,737 [main] WARN org.apache.pig.PigServer - ATS is disabled since yarn.timeline-service.enabled set to false
#执行CD操作
grunt> cd /home/hadoop/
#执行查看操作
grunt> ls
file:/home/hadoop/.mozilla <dir>
file:/home/hadoop/.bash_logout<r 1> 18
file:/home/hadoop/.bash_profile<r 1> 193
file:/home/hadoop/.bashrc<r 1> 231
file:/home/hadoop/.cache <dir>
file:/home/hadoop/.config <dir>
file:/home/hadoop/.bash_history<r 1> 17564
file:/home/hadoop/.ssh <dir>
file:/home/hadoop/share.tar.gz<r 1> 210532639
file:/home/hadoop/hadoop_tmp <dir>
file:/home/hadoop/wc.input<r 1> 1587
file:/home/hadoop/.beeline <dir>
file:/home/hadoop/.hivehistory<r 1> 2764
file:/home/hadoop/test_acid.txt<r 1> 144
file:/home/hadoop/.viminfo<r 1> 9932
file:/home/hadoop/.mysql_history<r 1> 325
file:/home/hadoop/.vim <dir>
file:/home/hadoop/.pig_history<r 1> 48
#执行退出
grunt> quit
2016-06-13 18:41:13,423 [main] INFO org.apache.pig.Main - Pig script completed in 4 minutes, 14 seconds and 495 milliseconds (254495 ms)
运行Mapreduce模式
前提是已经启动hadoop,配置好了环境变量。
#默认pig就是执行mapreduce
[hadoop@localhost pig-0.16.0]$ pig
16/06/13 18:42:24 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
16/06/13 18:42:24 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE
#查看当前路径
grunt> pwd
hdfs://172.16.22.251:9005/user/hadoop
#创建文件夹
grunt> mkdir pig
grunt> ls
hdfs://172.16.22.251:9005/user/hadoop/pig <dir>
#上传文件到pig目录
grunt> copyFromLocal /home/hadoop/text.input pig/
grunt> ls
hdfs://172.16.22.251:9005/user/hadoop/pig <dir>
grunt> cd pig/
grunt> ls
hdfs://172.16.22.251:9005/user/hadoop/pig/text.input<r 1> 1587
#显示文件内容
grunt> cat text.input
原文链接:Hadoop学习笔记(13)--Pig安装,转载请注明来源!