java 中自定义OutputFormat的实例详解
时间:2019-03-31
本文章向大家介绍java 中自定义OutputFormat的实例详解,主要包括java 中自定义OutputFormat的实例详解使用实例、应用技巧、基本知识点总结和需要注意事项,具有一定的参考价值,需要的朋友可以参考一下。
java 中 自定义OutputFormat的实例详解
实例代码:
package com.ccse.hadoop.outputformat; import java.io.IOException; import java.net.URI; import java.net.URISyntaxException; import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FSDataOutputStream; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.JobContext; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.OutputCommitter; import org.apache.hadoop.mapreduce.OutputFormat; import org.apache.hadoop.mapreduce.RecordWriter; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.TaskAttemptContext; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter; public class MySelfOutputFormatApp { public final static String INPUT_PATH = "hdfs://chaoren1:9000/mapinput"; public final static String OUTPUT_PATH = "hdfs://chaoren1:9000/mapoutput"; public final static String OUTPUT_FILENAME = "/abc"; public static void main(String[] args) throws IOException, URISyntaxException, ClassNotFoundException, InterruptedException { Configuration conf = new Configuration(); FileSystem fileSystem = FileSystem.get(new URI(OUTPUT_PATH), conf); fileSystem.delete(new Path(OUTPUT_PATH), true); Job job = new Job(conf, MySelfOutputFormatApp.class.getSimpleName()); job.setJarByClass(MySelfOutputFormatApp.class); FileInputFormat.setInputPaths(job, new Path(INPUT_PATH)); job.setMapperClass(MyMapper.class); job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(LongWritable.class); job.setReducerClass(MyReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(LongWritable.class); job.setOutputFormatClass(MyselfOutputFormat.class); job.waitForCompletion(true); } public static class MyMapper extends Mapper<LongWritable, Text, Text, LongWritable> { private Text word = new Text(); private LongWritable writable = new LongWritable(1); @Override protected void map(LongWritable key, Text value, Mapper<LongWritable, Text, Text, LongWritable>.Context context) throws IOException, InterruptedException { if (value != null) { String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); context.write(word, writable); } } } } public static class MyReducer extends Reducer<Text, LongWritable, Text, LongWritable> { @Override protected void reduce(Text key, Iterable<LongWritable> values, Reducer<Text, LongWritable, Text, LongWritable>.Context context) throws IOException, InterruptedException { long sum = 0; for (LongWritable value : values) { sum += value.get(); } context.write(key, new LongWritable(sum)); } } public static class MyselfOutputFormat extends OutputFormat<Text, LongWritable> { private FSDataOutputStream outputStream = null; @Override public RecordWriter<Text, LongWritable> getRecordWriter( TaskAttemptContext context) throws IOException, InterruptedException { try { FileSystem fileSystem = FileSystem.get(new URI(MySelfOutputFormatApp.OUTPUT_PATH), context.getConfiguration()); //指定文件的输出路径 final Path path = new Path(MySelfOutputFormatApp.OUTPUT_PATH + MySelfOutputFormatApp.OUTPUT_FILENAME); this.outputStream = fileSystem.create(path, false); } catch (URISyntaxException e) { e.printStackTrace(); } return new MySelfRecordWriter(outputStream); } @Override public void checkOutputSpecs(JobContext context) throws IOException, InterruptedException { } @Override public OutputCommitter getOutputCommitter(TaskAttemptContext context) throws IOException, InterruptedException { return new FileOutputCommitter(new Path(MySelfOutputFormatApp.OUTPUT_PATH), context); } } public static class MySelfRecordWriter extends RecordWriter<Text, LongWritable> { private FSDataOutputStream outputStream = null; public MySelfRecordWriter(FSDataOutputStream outputStream) { this.outputStream = outputStream; } @Override public void write(Text key, LongWritable value) throws IOException, InterruptedException { this.outputStream.writeBytes(key.toString()); this.outputStream.writeBytes("\t"); this.outputStream.writeLong(value.get()); } @Override public void close(TaskAttemptContext context) throws IOException, InterruptedException { this.outputStream.close(); } } }
2.OutputFormat是用于处理各种输出目的地的。
2.1 OutputFormat需要写出去的键值对,是来自于Reducer类,是通过RecordWriter获得的。
2.2 RecordWriter中的write(...)方法只有k和v,写到哪里去哪?这要通过单独传入OutputStream来处理。write就是把k和v写入到OutputStream中的。
2.3 RecordWriter类位于OutputFormat中的。因此,我们自定义的OutputFromat必须继承OutputFormat类型。那么,流对象必须在getRecordWriter(...)方法中获得。
以上就是java 中自定义OutputFormat的实例,如有疑问请留言或者到本站社区交流讨论,感谢阅读,希望能帮助到大家,谢谢大家对本站的支持!
- android galley实现画廊效果
- RavenDB:基于Windows/.NET平台的NoSQL数据库
- 神经网络算法交易:波动预测与定制损失函数
- 贝叶斯深度学习:桥接PyMC3和Lasagne构建层次神经网络
- How does it work in Mono's C# compiler?
- Cross-Origin Resource Sharing协议介绍
- 战斗民族开源神器ClickHouse:一款适合于构建量化回测研究系统的高性能列式数据库(二)
- Microsoft StreamInsight 构建物联网
- 利用 Microsoft StreamInsight 控制较大数据流
- HTML Agility Pack 搭配 ScrapySharp,彻底解除Html解析的痛苦
- 看看你是哪种级别的Python程序员(已跪)
- SignalR 在IE中无法工作 - Internet Explorer
- SQL Server 2012 中的 Service Broker功能的一些改进或增强
- 通用日志
- java教程
- Java快速入门
- Java 开发环境配置
- Java基本语法
- Java 对象和类
- Java 基本数据类型
- Java 变量类型
- Java 修饰符
- Java 运算符
- Java 循环结构
- Java 分支结构
- Java Number类
- Java Character类
- Java String类
- Java StringBuffer和StringBuilder类
- Java 数组
- Java 日期时间
- Java 正则表达式
- Java 方法
- Java 流(Stream)、文件(File)和IO
- Java 异常处理
- Java 继承
- Java 重写(Override)与重载(Overload)
- Java 多态
- Java 抽象类
- Java 封装
- Java 接口
- Java 包(package)
- Java 数据结构
- Java 集合框架
- Java 泛型
- Java 序列化
- Java 网络编程
- Java 发送邮件
- Java 多线程编程
- Java Applet基础
- Java 文档注释
- Android Studio 4.0 新功能中的Live Layout Inspector详解
- Android实现滑动刻度尺效果
- Android 仿微信发动态九宫格拖拽、删除功能
- android自定义等级评分圆形进度条
- Android Fragment实现底部通知栏
- Flutter实现局部刷新
- Android自定义条形对比统计图
- Android底部菜单栏(RadioGroup+Fragment)美化
- android自定义环形统计图动画
- 在Android环境下WebView中拦截所有请求并替换URL示例详解
- Android自定义控件横向柱状统计图
- Android处理视图圆角和色彩的工具类
- Flutter之Timer实现短信验证码获取60s倒计时功能的代码
- Android仿优酷视频的悬浮窗播放效果
- Android 本地广播和强制下线功能的实现代码