技术分享 | kafka的使用场景以及生态系统

时间:2022-05-06
本文章向大家介绍技术分享 | kafka的使用场景以及生态系统,主要内容包括其使用实例、应用技巧、基本知识点总结和需要注意事项,具有一定的参考价值,需要的朋友可以参考一下。

kafka的使用场景

今天介绍一些关于Apache kafka 流行的使用场景。这些领域的概述

消息

kafka更好的替换传统的消息系统,消息系统被用于各种场景(解耦数据生产者,缓存未处理的消息,等),与大多数消息系统比较,kafka有更好的吞吐量,内置分区,副本和故障转移,这有利于处理大规模的消息。根据我们的经验,消息往往用于较低的吞吐量,但需要低的端到端延迟,并需要提供强大的耐用性的保证。

在这一领域的kafka比得上传统的消息系统,如的ActiveMQ或RabbitMQ的。

网站活动追踪

kafka原本的使用场景:用户的活动追踪,网站的活动(网页游览,搜索或其他用户的操作信息)发布到不同的话题中心,这些消息可实时处理,实时监测,也可加载到Hadoop或离线处理数据仓库。每个用户页面视图都会产生非常高的量。

指标

kafka也常常用于监测数据。分布式应用程序生成的统计数据集中聚合。日志聚合使用kafka代替一个日志聚合的解决方案。流处理kafka消息处理包含多个阶段。其中原始输入数据是从kafka主题消费的,然后汇总,丰富,或者以其他的方式处理转化

为新主题。

例如,一个推荐新闻文章,文章内容可能从“articles”主题获取;然后进一步处理内容,得到一个处理后的新内容,最后推荐给用户。这种处理是基于单个主题的实时数据流。从0.10.0.0开始,轻量,但功能强大的流处理,就进行这样的数据处理了。除了Kafka Streams,还有Apache Storm和Apache Samza可选择。

事件采集

事件采集是一种应用程序的设计风格,其中状态的变化根据时间的顺序记录下来,kafka支持这种非常大的存储日志数据的场景。

提交日志

kafka可以作为一种分布式的外部提交日志,日志帮助节点之间复制数据,并作为失败的节点来恢复数据重新同步,kafka的日志压缩功能很好的支持这种用法,这种用法类似于Apacha BookKeeper项目。

kafka的生态系统

还有很多与kafka集成的外部的工具。包含了stream处理系统,hadoop的集成,监控和部署工具。

Ecosystem
Skip to end of metadata
Created by Jay Kreps, last modified by Thomas Weise on Sep 29, 2016
Go to start of metadata
Here is a list of tools we have been told about that integrate with Kafka outside the main distribution.
We haven't tried them all, so they may not work!
Clients, of course, are listed separately here.
Clients
Skip to end of metadata
Created by Jun Rao, last modified by Matthew Howlett on Mar 07, 2017
Go to start of metadata
How The Kafka Project Handles Clients
C/C++
Python
Go (AKA golang)
Erlang
.NET
Clojure
Ruby
Node.js
Proxy (HTTP REST, etc)
Perl
stdin/stdout
PHP
Rust
Alternative Java
Storm
Scala DSL
Clojure
Client Libraries Previously Supported
HHooww TThhee KKaaffkkaa PPrroojjeecctt HHaannddlleess CClliieennttss
Starting with the 0.8 release we are maintaining all but the jvm client external to the main code base.
The reason for this is that it allows a small group of implementers who know the language of that client
to quickly iterate on their code base on their own release cycle. Having these maintained centrally was
becoming a bottleneck as the main committers can't hope to know every possible programming language to be
able to perform meaningful code review and testing. This lead to a scenario where the committers were
attempting to review and test code they didn't understand.
We are instead moving to the redis/memcached model which seems to work better at supporting a rich
ecosystem of high quality clients.
We haven't tried all these clients and can't vouch for them. The normal rules of open source apply.
If you are aware of other clients not listed here (or are the author of such a client), please add it
here. Please also feel free to help fill out information on the features the client supports, level of
activity of the project, level of documentation, etc.
C/C++
-------------
Robust high performance C/C++ library with full protocol support
A bunch of other language bindings has been built on top of it, including Haskell, Node.js, OCaml, PHP,
Python, Ruby, C# / .NET.
https://github.com/edenhill/librdkafka
KKaaffkkaa VVeerrssiioonn: 0.7.x, 0.8.x, 0.9.x, 0.10.x
MMaaiinnttaaiinneerrss:: Magnus Edenhill
LLiicceennssee:: 2-clause BSD
-------------
Native C++ library with protocol support for Metadata, Produce, Fetch, and Offset.
KKaaffkkaa VVeerrssiioonn: 0.8.x
MMaaiinnttaaiinneerr:: David Tompkins
LLiicceennssee:: Apache 2.0
https://github.com/adobe-research/libkafka
-------------
C++ Header-only Kafka Client Library using Boost Asio
KKaaffkkaa VVeerrssiioonn: 0.8.x
MMaaiinnttaaiinneerr:: Daniel Joos
LLiicceennssee:: MIT
https://github.com/danieljoos/libkafka-asio
-------------
A C++11 asyncronous producer/consumer library for Apache Kafka based on boost asio
KKaaffkkaa VVeerrssiioonn: 0.8.x
MMaaiinnttaaiinneerr:: Svante Karlsson
LLiicceennssee:: Boost Software License - Version 1.0
https://github.com/bitbouncer/csi-kafka
-------------
libasynckafkaclient - C++ based single threaded asynchronous library
KKaaffkkaa VVeerrssiioonn: 0.8.x
MMaaiinnttaaiinneerr:: Vijay Jadhav
LLiicceennssee: 2-clause BSD
https://github.com/GSLabDev/libasynckafkaclient
-------------
https://github.com/quipo/kafka-cpp
KKaaffkkaa VVeerrssiioonn: 0.7.x
-------------
PPyytthhoonn
-------------
High performance Python client based on Librdkafka with full protocol support.
https://github.com/confluentinc/confluent-kafka-python
Docs: http://docs.confluent.io/current/clients/index.html
KKaaffkkaa VVeerrssiioonn:: 0.8.x, 0.9.x, 0.10.x
MMaaiinnttaaiinneerrss:: Confluent
LLiicceennssee:: Apache 2.0
-------------
Pure Python implementation with full protocol support. Consumer and Producer implementations included,
GZIP, LZ4, and Snappy compression supported.
http://github.com/dpkp/kafka-python
KKaaffkkaa VVeerrssiioonn:: 0.8.x, 0.9.x, 0.10.x
MMaaiinnttaaiinneerr:: Dana Powers
LLiicceennssee:: Apache 2.0
-------------
Python driver with full protocol support and balanced consumer implementation. GZIP and Snappy
compression supported.
https://github.com/Parsely/pykafka
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr:: Parse.ly
LLiicceennssee:: Apache v2.0
-------------
Protocol support for Kafka 0.7 in Python. GZip and Snappy compression supported
KKaaffkkaa VVeerrssiioonn:: 0.7.x
MMaaiinnttaaiinneerr:: David Arthur
LLiicceennssee:: Apache v.2.0
https://github.com/mumrah/kafka-python/tree/0.7
-------------
Also:
https://github.com/dsully/pykafka
KKaaffkkaa VVeerrssiioonn:: 0.7.x
MMaaiinnttaaiinneerr::: Dan Sulley, LinkedIn
LLiicceennssee:: Apache 2.0
-------------
Python client from Disqus:
https://github.com/getsamsa/samsa
KKaaffkkaa VVeerrssiioonn:: 0.7.x
MMaaiinnttaaiinneerr::: Keith Bourgoin, Parse.ly
LLiicceennssee:: Apache 2.0
-------------
Python client from Urban Airship: https://github.com/urbanairship/pykafkap
KKaaffkkaa VVeerrssiioonn:: 0.7.x
-------------
Python client from Datadog: https://github.com/datadog/brod
(Producer, Simple Consumer, ZK-Consumer)
KKaaffkkaa VVeerrssiioonn:: 0.7.x
-------------
GGoo ((AAKKAA ggoollaanngg))
-------------
Pure Go implementation with full protocol support. Consumer and Producer implementations included, GZIP
and Snappy compression supported.
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr:: Shopify
LLiicceennssee:: MIT
https://github.com/Shopify/sarama
-------------
Enhanced Go Kafka consumer and producer implementations.
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr:: Big Data Open Source Security
LLiicceennssee:: Apache v2.0
https://github.com/stealthly/go_kafka_client
-------------
A pure Go implementation of the low level Kafka API.
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr:: Big Data Open Source Security
LLiicceennssee:: Apache v2.0
https://github.com/stealthly/siesta
-------------
A fast pure Go Kafka implementation with a clean API.
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr:: OOppttiiooPPaayy
LLiicceennssee:: MIT
https://github.com/optiopay/kafka
-------------
https://github.com/nuance/kafka
https://github.com/jdamick/kafka.go
KKaaffkkaa VVeerrssiioonn:: 0.7.x
-------------
confluent-kafka-go: Confluent's Kafka client for Golang wraps the librdkafka C library, providing full
Kafka protocol support with great performance and reliability.
The Golang bindings provides a high-level Producer and Consumer with support for the balanced consumer
groups of Apache Kafka 0.9 and above.
KKaaffkkaa VVeerrssiioonn:: 0.8.x+
MMaaiinnttaaiinneerr:: Confluent
LLiicceennssee:: Apache v2.0
https://github.com/confluentinc/confluent-kafka-go
Docs: http://docs.confluent.io/current/clients/index.html
-------------
EErrllaanngg
-------------
Kafka client library in Erlang. Full support for 0.9+ consumer protocol, very efficient producer
implementation.
https://github.com/klarna/brod
KKaaffkkaa VVeerrssiioonn:: 0.9.x, 0.10.x
MMaaiinnttaaiinneerrss:: Ivan Dyachkov (Klarna AB), Shi Zaiming (Klarna AB)
LLiicceennssee:: Apache 2.0
-------------
A minimal, high-performance Kafka client in Erlang
https://github.com/helpshift/ekaf
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr:: Helpshift
LLiicceennssee:: Apache v2
-------------
erlkafka is a kafka client written in erlang
https://github.com/milindparikh/erlkafka.git
KKaaffkkaa VVeerrssiioonn:: 0.7.x
MMaaiinnttaaiinneerr:: Milind Parikh
*License: BSD, LGPL
Also:
https://github.com/wooga/kafka-erlang
.NET
-------------
A fully featured .NET client for Apache Kafka based on librdkafka (a fork of rdkafka-dotnet).
KKaaffkkaa VVeerrssiioonn:: 0.8.x - 0.10.x
MMaaiinnttaaiinneerr: Confluent Inc. (original author Andreas Heider)
LLiicceennssee: Apache 2.0
https://github.com/confluentinc/confluent-kafka-dotnet
-------------
Pure C# client with full protocol support. Includes consumer, producer,
lower level components and gzip support (no snappy)
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr: James Roland
LLiicceennssee: Apache 2.0
https://github.com/Jroland/kafka-net
-------------
This is a .NET implementation of a client for Kafka using C# for Kafka 0.8. It provides for an
implementation that covers most basic functionalities to include a simple Producer and Consumer.
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr: ExactTarget
LLiicceennssee: Apache 2.0
https://github.com/ExactTargetDev/kafka-net
--------------------------
.Net implementation of the Apache Kafka Protocol that provides basic functionality through
Producer/Consumer classes. The project also offers balanced consumer implementation. The project is a
fork from ExactTarget's Kafka-net Client.
KKaaffkkaa VVeerrssiioonn:: 0.8.x, 0.9.x
MMaaiinnttaaiinneerr: Microsoft
LLiicceennssee: Apache 2.0
https://github.com/Microsoft/Kafkanet
-------------
C# client, asynchronous, all 3 compressions supported (read and write), tracks leader partition changes
transparently, long time in production.
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr: Vadim Chekan
LLiicceennssee: Apache-2.0
https://github.com/ntent-ad/kafka4net
-------------
kafka-sharp - "High Performance" .NET Kafka Driver
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr: Criteo
LLiicceennssee: Apache 2.0
https://github.com/criteo/kafka-sharp
-------------
CClloojjuurree
-------------
Fast kafka api for JVM languages implemented in clojure.
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr: https://github.com/gerritjvv
LLiicceennssee: Apache 2.0
https://github.com/gerritjvv/kafka-fast
-------------
Wrapper to the Java API for interacting with Kafka
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr: https://github.com/pingles
LLiicceennssee: Apache 2.0
https://github.com/pingles/clj-kafka/
-------------
Kafka clojure client library
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr: Pierre-Yves Ritschard
LLiicceennssee: MIT
Code, Documentation
-------------
Ruby
-------------
[Unmaintained] Pure Ruby, Consumer and Producer implementations included, GZIP and Snappy compression
supported. Ruby 1.9.3 and up (CI runs MRI 2.0, JRuby and Rubinius).
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr: Bob Potter
LLiicceennssee: MIT
https://github.com/bpot/poseidon
-------------
Karafka - Framework used to simplify Apache Kafka based Ruby applications development.
KKaaffkkaa VVeerrssiioonn:: 0.9.x
MMaaiinnttaaiinneerr: Maciej Mensfeld
LLiicceennccee: MIT
https://github.com/karafka/karafka
-------------
ruby-kafka - A Ruby client library for the Kafka distributed log system
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr: Zendesk/ Daniel Schierbeck
LLiicceennccee: Apache 2.0
https://github.com/zendesk/ruby-kafka
-------------
JRuby wrapper for producers and consumers of the existing API
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr: Joseph Lawson
LLiicceennssee: Apache 2.0
https://github.com/joekiller/jruby-kafka
-------------
https://github.com/acrosa/kafka-rb
KKaaffkkaa VVeerrssiioonn:: 0.7.x
MMaaiinnttaaiinneerr:: Alejandro Crosa
LLiicceennssee::: Apache 2.0
-------------
Event machine client:
https://github.com/groupme/em-kafka
KKaaffkkaa VVeerrssiioonn:: 0.7.x
-------------
JRuby Event stream processor
https://github.com/wooga/kafkaesque
KKaaffkkaa VVeerrssiioonn:: 0.7.x
-------------
NNooddee..jjss
-------------
The node-rdkafka library is a high-performance NodeJS client for Apache Kafka that wraps the native
librdkafka library. All the complexity of balancing writes across partitions and managing (possibly everchanging) brokers should be encapsulated in the library.
https://github.com/Blizzard/node-rdkafka
KKaaffkkaa VVeerrssiioonn:: 0.9, 0.10
NNooddee..jjss >>== 44
MMaaiinnttaaiinneerr: Blizzard.com
LLiicceennssee: MIT
-------------
Kafka-Node is a NodeJS client with Zookeeper integration
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr: sohu.com
LLiicceennssee: MIT
https://github.com/SOHU-Co/kafka-node/
-------------
Kafka-node is a pure JavaScript implementation for NodeJS Server with Vagrant and Docker support.
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr: wurstmeister
LLiicceennssee: Apache 2.0
https://github.com/wurstmeister/node-kafka-0.8-plus
-------------
Node-kafka is a node.js wrapper for the C library librdkafka
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr: Sutoiku
LLiicceennssee: MIT
https://github.com/sutoiku/node-kafka
-------------
kafka-java-bridge is a Nodejs wrapper for the JAVA high level kafka 0.8. consumer API
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr: LivePersonInc
LLiicceennssee: MIT
https://www.npmjs.com/package/kafka-java-bridge
https://github.com/LivePersonInc/kafka-java-bridge
-------------
Low-level protocol support in node.js.
https://github.com/cainus/Prozess
https://npmjs.org/package/prozess
KKaaffkkaa VVeerrssiioonn:: 0.7.x
MMaaiinnttaaiinneerrss::
Gregg Caines
Eric lee
LLiicceennssee:: MIT
-------------
Alternate node client from Tagged
https://github.com/marcuswestin/node-kafka
Also:
https://github.com/dannycoates/franz-kafka
KKaaffkkaa VVeerrssiioonn:: 0.7.x
-------------
PPrrooxxyy ((HHTTTTPP RREESSTT,, eettcc))
-------------
Dory is a producer daemon that supports clients in various programming languages. Clients send messages
to Dory using local interprocess communication. Dory then takes full responsibility for reliable message
delivery. Monitoring infrastructure can query Dory's web interface for JSON-based status and data
quality reports.
https://github.com/dspeterson/dory
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr:: Dave Peterson
LLiicceennssee:: Apache v.2.0
-------------
Apache Kafka HTTP Endpoint for producing and consuming messages from topics
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr: Big Data Open Source Security LLC
LLiicceennssee: Apache 2.0
https://github.com/stealthly/dropwizard-kafka-http
-------------
(Deprecated) Kafka high level Producer and Consumer APIs are very hard to implement right. Rest endpoint
gives access to native Scala high level consumer and producer APIs.
MMaaiinnttaaiinneerr: Sasha Klizhentas
LLiicceennssee: Apache 2.0
https://github.com/mailgun/kafka-http
-------------
The Kafka REST Proxy provides a RESTful interface to a Kafka cluster.
KKaaffkkaa VVeerrssiioonn:: 0.8.x, 0.9.x, 0.10.x
MMaaiinnttaaiinneerr: Confluent
LLiicceennssee: Apache 2.0
Docs: http://confluent.io/docs/current/kafka-rest/docs/intro.html
Code: https://github.com/confluentinc/kafka-rest
-------------
Kafka-Pixy is a local aggregating HTTP proxy to Kafka messaging cluster.
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr: Mailgun
LLiicceennssee: Apache 2.0
https://github.com/mailgun/kafka-pixy
-------------
Efficient Kafka REST Proxy for producers
https://github.com/klarna/kastle
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerrss:: Ivan Dyachkov (Klarna AB), Shi Zaiming (Klarna AB)
LLiicceennssee:: Apache 2.0
Perl
-------------
Pure Perl, Consumer and Producer implementations included. Zookeeper
integration. GZIP and Snappy compression supported.
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr: Sergey Gladkov
LLiicceennssee:: Artistic License
https://github.com/TrackingSoft/Kafka
http://search.cpan.org/~sgladkov/Kafka/lib/Kafka.pm
ssttddiinn//ssttddoouutt
-------------
Generic producer and consumer for stdin and stdout.
https://github.com/edenhill/kafkacat
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr:: Magnus Edenhill
LLiicceennssee:: 2-clause BSD
PHP
-------------
PHP extension built on librdkafka
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr:: Elias Van Ootegem
LLiicceennssee:: MIT
https://github.com/EVODelavega/phpkafka
-------------
PHP client based on librdkafka
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr:: Arnaud Le Blanc, for Mention.com
LLiicceennssee:: MIT
https://github.com/arnaud-lb/php-rdkafka
-------------
PHP library with Consumer (simple and Zookeeper-based), Producer and compression support (release notes).
https://github.com/quipo/kafka-php
KKaaffkkaa VVeerrssiioonn:: 0.7.x
MMaaiinnttaaiinneerr:: Lorenzo Alberton
LLiicceennssee:: Apache v.2.0
Also:
https://github.com/michal-harish/kafka-php
Log4PHP Appender
https://github.com/dastra/log4php-kafka
-------------
Rust
-------------
Pure Rust implementation with support for Metadata, Produce, Fetch, and Offset requests. Supports Gzip
and Snappy compression
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr:: Yousuf Fauzan
LLiicceennssee:: MIT
Code
Documentation
AAlltteerrnnaattiivvee JJaavvaa
-------------
Of course the main project maintains a set of jvm-based clients. But here are alternate clients.
Krackle is an optimized Kafka client built by Blackberry.
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr: Blackberry
LLiicceennssee: Apache 2.0
https://github.com/blackberry/Krackle/blob/dev/LICENSE
Storm
-------------
Port of Apache storm-kafka to >= 0.8.0
MMaaiinnttaaiinneerr: wurstmeister
LLiicceennssee: Apache 2.0
https://github.com/wurstmeister/storm-kafka-0.8-plus
SSccaallaa DDSSLL
-------------
A DSL for Scala developers to produce and consume messages with Kafka.
KKaaffkkaa VVeerrssiioonn:: 0.8.x
MMaaiinnttaaiinneerr: Big Data Open Source Security LLC
LLiicceennssee: Apache 2.0
https://github.com/stealthly/scala-kafka
CClloojjuurree
-------------
https://github.com/pingles/clj-kafka
https://github.com/miniway/kafka-clj
KKaaffkkaa VVeerrssiioonn:: 0.7.x
CClliieenntt LLiibbrraarriieess PPrreevviioouussllyy SSuuppppoorrtteedd
https://svn.apache.org/repos/asf/incubator/kafka/branches/legacy_client_libraries
2 people like this
No labels
11 CCoommmmeenntt
User icon: komac
1. HHoonngg XXiinn
When is the python client for 0.8.x complete? _
Mar 13, 2013
Delete comments