Decentralized Advertising Platforms

Decentralized advertising platforms are the future. Write the rules. Own the data. Cash in. Here are the top 5 ads platforms you should know about 👇 Brave is a privacy-focused browser that rewards…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




The Ins and Outs of Apache Kafka

Modern technologies require modern systems to handle them. With trillions of requests and petabytes of data transmitted from devices everyday it is hard to imagine a world without everyone having their smartphone, tablet, or laptops. How do enterprise level applications handle such massive demands on their systems infrastructure? How does a company service millions of customers in real, and record, time. Introducing Apache Kafka. An open source distributed stream processing software platform. Originally developed and created by LinkedIn and donated to the Apache foundation in 2011. Apache Kafka breaths new life into enterprise applications by giving them highly optimized network, memory, disc access, and transmission protocols that lay the foundation for such large-scale real-time applications. In this blog post we are going to learn about the in’s and outs of Kafka and the key role it can play in most of today’s applications.

Kafka is a software run across multiple data centers. Therefore it is a distributed streaming platform. It is ran as a cluster that stores record in the form of streams. Each record contains three parts: a key, a value, and a timestamp. Each of these parts is paramount in keeping data and contiguous memory in storage which is the heart of optimization with Kafka. Kafka documentation refers to these record streams as ‘topics’. Topics are the abstraction of how the data is organized by Kafka. To understand Kafka, we really would need to break down how these topics are stored, arranged, and manipulated and how this cluster is distributed across multiple servers. That would be beyond the scope of this single blog article so we will focus mostly on the higher-level details and its implications for today’s applications. Let’s begin to talk about distribution!

Kafka is ran as a cluster across several servers. These could be in the same or multiple data centers. The topic, stream of records, and ordered commit log are all synonymous in Kafka. The topic is partitioned into three sections.

These partitions are not contiguous as one partition can be split across several different servers in the cluster and these servers could be hundreds if not thousands of miles apart from each other. These partitions are also duplicated in certain servers as a fault measure to ensure no data loss or down time for the system. The interesting part about this whole setup is that a single server will act at the ‘leader’. Of that partition it is handling requests for. If it fails to handle a request or fails to read for the partition it is taken down as the leader of the partition and a new server begins to take and handle the incoming requests for that partition. This leads to a highly effective and hugely scalable system for that partition that is able to handle trillions of requests as it is self-balancing within the cluster. That’s great and all but how does it do it?

Kafka uses four main parts to its architecture as API’s. Producers, consumers, connectors, and streams. We will take a brief look at each other these.

Producers: give an application a way of streaming data allowing it to publish to Kafka topics. These are then stored in the partitions we spoke about earlier. A producer may stream to one or more topics.

Consumers: Consumers are allowed to read from topics and ‘subscribe’ to the topics given to them.

Connectors: allow databases and such to be accessed by the Kafka cluster

Streams: allow for processors to mutate and change the producer data and output to different topics.

As applications grow larger and more complex and we have more users or devices that need to be serviced by them it became inevitable that a technology would have had to emerge to handle that load. It is undoubtedly in our future at some point that the capabilities of this software will be exceeded, and we may have to find another to replace it. As it stands, Kafka is a software that has gained rapid adoption form the tech community and it bodes well for technology enthusiasts to become more and more familiar with distributed stream processing platforms.

Add a comment

Related posts:

About Dr. Anthony Amoroso MD

Dr. Anthony Amoroso MD is an emergency physician with US Acute Care Solutions and practices at a Level II trauma center. Drawing upon 18 years of relevant experience, Dr. Anthony Amoroso MD is an…

7 things to keep in mind while planning your career

Though interviewers and your parents may ask you about it, think of 5–20 year career plans as a relic of the previous century. A long-term plan is redundant in the modern professional’s life since at…

The Joy of Peeling Potatoes

While I was peeling potatoes this morning with other volunteers and with Yiruma’s “River Flows in You” playing in the background, I felt a sense of calmness. I was suddenly in a meditative state…