Hi everybody,
I try to understand the Manual Offset Management of the Alpakka Kafka but I have some problems to understand the concepts…
When I read the existing API, I have a feeiling Manual Offset method on the Source.scala are mainly designed to handle External Offset Management but not for Business Logic deciding to commit the Offset in Kafka managed Offset management.
I have used in many Projeckt (without Akka und Alpakka Kafka), Kafka’s Manual Offset Management to takes advantage in long running processes, in the case of Business Logic success signals a commit of the Kafka Offset, to mark the message as successfully processed.
Now I can implement the same kind of the logic with Akka/Kafka combination, without using Alpakka (writing a Kafka Consumer, sending the message to Akka with an Ask, delivering the ‘offset’ as payload and returning the ‘offset’ in response payload to the ask in the case of a Business Logic success), but my main motivation is to use Alpakka is to take advantage of the Backpressure mechanisms of the Alpakka.
But if I look toi the methods in the Source,scala, ‘plainPartitionedManualOffsetSource’ and ‘committablePartitionedManualOffsetSource’, they give me the impression they are there for the external offset management but not really for Commiting the offset depending the result of the Business Case.
To be more concrete, this is an Alpakka Stream configuration that works for me at the moment,
val control : Consumer.DrainingControl[Done] =
Consumer
.sourceWithOffsetContext(consumerSettings, Subscriptions.topics("myTopic"))
.mapAsync(streamConfigProperties.getAkkaStreamParallelism) { consumerRecord =>
val myAvro : myAvro = consumerRecord.value().asInstanceOf[myAvro];
askUpdate(myAvro)
}
.via(Committer.flowWithOffsetContext(CommitterSettings(AkkaSystem.system.toClassic)))
.toMat(Sink.ignore)(Consumer.DrainingControl.apply)
.run()
which works but as I mentioned I try to convert this to
val control : Consumer.DrainingControl[Done] =
Consumer
.committablePartitionedManualOffsetSource(
consumerSettings,
Subscriptions.topics("myTopic"),
partitions => getOffsetsOnAssign(partitions, consumerSettings),
partitions => Set[TopicPartition]()
)
.map {
source =>
source._2.mapAsyncUnordered(streamConfigProperties.getAkkaStreamParallelism) {
message =>
val myAvro : MyAvro =
message.record.value().asInstanceOf[myAvro];
askUpdate(myAvro, message.committableOffset)
.map(response =>
response match {
case i1: MyActor.ProcessCompleteResponse =>
message.committableOffset
case unh @ _ =>
AkkaSystem.mySystem.log.info("Business Case says we can't commit")
null
}
)
}.runWith(Committer.sink(CommitterSettings(AkkaSystem.mySystem.toClassic)))
}
.toMat(Sink.ignore)(Consumer.DrainingControl.apply)
.run()
def getOffsetsOnAssign(partitions : Set[TopicPartition], consumerSettings : ConsumerSettings[String, SpecificRecord]) : Future[Map[TopicPartition, Long]] =
Future {
partitions
}.map(partitions => {
val kafkaConsumer: org.apache.kafka.clients.consumer.Consumer[String, SpecificRecord] =
consumerSettings.createKafkaConsumer()
val mapOffsets : util.Map[TopicPartition, OffsetAndMetadata] =
kafkaConsumer.committed(partitions.asJava)
var finalMap : Map[TopicPartition, Long] = Map[TopicPartition, Long]()
mapOffsets.forEach((key, value) => {
if(value != null) {
finalMap += (key -> value.offset())
} else {
finalMap += (key -> 0L)
}
}
)
finalMap
})
According to my Tests this works too, but I am not sure this is the correct way to do this and may be a more compact code can be created for it.
And actually, I am not sure what is expected from us if ‘committablePartitionedManualOffsetSource’ ‘onRevoke’ occurs.
Any comments or suggestions?
1 post - 1 participant
Read full topic