Akka Cluster is from unavailability to recovery, and member nodes cannot connect to the seed node again

My example is modified from an official example: akka-sample-cluster-java，
The configuration ：

akka {
  loglevel = debug
  actor {
    provider = cluster

    serialization-bindings {
      "sample.cluster.CborSerializable" = jackson-cbor
    }
  }
  remote {
    artery {
      canonical.hostname = "127.0.0.1"
      canonical.port = 0
    }
  }
  cluster {
    seed-nodes = [
      "akka://ClusterSystem@127.0.0.1:25260",
      "akka://ClusterSystem@127.0.0.1:25261"]
    downing-provider-class = "akka.cluster.sbr.SplitBrainResolverProvider"
  }
}

server, node-seed:

mvn exec:java -Dexec.mainClass="sample.cluster.stats.DcServer" -Dexec.args="25260"
[INFO] --- exec-maven-plugin:3.0.0:java (default-cli) @ akka-sample-cluster-java ---
SLF4J: A number (4) of logging calls during the initialization phase have been intercepted and are
SLF4J: now being replayed. These are subject to the filtering rules of the underlying logging [2021-02-03 10:49:17,518] [INFO] [akka.event.slf4j.Slf4jLogger] [] [ClusterSystem-akka.actor.default-dispatcher-3] - Slf4jLoggersyst started
em.
SLF4J: See also http://www.slf4j.org/codes.html#replay
[2021-02-03 10:49:17,731] [INFO] [akka.remote.artery.tcp.ArteryTcpTransport] [] [ClusterSystem-akka.actor.default-dispatcher-3] - Remoting started with transport [Artery tcp]; listening on address [akka://ClusterSystem@127.0.0.1:25260] with UID [-8821946585502473497]
[2021-02-03 10:49:17,745] [INFO] [akka.cluster.Cluster] [] [ClusterSystem-akka.actor.default-dispatcher-3] - Cluster Node [akka://ClusterSystem@127.0.0.1:25260] - Starting up, Akka version [2.6.10] ...
[2021-02-03 10:49:17,825] [INFO] [akka.cluster.Cluster] [] [ClusterSystem-akka.actor.default-dispatcher-3] - Cluster Node [akka://ClusterSystem@127.0.0.1:25260] - Registered cluster JMX MBean [akka:type=Cluster]
[2021-02-03 10:49:17,825] [INFO] [akka.cluster.Cluster] [] [ClusterSystem-akka.actor.default-dispatcher-3] - Cluster Node [akka://ClusterSystem@127.0.0.1:25260] - Started up successfully
[2021-02-03 10:49:17,853] [INFO] [akka.cluster.sbr.SplitBrainResolver] [] [ClusterSystem-akka.actor.default-dispatcher-5] - SBR started. Config: strategy [KeepMajority], stable-after [20 seconds], down-all-when-unstable [15 seconds], selfUniqueAddress [akka://ClusterSystem@127.0.0.1:25260#-8821946585502473497], selfDc [default].
[2021-02-03 10:49:18,238] [WARN] [akka.stream.Materializer] [] [ClusterSystem-akka.actor.default-dispatcher-18] - [outbound connection to [akka://ClusterSystem@127.0.0.1:25261], control stream] Upstream failed, cause: StreamTcpException: Tcp command [Connect(127.0.0.1/<unresolved>:25261,None,List(),Some(5000 milliseconds),true)] failed because of java.net.ConnectException: Connection refused
[2021-02-03 10:49:18,238] [WARN] [akka.stream.Materializer] [] [ClusterSystem-akka.actor.default-dispatcher-18] - [outbound connection to [akka://ClusterSystem@127.0.0.1:25261], message stream] Upstream failed, cause: StreamTcpException: Tcp command [Connect(127.0.0.1/<unresolved>:25261,None,List(),Some(5000 milliseconds),true)] failed because of java.net.ConnectException: Connection refused
[2021-02-03 10:49:20,209] [INFO] [akka.actor.ActorCell] [] [ClusterSystem-akka.actor.default-dispatcher-18] - Sending process request - DcIpccTaskHandle
[2021-02-03 10:49:20,213] [INFO] [akka.actor.LocalActorRef] [akkaDeadLetter] [ClusterSystem-akka.actor.default-dispatcher-3] - Message [sample.cluster.stats.dc.TaskMessage] to Actor[akka://ClusterSystem/user/DcIpccTaskHandle#-904996812] was dropped. No routees in group router for [ServiceKey[sample.cluster.stats.dc.TaskMessage](DcIpccTaskHandle)]. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2021-02-03 10:49:22,217] [INFO] [akka.actor.ActorCell] [] [ClusterSystem-akka.actor.default-dispatcher-18] - Sending process request - DcIpccTaskHandle
[2021-02-03 10:49:22,218] [INFO] [akka.actor.LocalActorRef] [akkaDeadLetter] [ClusterSystem-akka.actor.default-dispatcher-18] - Message [sample.cluster.stats.dc.TaskMessage] to Actor[akka://ClusterSystem/user/DcIpccTaskHandle#-904996812] was dropped. No routees in group router for [ServiceKey[sample.cluster.stats.dc.TaskMessage](DcIpccTaskHandle)]. [2] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2021-02-03 10:49:22,953] [INFO] [akka.cluster.Cluster] [akkaMemberChanged] [ClusterSystem-akka.actor.default-dispatcher-3] - Cluster Node [akka://ClusterSystem@127.0.0.1:25260] - Node [akka://ClusterSystem@127.0.0.1:25260] is JOINING itself (with roles [DcServer, dc-default], version [0.0.0]) and forming new cluster
[2021-02-03 10:49:22,954] [INFO] [akka.cluster.Cluster] [] [ClusterSystem-akka.actor.default-dispatcher-3] - Cluster Node [akka://ClusterSystem@127.0.0.1:25260] - is the new leader among reachable nodes (more leaders may exist)
[2021-02-03 10:49:22,958] [INFO] [akka.cluster.Cluster] [akkaMemberChanged] [ClusterSystem-akka.actor.default-dispatcher-3] - Cluster Node [akka://ClusterSystem@127.0.0.1:25260] - Leader is moving node [akka://ClusterSystem@127.0.0.1:25260] to [Up]
[2021-02-03 10:49:22,964] [INFO] [akka.cluster.sbr.SplitBrainResolver] [] [ClusterSystem-akka.actor.default-dispatcher-18] - This node is now the leader responsible for taking SBR decisions among the reachable nodes (more leaders may exist).

client, node-member:

mvn exec:java -Dexec.mainClass="sample.cluster.stats.DcWorkerPrv3"                
[INFO] --- exec-maven-plugin:3.0.0:java (default-cli) @ akka-sample-cluster-java ---
SLF4J: A number (4) of logging calls during the initialization phase have been intercepted and are
SLF4J: now being replayed. These are subject to the filter[2021-02-03 10:49:34,416] [INFO] [akka.event.slf4j.Slf4jLogger] [] [ClusterSystem-akka.actor.default-dispatcher-3] - Slf4jLoggering started
 rules of the underlying logging system.
SLF4J: See also http://www.slf4j.org/codes.html#replay
[2021-02-03 10:49:34,632] [INFO] [akka.remote.artery.tcp.ArteryTcpTransport] [] [ClusterSystem-akka.actor.default-dispatcher-3] - Remoting started with transport [Artery tcp]; listening on address [akka://ClusterSystem@127.0.0.1:51639] with UID [-1111624712757022496]
[2021-02-03 10:49:34,647] [INFO] [akka.cluster.Cluster] [] [ClusterSystem-akka.actor.default-dispatcher-3] - Cluster Node [akka://ClusterSystem@127.0.0.1:51639] - Starting up, Akka version [2.6.10] ...
[2021-02-03 10:49:34,731] [INFO] [akka.cluster.Cluster] [] [ClusterSystem-akka.actor.default-dispatcher-3] - Cluster Node [akka://ClusterSystem@127.0.0.1:51639] - Registered cluster JMX MBean [akka:type=Cluster]
[2021-02-03 10:49:34,731] [INFO] [akka.cluster.Cluster] [] [ClusterSystem-akka.actor.default-dispatcher-3] - Cluster Node [akka://ClusterSystem@127.0.0.1:51639] - Started up successfully
[2021-02-03 10:49:34,758] [INFO] [akka.cluster.sbr.SplitBrainResolver] [] [ClusterSystem-akka.actor.default-dispatcher-5] - SBR started. Config: strategy [KeepMajority], stable-after [20 seconds], down-all-when-unstable [15 seconds], selfUniqueAddress [akka://ClusterSystem@127.0.0.1:51639#-1111624712757022496], selfDc [default].
[2021-02-03 10:49:35,163] [WARN] [akka.stream.Materializer] [] [ClusterSystem-akka.actor.default-dispatcher-3] - [outbound connection to [akka://ClusterSystem@127.0.0.1:25261], message stream] Upstream failed, cause: StreamTcpException: Tcp command [Connect(127.0.0.1/<unresolved>:25261,None,List(),Some(5000 milliseconds),true)] failed because of java.net.ConnectException: Connection refused
[2021-02-03 10:49:35,163] [WARN] [akka.stream.Materializer] [] [ClusterSystem-akka.actor.default-dispatcher-3] - [outbound connection to [akka://ClusterSystem@127.0.0.1:25261], control stream] Upstream failed, cause: StreamTcpException: Tcp command [Connect(127.0.0.1/<unresolved>:25261,None,List(),Some(5000 milliseconds),true)] failed because of java.net.ConnectException: Connection refused
[2021-02-03 10:49:35,279] [INFO] [akka.cluster.Cluster] [] [ClusterSystem-akka.actor.default-dispatcher-5] - Cluster Node [akka://ClusterSystem@127.0.0.1:51639] - Received InitJoinAck message from [Actor[akka://ClusterSystem@127.0.0.1:25260/system/cluster/core/daemon#1349184795]] to [akka://ClusterSystem@127.0.0.1:51639]
[2021-02-03 10:49:35,335] [INFO] [akka.cluster.Cluster] [] [ClusterSystem-akka.actor.default-dispatcher-5] - Cluster Node [akka://ClusterSystem@127.0.0.1:51639] - Welcome from [akka://ClusterSystem@127.0.0.1:25260]

When the node-seed is all restarted, the node-member log is as follows, and it is impossible to reconnect to the cluster again.

[2021-02-03 10:49:46,390] [INFO] [akka.cluster.Cluster] [] [ClusterSystem-akka.actor.default-dispatcher-3] - Cluster Node [akka://ClusterSystem@127.0.0.1:51639] - Exiting confirmed [akka://ClusterSystem@127.0.0.1:25260]
[2021-02-03 10:49:46,390] [INFO] [akka.cluster.sbr.SplitBrainResolver] [] [ClusterSystem-akka.actor.default-dispatcher-5] - This node is now the leader responsible for taking SBR decisions among the reachable nodes (more leaders may exist).
[2021-02-03 10:49:46,978] [INFO] [akka.cluster.Cluster] [] [ClusterSystem-akka.actor.default-dispatcher-5] - Cluster Node [akka://ClusterSystem@127.0.0.1:51639] - is the new leader among reachable nodes (more leaders may exist)
[2021-02-03 10:49:46,988] [INFO] [akka.cluster.Cluster] [akkaMemberChanged] [ClusterSystem-akka.actor.default-dispatcher-5] - Cluster Node [akka://ClusterSystem@127.0.0.1:51639] - Leader is removing confirmed Exiting node [akka://ClusterSystem@127.0.0.1:25260]
[2021-02-03 10:49:47,420] [INFO] [akka.remote.artery.Association] [] [ClusterSystem-akka.actor.default-dispatcher-5] - Association to [akka://ClusterSystem@127.0.0.1:25260] having UID [-8821946585502473497] has been stopped. All messages to this UID will be delivered to dead letters. Reason: ActorSystem terminated

What can I do to make the problem repair itself ？

2 posts - 1 participant

Read full topic

Akka Cluster is from unavailability to recovery, and member nodes cannot connect to the seed node again

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Driver sought by police following a crash in Camborne

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

Blackstone — Befi Mano (Throw Back Thursday)

Black Angus Grilled Artichokes

More Questions Than Answers Four Years Later: Was Killing Of Bobby “Boo Boo”...

ANTHONY NOLAN MCNEILLY Arrested by Clackamas County Sheriff's Office on Mar...

Improve virtio-blk device performance using iothread-vq-mapping

Bureau of Internal Revenue: Regional Offices (Directory)

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

Moondru Mudichu 27-05-2016 – Polimer tv Serial

Skeng & Tommy Lee Sparta – Disappear Season (feat. Nicki Minaj) – Single...

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

VMOU RSCIT Result 2017, RSCIT Result VMOU rkcl.vmou.ac.in Name Wise

30-03-2016 – Ponnoonjal

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Murder charge millionaire Clint Spearpoint gets bail to help firm

Created Release: VG-Ripper 2.9.64 (Sep 13, 2014)

SOFT COPY ZA NGAIZA CHEMISTRY

99 God Status for Whatsapp, Facebook