My example is modified from an official example: akka-sample-cluster-java,
The configuration :
akka {
loglevel = debug
actor {
provider = cluster
serialization-bindings {
"sample.cluster.CborSerializable" = jackson-cbor
}
}
remote {
artery {
canonical.hostname = "127.0.0.1"
canonical.port = 0
}
}
cluster {
seed-nodes = [
"akka://ClusterSystem@127.0.0.1:25260",
"akka://ClusterSystem@127.0.0.1:25261"]
downing-provider-class = "akka.cluster.sbr.SplitBrainResolverProvider"
}
}
server, node-seed:
mvn exec:java -Dexec.mainClass="sample.cluster.stats.DcServer" -Dexec.args="25260"
[INFO] --- exec-maven-plugin:3.0.0:java (default-cli) @ akka-sample-cluster-java ---
SLF4J: A number (4) of logging calls during the initialization phase have been intercepted and are
SLF4J: now being replayed. These are subject to the filtering rules of the underlying logging [2021-02-03 10:49:17,518] [INFO] [akka.event.slf4j.Slf4jLogger] [] [ClusterSystem-akka.actor.default-dispatcher-3] - Slf4jLoggersyst started
em.
SLF4J: See also http://www.slf4j.org/codes.html#replay
[2021-02-03 10:49:17,731] [INFO] [akka.remote.artery.tcp.ArteryTcpTransport] [] [ClusterSystem-akka.actor.default-dispatcher-3] - Remoting started with transport [Artery tcp]; listening on address [akka://ClusterSystem@127.0.0.1:25260] with UID [-8821946585502473497]
[2021-02-03 10:49:17,745] [INFO] [akka.cluster.Cluster] [] [ClusterSystem-akka.actor.default-dispatcher-3] - Cluster Node [akka://ClusterSystem@127.0.0.1:25260] - Starting up, Akka version [2.6.10] ...
[2021-02-03 10:49:17,825] [INFO] [akka.cluster.Cluster] [] [ClusterSystem-akka.actor.default-dispatcher-3] - Cluster Node [akka://ClusterSystem@127.0.0.1:25260] - Registered cluster JMX MBean [akka:type=Cluster]
[2021-02-03 10:49:17,825] [INFO] [akka.cluster.Cluster] [] [ClusterSystem-akka.actor.default-dispatcher-3] - Cluster Node [akka://ClusterSystem@127.0.0.1:25260] - Started up successfully
[2021-02-03 10:49:17,853] [INFO] [akka.cluster.sbr.SplitBrainResolver] [] [ClusterSystem-akka.actor.default-dispatcher-5] - SBR started. Config: strategy [KeepMajority], stable-after [20 seconds], down-all-when-unstable [15 seconds], selfUniqueAddress [akka://ClusterSystem@127.0.0.1:25260#-8821946585502473497], selfDc [default].
[2021-02-03 10:49:18,238] [WARN] [akka.stream.Materializer] [] [ClusterSystem-akka.actor.default-dispatcher-18] - [outbound connection to [akka://ClusterSystem@127.0.0.1:25261], control stream] Upstream failed, cause: StreamTcpException: Tcp command [Connect(127.0.0.1/<unresolved>:25261,None,List(),Some(5000 milliseconds),true)] failed because of java.net.ConnectException: Connection refused
[2021-02-03 10:49:18,238] [WARN] [akka.stream.Materializer] [] [ClusterSystem-akka.actor.default-dispatcher-18] - [outbound connection to [akka://ClusterSystem@127.0.0.1:25261], message stream] Upstream failed, cause: StreamTcpException: Tcp command [Connect(127.0.0.1/<unresolved>:25261,None,List(),Some(5000 milliseconds),true)] failed because of java.net.ConnectException: Connection refused
[2021-02-03 10:49:20,209] [INFO] [akka.actor.ActorCell] [] [ClusterSystem-akka.actor.default-dispatcher-18] - Sending process request - DcIpccTaskHandle
[2021-02-03 10:49:20,213] [INFO] [akka.actor.LocalActorRef] [akkaDeadLetter] [ClusterSystem-akka.actor.default-dispatcher-3] - Message [sample.cluster.stats.dc.TaskMessage] to Actor[akka://ClusterSystem/user/DcIpccTaskHandle#-904996812] was dropped. No routees in group router for [ServiceKey[sample.cluster.stats.dc.TaskMessage](DcIpccTaskHandle)]. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2021-02-03 10:49:22,217] [INFO] [akka.actor.ActorCell] [] [ClusterSystem-akka.actor.default-dispatcher-18] - Sending process request - DcIpccTaskHandle
[2021-02-03 10:49:22,218] [INFO] [akka.actor.LocalActorRef] [akkaDeadLetter] [ClusterSystem-akka.actor.default-dispatcher-18] - Message [sample.cluster.stats.dc.TaskMessage] to Actor[akka://ClusterSystem/user/DcIpccTaskHandle#-904996812] was dropped. No routees in group router for [ServiceKey[sample.cluster.stats.dc.TaskMessage](DcIpccTaskHandle)]. [2] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2021-02-03 10:49:22,953] [INFO] [akka.cluster.Cluster] [akkaMemberChanged] [ClusterSystem-akka.actor.default-dispatcher-3] - Cluster Node [akka://ClusterSystem@127.0.0.1:25260] - Node [akka://ClusterSystem@127.0.0.1:25260] is JOINING itself (with roles [DcServer, dc-default], version [0.0.0]) and forming new cluster
[2021-02-03 10:49:22,954] [INFO] [akka.cluster.Cluster] [] [ClusterSystem-akka.actor.default-dispatcher-3] - Cluster Node [akka://ClusterSystem@127.0.0.1:25260] - is the new leader among reachable nodes (more leaders may exist)
[2021-02-03 10:49:22,958] [INFO] [akka.cluster.Cluster] [akkaMemberChanged] [ClusterSystem-akka.actor.default-dispatcher-3] - Cluster Node [akka://ClusterSystem@127.0.0.1:25260] - Leader is moving node [akka://ClusterSystem@127.0.0.1:25260] to [Up]
[2021-02-03 10:49:22,964] [INFO] [akka.cluster.sbr.SplitBrainResolver] [] [ClusterSystem-akka.actor.default-dispatcher-18] - This node is now the leader responsible for taking SBR decisions among the reachable nodes (more leaders may exist).
client, node-member:
mvn exec:java -Dexec.mainClass="sample.cluster.stats.DcWorkerPrv3"
[INFO] --- exec-maven-plugin:3.0.0:java (default-cli) @ akka-sample-cluster-java ---
SLF4J: A number (4) of logging calls during the initialization phase have been intercepted and are
SLF4J: now being replayed. These are subject to the filter[2021-02-03 10:49:34,416] [INFO] [akka.event.slf4j.Slf4jLogger] [] [ClusterSystem-akka.actor.default-dispatcher-3] - Slf4jLoggering started
rules of the underlying logging system.
SLF4J: See also http://www.slf4j.org/codes.html#replay
[2021-02-03 10:49:34,632] [INFO] [akka.remote.artery.tcp.ArteryTcpTransport] [] [ClusterSystem-akka.actor.default-dispatcher-3] - Remoting started with transport [Artery tcp]; listening on address [akka://ClusterSystem@127.0.0.1:51639] with UID [-1111624712757022496]
[2021-02-03 10:49:34,647] [INFO] [akka.cluster.Cluster] [] [ClusterSystem-akka.actor.default-dispatcher-3] - Cluster Node [akka://ClusterSystem@127.0.0.1:51639] - Starting up, Akka version [2.6.10] ...
[2021-02-03 10:49:34,731] [INFO] [akka.cluster.Cluster] [] [ClusterSystem-akka.actor.default-dispatcher-3] - Cluster Node [akka://ClusterSystem@127.0.0.1:51639] - Registered cluster JMX MBean [akka:type=Cluster]
[2021-02-03 10:49:34,731] [INFO] [akka.cluster.Cluster] [] [ClusterSystem-akka.actor.default-dispatcher-3] - Cluster Node [akka://ClusterSystem@127.0.0.1:51639] - Started up successfully
[2021-02-03 10:49:34,758] [INFO] [akka.cluster.sbr.SplitBrainResolver] [] [ClusterSystem-akka.actor.default-dispatcher-5] - SBR started. Config: strategy [KeepMajority], stable-after [20 seconds], down-all-when-unstable [15 seconds], selfUniqueAddress [akka://ClusterSystem@127.0.0.1:51639#-1111624712757022496], selfDc [default].
[2021-02-03 10:49:35,163] [WARN] [akka.stream.Materializer] [] [ClusterSystem-akka.actor.default-dispatcher-3] - [outbound connection to [akka://ClusterSystem@127.0.0.1:25261], message stream] Upstream failed, cause: StreamTcpException: Tcp command [Connect(127.0.0.1/<unresolved>:25261,None,List(),Some(5000 milliseconds),true)] failed because of java.net.ConnectException: Connection refused
[2021-02-03 10:49:35,163] [WARN] [akka.stream.Materializer] [] [ClusterSystem-akka.actor.default-dispatcher-3] - [outbound connection to [akka://ClusterSystem@127.0.0.1:25261], control stream] Upstream failed, cause: StreamTcpException: Tcp command [Connect(127.0.0.1/<unresolved>:25261,None,List(),Some(5000 milliseconds),true)] failed because of java.net.ConnectException: Connection refused
[2021-02-03 10:49:35,279] [INFO] [akka.cluster.Cluster] [] [ClusterSystem-akka.actor.default-dispatcher-5] - Cluster Node [akka://ClusterSystem@127.0.0.1:51639] - Received InitJoinAck message from [Actor[akka://ClusterSystem@127.0.0.1:25260/system/cluster/core/daemon#1349184795]] to [akka://ClusterSystem@127.0.0.1:51639]
[2021-02-03 10:49:35,335] [INFO] [akka.cluster.Cluster] [] [ClusterSystem-akka.actor.default-dispatcher-5] - Cluster Node [akka://ClusterSystem@127.0.0.1:51639] - Welcome from [akka://ClusterSystem@127.0.0.1:25260]
When the node-seed is all restarted, the node-member log is as follows, and it is impossible to reconnect to the cluster again.
[2021-02-03 10:49:46,390] [INFO] [akka.cluster.Cluster] [] [ClusterSystem-akka.actor.default-dispatcher-3] - Cluster Node [akka://ClusterSystem@127.0.0.1:51639] - Exiting confirmed [akka://ClusterSystem@127.0.0.1:25260]
[2021-02-03 10:49:46,390] [INFO] [akka.cluster.sbr.SplitBrainResolver] [] [ClusterSystem-akka.actor.default-dispatcher-5] - This node is now the leader responsible for taking SBR decisions among the reachable nodes (more leaders may exist).
[2021-02-03 10:49:46,978] [INFO] [akka.cluster.Cluster] [] [ClusterSystem-akka.actor.default-dispatcher-5] - Cluster Node [akka://ClusterSystem@127.0.0.1:51639] - is the new leader among reachable nodes (more leaders may exist)
[2021-02-03 10:49:46,988] [INFO] [akka.cluster.Cluster] [akkaMemberChanged] [ClusterSystem-akka.actor.default-dispatcher-5] - Cluster Node [akka://ClusterSystem@127.0.0.1:51639] - Leader is removing confirmed Exiting node [akka://ClusterSystem@127.0.0.1:25260]
[2021-02-03 10:49:47,420] [INFO] [akka.remote.artery.Association] [] [ClusterSystem-akka.actor.default-dispatcher-5] - Association to [akka://ClusterSystem@127.0.0.1:25260] having UID [-8821946585502473497] has been stopped. All messages to this UID will be delivered to dead letters. Reason: ActorSystem terminated
What can I do to make the problem repair itself ?
2 posts - 1 participant