Quantcast
Channel: Akka Libraries - Discussion Forum for Akka technologies
Viewing all articles
Browse latest Browse all 1362

LocalActorRef memory leak and Full GC Allocation Failure causing gc to take too long

$
0
0

Hi

We use akka cluster with persistence version 2.6.6_2.13 and our application runs on raspberry pi 3’s (compute module).

When having -Xmx set to 128m for the jvm, it takes about 26 hours until the heap has almost no space left. There is no OutOfMemoryError but instead the max GC pause becomes 2 sec 339 ms and Full GC happens every minute (analyzed using GCeasy).
This is not acceptable because we have setup timeouts in actors of 1 second which means they always time out if the application is paused for more than 1 second.
Goal: The max GC pause should not exceed 700 millis

Increasing the heap size from 128m to 256m makes it even worse, as expected (max GC pause then is 16 sec 48 ms !).

Heapdump analysis with eclipse’s memory analyzer tool “mat“ suspects a memory problem with an instance of the class akka.actor.LocalActorRef.
Somehow, there is one of our actors in the system, called “propertyHostChannel”, which holds an ActorRef instance which occupies 42.22% of the heap (this is getting worse the longer the application runs). It is a LocalActorRef.

Additionally, we can see per-minute bursts of ~400 (!) debug logs like:

[DEBUG] [a.a.L.Deserialization] [akka.actor.LocalActorRefProvider.Deserialization] [SELServer-akka.actor.default-dispatcher-28] - Resolve (deserialization) of path [user/SELServe
r/PropertyHostChannelRouter/VirtualPropertyHost/$Q7#65017950] doesn't match an active actor. It has probably been stopped, using deadLetters.

Eventually relevant code excerpt of the ChannelRouterActor (the actor behind the “propertyHostChannel” instance):

private PSet<ActorSelection> clusterRoutersOfSameType = HashTreePSet.empty();

@Override
public Receive createReceive() {
        return receiveBuilder()
                .match(AddIdWithPropsToRegistry.class, this::onAddIdWithPropsToRegistry)
                .match(RemoveIdWithPropsFromRegistry.class, this::onRemoveIdWithPropsFromRegistry)
                .match(SendTo.class, this::onSendTo)
                .match(SendToCluster.class, this::onSendToCluster)
                // cluster events
                .match(ClusterEvent.CurrentClusterState.class, this::onCurrentClusterState)
                .match(ClusterEvent.MemberUp.class, mUp ->
                        addToClusterRouters(mUp.member())
                )
                .match(ClusterEvent.ReachableMember.class, reachableMember ->
                        addToClusterRouters(reachableMember.member())
                )
                .match(ClusterEvent.UnreachableMember.class, unreachableMember ->
                        removeFromClusterRouters(unreachableMember.member())
                )
                .match(ClusterEvent.MemberLeft.class, memberLeft ->
                        removeFromClusterRouters(memberLeft.member())
                )
                .match(ClusterEvent.MemberDowned.class, memberDowned ->
                        removeFromClusterRouters(memberDowned.member())
                )
                .build();
    }

private void addToClusterRouters(Member member) {
        final ActorSelection cousinRouter = getCousinRouter(member);
        if (cousinRouter.anchor().path().address().host().isDefined()) {
            clusterRoutersOfSameType = clusterRoutersOfSameType.plus(
                    cousinRouter
            );
            log.debug("Added cousinRouter: {} to clusterRoutersOfSameType: {}", cousinRouter, clusterRoutersOfSameType);
        }
    }

private void removeFromClusterRouters(Member member) {
        final ActorSelection cousinRouter = getCousinRouter(member);
        clusterRoutersOfSameType = clusterRoutersOfSameType.minus(
                cousinRouter
        );
        log.debug("Removed cousinRouter: {} from clusterRoutersOfSameType: {}", cousinRouter, clusterRoutersOfSameType);
    }

private ActorSelection getCousinRouter(Member member) {
        final String name = getContext().getSelf().path().name();

        // each node has a ChannelRouterActor at "/user/SELServer/" + name, so select that path
        return getContext().actorSelection(member.address() + "/user/SELServer/" + name);
    }

Questions

  1. Why do we have bursts of the mentioned DEBUG logs?
  2. Why is there one instance of LocalActorRef which occupies so much heap?
  3. How can we get rid of the DEBUG log bursts and how can we fix the memory issue so gc does not take more than 700 millis?

Please let us know if you need more information.

Thanks

4 posts - 2 participants

Read full topic


Viewing all articles
Browse latest Browse all 1362

Trending Articles