티스토리 뷰

반응형

얼마전 Flink 1.15 가 릴리즈 되었고, 테스트를 하는데 HA 관련 설정을 하면 오류가 발생했다.

주키퍼를 이용해 HA 를 처리하게 되고 관련된 설정은 "high-availability.zookeeper.quorum" 이다.

로그를 zookeeper 와 curator 관련 클래스에서 문제가 된걸 알수 있다. 결론부터 말하면 주키퍼 버전의 문제였다. 왜 이런 일이 일어났을까?

2022-05-12 13:11:45,213 INFO  org.apache.flink.shaded.curator5.org.apache.curator.framework.state.ConnectionStateManager [] - State change: CONNECTED
2022-05-12 13:11:45,453 INFO  org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxn [] - Unable to read additional data from server sessionid 0x57b75c3b354af2a, likely server has closed socket, closing socket connection and attempting reconnect
2022-05-12 13:11:45,453 ERROR org.apache.flink.shaded.curator5.org.apache.curator.framework.imps.CuratorFrameworkImpl [] - Ensure path threw exception
org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.KeeperException$UnimplementedException: KeeperErrorCode = Unimplemented for /flink/flink-dev-115
        at org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.KeeperException.create(KeeperException.java:106) ~[flink-shaded-zookeeper-3.5.9.jar:3.5.9-15.0]
        at org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.KeeperException.create(KeeperException.java:54) ~[flink-shaded-zookeeper-3.5.9.jar:3.5.9-15.0]
        at org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:1538) ~[flink-shaded-zookeeper-3.5.9.jar:3.5.9-15.0]
        at org.apache.flink.shaded.curator5.org.apache.curator.utils.ZKPaths.mkdirs(ZKPaths.java:351) ~[flink-shaded-zookeeper-3.5.9.jar:3.5.9-15.0]
        at org.apache.flink.shaded.curator5.org.apache.curator.framework.imps.NamespaceImpl$1.call(NamespaceImpl.java:90) ~[flink-shaded-zookeeper-3.5.9.jar:3.5.9-15.0]
        at org.apache.flink.shaded.curator5.org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93) ~[flink-shaded-zookeeper-3.5.9.jar:3.5.9-15.0]
        at org.apache.flink.shaded.curator5.org.apache.curator.framework.imps.NamespaceImpl.fixForNamespace(NamespaceImpl.java:83) ~[flink-shaded-zookeeper-3.5.9.jar:3.5.9-15.0]
        at org.apache.flink.shaded.curator5.org.apache.curator.framework.imps.NamespaceImpl.newNamespaceAwareEnsurePath(NamespaceImpl.java:109) ~[flink-shaded-zookeeper-3.5.9.jar:3.5.9-15.0]
        at org.apache.flink.shaded.curator5.org.apache.curator.framework.imps.CuratorFrameworkImpl.newNamespaceAwareEnsurePath(CuratorFrameworkImpl.java:618) ~[flink-shaded-zookeeper-3.5.9.jar:3.5.9-15.0]
        at org.apache.flink.runtime.util.ZooKeeperUtils.useNamespaceAndEnsurePath(ZooKeeperUtils.java:729) ~[flink-dist-1.15.0.jar:1.15.0]
        at org.apache.flink.runtime.highavailability.zookeeper.ZooKeeperMultipleComponentLeaderElectionHaServices.<init>(ZooKeeperMultipleComponentLeaderElectionHaServices.java:85) ~[flink-dist-1.15.0.jar:1.15.0]
        at org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createZooKeeperHaServices(HighAvailabilityServicesUtils.java:96) ~[flink-dist-1.15.0.jar:1.15.0]
        at org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:140) ~[flink-dist-1.15.0.jar:1.15.0]
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:427) ~[flink-dist-1.15.0.jar:1.15.0]
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:376) ~[flink-dist-1.15.0.jar:1.15.0]
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:277) ~[flink-dist-1.15.0.jar:1.15.0]
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$1(ClusterEntrypoint.java:227) ~[flink-dist-1.15.0.jar:1.15.0]
        at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_312]
        at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_312]
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762) [hadoop-client-api-3.2.3.jar:?]
        at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) [flink-dist-1.15.0.jar:1.15.0]
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:224) [flink-dist-1.15.0.jar:1.15.0]
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:711) [flink-dist-1.15.0.jar:1.15.0]
        at org.apache.flink.runtime.entrypoint.StandaloneSessionClusterEntrypoint.main(StandaloneSessionClusterEntrypoint.java:59) [flink-dist-1.15.0.jar:1.15.0]

원인과 해결방법

보통 하둡 패키지안에 있는 주키퍼의 경우 3.4.x 버전대가 기본으로 많이 사용되고, flink 1.14 버전까지는 두 버전의 연동이 잘 되었다. 하지만 flink 1.15로 버전이 올라가면서 zookeeper 기본 지원버전이 3.5로 변경되었고 이에 따른 문제가 발생된것으로 밝혀졌다. 관련된 이슈는 아래 링크를 확인하면 된다.

 

[FLINK-25146] Drop support for Zookeeper 3.4 - ASF JIRA

Upgrade default ZK version to 3.5.

issues.apache.org

이걸 해결하는건, 주키퍼 버전을 올려주거나 플링크 버전을 낮춰야 한다.

플링크의 버전이 빠르게 올라가기 때문에 어쩔수 없이 주키퍼버전을 올리는것만이 답일것 같다.

 

덤으로, 1.15 릴리즈노트를 확인해 보면 java8 지원도 종료되고, 앞으로는 java11 으로만 테스트하려나보다. 관심있다면 체크해보도록 하자.

https://nightlies.apache.org/flink/flink-docs-release-1.15/release-notes/flink-1.15/

 

반응형
댓글
공지사항
최근에 올라온 글
최근에 달린 댓글
Total
Today
Yesterday
링크
«   2024/12   »
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31
글 보관함