하이브리드 UI 또는 관리 API에서 항목에 대해 일치하지 않는 데이터가 관측되거나 데이터가 관측되지 않음

Apigee 및 Apigee Hybrid 문서입니다.
Apigee Edge 문서 보기

증상

Apigee 하이브리드 사용자 인터페이스(UI) 및 관리 API에서 API 제품, 앱, 개발자, 키-값 맵(KVM), 캐시와 같은 항목에 대해 간헐적으로 일치하지 않는 데이터가 사용자에게 관측되거나 데이터가 관측되지 않습니다.

오류 메시지

이 시나리오에서는 표시되도록 알려진 오류 메시지가 없습니다.

가능한 원인

원인	설명
Cassandra Pod가 링에 연결되지 않음	모든 데이터 센터의 Cassandra Pod가 일반적인 Cassandra 링에 연결되지 않을 수 있습니다.
nodetool 복구가 실행되지 않음	`nodetool repair` 명령어가 주기적으로 실행되지 않을 수 있습니다.
네트워크 연결 문제	여러 데이터 센터에 있는 Cassandra Pod 사이에 네트워크 연결 문제가 있을 수 있습니다.

일반적인 진단 단계

Management API를 사용하여 API 제품, 앱 등 이 문제가 발생할 수 있는 하나 이상의 항목들에 대한 정보를 가져오고, 여러 번 호출할 때 서로 다른 결과가 표시될 수 있는지 확인합니다.

명령줄에서 다음 예시에 따라 gcloud 인증 사용자 인증 정보를 가져오고, 환경 변수를 설정하고, API 명령어를 실행합니다.

API 제품 가져오기:

TOKEN=$(gcloud auth print-access-token)
ORG=ORGANIZATION_NAME

curl -i -H "Authorization: Bearer $TOKEN" \
"https://apigee.googleapis.com/v1/organizations/$ORG/apiproducts"

앱 가져오기:

TOKEN=$(gcloud auth print-access-token)
ORG=ORGANIZATION_NAME

curl -i -H "Authorization: Bearer $TOKEN" \
"https://apigee.googleapis.com/v1/organizations/$ORG/apps"

개발자 가져오기:

TOKEN=$(gcloud auth print-access-token)
ORG=ORGANIZATION_NAME

curl -i -H "Authorization: Bearer $TOKEN" \
"https://apigee.googleapis.com/v1/organizations/$ORG/developers"

키-값 맵(KVM) 가져오기:

TOKEN=$(gcloud auth print-access-token)
ORG=ORGANIZATION_NAME

curl -i -H "Authorization: Bearer $TOKEN" \
"https://apigee.googleapis.com/v1/organizations/$ORG/keyvaluemaps"

캐시 가져오기:

TOKEN=$(gcloud auth print-access-token)
ORG=ORGANIZATION_NAME
ENV=ENVIRONMENT_NAME

curl -i -H "Authorization: Bearer $TOKEN" \
"https://apigee.googleapis.com/v1/organizations/$ORG/environments/$ENV/caches"

위의 관리 API 요청을 실행할 때 데이터가 표시되지 않거나 다른 데이터가 표시되는 경우 UI에서 관측되는 것과 동일한 문제가 관측됩니다.

원인: Cassandra Pod가 모든 데이터 센터의 Cassandra Pod에 연결되어 있지 않음

멀티 리전 Apigee 하이브리드 배포에서 모든 Cassandra Pod가 동일한 Cassandra 링에 연결되어 있지 않으면 전체 Cassandra Pod에서 데이터가 복제되지 않을 수 있습니다. 따라서 동일한 쿼리에 대해 동일한 데이터 세트가 관리 영역에 일관적으로 수신되지 않을 수 있습니다. 이 시나리오를 분석하려면 다음 단계를 수행하세요.

진단

Cassandra Pod를 나열합니다.

# list cassandra pods
kubectl -n apigee get pods -l app=apigee-cassandra

다음 명령어를 실행하여 각 데이터 센터에서 전체 Cassandra Pod의 상태를 확인합니다.

1.4.0 미만의 Apigee 하이브리드 버전:

# check cassandra cluster status
kubectl -n apigee get pods \
-l app=apigee-cassandra \
--field-selector=status.phase=Running \
-o custom-columns=name:metadata.name --no-headers \
| xargs -I{} sh -c "echo {}; kubectl -n apigee exec {} -- nodetool status"

1.4.0 이상의 Apigee 하이브리드 버전:

# check cassandra cluster status
kubectl -n apigee get pods \
-l app=apigee-cassandra \
--field-selector=status.phase=Running \
-o custom-columns=name:metadata.name --no-headers \
| xargs -I{} sh -c "echo {}; kubectl -n apigee exec {} -- nodetool -u jmxuser -pw JMXUSER_PASSWORD status"

위 명령어의 결과를 확인하고 모든 데이터 센터에 있는 모든 Cassandra Pod가 Cassandra 링에 연결되어 있고 작동 및 정상(UN) 상태인지 확인합니다.

정상 Cassandra 링의 출력 예시:

kubectl -n apigee get pods \
-l app=apigee-cassandra \
--field-selector=status.phase=Running \
-o custom-columns=name:metadata.name --no-headers \
| xargs -I{} sh -c "echo {}; kubectl -n apigee exec {} -- nodetool -u jmxuser -pw iloveapis123 status"

apigee-cassandra-default-0
Datacenter: dc-1
================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.0.2.18  1.32 MiB   256          100.0%            2e6051fe-e3ed-4858-aed0-ac9be5270e97  ra-1
UN  10.0.4.10  1.49 MiB   256          100.0%            2396e17f-94fd-4d7d-b55e-35f491a5c1cc  ra-1
UN  10.0.3.14  1.38 MiB   256          100.0%            579cf76e-7d6d-46c8-8319-b7cd74ee87c8  ra-1
Datacenter: dc-2
================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.8.1.12  1.31 MiB   256          100.0%            3e9f24bf-2c10-4cfd-8217-5be6245c2b9c  ra-1
UN  10.8.2.19  1.24 MiB   256          100.0%            1d2e803d-aa31-487b-9503-1e18297efc04  ra-1
UN  10.8.4.4   1.28 MiB   256          100.0%            d15ffeef-7929-42c2-a3b1-a3feb85a857b  ra-1

apigee-cassandra-default-1
Datacenter: dc-1
================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.0.2.18  1.32 MiB   256          100.0%            2e6051fe-e3ed-4858-aed0-ac9be5270e97  ra-1
UN  10.0.4.10  1.49 MiB   256          100.0%            2396e17f-94fd-4d7d-b55e-35f491a5c1cc  ra-1
UN  10.0.3.14  1.38 MiB   256          100.0%            579cf76e-7d6d-46c8-8319-b7cd74ee87c8  ra-1
Datacenter: dc-2
================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.8.1.12  1.31 MiB   256          100.0%            3e9f24bf-2c10-4cfd-8217-5be6245c2b9c  ra-1
UN  10.8.2.19  1.24 MiB   256          100.0%            1d2e803d-aa31-487b-9503-1e18297efc04  ra-1
UN  10.8.4.4   1.28 MiB   256          100.0%            d15ffeef-7929-42c2-a3b1-a3feb85a857b  ra-1

apigee-cassandra-default-2
Datacenter: dc-1
================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.0.2.18  1.32 MiB   256          100.0%            2e6051fe-e3ed-4858-aed0-ac9be5270e97  ra-1
UN  10.0.4.10  1.49 MiB   256          100.0%            2396e17f-94fd-4d7d-b55e-35f491a5c1cc  ra-1
UN  10.0.3.14  1.38 MiB   256          100.0%            579cf76e-7d6d-46c8-8319-b7cd74ee87c8  ra-1
Datacenter: dc-2
================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.8.1.12  1.31 MiB   256          100.0%            3e9f24bf-2c10-4cfd-8217-5be6245c2b9c  ra-1
UN  10.8.2.19  1.24 MiB   256          100.0%            1d2e803d-aa31-487b-9503-1e18297efc04  ra-1
UN  10.8.4.4   1.28 MiB   256          100.0%            d15ffeef-7929-42c2-a3b1-a3feb85a857b  ra-1

비정상 Cassandra 링의 출력 예시:

kubectl -n apigee get pods \
-l app=apigee-cassandra \
--field-selector=status.phase=Running \
-o custom-columns=name:metadata.name --no-headers \
| xargs -I{} sh -c "echo {}; kubectl -n apigee exec {} -- nodetool -u jmxuser -pw iloveapis123 status"

apigee-cassandra-default-0
Datacenter: dc-1
================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.0.2.18  1.32 MiB   256          100.0%            2e6051fe-e3ed-4858-aed0-ac9be5270e97  ra-1
DL  10.0.4.10  1.49 MiB   256          100.0%            2396e17f-94fd-4d7d-b55e-35f491a5c1cc  ra-1
DL  10.0.3.14  1.38 MiB   256          100.0%            579cf76e-7d6d-46c8-8319-b7cd74ee87c8  ra-1
Datacenter: dc-2
================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.8.1.12  1.31 MiB   256          100.0%            3e9f24bf-2c10-4cfd-8217-5be6245c2b9c  ra-1
UN  10.8.2.19  1.24 MiB   256          100.0%            1d2e803d-aa31-487b-9503-1e18297efc04  ra-1
DL  10.8.4.4   1.28 MiB   256          100.0%            d15ffeef-7929-42c2-a3b1-a3feb85a857b  ra-1

apigee-cassandra-default-1
Datacenter: dc-1
================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.0.2.18  1.32 MiB   256          100.0%            2e6051fe-e3ed-4858-aed0-ac9be5270e97  ra-1
UN  10.0.4.10  1.49 MiB   256          100.0%            2396e17f-94fd-4d7d-b55e-35f491a5c1cc  ra-1
UN  10.0.3.14  1.38 MiB   256          100.0%            579cf76e-7d6d-46c8-8319-b7cd74ee87c8  ra-1
Datacenter: dc-2
================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.8.1.12  1.31 MiB   256          100.0%            3e9f24bf-2c10-4cfd-8217-5be6245c2b9c  ra-1
UN  10.8.2.19  1.24 MiB   256          100.0%            1d2e803d-aa31-487b-9503-1e18297efc04  ra-1
UN  10.8.4.4   1.28 MiB   256          100.0%            d15ffeef-7929-42c2-a3b1-a3feb85a857b  ra-1

apigee-cassandra-default-2
Datacenter: dc-1
================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.0.2.18  1.32 MiB   256          100.0%            2e6051fe-e3ed-4858-aed0-ac9be5270e97  ra-1
UN  10.0.4.10  1.49 MiB   256          100.0%            2396e17f-94fd-4d7d-b55e-35f491a5c1cc  ra-1
UN  10.0.3.14  1.38 MiB   256          100.0%            579cf76e-7d6d-46c8-8319-b7cd74ee87c8  ra-1
Datacenter: dc-2
================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.8.1.12  1.31 MiB   256          100.0%            3e9f24bf-2c10-4cfd-8217-5be6245c2b9c  ra-1
UN  10.8.2.19  1.24 MiB   256          100.0%            1d2e803d-aa31-487b-9503-1e18297efc04  ra-1
UN  10.8.4.4   1.28 MiB   256          100.0%            d15ffeef-7929-42c2-a3b1-a3feb85a857b  ra-1

위 출력의 일부 Cassandra Pod가 작동 중지 및 종료(DL) 상태인지 확인합니다. 자세한 내용은 nodetool 상태를 참조하세요.

위 예시 출력에 표시된 것처럼 DL 상태인 Cassandra Pod가 있으면, 이 문제에 해당될 수 있습니다.
하이브리드 UI 또는 관리 API를 통해 항목 정보를 가져오도록 요청을 수행할 때 작동 중지 상태인 Cassandra Pod에 요청이 도달하면 데이터를 얻지 못합니다.

해결 방법

다음 섹션에 제공된 단계를 수행하고 문제가 되는 데이터 센터에 있는 Cassandra 포드가 GKE 및 GKE On-Prem의 멀티 리전 배포 | Apigee에 설명된 대로 원래 데이터 센터에 연결되어 있는지 확인합니다.

원인: nodetool 복구가 실행되지 않음

유지보수 태스크로 nodetool repair 명령어가 주기적으로 실행되지 않았으면 Cassandra Pod 간에 데이터가 일치하지 않을 수 있습니다. 이 시나리오를 분석하려면 다음 단계를 수행하세요.

진단

디버깅을 위한 Cassandra 클라이언트 컨테이너 포드 apigee-hybrid-cassandra-client를 만듭니다.

모든 Cassandra Pod를 나열합니다.

# list cassandra pods
kubectl -n=apigee get pods -l app=apigee-cassandra

CQLSH를 사용하여 Cassandra Pod 중 하나에 연결합니다.

cqlsh apigee-cassandra-default-0.apigee-cassandra-default.apigee.svc.cluster.local -u ddl_user --ssl

keyspaces를 나열합니다.

SELECT * from system_schema.keyspaces;

출력 예시:

ddl_user@cqlsh> SELECT keyspace_name from system_schema.keyspaces;

 keyspace_name
-----------------------------
                 system_auth
 cache_PROJECT_ID_hybrid
               system_schema
   kms_PROJECT_ID_hybrid
   kvm_PROJECT_ID_hybrid
   rtc_PROJECT_ID_hybrid
          system_distributed
                      system
                      perses
               system_traces
 quota_PROJECT_ID_hybrid

(11 rows)

위 결과에서 keyspaces를 식별하고 CQLSH를 사용하여 각 데이터 센터에서 모든 항목을 나열하고 쿼리합니다.
일치하지 않는 항목이 API 제품인 경우:
```
select * from KMS_KEYSPACE.api_product;
```
일치하지 않는 항목이 애플리케이션인 경우(app):
```
select * from KMS_KEYSPACE.app;
```
일치하지 않는 항목이 developer인 경우:
```
select * from KMS_KEYSPACE.developer;
```
일치하지 않는 항목이 키-값 맵인 경우:
```
select * from KVM_KEYSPACE.kvm_map_entry;
```
일치하지 않는 항목이 cache인 경우:
```
select * from CACHE_KEYSPACE.cache_map_entry;
```
위의 각 쿼리의 출력에서 레코드 수를 기록해 둡니다.
모든 데이터 센터에서 각 Cassandra Pod에 대해 위 단계를 반복합니다.
모든 Cassandra Pod에서 가져온 레코드 수를 비교합니다.
데이터가 일치하지 않는 Cassandra Pod를 확인합니다.

해결 방법

Cassandra Pod를 나열하고 데이터가 일치하지 않은 특정 Cassandra Pod에 연결합니다.

# list cassandra pods
kubectl -n=apigee get pods -l app=apigee-cassandra

# connect to one cassandra pod
kubectl -n=apigee exec -it apigee-cassandra-default-0 bash

각 데이터 센터의 각 Cassandra Pod에서 nodetool repair 명령어를 실행합니다.
1.4.0 미만의 Apigee 하이브리드 버전:
```
nodetool repair
```
1.4.0 이상의 Apigee 하이브리드 버전:
```
nodetool -u JMX_USERNAME -pw JMX-PASSWORD repair
```
진단 섹션을 다시 따르고 데이터가 전체 Cassandra Pod에 일관되게 복제되었는지 확인합니다.
데이터가 일치하지 않은 모든 Cassandra Pod에 대해 위 단계를 반복합니다.

원인: 네트워크 연결 문제

데이터 센터 간에 네트워크 연결 문제가 있으면 Cassandra 링의 전체 Cassandra Pod에 Cassandra 데이터가 일관되게 복제되지 않을 수 있습니다. 이 시나리오를 분석하려면 다음 단계를 수행하세요.

진단

모든 Cassandra Pod를 나열합니다.

# list cassandra pods
kubectl -n=apigee get pods -l app=apigee-cassandra

다음 curl 명령어를 실행하고 포트 7001을 사용하여 첫 번째 데이터 센터(dc-1)에 있는 첫 번째 Cassandra Pod에서 두 번째 데이터 센터(dc-2)에 있는 첫 번째 Cassandra Pod에 telnet으로 연결합니다.
```
  kubectl -n apigee exec -it apigee-cassandra-default-0 bash -- curl -v telnet://DC_2_APIGEE_CASSANDRA_DEFAULT_0_POD_IP:7001
```

telnet이 성공하면 다음과 비슷한 출력이 표시됩니다.

* Rebuilt URL to: telnet://10.0.4.10:7001/
*   Trying 10.0.4.10...
* TCP_NODELAY set
* Connected to 10.0.4.10 (10.0.4.10) port 7001 (#0)

그렇지 않으면 다음과 비슷한 오류가 표시됩니다.

* Rebuilt URL to: telnet://10.0.4.10:7001/
*   Trying 10.0.4.10...
* TCP_NODELAY set
* connect to 10.0.4.10 port 7001 failed: Connection refused
* Failed to connect to 10.0.4.10 port 7001: Connection refused
* Closing connection 0
curl: (7) Failed to connect to 10.0.4.10 port 7001: Connection refused

한 데이터 센터에 있는 Cassandra Pod에서 다른 데이터 센터에 있는 Cassandra Pod로의 연결 오류가 발생하면 방화벽 제한이 있거나 다른 종류의 네트워크 연결 오류가 있는 것입니다.

해결 방법

이 Apigee 하이브리드 배포가 GKE에 있는 경우 VPC 방화벽 규칙 개요를 참조하여 한 데이터 센터에서 다른 데이터 센터로의 트래픽을 차단하도록 방화벽 규칙이 설정되었는지 확인하고 네트워크 연결 문제를 분석합니다.
이 Apigee 하이브리드 배포가 GKE-On-Prem에 있는 경우 관련 네트워킹팀과 협력하여 네트워크 연결 문제를 분석합니다.

진단 정보 수집 필요

위 안내를 따른 후에도 문제가 지속되면 다음 진단 정보를 수집한 후 Google Cloud Customer Care에 문의하세요.

Google Cloud 프로젝트 ID
Apigee 하이브리드 조직
모든 민감한 정보를 마스킹하는 overrides.yaml 파일

모든 네임스페이스의 Kubernetes Pod 상태:

kubectl get pods -A > kubectl-pod-status`date +%Y.%m.%d_%H.%M.%S`.txt

Kubernetes cluster-info dump:

# generate kubernetes cluster-info dump
kubectl cluster-info dump -A --output-directory=/tmp/kubectl-cluster-info-dump

# zip kubernetes cluster-info dump
zip -r kubectl-cluster-info-dump`date +%Y.%m.%d_%H.%M.%S`.zip /tmp/kubectl-cluster-info-dump/*