Kubernetes Service Mesh: Istio ve Linkerd ile Mikroservis Yönetimi

Mikroservis mimarileri ölçeklendikçe, servisler arası iletişimin yönetimi giderek karmaşıklaşır. Service Mesh, bu karmaşıklığı application code'dan soyutlayarak trafik yönetimi, güvenlik, observability ve resilience gibi cross-cutting concern'leri merkezi bir şekilde çözüme kavuşturur.

Bu yazıda, Kubernetes ekosisteminin iki önde gelen Service Mesh çözümü Istio ve Linkerd'i detaylıca inceleyeceğiz. Production ortamlarında kullanabileceğiniz pratik örnekler, kurulum rehberleri, traffic management stratejileri, mTLS güvenliği ve observability implementasyonları ile Service Mesh dünyasına kapsamlı bir giriş yapacaksınız.

İçindekiler

Service Mesh Nedir?
Service Mesh'in Çözdüğü Problemler
Istio vs Linkerd: Hangisini Seçmeliyim?
Istio Mimarisi ve Bileşenleri
Istio Kurulumu (Production-Ready)
Istio ile Traffic Management
Istio ile Canary Deployment
Istio ile Circuit Breaking ve Fault Injection
Istio ile mTLS ve Security
Linkerd Mimarisi ve Avantajları
Linkerd Kurulumu (Lightweight Approach)
Linkerd ile Traffic Splitting
Linkerd ile Multi-Cluster Service Mesh
Service Mesh Observability: Kiali, Grafana, Jaeger
Production Best Practices
Performance ve Resource Optimization
Migration Stratejileri
Troubleshooting ve Debugging

Service Mesh Nedir? {#nedir}

Service Mesh, mikroservis mimarisinde servisler arası iletişimi yöneten bir infrastructure layer'dır. Her servisin yanına sidecar proxy (genellikle Envoy) deploy ederek network trafiğini intercept eder ve şu özellikleri sağlar:

Core Özellikler

Traffic Management: Load balancing, routing, retries, timeouts
Security: Mutual TLS (mTLS), authentication, authorization
Observability: Metrics, distributed tracing, access logs
Resilience: Circuit breaking, fault injection, rate limiting

Sidecar Pattern

          
yaml

# Pod içinde uygulama + sidecar proxy
apiVersion: v1
kind: Pod
metadata:
  name: my-app
spec:
  containers:
  - name: app
    image: my-app:v1
    ports:
    - containerPort: 8080
  - name: istio-proxy  # Sidecar
    image: istio/proxyv2:1.20.0
    # Envoy proxy - tüm trafik buradan geçer

Avantaj: Application code'da değişiklik yapmadan tüm bu özellikleri kazanırsınız.

Service Mesh'in Çözdüğü Problemler {#problemler}

1. Distributed Tracing Karmaşıklığı

Önce (her serviste manuel implementation):

          
python

# Her serviste tekrar eden kod
import opentelemetry
from opentelemetry import trace

tracer = trace.get_tracer(__name__)

@app.route('/api/users')
def get_users():
    with tracer.start_as_current_span("get_users"):
        # Trace context'i header'lara ekle
        headers = inject_trace_context()
        response = requests.get('http://database-service', headers=headers)
        return response.json()

Service Mesh ile:

          
python

# Kod basitleşir - tracing otomatik
@app.route('/api/users')
def get_users():
    response = requests.get('http://database-service')
    return response.json()

Tracing, sidecar proxy tarafından otomatik inject edilir.

2. mTLS ve Encryption

Önce: Her serviste TLS sertifika yönetimi, rotation, renewal...

Service Mesh ile: Sidecar'lar arası otomatik mTLS, sertifika rotation her 24 saatte bir otomatik.

3. Canary Deployment ve Traffic Splitting

Önce: Multiple deployments, manual DNS/LB configuration, risk...

Service Mesh ile: Declarative traffic routing:

          
yaml

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: my-app
spec:
  hosts:
  - my-app
  http:
  - match:
    - headers:
        user-agent:
          regex: ".*Mobile.*"
    route:
    - destination:
        host: my-app
        subset: v2
  - route:
    - destination:
        host: my-app
        subset: v1
      weight: 90
    - destination:
        host: my-app
        subset: v2
      weight: 10  # %10 trafiği v2'ye

4. Circuit Breaking ve Resilience

Önce: Her serviste custom retry/timeout logic.

Service Mesh ile: Declarative policies:

          
yaml

apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: database-circuit-breaker
spec:
  host: database-service
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 100
      http:
        http1MaxPendingRequests: 50
        maxRequestsPerConnection: 2
    outlierDetection:
      consecutiveErrors: 5
      interval: 30s
      baseEjectionTime: 30s
      maxEjectionPercent: 50

Database 5 consecutive error verirse, 30 saniye boyunca ejection.

Istio vs Linkerd: Hangisini Seçmeliyim? {#karsilastirma}

Istio

✅ Avantajları:

Zengin feature set (traffic management, security, telemetry)
Büyük ekosistem (Kiali, Jaeger, Grafana entegrasyonları)
Multi-cluster, multi-mesh support
Envoy proxy (battle-tested, performant)
Virtual Service, Gateway gibi powerful CRD'ler

❌ Dezavantajları:

Karmaşık mimari (control plane heavy)
Yüksek resource consumption
Steep learning curve
Upgrade path zorlukları

İdeal kullanım:

Büyük enterprise ortamlar
Kompleks traffic routing ihtiyaçları
Multi-cluster/multi-tenancy
Zengin observability gereksinimleri

Linkerd

✅ Avantajları:

Ultra-lightweight (Rust-based data plane)
Basit kurulum ve kullanım
Düşük resource footprint
Out-of-the-box mTLS
Hızlı startup time
Production-first approach

❌ Dezavantajları:

Daha az feature (focused approach)
Küçük ekosistem
Sınırlı traffic routing capabilities
Custom proxy (linkerd2-proxy)

İdeal kullanım:

Startup'lar, orta ölçekli ortamlar
Basit use case'ler (mTLS + observability)
Resource-constrained clusters
Hızlı adoption isteyen ekipler

Karşılaştırma Tablosu

Özellik	Istio	Linkerd
Kurulum Kolaylığı	⭐⭐⭐	⭐⭐⭐⭐⭐
Resource Usage	~500MB (control plane)	~100MB (control plane)
Proxy	Envoy (C++)	linkerd2-proxy (Rust)
mTLS	✅ (manual enable)	✅ (default enabled)
Traffic Management	⭐⭐⭐⭐⭐	⭐⭐⭐
Observability	⭐⭐⭐⭐⭐	⭐⭐⭐⭐
Multi-cluster	✅ Native	✅ Extension
Maturity	⭐⭐⭐⭐⭐	⭐⭐⭐⭐
Community	Very Large	Large
CNCF Status	Graduated	Graduated

Istio Mimarisi ve Bileşenleri {#istio-mimari}

Control Plane: Istiod

Istio 1.5+ tek bir component: istiod (eski versiyonlarda Pilot, Citadel, Galley ayrıydı).

          
text

┌─────────────────────────────────────────────┐
│             Istiod (Control Plane)          │
│  ┌────────┐  ┌────────┐  ┌────────┐        │
│  │ Pilot  │  │Citadel │  │Galley  │        │
│  │Config  │  │ Cert   │  │ Config │        │
│  │  Mgmt  │  │  Mgmt  │  │Validate│        │
│  └────────┘  └────────┘  └────────┘        │
└─────────────────────────────────────────────┘
             ↓ (xDS API)
┌─────────────────────────────────────────────┐
│            Data Plane (Envoy Proxies)       │
│  ┌─────────────┐      ┌─────────────┐      │
│  │ Pod A       │      │ Pod B       │      │
│  │  App | Envoy│ ←──→ │ Envoy | App │      │
│  └─────────────┘      └─────────────┘      │
└─────────────────────────────────────────────┘

Istiod görevleri:

Configuration management (VirtualService, DestinationRule)
Certificate authority (mTLS sertifikalar)
Sidecar injection
Service discovery
Proxy configuration (xDS API ile Envoy'a config push)

Data Plane: Envoy Sidecar

Her Pod'a inject edilen Envoy proxy:

L7 traffic management
Load balancing
TLS termination/origination
Metrics collection
Access logging

Istio Kurulumu (Production-Ready) {#istio-kurulum}

Prerequisites

          
bash

# Kubernetes cluster (1.27+)
kubectl version --short

# Istioctl CLI kurulumu
curl -L https://istio.io/downloadIstio | ISTIO_VERSION=1.20.0 sh -
cd istio-1.20.0
export PATH=$PWD/bin:$PATH
istioctl version

1. Production Profile ile Kurulum

          
bash

# Production profile: HA control plane, resource limits
istioctl install --set profile=production -y

# Verify installation
kubectl get pods -n istio-system
# NAME                                    READY   STATUS
# istiod-7d6b8d8f4c-abcde                 1/1     Running
# istiod-7d6b8d8f4c-fghij                 1/1     Running  # HA
# istio-ingressgateway-5c8f9d8b7c-klmno   1/1     Running

Production profile özellikleri:

2x istiod replicas (HA)
Resource requests/limits
PodDisruptionBudget
HorizontalPodAutoscaler

2. Sidecar Auto-Injection

          
bash

# Namespace'e label ekle
kubectl label namespace default istio-injection=enabled

# Verify
kubectl get namespace -L istio-injection

Artık bu namespace'e deploy edilen her Pod'a otomatik sidecar inject edilir.

3. Demo Uygulama Deploy

          
yaml

# bookinfo-app.yaml
apiVersion: v1
kind: Service
metadata:
  name: productpage
spec:
  ports:
  - port: 9080
  selector:
    app: productpage
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: productpage-v1
spec:
  replicas: 1
  selector:
    matchLabels:
      app: productpage
      version: v1
  template:
    metadata:
      labels:
        app: productpage
        version: v1
    spec:
      containers:
      - name: productpage
        image: docker.io/istio/examples-bookinfo-productpage-v1:1.18.0
        ports:
        - containerPort: 9080

          
bash

kubectl apply -f bookinfo-app.yaml

# Pod'da sidecar var mı kontrol et
kubectl get pod -l app=productpage -o jsonpath='{.items[0].spec.containers[*].name}'
# Output: productpage istio-proxy  ✅

4. Gateway ve VirtualService

Ingress Gateway (cluster dışından erişim):

          
yaml

# gateway.yaml
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  name: bookinfo-gateway
spec:
  selector:
    istio: ingressgateway  # Istio ingress gateway
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - "bookinfo.tektik.tr"
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: bookinfo
spec:
  hosts:
  - "bookinfo.tektik.tr"
  gateways:
  - bookinfo-gateway
  http:
  - match:
    - uri:
        prefix: /productpage
    route:
    - destination:
        host: productpage
        port:
          number: 9080

          
bash

kubectl apply -f gateway.yaml

# Ingress IP al
export INGRESS_HOST=$(kubectl get svc istio-ingressgateway -n istio-system -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo "http://$INGRESS_HOST/productpage"

Istio ile Traffic Management {#istio-traffic}

1. Version-Based Routing

Farklı versiyonlara trafik yönlendirme:

          
yaml

# destination-rule.yaml
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: productpage
spec:
  host: productpage
  subsets:
  - name: v1
    labels:
      version: v1
  - name: v2
    labels:
      version: v2
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: productpage-route
spec:
  hosts:
  - productpage
  http:
  - match:
    - headers:
        user-type:
          exact: "premium"
    route:
    - destination:
        host: productpage
        subset: v2  # Premium users → v2
  - route:
    - destination:
        host: productpage
        subset: v1  # Default → v1

Test:

          
bash

# Normal user → v1
curl http://productpage:9080/productpage

# Premium user → v2
curl -H "user-type: premium" http://productpage:9080/productpage

2. Weighted Routing (Canary)

Ağırlıklı trafik dağılımı:

          
yaml

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: productpage-canary
spec:
  hosts:
  - productpage
  http:
  - route:
    - destination:
        host: productpage
        subset: v1
      weight: 80
    - destination:
        host: productpage
        subset: v2
      weight: 20  # %20 trafiği yeni versiyona

Progressive rollout:

          
bash

# 1. Adım: %10
kubectl patch virtualservice productpage-canary --type=json -p='[{"op": "replace", "path": "/spec/http/0/route/1/weight", "value": 10}]'

# 2. Adım: Metrics izle
kubectl exec -it prometheus-xxx -n istio-system -- promtool query instant 'istio_request_duration_milliseconds_bucket{destination_service="productpage"}'

# 3. Adım: Başarılıysa %50
kubectl patch virtualservice productpage-canary --type=json -p='[{"op": "replace", "path": "/spec/http/0/route/1/weight", "value": 50}]'

# 4. Adım: Tam geçiş %100
kubectl patch virtualservice productpage-canary --type=json -p='[{"op": "replace", "path": "/spec/http/0/route/1/weight", "value": 100}]'

3. Timeout ve Retry

Resilience policies:

          
yaml

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: ratings-timeout
spec:
  hosts:
  - ratings
  http:
  - route:
    - destination:
        host: ratings
        subset: v1
    timeout: 10s  # 10 saniye timeout
    retries:
      attempts: 3
      perTryTimeout: 2s
      retryOn: 5xx,reset,connect-failure,refused-stream

Test:

          
bash

# Fault injection ile test
kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: ratings-delay
spec:
  hosts:
  - ratings
  http:
  - fault:
      delay:
        percentage:
          value: 100
        fixedDelay: 5s  # 5s gecikme
    route:
    - destination:
        host: ratings
EOF

# Retry mekanizması devreye girer
curl -w "@curl-format.txt" http://productpage:9080/productpage

Istio ile Canary Deployment {#istio-canary}

Full Canary Deployment Pipeline

Senaryo: E-commerce checkout service'i v2 versiyonuna canary ile geçiş.

1. İki Deployment:

          
yaml

# checkout-v1.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: checkout-v1
spec:
  replicas: 3
  selector:
    matchLabels:
      app: checkout
      version: v1
  template:
    metadata:
      labels:
        app: checkout
        version: v1
    spec:
      containers:
      - name: checkout
        image: mycompany/checkout:v1.5.2
        env:
        - name: VERSION
          value: "v1"
---
# checkout-v2.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: checkout-v2
spec:
  replicas: 1
  selector:
    matchLabels:
      app: checkout
      version: v2
  template:
    metadata:
      labels:
        app: checkout
        version: v2
    spec:
      containers:
      - name: checkout
        image: mycompany/checkout:v2.0.0
        env:
        - name: VERSION
          value: "v2"
        - name: NEW_PAYMENT_GATEWAY
          value: "enabled"

2. DestinationRule:

          
yaml

apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: checkout
spec:
  host: checkout
  trafficPolicy:
    loadBalancer:
      simple: LEAST_REQUEST  # En az yüklü pod'a yönlendir
    connectionPool:
      http:
        http1MaxPendingRequests: 100
        maxRequestsPerConnection: 2
  subsets:
  - name: v1
    labels:
      version: v1
  - name: v2
    labels:
      version: v2
    trafficPolicy:
      outlierDetection:
        consecutiveErrors: 3
        interval: 30s
        baseEjectionTime: 60s

3. Progressive Rollout Script:

          
bash

#!/bin/bash
# canary-rollout.sh

WEIGHTS=(5 10 25 50 75 100)
ERROR_THRESHOLD=0.05  # %5 error rate threshold

for WEIGHT in "${WEIGHTS[@]}"; do
  echo "🚀 Rolling out v2 with $WEIGHT% traffic..."
  
  # Update VirtualService
  kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: checkout
spec:
  hosts:
  - checkout
  http:
  - route:
    - destination:
        host: checkout
        subset: v1
      weight: $((100 - WEIGHT))
    - destination:
        host: checkout
        subset: v2
      weight: $WEIGHT
EOF

  # Wait for traffic shift
  sleep 10
  
  # Check error rate
  ERROR_RATE=$(kubectl exec -n istio-system deploy/prometheus -c prometheus -- \
    promtool query instant 'sum(rate(istio_requests_total{destination_service="checkout",destination_version="v2",response_code=~"5.."}[1m])) / sum(rate(istio_requests_total{destination_service="checkout",destination_version="v2"}[1m]))' \
    | jq -r '.data.result[0].value[1]')
  
  if (( $(echo "$ERROR_RATE > $ERROR_THRESHOLD" | bc -l) )); then
    echo "❌ Error rate too high ($ERROR_RATE). Rolling back!"
    kubectl apply -f virtualservice-v1-100.yaml
    exit 1
  fi
  
  echo "✅ Error rate OK ($ERROR_RATE). Waiting 5 minutes..."
  sleep 300
done

echo "🎉 Canary deployment completed successfully!"

4. Automated Rollback with Prometheus Alert

          
yaml

# prometheus-alert.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-alerts
  namespace: istio-system
data:
  alerts.yml: |
    groups:
    - name: canary
      interval: 30s
      rules:
      - alert: CanaryHighErrorRate
        expr: |
          sum(rate(istio_requests_total{destination_service="checkout",destination_version="v2",response_code=~"5.."}[1m])) 
          / 
          sum(rate(istio_requests_total{destination_service="checkout",destination_version="v2"}[1m])) 
          > 0.05
        for: 2m
        annotations:
          summary: "Canary v2 error rate above 5%"
          description: "Rolling back to v1"
        labels:
          severity: critical
          action: rollback

Rollback webhook:

          
python

# rollback-webhook.py
from flask import Flask, request
import subprocess

app = Flask(__name__)

@app.route('/webhook', methods=['POST'])
def rollback():
    alert = request.json
    if alert['labels'].get('action') == 'rollback':
        subprocess.run(['kubectl', 'apply', '-f', 'virtualservice-v1-100.yaml'])
        print("🔄 Rolled back to v1")
        return "OK", 200
    return "Ignored", 200

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=8080)

Istio ile Circuit Breaking ve Fault Injection {#istio-resilience}

Circuit Breaker Configuration

Senaryo: Database servisine aşırı yük binerse circuit break.

          
yaml

apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: database-circuit-breaker
spec:
  host: postgres-service
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 50  # Max 50 TCP connection
      http:
        http1MaxPendingRequests: 10
        http2MaxRequests: 100
        maxRequestsPerConnection: 3
    outlierDetection:
      consecutiveGatewayErrors: 3  # 3 consecutive 502/503/504
      consecutive5xxErrors: 5      # veya 5 consecutive 5xx
      interval: 10s                # 10s aralıklarla check
      baseEjectionTime: 30s        # 30s eject
      maxEjectionPercent: 50       # Max %50 pod'u eject et
      minHealthPercent: 25         # Min %25 healthy pod kalmalı

Test:

          
bash

# Load generator
kubectl run -it --rm load-generator --image=busybox -- /bin/sh -c \
  "while true; do wget -q -O- http://postgres-service:5432; done"

# Circuit breaker devreye girer
# Logs:
kubectl logs -l app=istio-proxy -c istio-proxy | grep "upstream_rq_pending_overflow"

Fault Injection (Chaos Engineering)

Delay injection (latency testing):

          
yaml

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: payment-delay-test
spec:
  hosts:
  - payment-service
  http:
  - fault:
      delay:
        percentage:
          value: 50  # %50 isteklerde
        fixedDelay: 3s  # 3s gecikme
    match:
    - headers:
        x-test-user:
          exact: "qa-team"  # Sadece QA ekibine
    route:
    - destination:
        host: payment-service

Abort injection (error testing):

          
yaml

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: payment-abort-test
spec:
  hosts:
  - payment-service
  http:
  - fault:
      abort:
        percentage:
          value: 10  # %10 isteklerde
        httpStatus: 503  # 503 Service Unavailable dön
    route:
    - destination:
        host: payment-service

Test:

          
bash

# Normal kullanıcı - etkilenmez
curl http://payment-service/checkout

# QA kullanıcı - 3s delay görür
curl -H "x-test-user: qa-team" http://payment-service/checkout

Istio ile mTLS ve Security {#istio-security}

Otomatik mTLS (PeerAuthentication)

Cluster-wide mTLS zorla:

          
yaml

# Tüm mesh'te mTLS zorunlu
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: default
  namespace: istio-system
spec:
  mtls:
    mode: STRICT  # Tüm servisler arası mTLS zorunlu

Namespace-specific:

          
yaml

# Sadece production namespace'te
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: production-mtls
  namespace: production
spec:
  mtls:
    mode: STRICT

Service-specific (permissive mode):

          
yaml

# Legacy servis - hem mTLS hem plaintext kabul et
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: legacy-service
  namespace: default
spec:
  selector:
    matchLabels:
      app: legacy-app
  mtls:
    mode: PERMISSIVE  # Hem mTLS hem plain HTTP

Authorization Policies

Service-to-service authorization:

          
yaml

# Sadece frontend, backend'e erişebilir
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: backend-access
  namespace: default
spec:
  selector:
    matchLabels:
      app: backend
  action: ALLOW
  rules:
  - from:
    - source:
        principals: ["cluster.local/ns/default/sa/frontend"]
    to:
    - operation:
        methods: ["GET", "POST"]
        paths: ["/api/*"]

JWT-based authentication:

          
yaml

apiVersion: security.istio.io/v1beta1
kind: RequestAuthentication
metadata:
  name: jwt-auth
  namespace: default
spec:
  selector:
    matchLabels:
      app: api-gateway
  jwtRules:
  - issuer: "https://accounts.google.com"
    jwksUri: "https://www.googleapis.com/oauth2/v3/certs"
    audiences:
    - "my-app-client-id"
---
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: require-jwt
  namespace: default
spec:
  selector:
    matchLabels:
      app: api-gateway
  action: ALLOW
  rules:
  - from:
    - source:
        requestPrincipals: ["*"]  # JWT var mı kontrol et
    when:
    - key: request.auth.claims[role]
      values: ["admin", "user"]

Test:

          
bash

# JWT olmadan - 403 Forbidden
curl http://api-gateway/api/users

# JWT ile
TOKEN=$(gcloud auth print-identity-token)
curl -H "Authorization: Bearer $TOKEN" http://api-gateway/api/users

Linkerd Mimarisi ve Avantajları {#linkerd-mimari}

Linkerd Architecture

          
text

┌──────────────────────────────────────┐
│   Control Plane (linkerd namespace)  │
│  ┌──────────┐  ┌──────────┐         │
│  │linkerd-  │  │linkerd-  │         │
│  │destination│ │identity  │         │
│  └──────────┘  └──────────┘         │
└──────────────────────────────────────┘
             ↓
┌──────────────────────────────────────┐
│    Data Plane (linkerd2-proxy)       │
│  ┌────────────┐    ┌────────────┐   │
│  │ Pod        │    │ Pod        │   │
│  │ App│Proxy │ ←─→ │Proxy│ App │   │
│  └────────────┘    └────────────┘   │
└──────────────────────────────────────┘

Linkerd2-proxy (Rust):

Ultra-lightweight: ~10MB memory
Sub-millisecond p99 latency
Zero-config mTLS (default enabled)
Service profiles for per-route metrics

Neden Linkerd?

1. Basitlik:

          
bash

# Kurulum 2 komut
linkerd install --crds | kubectl apply -f -
linkerd install | kubectl apply -f -

# Istio'da ~15 adım config gerekir

2. Performance:

Metric	Linkerd	Istio
P50 latency overhead	+0.3ms	+1.2ms
P99 latency overhead	+0.8ms	+4.5ms
Memory (proxy)	10MB	50MB
Memory (control plane)	100MB	500MB

3. Zero-config mTLS:

Linkerd'de mTLS default enabled, certificate rotation otomatik. Istio'da manuel enable etmelisiniz.

Linkerd Kurulumu (Lightweight Approach) {#linkerd-kurulum}

1. CLI Kurulumu

          
bash

# Linkerd CLI
curl --proto '=https' --tlsv1.2 -sSfL https://run.linkerd.io/install | sh
export PATH=$PATH:$HOME/.linkerd2/bin

# Pre-check
linkerd check --pre
# ✔ Kubernetes API
# ✔ Kubernetes version >= 1.21
# ✔ kubectl configured

2. Control Plane Kurulumu

          
bash

# CRDs
linkerd install --crds | kubectl apply -f -

# Control plane
linkerd install | kubectl apply -f -

# Verify
linkerd check
# ✔ control plane is up-to-date
# ✔ control plane and cli versions match

Resource usage:

          
bash

kubectl top pod -n linkerd
# NAME                              CPU   MEMORY
# linkerd-destination-xxx           2m    30Mi
# linkerd-identity-xxx              1m    25Mi
# linkerd-proxy-injector-xxx        1m    20Mi

Istio ile karşılaştırma:

          
bash

kubectl top pod -n istio-system
# NAME                              CPU   MEMORY
# istiod-xxx                        50m   250Mi
# istio-ingressgateway-xxx          10m   100Mi

3. Sidecar Injection

          
bash

# Namespace annotation
kubectl annotate namespace default linkerd.io/inject=enabled

# Deploy app
kubectl apply -f app.yaml

# Verify
kubectl get pod -o jsonpath='{.items[0].spec.containers[*].name}'
# Output: app linkerd-proxy ✅

4. Viz Extension (Dashboard)

          
bash

# Grafana + Prometheus + Dashboard
linkerd viz install | kubectl apply -f -

# Open dashboard
linkerd viz dashboard &
# Grafana: http://localhost:50750

Linkerd ile Traffic Splitting {#linkerd-traffic}

Linkerd TrafficSplit (SMI spec):

          
yaml

# trafficsplit.yaml
apiVersion: split.smi-spec.io/v1alpha4
kind: TrafficSplit
metadata:
  name: checkout-split
spec:
  service: checkout  # Root service
  backends:
  - service: checkout-v1
    weight: 80
  - service: checkout-v2
    weight: 20  # %20 trafiği v2'ye
---
# Services
apiVersion: v1
kind: Service
metadata:
  name: checkout
spec:
  selector:
    app: checkout  # Tüm versiyonlar
  ports:
  - port: 80
---
apiVersion: v1
kind: Service
metadata:
  name: checkout-v1
spec:
  selector:
    app: checkout
    version: v1
  ports:
  - port: 80
---
apiVersion: v1
kind: Service
metadata:
  name: checkout-v2
spec:
  selector:
    app: checkout
    version: v2
  ports:
  - port: 80

Live metrics:

          
bash

linkerd viz stat trafficsplit/checkout-split
# NAME            SERVICE      SUCCESS  RPS  P50_LATENCY  P99_LATENCY
# checkout-split  checkout-v1  100.00%  8.0  10ms         25ms
#                 checkout-v2  100.00%  2.0  12ms         30ms

Flagger ile otomatik canary:

          
yaml

# Flagger CRD
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: checkout
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: checkout
  service:
    port: 80
  analysis:
    interval: 1m
    threshold: 5
    maxWeight: 50
    stepWeight: 10
    metrics:
    - name: request-success-rate
      thresholdRange:
        min: 99
    - name: request-duration
      thresholdRange:
        max: 500
  webhooks:
  - name: load-test
    url: http://flagger-loadtester/

Flagger, otomatik olarak traffic'i %10, %20, %30... artırır ve metrics'e göre rollout/rollback yapar.

Linkerd ile Multi-Cluster Service Mesh {#linkerd-multicluster}

Senaryo: US ve EU cluster'ları arasında service mesh.

1. Multi-cluster Extension

          
bash

# Cluster 1 (US)
linkerd multicluster install | kubectl apply -f -

# Cluster 2 (EU)
linkerd multicluster install | kubectl apply -f -

2. Link Clusters

          
bash

# EU cluster'dan US cluster'a link
linkerd --context=us multicluster link --cluster-name us \
  | kubectl --context=eu apply -f -

# Verify
linkerd --context=eu multicluster gateways
# CLUSTER  ALIVE  NUM_SVC  LATENCY
# us       True   3        45ms

3. Export Service

          
bash

# US cluster'da service export et
kubectl --context=us label svc checkout mirror.linkerd.io/exported=true

# EU cluster'dan erişim
kubectl --context=eu get svc
# NAME              TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)
# checkout-us       ClusterIP   10.100.1.50    <none>        80/TCP

# EU pod'dan US service'e çağrı
curl http://checkout-us/api

Traffic routing:

          
yaml

# EU cluster'da traffic split
apiVersion: split.smi-spec.io/v1alpha4
kind: TrafficSplit
metadata:
  name: checkout-multi-region
spec:
  service: checkout
  backends:
  - service: checkout  # EU local
    weight: 80
  - service: checkout-us  # US remote
    weight: 20  # Fallback veya load balancing

Service Mesh Observability: Kiali, Grafana, Jaeger {#observability}

1. Kiali (Istio Service Graph)

          
bash

# Kiali kurulum
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/kiali.yaml

# Dashboard
istioctl dashboard kiali

Kiali features:

Real-time service topology graph
Traffic flow visualization
Configuration validation
Health checks

Örnek görünüm:

          
text

┌─────────────┐         ┌─────────────┐
│  Frontend   │ ──90%─→ │   Backend   │
│   v1: 3/3   │         │   v1: 2/3   │
│   ✔ Healthy │ ←─────  │   v2: 1/3   │
└─────────────┘         └─────────────┘
       │                       │
       │ 100%                  │ 50%
       ↓                       ↓
┌─────────────┐         ┌─────────────┐
│   Redis     │         │  Postgres   │
│   ✔ 1/1     │         │   ⚠ 2/3     │
└─────────────┘         └─────────────┘

2. Grafana Dashboards

          
bash

# Grafana + Prometheus
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/prometheus.yaml
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/grafana.yaml

istioctl dashboard grafana

Pre-built dashboards:

Istio Mesh Dashboard (global metrics)
Istio Service Dashboard (per-service)
Istio Workload Dashboard (per-pod)
Istio Performance Dashboard (control plane)

Custom queries:

          
promql

# Request rate per service
sum(rate(istio_requests_total[5m])) by (destination_service)

# Error rate
sum(rate(istio_requests_total{response_code=~"5.."}[5m])) 
/ 
sum(rate(istio_requests_total[5m]))

# P99 latency
histogram_quantile(0.99, 
  sum(rate(istio_request_duration_milliseconds_bucket[5m])) 
  by (le, destination_service)
)

3. Jaeger (Distributed Tracing)

          
bash

# Jaeger kurulum
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/jaeger.yaml

istioctl dashboard jaeger

Trace flow örneği:

          
text

[Frontend] ──(50ms)──→ [API Gateway] ──(20ms)──→ [Auth Service]
                              │
                              └──(150ms)──→ [Backend Service]
                                                    │
                                                    ├──(80ms)──→ [Database]
                                                    └──(70ms)──→ [Cache]

Total: 370ms (150ms backend bottleneck 🔴)

Trace context propagation (otomatik):

Istio sidecar'lar otomatik olarak trace header'larını inject eder:

x-request-id
x-b3-traceid
x-b3-spanid
x-b3-parentspanid
x-b3-sampled

Production Best Practices {#best-practices}

1. Resource Limits (ÖNEMLİ!)

Sidecar proxy limits:

          
yaml

apiVersion: v1
kind: Pod
metadata:
  annotations:
    sidecar.istio.io/proxyCPU: "100m"
    sidecar.istio.io/proxyCPULimit: "200m"
    sidecar.istio.io/proxyMemory: "128Mi"
    sidecar.istio.io/proxyMemoryLimit: "256Mi"
spec:
  containers:
  - name: app
    image: my-app

Control plane limits:

          
bash

istioctl install --set values.pilot.resources.requests.cpu=500m \
  --set values.pilot.resources.requests.memory=2Gi \
  --set values.pilot.resources.limits.cpu=2 \
  --set values.pilot.resources.limits.memory=4Gi

2. High Availability

          
bash

# Multi-replica control plane
istioctl install --set profile=production \
  --set values.pilot.autoscaleEnabled=true \
  --set values.pilot.autoscaleMin=2 \
  --set values.pilot.autoscaleMax=5

3. Gradual Rollout

Namespace-by-namespace adoption:

          
bash

# Phase 1: Dev namespace
kubectl label namespace dev istio-injection=enabled

# Phase 2: Staging
kubectl label namespace staging istio-injection=enabled

# Phase 3: Production (non-critical services)
kubectl label namespace production-backend istio-injection=enabled

# Phase 4: Production (critical services)
kubectl label namespace production-frontend istio-injection=enabled

4. Monitoring ve Alerting

Critical alerts:

          
yaml

# Prometheus AlertManager
groups:
- name: istio
  rules:
  - alert: IstioPilotDown
    expr: up{job="pilot"} == 0
    for: 5m
  - alert: HighProxyResourceUsage
    expr: container_memory_usage_bytes{container="istio-proxy"} / container_spec_memory_limit_bytes{container="istio-proxy"} > 0.9
    for: 10m
  - alert: HighErrorRate
    expr: sum(rate(istio_requests_total{response_code=~"5.."}[5m])) / sum(rate(istio_requests_total[5m])) > 0.05
    for: 5m

5. Security Hardening

          
yaml

# Egress traffic kontrolü
apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
  name: external-apis
spec:
  hosts:
  - "api.stripe.com"
  - "api.github.com"
  ports:
  - number: 443
    name: https
    protocol: HTTPS
  location: MESH_EXTERNAL
  resolution: DNS
---
# Diğer tüm egress trafiği engelle
apiVersion: networking.istio.io/v1beta1
kind: Sidecar
metadata:
  name: default
  namespace: istio-system
spec:
  outboundTrafficPolicy:
    mode: REGISTRY_ONLY  # Sadece ServiceEntry'lere izin ver

Performance ve Resource Optimization {#performance}

1. Sidecar Scoping (Network Optimization)

Problem: Tüm servislerin config her pod'a push edilir → yüksek memory.

Çözüm: Sidecar resource'u ile scope belirle:

          
yaml

apiVersion: networking.istio.io/v1beta1
kind: Sidecar
metadata:
  name: frontend-sidecar
  namespace: production
spec:
  workloadSelector:
    labels:
      app: frontend
  egress:
  - hosts:
    - "./backend.production.svc.cluster.local"  # Sadece backend'e erişim
    - "istio-system/*"  # Istio system servisleri

Sonuç: Frontend sidecar sadece backend config'ini alır, %80 memory tasarrufu.

2. Telemetry Filtering

Problem: Her request için full metrics → yüksek Prometheus cardinality.

Çözüm: Telemetry API ile filter:

          
yaml

apiVersion: telemetry.istio.io/v1alpha1
kind: Telemetry
metadata:
  name: low-priority-metrics
  namespace: default
spec:
  selector:
    matchLabels:
      app: low-priority-service
  metrics:
  - providers:
    - name: prometheus
    overrides:
    - match:
        metric: REQUEST_COUNT
      disabled: true  # Request count metric'i devre dışı

3. Access Log Sampling

          
yaml

apiVersion: telemetry.istio.io/v1alpha1
kind: Telemetry
metadata:
  name: access-log-sampling
spec:
  accessLogging:
  - providers:
    - name: envoy
    filter:
      expression: "response.code >= 400"  # Sadece error logları
    # Veya sampling
    # match:
    #   mode: SERVER
    # sampling: 10  # %10 request logla

4. Protocol Detection Optimization

Problem: Istio otomatik protocol detection yapar → latency overhead.

Çözüm: Explicit protocol tanımla:

          
yaml

apiVersion: v1
kind: Service
metadata:
  name: database
spec:
  ports:
  - port: 5432
    name: tcp-postgres  # "tcp-" prefix → no auto-detection
  - port: 6379
    name: tcp-redis
  - port: 9200
    name: http-elasticsearch  # "http-" prefix

Migration Stratejileri {#migration}

Legacy → Service Mesh Geçiş

Senaryo: Mevcut 50 microservice, Service Mesh'e geçiş.

1. Assessment Phase (1-2 hafta)

          
bash

# Mevcut cluster analiz
kubectl get pods --all-namespaces -o json | jq '.items | length'
# 347 pods

# Deployment topology
kubectl get deploy --all-namespaces -o custom-columns=NAMESPACE:.metadata.namespace,NAME:.metadata.name,REPLICAS:.spec.replicas --no-headers | awk '{print $1}' | sort | uniq -c

Risk değerlendirmesi:

High-traffic services (>1000 RPS) → En son migre et
Low-traffic services (test, staging) → İlk migre et
Stateful services (database, cache) → Skip sidecar injection

2. Pilot Phase (2-3 hafta)

          
bash

# Dev namespace
kubectl create namespace dev-mesh
kubectl label namespace dev-mesh istio-injection=enabled

# 3 low-risk service migre et
kubectl get deploy -n dev -l risk=low -o name | head -3 | xargs -I {} kubectl patch {} -p '{"metadata":{"namespace":"dev-mesh"}}'

# 1 hafta monitoring
linkerd viz stat deploy -n dev-mesh --from deploy/load-generator

Success criteria:

✅ P99 latency < 5ms overhead
✅ Zero 5xx errors
✅ Resource usage < %110 previous

3. Progressive Rollout (4-8 hafta)

          
bash

# Week 1-2: Low-risk services
kubectl label namespace staging istio-injection=enabled
kubectl rollout restart deployment -n staging

# Week 3-4: Medium-risk services
kubectl label namespace production-backend istio-injection=enabled
kubectl rollout restart deployment -n production-backend --cascade=background

# Week 5-6: High-traffic services (gradual)
# Canary pattern: 1 pod → 25% → 50% → 100%
kubectl scale deployment/api-gateway --replicas=4
kubectl patch deployment/api-gateway -p '{"spec":{"template":{"metadata":{"labels":{"version":"meshed"}}}}}'  # 1 pod meshed

# Traffic split
kubectl apply -f virtualservice-25percent-meshed.yaml
# ... wait 3 days, monitor
kubectl apply -f virtualservice-50percent-meshed.yaml
# ... wait 3 days
kubectl apply -f virtualservice-100percent-meshed.yaml

# Week 7-8: Critical services (database proxies, auth)
# Manual, controlled rollout

Troubleshooting ve Debugging {#troubleshooting}

1. Sidecar Injection Çalışmıyor

          
bash

# Check webhook
kubectl get mutatingwebhookconfiguration istio-sidecar-injector -o yaml

# Check namespace label
kubectl get namespace default --show-labels | grep istio-injection

# Manual injection test
istioctl kube-inject -f deployment.yaml | kubectl apply -f -

# Injection logs
kubectl logs -n istio-system deploy/istiod | grep injection

2. 503 Service Unavailable

Troubleshooting steps:

          
bash

# 1. Envoy config kontrol
istioctl proxy-config cluster <pod-name> -n <namespace>

# 2. Endpoint discovery
istioctl proxy-config endpoint <pod-name> -n <namespace> --cluster <service-name>

# 3. Listener check
istioctl proxy-config listener <pod-name> -n <namespace>

# 4. Route check
istioctl proxy-config route <pod-name> -n <namespace>

# 5. Envoy logs
kubectl logs <pod-name> -c istio-proxy | grep "upstream connect error"

Yaygın sebepler:

DestinationRule subset mismatch
PeerAuthentication mTLS mismatch (STRICT vs PERMISSIVE)
Service selector yanlış
Network policies blocking traffic

3. mTLS Certificate Errors

          
bash

# Certificate chain kontrol
istioctl proxy-config secret <pod-name> -n <namespace>

# Certificate expiration
kubectl exec <pod-name> -c istio-proxy -- openssl s_client -connect backend:443 -showcerts

# Force certificate rotation
kubectl delete pod <pod-name>  # Yeni pod yeni cert alır

4. High Latency (P99 > 100ms overhead)

          
bash

# Telemetry overhead check
kubectl top pod -l app=my-app --containers
# NAME          CPU   MEMORY
# my-app        50m   100Mi
# istio-proxy   20m   50Mi   # %40 CPU overhead → çok yüksek!

# Access log disable et (latency azalır)
kubectl annotate pod <pod-name> sidecar.istio.io/componentLogLevel="rbac:none,jwt:none"

# Stats reduce et
istioctl manifest generate --set meshConfig.defaultConfig.statsMatcherInclusionPrefixes="cluster.outbound" | kubectl apply -f -

5. Control Plane Unstable

          
bash

# Istiod crashlooping
kubectl logs -n istio-system deploy/istiod --previous

# Pilot overload (too many services)
kubectl get validatingwebhookconfiguration istio-validator -o yaml
# Increase pilot resources:
istioctl install --set values.pilot.resources.requests.cpu=2 \
  --set values.pilot.resources.requests.memory=4Gi

# Config sync status
istioctl proxy-status
# SYNC STATUS will show NOT SENT if config too large

Sonuç

Service Mesh, mikroservis mimarilerinde operational complexity'yi application code'dan soyutlayarak yönetimi kolaylaştırır. Istio, feature-rich ve enterprise-grade bir çözüm sunarken, Linkerd lightweight ve basit bir yaklaşım benimser.

Ne Zaman Service Mesh?

✅ Service Mesh kullanın:

10+ mikroservis
mTLS zorunluluğu
Canary/blue-green deployment ihtiyacı
Distributed tracing gerekli
Zero-trust network modeli

❌ Service Mesh kullanmayın:

Monolith veya <5 servis
Simple CRUD API
Resource-constrained cluster (<10 nodes)
Henüz Kubernetes'i tam öğrenmediniz

İlk Adım

Küçük başlayın:

          
bash

# Linkerd (1 saat)
linkerd install | kubectl apply -f -
kubectl annotate namespace dev linkerd.io/inject=enabled
kubectl rollout restart deployment -n dev

# Metrics kontrol
linkerd viz stat deploy -n dev

Sonra Istio (production-ready özellikler için):

          
bash

istioctl install --set profile=demo
kubectl label namespace production istio-injection=enabled

Kaynaklar

Istio Docs: https://istio.io/latest/docs/
Linkerd Docs: https://linkerd.io/docs/
Service Mesh Comparison: https://servicemesh.es/
CNCF Service Mesh Landscape: https://landscape.cncf.io/

TekTık Yazılım olarak, müşterilerimizin Kubernetes cluster'larına production-ready Service Mesh implementasyonları sağlıyoruz. İstio/Linkerd kurulumu, migration stratejisi ve 7/24 support için iletişime geçin.

Bir sonraki yazıda görüşmek üzere! 🚀

Kubernetes Service Mesh: Istio ve Linkerd ile Mikroservis Yönetimi

Kubernetes Service Mesh: Istio ve Linkerd ile Mikroservis Yönetimi

İçindekiler

Service Mesh Nedir? {#nedir}

Core Özellikler

Sidecar Pattern

Service Mesh'in Çözdüğü Problemler {#problemler}

1. **Distributed Tracing Karmaşıklığı**

2. **mTLS ve Encryption**

3. **Canary Deployment ve Traffic Splitting**

4. **Circuit Breaking ve Resilience**

Istio vs Linkerd: Hangisini Seçmeliyim? {#karsilastirma}

Istio

Linkerd

Karşılaştırma Tablosu

Istio Mimarisi ve Bileşenleri {#istio-mimari}

Control Plane: Istiod

Data Plane: Envoy Sidecar

Istio Kurulumu (Production-Ready) {#istio-kurulum}

Prerequisites

1. Production Profile ile Kurulum

2. Sidecar Auto-Injection

3. Demo Uygulama Deploy

4. Gateway ve VirtualService

Istio ile Traffic Management {#istio-traffic}

1. **Version-Based Routing**

2. **Weighted Routing (Canary)**

3. **Timeout ve Retry**

Istio ile Canary Deployment {#istio-canary}

Full Canary Deployment Pipeline

4. Automated Rollback with Prometheus Alert

Istio ile Circuit Breaking ve Fault Injection {#istio-resilience}

Circuit Breaker Configuration

Fault Injection (Chaos Engineering)

Istio ile mTLS ve Security {#istio-security}

Otomatik mTLS (PeerAuthentication)

Authorization Policies

Linkerd Mimarisi ve Avantajları {#linkerd-mimari}

Linkerd Architecture

Neden Linkerd?

Linkerd Kurulumu (Lightweight Approach) {#linkerd-kurulum}

1. CLI Kurulumu

2. Control Plane Kurulumu

3. Sidecar Injection

4. Viz Extension (Dashboard)

Linkerd ile Traffic Splitting {#linkerd-traffic}

Linkerd ile Multi-Cluster Service Mesh {#linkerd-multicluster}

1. Multi-cluster Extension

2. Link Clusters

3. Export Service

Service Mesh Observability: Kiali, Grafana, Jaeger {#observability}

1. Kiali (Istio Service Graph)

2. Grafana Dashboards

3. Jaeger (Distributed Tracing)

Production Best Practices {#best-practices}

1. **Resource Limits (ÖNEMLİ!)**

2. **High Availability**

3. **Gradual Rollout**

4. **Monitoring ve Alerting**

5. **Security Hardening**

Performance ve Resource Optimization {#performance}

1. **Sidecar Scoping (Network Optimization)**

2. **Telemetry Filtering**

3. **Access Log Sampling**

4. **Protocol Detection Optimization**

Migration Stratejileri {#migration}

Legacy → Service Mesh Geçiş

Troubleshooting ve Debugging {#troubleshooting}

1. **Sidecar Injection Çalışmıyor**

2. **503 Service Unavailable**

3. **mTLS Certificate Errors**

4. **High Latency (P99 > 100ms overhead)**

5. **Control Plane Unstable**

Sonuç

Ne Zaman Service Mesh?

İlk Adım

Kaynaklar

Kubernetes Autoscaling: HPA, VPA ve KEDA ile Dinamik Ölçeklendirme

Container Security: Kubernetes'te Güvenli Container İmajları ile Çalışma

Kubernetes Maliyet Optimizasyonu: Production Ortamında Para Tasarrufu

1. Distributed Tracing Karmaşıklığı

2. mTLS ve Encryption

3. Canary Deployment ve Traffic Splitting

4. Circuit Breaking ve Resilience

1. Version-Based Routing

2. Weighted Routing (Canary)

3. Timeout ve Retry

1. Resource Limits (ÖNEMLİ!)

2. High Availability

3. Gradual Rollout

4. Monitoring ve Alerting

5. Security Hardening

1. Sidecar Scoping (Network Optimization)

2. Telemetry Filtering

3. Access Log Sampling

4. Protocol Detection Optimization

1. Sidecar Injection Çalışmıyor

2. 503 Service Unavailable

3. mTLS Certificate Errors

4. High Latency (P99 > 100ms overhead)

5. Control Plane Unstable