Kubernetes Service Mesh: Istio ve Linkerd ile Mikroservis Yönetimi
Mikroservis mimarileri ölçeklendikçe, servisler arası iletişimin yönetimi giderek karmaşıklaşır. Service Mesh, bu karmaşıklığı application code'dan soyutlayarak trafik yönetimi, güvenlik, observability ve resilience gibi cross-cutting concern'leri merkezi bir şekilde çözüme kavuşturur.
Bu yazıda, Kubernetes ekosisteminin iki önde gelen Service Mesh çözümü Istio ve Linkerd'i detaylıca inceleyeceğiz. Production ortamlarında kullanabileceğiniz pratik örnekler, kurulum rehberleri, traffic management stratejileri, mTLS güvenliği ve observability implementasyonları ile Service Mesh dünyasına kapsamlı bir giriş yapacaksınız.
İçindekiler
- Service Mesh Nedir?
- Service Mesh'in Çözdüğü Problemler
- Istio vs Linkerd: Hangisini Seçmeliyim?
- Istio Mimarisi ve Bileşenleri
- Istio Kurulumu (Production-Ready)
- Istio ile Traffic Management
- Istio ile Canary Deployment
- Istio ile Circuit Breaking ve Fault Injection
- Istio ile mTLS ve Security
- Linkerd Mimarisi ve Avantajları
- Linkerd Kurulumu (Lightweight Approach)
- Linkerd ile Traffic Splitting
- Linkerd ile Multi-Cluster Service Mesh
- Service Mesh Observability: Kiali, Grafana, Jaeger
- Production Best Practices
- Performance ve Resource Optimization
- Migration Stratejileri
- Troubleshooting ve Debugging
Service Mesh Nedir? {#nedir}
Service Mesh, mikroservis mimarisinde servisler arası iletişimi yöneten bir infrastructure layer'dır. Her servisin yanına sidecar proxy (genellikle Envoy) deploy ederek network trafiğini intercept eder ve şu özellikleri sağlar:
Core Özellikler
- Traffic Management: Load balancing, routing, retries, timeouts
- Security: Mutual TLS (mTLS), authentication, authorization
- Observability: Metrics, distributed tracing, access logs
- Resilience: Circuit breaking, fault injection, rate limiting
Sidecar Pattern
# Pod içinde uygulama + sidecar proxy
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
containers:
- name: app
image: my-app:v1
ports:
- containerPort: 8080
- name: istio-proxy # Sidecar
image: istio/proxyv2:1.20.0
# Envoy proxy - tüm trafik buradan geçer
Avantaj: Application code'da değişiklik yapmadan tüm bu özellikleri kazanırsınız.
Service Mesh'in Çözdüğü Problemler {#problemler}
1. **Distributed Tracing Karmaşıklığı**
Önce (her serviste manuel implementation):
# Her serviste tekrar eden kod
import opentelemetry
from opentelemetry import trace
tracer = trace.get_tracer(__name__)
@app.route('/api/users')
def get_users():
with tracer.start_as_current_span("get_users"):
# Trace context'i header'lara ekle
headers = inject_trace_context()
response = requests.get('http://database-service', headers=headers)
return response.json()
Service Mesh ile:
# Kod basitleşir - tracing otomatik
@app.route('/api/users')
def get_users():
response = requests.get('http://database-service')
return response.json()
Tracing, sidecar proxy tarafından otomatik inject edilir.
2. **mTLS ve Encryption**
Önce: Her serviste TLS sertifika yönetimi, rotation, renewal...
Service Mesh ile: Sidecar'lar arası otomatik mTLS, sertifika rotation her 24 saatte bir otomatik.
3. **Canary Deployment ve Traffic Splitting**
Önce: Multiple deployments, manual DNS/LB configuration, risk...
Service Mesh ile: Declarative traffic routing:
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: my-app
spec:
hosts:
- my-app
http:
- match:
- headers:
user-agent:
regex: ".*Mobile.*"
route:
- destination:
host: my-app
subset: v2
- route:
- destination:
host: my-app
subset: v1
weight: 90
- destination:
host: my-app
subset: v2
weight: 10 # %10 trafiği v2'ye
4. **Circuit Breaking ve Resilience**
Önce: Her serviste custom retry/timeout logic.
Service Mesh ile: Declarative policies:
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: database-circuit-breaker
spec:
host: database-service
trafficPolicy:
connectionPool:
tcp:
maxConnections: 100
http:
http1MaxPendingRequests: 50
maxRequestsPerConnection: 2
outlierDetection:
consecutiveErrors: 5
interval: 30s
baseEjectionTime: 30s
maxEjectionPercent: 50
Database 5 consecutive error verirse, 30 saniye boyunca ejection.
Istio vs Linkerd: Hangisini Seçmeliyim? {#karsilastirma}
Istio
✅ Avantajları:
- Zengin feature set (traffic management, security, telemetry)
- Büyük ekosistem (Kiali, Jaeger, Grafana entegrasyonları)
- Multi-cluster, multi-mesh support
- Envoy proxy (battle-tested, performant)
- Virtual Service, Gateway gibi powerful CRD'ler
❌ Dezavantajları:
- Karmaşık mimari (control plane heavy)
- Yüksek resource consumption
- Steep learning curve
- Upgrade path zorlukları
İdeal kullanım:
- Büyük enterprise ortamlar
- Kompleks traffic routing ihtiyaçları
- Multi-cluster/multi-tenancy
- Zengin observability gereksinimleri
Linkerd
✅ Avantajları:
- Ultra-lightweight (Rust-based data plane)
- Basit kurulum ve kullanım
- Düşük resource footprint
- Out-of-the-box mTLS
- Hızlı startup time
- Production-first approach
❌ Dezavantajları:
- Daha az feature (focused approach)
- Küçük ekosistem
- Sınırlı traffic routing capabilities
- Custom proxy (linkerd2-proxy)
İdeal kullanım:
- Startup'lar, orta ölçekli ortamlar
- Basit use case'ler (mTLS + observability)
- Resource-constrained clusters
- Hızlı adoption isteyen ekipler
Karşılaştırma Tablosu
| Özellik | Istio | Linkerd |
|---|---|---|
| **Kurulum Kolaylığı** | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
| **Resource Usage** | ~500MB (control plane) | ~100MB (control plane) |
| **Proxy** | Envoy (C++) | linkerd2-proxy (Rust) |
| **mTLS** | ✅ (manual enable) | ✅ (default enabled) |
| **Traffic Management** | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ |
| **Observability** | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
| **Multi-cluster** | ✅ Native | ✅ Extension |
| **Maturity** | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
| **Community** | Very Large | Large |
| **CNCF Status** | Graduated | Graduated |
Istio Mimarisi ve Bileşenleri {#istio-mimari}
Control Plane: Istiod
Istio 1.5+ tek bir component: istiod (eski versiyonlarda Pilot, Citadel, Galley ayrıydı).
┌─────────────────────────────────────────────┐
│ Istiod (Control Plane) │
│ ┌────────┐ ┌────────┐ ┌────────┐ │
│ │ Pilot │ │Citadel │ │Galley │ │
│ │Config │ │ Cert │ │ Config │ │
│ │ Mgmt │ │ Mgmt │ │Validate│ │
│ └────────┘ └────────┘ └────────┘ │
└─────────────────────────────────────────────┘
↓ (xDS API)
┌─────────────────────────────────────────────┐
│ Data Plane (Envoy Proxies) │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ Pod A │ │ Pod B │ │
│ │ App | Envoy│ ←──→ │ Envoy | App │ │
│ └─────────────┘ └─────────────┘ │
└─────────────────────────────────────────────┘
Istiod görevleri:
- Configuration management (VirtualService, DestinationRule)
- Certificate authority (mTLS sertifikalar)
- Sidecar injection
- Service discovery
- Proxy configuration (xDS API ile Envoy'a config push)
Data Plane: Envoy Sidecar
Her Pod'a inject edilen Envoy proxy:
- L7 traffic management
- Load balancing
- TLS termination/origination
- Metrics collection
- Access logging
Istio Kurulumu (Production-Ready) {#istio-kurulum}
Prerequisites
# Kubernetes cluster (1.27+)
kubectl version --short
# Istioctl CLI kurulumu
curl -L https://istio.io/downloadIstio | ISTIO_VERSION=1.20.0 sh -
cd istio-1.20.0
export PATH=$PWD/bin:$PATH
istioctl version
1. Production Profile ile Kurulum
# Production profile: HA control plane, resource limits
istioctl install --set profile=production -y
# Verify installation
kubectl get pods -n istio-system
# NAME READY STATUS
# istiod-7d6b8d8f4c-abcde 1/1 Running
# istiod-7d6b8d8f4c-fghij 1/1 Running # HA
# istio-ingressgateway-5c8f9d8b7c-klmno 1/1 Running
Production profile özellikleri:
- 2x istiod replicas (HA)
- Resource requests/limits
- PodDisruptionBudget
- HorizontalPodAutoscaler
2. Sidecar Auto-Injection
# Namespace'e label ekle
kubectl label namespace default istio-injection=enabled
# Verify
kubectl get namespace -L istio-injection
Artık bu namespace'e deploy edilen her Pod'a otomatik sidecar inject edilir.
3. Demo Uygulama Deploy
# bookinfo-app.yaml
apiVersion: v1
kind: Service
metadata:
name: productpage
spec:
ports:
- port: 9080
selector:
app: productpage
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: productpage-v1
spec:
replicas: 1
selector:
matchLabels:
app: productpage
version: v1
template:
metadata:
labels:
app: productpage
version: v1
spec:
containers:
- name: productpage
image: docker.io/istio/examples-bookinfo-productpage-v1:1.18.0
ports:
- containerPort: 9080
kubectl apply -f bookinfo-app.yaml
# Pod'da sidecar var mı kontrol et
kubectl get pod -l app=productpage -o jsonpath='{.items[0].spec.containers[*].name}'
# Output: productpage istio-proxy ✅
4. Gateway ve VirtualService
Ingress Gateway (cluster dışından erişim):
# gateway.yaml
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
name: bookinfo-gateway
spec:
selector:
istio: ingressgateway # Istio ingress gateway
servers:
- port:
number: 80
name: http
protocol: HTTP
hosts:
- "bookinfo.tektik.tr"
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: bookinfo
spec:
hosts:
- "bookinfo.tektik.tr"
gateways:
- bookinfo-gateway
http:
- match:
- uri:
prefix: /productpage
route:
- destination:
host: productpage
port:
number: 9080
kubectl apply -f gateway.yaml
# Ingress IP al
export INGRESS_HOST=$(kubectl get svc istio-ingressgateway -n istio-system -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo "http://$INGRESS_HOST/productpage"
Istio ile Traffic Management {#istio-traffic}
1. **Version-Based Routing**
Farklı versiyonlara trafik yönlendirme:
# destination-rule.yaml
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: productpage
spec:
host: productpage
subsets:
- name: v1
labels:
version: v1
- name: v2
labels:
version: v2
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: productpage-route
spec:
hosts:
- productpage
http:
- match:
- headers:
user-type:
exact: "premium"
route:
- destination:
host: productpage
subset: v2 # Premium users → v2
- route:
- destination:
host: productpage
subset: v1 # Default → v1
Test:
# Normal user → v1
curl http://productpage:9080/productpage
# Premium user → v2
curl -H "user-type: premium" http://productpage:9080/productpage
2. **Weighted Routing (Canary)**
Ağırlıklı trafik dağılımı:
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: productpage-canary
spec:
hosts:
- productpage
http:
- route:
- destination:
host: productpage
subset: v1
weight: 80
- destination:
host: productpage
subset: v2
weight: 20 # %20 trafiği yeni versiyona
Progressive rollout:
# 1. Adım: %10
kubectl patch virtualservice productpage-canary --type=json -p='[{"op": "replace", "path": "/spec/http/0/route/1/weight", "value": 10}]'
# 2. Adım: Metrics izle
kubectl exec -it prometheus-xxx -n istio-system -- promtool query instant 'istio_request_duration_milliseconds_bucket{destination_service="productpage"}'
# 3. Adım: Başarılıysa %50
kubectl patch virtualservice productpage-canary --type=json -p='[{"op": "replace", "path": "/spec/http/0/route/1/weight", "value": 50}]'
# 4. Adım: Tam geçiş %100
kubectl patch virtualservice productpage-canary --type=json -p='[{"op": "replace", "path": "/spec/http/0/route/1/weight", "value": 100}]'
3. **Timeout ve Retry**
Resilience policies:
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: ratings-timeout
spec:
hosts:
- ratings
http:
- route:
- destination:
host: ratings
subset: v1
timeout: 10s # 10 saniye timeout
retries:
attempts: 3
perTryTimeout: 2s
retryOn: 5xx,reset,connect-failure,refused-stream
Test:
# Fault injection ile test
kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: ratings-delay
spec:
hosts:
- ratings
http:
- fault:
delay:
percentage:
value: 100
fixedDelay: 5s # 5s gecikme
route:
- destination:
host: ratings
EOF
# Retry mekanizması devreye girer
curl -w "@curl-format.txt" http://productpage:9080/productpage
Istio ile Canary Deployment {#istio-canary}
Full Canary Deployment Pipeline
Senaryo: E-commerce checkout service'i v2 versiyonuna canary ile geçiş.
1. İki Deployment:
# checkout-v1.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: checkout-v1
spec:
replicas: 3
selector:
matchLabels:
app: checkout
version: v1
template:
metadata:
labels:
app: checkout
version: v1
spec:
containers:
- name: checkout
image: mycompany/checkout:v1.5.2
env:
- name: VERSION
value: "v1"
---
# checkout-v2.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: checkout-v2
spec:
replicas: 1
selector:
matchLabels:
app: checkout
version: v2
template:
metadata:
labels:
app: checkout
version: v2
spec:
containers:
- name: checkout
image: mycompany/checkout:v2.0.0
env:
- name: VERSION
value: "v2"
- name: NEW_PAYMENT_GATEWAY
value: "enabled"
2. DestinationRule:
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: checkout
spec:
host: checkout
trafficPolicy:
loadBalancer:
simple: LEAST_REQUEST # En az yüklü pod'a yönlendir
connectionPool:
http:
http1MaxPendingRequests: 100
maxRequestsPerConnection: 2
subsets:
- name: v1
labels:
version: v1
- name: v2
labels:
version: v2
trafficPolicy:
outlierDetection:
consecutiveErrors: 3
interval: 30s
baseEjectionTime: 60s
3. Progressive Rollout Script:
#!/bin/bash
# canary-rollout.sh
WEIGHTS=(5 10 25 50 75 100)
ERROR_THRESHOLD=0.05 # %5 error rate threshold
for WEIGHT in "${WEIGHTS[@]}"; do
echo "🚀 Rolling out v2 with $WEIGHT% traffic..."
# Update VirtualService
kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: checkout
spec:
hosts:
- checkout
http:
- route:
- destination:
host: checkout
subset: v1
weight: $((100 - WEIGHT))
- destination:
host: checkout
subset: v2
weight: $WEIGHT
EOF
# Wait for traffic shift
sleep 10
# Check error rate
ERROR_RATE=$(kubectl exec -n istio-system deploy/prometheus -c prometheus -- \
promtool query instant 'sum(rate(istio_requests_total{destination_service="checkout",destination_version="v2",response_code=~"5.."}[1m])) / sum(rate(istio_requests_total{destination_service="checkout",destination_version="v2"}[1m]))' \
| jq -r '.data.result[0].value[1]')
if (( $(echo "$ERROR_RATE > $ERROR_THRESHOLD" | bc -l) )); then
echo "❌ Error rate too high ($ERROR_RATE). Rolling back!"
kubectl apply -f virtualservice-v1-100.yaml
exit 1
fi
echo "✅ Error rate OK ($ERROR_RATE). Waiting 5 minutes..."
sleep 300
done
echo "🎉 Canary deployment completed successfully!"
4. Automated Rollback with Prometheus Alert
# prometheus-alert.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-alerts
namespace: istio-system
data:
alerts.yml: |
groups:
- name: canary
interval: 30s
rules:
- alert: CanaryHighErrorRate
expr: |
sum(rate(istio_requests_total{destination_service="checkout",destination_version="v2",response_code=~"5.."}[1m]))
/
sum(rate(istio_requests_total{destination_service="checkout",destination_version="v2"}[1m]))
> 0.05
for: 2m
annotations:
summary: "Canary v2 error rate above 5%"
description: "Rolling back to v1"
labels:
severity: critical
action: rollback
Rollback webhook:
# rollback-webhook.py
from flask import Flask, request
import subprocess
app = Flask(__name__)
@app.route('/webhook', methods=['POST'])
def rollback():
alert = request.json
if alert['labels'].get('action') == 'rollback':
subprocess.run(['kubectl', 'apply', '-f', 'virtualservice-v1-100.yaml'])
print("🔄 Rolled back to v1")
return "OK", 200
return "Ignored", 200
if __name__ == '__main__':
app.run(host='0.0.0.0', port=8080)
Istio ile Circuit Breaking ve Fault Injection {#istio-resilience}
Circuit Breaker Configuration
Senaryo: Database servisine aşırı yük binerse circuit break.
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: database-circuit-breaker
spec:
host: postgres-service
trafficPolicy:
connectionPool:
tcp:
maxConnections: 50 # Max 50 TCP connection
http:
http1MaxPendingRequests: 10
http2MaxRequests: 100
maxRequestsPerConnection: 3
outlierDetection:
consecutiveGatewayErrors: 3 # 3 consecutive 502/503/504
consecutive5xxErrors: 5 # veya 5 consecutive 5xx
interval: 10s # 10s aralıklarla check
baseEjectionTime: 30s # 30s eject
maxEjectionPercent: 50 # Max %50 pod'u eject et
minHealthPercent: 25 # Min %25 healthy pod kalmalı
Test:
# Load generator
kubectl run -it --rm load-generator --image=busybox -- /bin/sh -c \
"while true; do wget -q -O- http://postgres-service:5432; done"
# Circuit breaker devreye girer
# Logs:
kubectl logs -l app=istio-proxy -c istio-proxy | grep "upstream_rq_pending_overflow"
Fault Injection (Chaos Engineering)
Delay injection (latency testing):
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: payment-delay-test
spec:
hosts:
- payment-service
http:
- fault:
delay:
percentage:
value: 50 # %50 isteklerde
fixedDelay: 3s # 3s gecikme
match:
- headers:
x-test-user:
exact: "qa-team" # Sadece QA ekibine
route:
- destination:
host: payment-service
Abort injection (error testing):
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: payment-abort-test
spec:
hosts:
- payment-service
http:
- fault:
abort:
percentage:
value: 10 # %10 isteklerde
httpStatus: 503 # 503 Service Unavailable dön
route:
- destination:
host: payment-service
Test:
# Normal kullanıcı - etkilenmez
curl http://payment-service/checkout
# QA kullanıcı - 3s delay görür
curl -H "x-test-user: qa-team" http://payment-service/checkout
Istio ile mTLS ve Security {#istio-security}
Otomatik mTLS (PeerAuthentication)
Cluster-wide mTLS zorla:
# Tüm mesh'te mTLS zorunlu
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: default
namespace: istio-system
spec:
mtls:
mode: STRICT # Tüm servisler arası mTLS zorunlu
Namespace-specific:
# Sadece production namespace'te
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: production-mtls
namespace: production
spec:
mtls:
mode: STRICT
Service-specific (permissive mode):
# Legacy servis - hem mTLS hem plaintext kabul et
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: legacy-service
namespace: default
spec:
selector:
matchLabels:
app: legacy-app
mtls:
mode: PERMISSIVE # Hem mTLS hem plain HTTP
Authorization Policies
Service-to-service authorization:
# Sadece frontend, backend'e erişebilir
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: backend-access
namespace: default
spec:
selector:
matchLabels:
app: backend
action: ALLOW
rules:
- from:
- source:
principals: ["cluster.local/ns/default/sa/frontend"]
to:
- operation:
methods: ["GET", "POST"]
paths: ["/api/*"]
JWT-based authentication:
apiVersion: security.istio.io/v1beta1
kind: RequestAuthentication
metadata:
name: jwt-auth
namespace: default
spec:
selector:
matchLabels:
app: api-gateway
jwtRules:
- issuer: "https://accounts.google.com"
jwksUri: "https://www.googleapis.com/oauth2/v3/certs"
audiences:
- "my-app-client-id"
---
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: require-jwt
namespace: default
spec:
selector:
matchLabels:
app: api-gateway
action: ALLOW
rules:
- from:
- source:
requestPrincipals: ["*"] # JWT var mı kontrol et
when:
- key: request.auth.claims[role]
values: ["admin", "user"]
Test:
# JWT olmadan - 403 Forbidden
curl http://api-gateway/api/users
# JWT ile
TOKEN=$(gcloud auth print-identity-token)
curl -H "Authorization: Bearer $TOKEN" http://api-gateway/api/users
Linkerd Mimarisi ve Avantajları {#linkerd-mimari}
Linkerd Architecture
┌──────────────────────────────────────┐
│ Control Plane (linkerd namespace) │
│ ┌──────────┐ ┌──────────┐ │
│ │linkerd- │ │linkerd- │ │
│ │destination│ │identity │ │
│ └──────────┘ └──────────┘ │
└──────────────────────────────────────┘
↓
┌──────────────────────────────────────┐
│ Data Plane (linkerd2-proxy) │
│ ┌────────────┐ ┌────────────┐ │
│ │ Pod │ │ Pod │ │
│ │ App│Proxy │ ←─→ │Proxy│ App │ │
│ └────────────┘ └────────────┘ │
└──────────────────────────────────────┘
Linkerd2-proxy (Rust):
- Ultra-lightweight: ~10MB memory
- Sub-millisecond p99 latency
- Zero-config mTLS (default enabled)
- Service profiles for per-route metrics
Neden Linkerd?
1. Basitlik:
# Kurulum 2 komut
linkerd install --crds | kubectl apply -f -
linkerd install | kubectl apply -f -
# Istio'da ~15 adım config gerekir
2. Performance:
| Metric | Linkerd | Istio |
|---|---|---|
| P50 latency overhead | +0.3ms | +1.2ms |
| P99 latency overhead | +0.8ms | +4.5ms |
| Memory (proxy) | 10MB | 50MB |
| Memory (control plane) | 100MB | 500MB |
3. Zero-config mTLS:
Linkerd'de mTLS default enabled, certificate rotation otomatik. Istio'da manuel enable etmelisiniz.
Linkerd Kurulumu (Lightweight Approach) {#linkerd-kurulum}
1. CLI Kurulumu
# Linkerd CLI
curl --proto '=https' --tlsv1.2 -sSfL https://run.linkerd.io/install | sh
export PATH=$PATH:$HOME/.linkerd2/bin
# Pre-check
linkerd check --pre
# ✔ Kubernetes API
# ✔ Kubernetes version >= 1.21
# ✔ kubectl configured
2. Control Plane Kurulumu
# CRDs
linkerd install --crds | kubectl apply -f -
# Control plane
linkerd install | kubectl apply -f -
# Verify
linkerd check
# ✔ control plane is up-to-date
# ✔ control plane and cli versions match
Resource usage:
kubectl top pod -n linkerd
# NAME CPU MEMORY
# linkerd-destination-xxx 2m 30Mi
# linkerd-identity-xxx 1m 25Mi
# linkerd-proxy-injector-xxx 1m 20Mi
Istio ile karşılaştırma:
kubectl top pod -n istio-system
# NAME CPU MEMORY
# istiod-xxx 50m 250Mi
# istio-ingressgateway-xxx 10m 100Mi
3. Sidecar Injection
# Namespace annotation
kubectl annotate namespace default linkerd.io/inject=enabled
# Deploy app
kubectl apply -f app.yaml
# Verify
kubectl get pod -o jsonpath='{.items[0].spec.containers[*].name}'
# Output: app linkerd-proxy ✅
4. Viz Extension (Dashboard)
# Grafana + Prometheus + Dashboard
linkerd viz install | kubectl apply -f -
# Open dashboard
linkerd viz dashboard &
# Grafana: http://localhost:50750
Linkerd ile Traffic Splitting {#linkerd-traffic}
Linkerd TrafficSplit (SMI spec):
# trafficsplit.yaml
apiVersion: split.smi-spec.io/v1alpha4
kind: TrafficSplit
metadata:
name: checkout-split
spec:
service: checkout # Root service
backends:
- service: checkout-v1
weight: 80
- service: checkout-v2
weight: 20 # %20 trafiği v2'ye
---
# Services
apiVersion: v1
kind: Service
metadata:
name: checkout
spec:
selector:
app: checkout # Tüm versiyonlar
ports:
- port: 80
---
apiVersion: v1
kind: Service
metadata:
name: checkout-v1
spec:
selector:
app: checkout
version: v1
ports:
- port: 80
---
apiVersion: v1
kind: Service
metadata:
name: checkout-v2
spec:
selector:
app: checkout
version: v2
ports:
- port: 80
Live metrics:
linkerd viz stat trafficsplit/checkout-split
# NAME SERVICE SUCCESS RPS P50_LATENCY P99_LATENCY
# checkout-split checkout-v1 100.00% 8.0 10ms 25ms
# checkout-v2 100.00% 2.0 12ms 30ms
Flagger ile otomatik canary:
# Flagger CRD
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: checkout
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: checkout
service:
port: 80
analysis:
interval: 1m
threshold: 5
maxWeight: 50
stepWeight: 10
metrics:
- name: request-success-rate
thresholdRange:
min: 99
- name: request-duration
thresholdRange:
max: 500
webhooks:
- name: load-test
url: http://flagger-loadtester/
Flagger, otomatik olarak traffic'i %10, %20, %30... artırır ve metrics'e göre rollout/rollback yapar.
Linkerd ile Multi-Cluster Service Mesh {#linkerd-multicluster}
Senaryo: US ve EU cluster'ları arasında service mesh.
1. Multi-cluster Extension
# Cluster 1 (US)
linkerd multicluster install | kubectl apply -f -
# Cluster 2 (EU)
linkerd multicluster install | kubectl apply -f -
2. Link Clusters
# EU cluster'dan US cluster'a link
linkerd --context=us multicluster link --cluster-name us \
| kubectl --context=eu apply -f -
# Verify
linkerd --context=eu multicluster gateways
# CLUSTER ALIVE NUM_SVC LATENCY
# us True 3 45ms
3. Export Service
# US cluster'da service export et
kubectl --context=us label svc checkout mirror.linkerd.io/exported=true
# EU cluster'dan erişim
kubectl --context=eu get svc
# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S)
# checkout-us ClusterIP 10.100.1.50 <none> 80/TCP
# EU pod'dan US service'e çağrı
curl http://checkout-us/api
Traffic routing:
# EU cluster'da traffic split
apiVersion: split.smi-spec.io/v1alpha4
kind: TrafficSplit
metadata:
name: checkout-multi-region
spec:
service: checkout
backends:
- service: checkout # EU local
weight: 80
- service: checkout-us # US remote
weight: 20 # Fallback veya load balancing
Service Mesh Observability: Kiali, Grafana, Jaeger {#observability}
1. Kiali (Istio Service Graph)
# Kiali kurulum
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/kiali.yaml
# Dashboard
istioctl dashboard kiali
Kiali features:
- Real-time service topology graph
- Traffic flow visualization
- Configuration validation
- Health checks
Örnek görünüm:
┌─────────────┐ ┌─────────────┐
│ Frontend │ ──90%─→ │ Backend │
│ v1: 3/3 │ │ v1: 2/3 │
│ ✔ Healthy │ ←───── │ v2: 1/3 │
└─────────────┘ └─────────────┘
│ │
│ 100% │ 50%
↓ ↓
┌─────────────┐ ┌─────────────┐
│ Redis │ │ Postgres │
│ ✔ 1/1 │ │ ⚠ 2/3 │
└─────────────┘ └─────────────┘
2. Grafana Dashboards
# Grafana + Prometheus
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/prometheus.yaml
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/grafana.yaml
istioctl dashboard grafana
Pre-built dashboards:
- Istio Mesh Dashboard (global metrics)
- Istio Service Dashboard (per-service)
- Istio Workload Dashboard (per-pod)
- Istio Performance Dashboard (control plane)
Custom queries:
# Request rate per service
sum(rate(istio_requests_total[5m])) by (destination_service)
# Error rate
sum(rate(istio_requests_total{response_code=~"5.."}[5m]))
/
sum(rate(istio_requests_total[5m]))
# P99 latency
histogram_quantile(0.99,
sum(rate(istio_request_duration_milliseconds_bucket[5m]))
by (le, destination_service)
)
3. Jaeger (Distributed Tracing)
# Jaeger kurulum
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/jaeger.yaml
istioctl dashboard jaeger
Trace flow örneği:
[Frontend] ──(50ms)──→ [API Gateway] ──(20ms)──→ [Auth Service]
│
└──(150ms)──→ [Backend Service]
│
├──(80ms)──→ [Database]
└──(70ms)──→ [Cache]
Total: 370ms (150ms backend bottleneck 🔴)
Trace context propagation (otomatik):
Istio sidecar'lar otomatik olarak trace header'larını inject eder:
x-request-idx-b3-traceidx-b3-spanidx-b3-parentspanidx-b3-sampled
Production Best Practices {#best-practices}
1. **Resource Limits (ÖNEMLİ!)**
Sidecar proxy limits:
apiVersion: v1
kind: Pod
metadata:
annotations:
sidecar.istio.io/proxyCPU: "100m"
sidecar.istio.io/proxyCPULimit: "200m"
sidecar.istio.io/proxyMemory: "128Mi"
sidecar.istio.io/proxyMemoryLimit: "256Mi"
spec:
containers:
- name: app
image: my-app
Control plane limits:
istioctl install --set values.pilot.resources.requests.cpu=500m \
--set values.pilot.resources.requests.memory=2Gi \
--set values.pilot.resources.limits.cpu=2 \
--set values.pilot.resources.limits.memory=4Gi
2. **High Availability**
# Multi-replica control plane
istioctl install --set profile=production \
--set values.pilot.autoscaleEnabled=true \
--set values.pilot.autoscaleMin=2 \
--set values.pilot.autoscaleMax=5
3. **Gradual Rollout**
Namespace-by-namespace adoption:
# Phase 1: Dev namespace
kubectl label namespace dev istio-injection=enabled
# Phase 2: Staging
kubectl label namespace staging istio-injection=enabled
# Phase 3: Production (non-critical services)
kubectl label namespace production-backend istio-injection=enabled
# Phase 4: Production (critical services)
kubectl label namespace production-frontend istio-injection=enabled
4. **Monitoring ve Alerting**
Critical alerts:
# Prometheus AlertManager
groups:
- name: istio
rules:
- alert: IstioPilotDown
expr: up{job="pilot"} == 0
for: 5m
- alert: HighProxyResourceUsage
expr: container_memory_usage_bytes{container="istio-proxy"} / container_spec_memory_limit_bytes{container="istio-proxy"} > 0.9
for: 10m
- alert: HighErrorRate
expr: sum(rate(istio_requests_total{response_code=~"5.."}[5m])) / sum(rate(istio_requests_total[5m])) > 0.05
for: 5m
5. **Security Hardening**
# Egress traffic kontrolü
apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
name: external-apis
spec:
hosts:
- "api.stripe.com"
- "api.github.com"
ports:
- number: 443
name: https
protocol: HTTPS
location: MESH_EXTERNAL
resolution: DNS
---
# Diğer tüm egress trafiği engelle
apiVersion: networking.istio.io/v1beta1
kind: Sidecar
metadata:
name: default
namespace: istio-system
spec:
outboundTrafficPolicy:
mode: REGISTRY_ONLY # Sadece ServiceEntry'lere izin ver
Performance ve Resource Optimization {#performance}
1. **Sidecar Scoping (Network Optimization)**
Problem: Tüm servislerin config her pod'a push edilir → yüksek memory.
Çözüm: Sidecar resource'u ile scope belirle:
apiVersion: networking.istio.io/v1beta1
kind: Sidecar
metadata:
name: frontend-sidecar
namespace: production
spec:
workloadSelector:
labels:
app: frontend
egress:
- hosts:
- "./backend.production.svc.cluster.local" # Sadece backend'e erişim
- "istio-system/*" # Istio system servisleri
Sonuç: Frontend sidecar sadece backend config'ini alır, %80 memory tasarrufu.
2. **Telemetry Filtering**
Problem: Her request için full metrics → yüksek Prometheus cardinality.
Çözüm: Telemetry API ile filter:
apiVersion: telemetry.istio.io/v1alpha1
kind: Telemetry
metadata:
name: low-priority-metrics
namespace: default
spec:
selector:
matchLabels:
app: low-priority-service
metrics:
- providers:
- name: prometheus
overrides:
- match:
metric: REQUEST_COUNT
disabled: true # Request count metric'i devre dışı
3. **Access Log Sampling**
apiVersion: telemetry.istio.io/v1alpha1
kind: Telemetry
metadata:
name: access-log-sampling
spec:
accessLogging:
- providers:
- name: envoy
filter:
expression: "response.code >= 400" # Sadece error logları
# Veya sampling
# match:
# mode: SERVER
# sampling: 10 # %10 request logla
4. **Protocol Detection Optimization**
Problem: Istio otomatik protocol detection yapar → latency overhead.
Çözüm: Explicit protocol tanımla:
apiVersion: v1
kind: Service
metadata:
name: database
spec:
ports:
- port: 5432
name: tcp-postgres # "tcp-" prefix → no auto-detection
- port: 6379
name: tcp-redis
- port: 9200
name: http-elasticsearch # "http-" prefix
Migration Stratejileri {#migration}
Legacy → Service Mesh Geçiş
Senaryo: Mevcut 50 microservice, Service Mesh'e geçiş.
1. Assessment Phase (1-2 hafta)
# Mevcut cluster analiz
kubectl get pods --all-namespaces -o json | jq '.items | length'
# 347 pods
# Deployment topology
kubectl get deploy --all-namespaces -o custom-columns=NAMESPACE:.metadata.namespace,NAME:.metadata.name,REPLICAS:.spec.replicas --no-headers | awk '{print $1}' | sort | uniq -c
Risk değerlendirmesi:
- High-traffic services (>1000 RPS) → En son migre et
- Low-traffic services (test, staging) → İlk migre et
- Stateful services (database, cache) → Skip sidecar injection
2. Pilot Phase (2-3 hafta)
# Dev namespace
kubectl create namespace dev-mesh
kubectl label namespace dev-mesh istio-injection=enabled
# 3 low-risk service migre et
kubectl get deploy -n dev -l risk=low -o name | head -3 | xargs -I {} kubectl patch {} -p '{"metadata":{"namespace":"dev-mesh"}}'
# 1 hafta monitoring
linkerd viz stat deploy -n dev-mesh --from deploy/load-generator
Success criteria:
- ✅ P99 latency < 5ms overhead
- ✅ Zero 5xx errors
- ✅ Resource usage < %110 previous
3. Progressive Rollout (4-8 hafta)
# Week 1-2: Low-risk services
kubectl label namespace staging istio-injection=enabled
kubectl rollout restart deployment -n staging
# Week 3-4: Medium-risk services
kubectl label namespace production-backend istio-injection=enabled
kubectl rollout restart deployment -n production-backend --cascade=background
# Week 5-6: High-traffic services (gradual)
# Canary pattern: 1 pod → 25% → 50% → 100%
kubectl scale deployment/api-gateway --replicas=4
kubectl patch deployment/api-gateway -p '{"spec":{"template":{"metadata":{"labels":{"version":"meshed"}}}}}' # 1 pod meshed
# Traffic split
kubectl apply -f virtualservice-25percent-meshed.yaml
# ... wait 3 days, monitor
kubectl apply -f virtualservice-50percent-meshed.yaml
# ... wait 3 days
kubectl apply -f virtualservice-100percent-meshed.yaml
# Week 7-8: Critical services (database proxies, auth)
# Manual, controlled rollout
Troubleshooting ve Debugging {#troubleshooting}
1. **Sidecar Injection Çalışmıyor**
# Check webhook
kubectl get mutatingwebhookconfiguration istio-sidecar-injector -o yaml
# Check namespace label
kubectl get namespace default --show-labels | grep istio-injection
# Manual injection test
istioctl kube-inject -f deployment.yaml | kubectl apply -f -
# Injection logs
kubectl logs -n istio-system deploy/istiod | grep injection
2. **503 Service Unavailable**
Troubleshooting steps:
# 1. Envoy config kontrol
istioctl proxy-config cluster <pod-name> -n <namespace>
# 2. Endpoint discovery
istioctl proxy-config endpoint <pod-name> -n <namespace> --cluster <service-name>
# 3. Listener check
istioctl proxy-config listener <pod-name> -n <namespace>
# 4. Route check
istioctl proxy-config route <pod-name> -n <namespace>
# 5. Envoy logs
kubectl logs <pod-name> -c istio-proxy | grep "upstream connect error"
Yaygın sebepler:
- DestinationRule subset mismatch
- PeerAuthentication mTLS mismatch (STRICT vs PERMISSIVE)
- Service selector yanlış
- Network policies blocking traffic
3. **mTLS Certificate Errors**
# Certificate chain kontrol
istioctl proxy-config secret <pod-name> -n <namespace>
# Certificate expiration
kubectl exec <pod-name> -c istio-proxy -- openssl s_client -connect backend:443 -showcerts
# Force certificate rotation
kubectl delete pod <pod-name> # Yeni pod yeni cert alır
4. **High Latency (P99 > 100ms overhead)**
# Telemetry overhead check
kubectl top pod -l app=my-app --containers
# NAME CPU MEMORY
# my-app 50m 100Mi
# istio-proxy 20m 50Mi # %40 CPU overhead → çok yüksek!
# Access log disable et (latency azalır)
kubectl annotate pod <pod-name> sidecar.istio.io/componentLogLevel="rbac:none,jwt:none"
# Stats reduce et
istioctl manifest generate --set meshConfig.defaultConfig.statsMatcherInclusionPrefixes="cluster.outbound" | kubectl apply -f -
5. **Control Plane Unstable**
# Istiod crashlooping
kubectl logs -n istio-system deploy/istiod --previous
# Pilot overload (too many services)
kubectl get validatingwebhookconfiguration istio-validator -o yaml
# Increase pilot resources:
istioctl install --set values.pilot.resources.requests.cpu=2 \
--set values.pilot.resources.requests.memory=4Gi
# Config sync status
istioctl proxy-status
# SYNC STATUS will show NOT SENT if config too large
Sonuç
Service Mesh, mikroservis mimarilerinde operational complexity'yi application code'dan soyutlayarak yönetimi kolaylaştırır. Istio, feature-rich ve enterprise-grade bir çözüm sunarken, Linkerd lightweight ve basit bir yaklaşım benimser.
Ne Zaman Service Mesh?
✅ Service Mesh kullanın:
- 10+ mikroservis
- mTLS zorunluluğu
- Canary/blue-green deployment ihtiyacı
- Distributed tracing gerekli
- Zero-trust network modeli
❌ Service Mesh kullanmayın:
- Monolith veya <5 servis
- Simple CRUD API
- Resource-constrained cluster (<10 nodes)
- Henüz Kubernetes'i tam öğrenmediniz
İlk Adım
Küçük başlayın:
# Linkerd (1 saat)
linkerd install | kubectl apply -f -
kubectl annotate namespace dev linkerd.io/inject=enabled
kubectl rollout restart deployment -n dev
# Metrics kontrol
linkerd viz stat deploy -n dev
Sonra Istio (production-ready özellikler için):
istioctl install --set profile=demo
kubectl label namespace production istio-injection=enabled
Kaynaklar
- Istio Docs: https://istio.io/latest/docs/
- Linkerd Docs: https://linkerd.io/docs/
- Service Mesh Comparison: https://servicemesh.es/
- CNCF Service Mesh Landscape: https://landscape.cncf.io/
TekTık Yazılım olarak, müşterilerimizin Kubernetes cluster'larına production-ready Service Mesh implementasyonları sağlıyoruz. İstio/Linkerd kurulumu, migration stratejisi ve 7/24 support için iletişime geçin.
Bir sonraki yazıda görüşmek üzere! 🚀