Kubernetes对外服务

Kuberntets介绍

Kubernets是一个Google主导的机群管理系统,目前底层可以使用Docker,实现Docker实例的应用编排。Kubernets的介绍很多,本文简单介绍安装和使用的过程。更多资料可参考Kerbernets官网

Kuberntets安装

Kubernets可以在虚拟机VM或安装Linux的服务器上安装,本文以Ubuntu Server服务器为例,详细可参见官网的Ubuntu安装指南

先下载Kubernets源码,目前最新版为1.4.1

root@node3:/usr/src# git clone https://github.com/kubernetes/kubernetes.git
root@node3:/usr/src# cd /usr/src/kubernetes
root@node3:/usr/src# git checkout v1.4.1

本文中存在两个节点,node3(192.168.200.13)和node4(192.168.200.14),node3作为控制节点和计算节点,node4作为计算节点。于是修改kubernetes/cluster/ubuntu/config-default.sh

export nodes=${nodes:-"root@192.168.200.13 root@192.168.200.14"}
roles=${roles:-"ai i"}
export NUM_NODES=${NUM_NODES:-2}
export SERVICE_CLUSTER_IP_RANGE=${SERVICE_CLUSTER_IP_RANGE:-100.0.0.0/16}  # formerly PORTAL_NET
export FLANNEL_NET=${FLANNEL_NET:-172.16.0.0/16}
DNS_SERVER_IP=${DNS_SERVER_IP:-"100.0.0.2"}

以上就是对配置文件的全部改动,请放置在相应位置。然后进行安装:

$ cd kubernetes/cluster
$ KUBERNETES_PROVIDER=ubuntu ./kube-up.sh
...........
Validate output:
NAME                 STATUS    MESSAGE              ERROR
controller-manager   Healthy   ok
scheduler            Healthy   ok
etcd-0               Healthy   {"health": "true"}
Cluster validation succeeded
Done, listing cluster services:

Kubernetes master is running at http://192.168.200.13:8080

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

如果不出错则会提示安装完毕。此时将Kubernets的命令放于PATH中。

root@node3:/usr/src# export PATH=/usr/src/kubernetes/cluster/ubuntu/binaries/:$PATH
root@node3:/usr/src# which kubectl
/usr/src/kubernetes/cluster/ubuntu/binaries//kubectl

然后安装dashboard和dns组件:

$ cd cluster/ubuntu
$ KUBERNETES_PROVIDER=ubuntu ./deployAddons.sh

可能存在的问题:

  • 如果需要重装,请运行KUBERNETES_PROVIDER=ubuntu ./kube-down.sh,停掉相关服务,然后要还原/etc/default/docker配置文件。
  • Kubernets会从Google的镜像仓库(gcr.io)获取某些镜像,但国内被墙了,所以可以选择一个http代理服务器,并在需要启动这些镜像的主机上为docker添加代理,方法是在/etc/default/docker中的开头添加:
export HTTP_PROXY=...

然后重启docker:

service docker restart

完毕需要将代理去掉再重启docker,具体可参考这篇文章

确保所有需运行这些镜像的节点本地都要有这些镜像!!,可以先在一个节点上用代理下载所有镜像,然后上传到私有仓库,再在其他节点上下载这些镜像即可。

  • 在运行带有运行 google镜像时,如果本地已经有该镜像的时候,但配置文件中带有
    imagePullPolicy: Always
    时,则仍会从Google仓库去获取,一种方法是将其变为:
imagePullPolicy: IfNotPresent

另一种方法是放到私有仓库中。
* 如果配置文件中没有指定imagePullPolicy,老版本会优先从本地找该版本的镜像,如有则直接启动;但发现1.4.1版本会优先pull。测试需要在配置文件中加入:

然后再运行安装或其他的命令。

  • Kubernetes会修改/etc/default/docker,请注意不要被覆盖原来的一些配置,否则docker pull私有仓库可能有问题,我的配置是:
DOCKER_OPTS=" --registry-mirror=http://2687282c.m.daocloud.io -H tcp://127.0.0.1:4243 -H unix:///var/run/docker.sock --bip=172.16.69.1/24 --mtu=1450"

Kubernets的一些概念

Pod

每一个Pod都是运行在某个节点上的实例或实例集合。可以对应于docker的实例或实例集合。
通过下面的命令可以查看运行的实例:

root@node3:~# kubectl get pods
NAME                   READY     STATUS    RESTARTS   AGE
django-default-yxm7u   1/1       Running   0          15m
django-x-q9twt         1/1       Running   0          15m
django-y-wgy0c         1/1       Running   0          15m
nginx-ingress-e049x    1/1       Running   0          14m

还可以看一些系统的实例:

root@node3:/usr/sr# kubectl get pods --namespace=kube-system
NAME                                READY     STATUS    RESTARTS   AGE
kube-dns-v20-h35xt                  3/3       Running   3          1h
kubernetes-dashboard-v1.4.0-5g12f   1/1       Running   6          13h

还可以看这些pod运行在什么节点,这对排查问题比较有用。

root@node3:/usr/src# kubectl get pods -o wide
NAME                   READY     STATUS    RESTARTS   AGE       IP            NODE
django-default-yxm7u   1/1       Running   0          20m       172.16.77.4   192.168.200.13
django-x-q9twt         1/1       Running   0          20m       172.16.9.2    192.168.200.14
django-y-wgy0c         1/1       Running   0          20m       172.16.9.3    192.168.200.14
nginx-ingress-e049x    1/1       Running   0          19m       172.16.77.5   192.168.200.13

Service

Service是提供对外可见的服务,可以使用下面的配置文件service.yaml新建服务。

# 3 Services for the 3 endpoints of the Ingress
apiVersion: v1
kind: Service
metadata:
  name: django-x
  labels:
    app: django-x
spec:
  type: NodePort
  ports:
  - port: 18111
    #nodePort: 30301
    targetPort: 8111
    protocol: TCP
    name: http
  selector:
    app: django-x

---
apiVersion: v1
kind: Service
metadata:
  name: django-default
  labels:
    app: django-default
spec:
  type: NodePort
  ports:
  - port: 18111
    #nodePort: 30302
    targetPort: 8111
    protocol: TCP
    name: http
  selector:
    app: django-default
---
apiVersion: v1
kind: Service
metadata:
  name: django-y
  labels:
    app: django-y
spec:
  type: NodePort
  ports:
  - port: 18111
    #nodePort: 30284
    targetPort: 8111
    protocol: TCP
    name: http
  selector:
    app: django-y

可以用kubectl查看服务(svc是services的缩写):

root@node3:/usr/src# kubectl create -f service.yaml
root@node3:/usr/src# kubectl get svc
NAME             CLUSTER-IP     EXTERNAL-IP   PORT(S)     AGE
basic            100.0.53.240    <nodes>       18112/TCP   8s
django-default   100.0.53.222   <nodes>       18111/TCP   21m
django-x         100.0.34.47    <nodes>       18111/TCP   21m
django-y         100.0.95.86    <nodes>       18111/TCP   21m
kubernetes       100.0.0.1      <none>        443/TCP     13h

正常情况下,在集群内部,是可以通过CLUSTER_IP : PORT来访问服务的,如在node3或node4上运行curl http://100.0.34.47:18111来访问django-x的服务的。

CLUSTER是Kubernets内部维护的机群,所以CLUSTER-IP和PORT是服务面向集群提供的内部ip和端口,互联网是访问不到的,例如上面的kubernets和basic服务就是这种情况。如果要从互联网访问这些服务,我们会在下面讲到,可用ingress的方法,使用一个代理将请求转到CLUSTER-IP : PORT。

如果每个物理节点是互联网可以直接访问到的话,那么也可以使用NodePort的类型,如上面的三个django都是这类服务,所以其外部IP是nodes。这样每个节点上的kube-proxy服务都会开一个端口P,外部网络可以通过访问任意一个节点的端口P进行访问。
那么P如何获得呢?可以通过查询服务获知:

root@node3:~/k8s-test# kubectl describe svc django-x
Name:           django-x
Namespace:      default
Labels:         app=django-x
Selector:       app=django-x
IP:         100.0.34.47
Port:           http    18111/TCP
NodePort:       http    32400/TCP
Endpoints:      172.16.76.3:8111
Session Affinity:   None
No events.

那么就可以通过curl http://192.168.200.13:32400来访问该服务了。

Replication Controller

Replication Controller(RC)是控制Pod数量和部署的控制器,Kubernets区别原生Docker很重要的一点是它实现了对资源的监控、弹性部署。假设一个pod挂了,rc可以再启动一个;或者机群要扩容,rc也可以很快增加pod实现。

以下配置文件rc.yaml可以新建三个RC:


# A single RC matching all Services apiVersion: v1 kind: ReplicationController metadata: name: django-x spec: replicas: 1 template: metadata: labels: app: django-x spec: containers: - name: django-x image: appstore:5000/liuwenmao/django-hello ports: - containerPort: 8111 --- apiVersion: v1 kind: ReplicationController metadata: name: django-default spec: replicas: 1 template: metadata: labels: app: django-default spec: containers: - name: django-default image: appstore:5000/liuwenmao/django-hello ports: - containerPort: 8111 --- apiVersion: v1 kind: ReplicationController metadata: name: django-y spec: replicas: 1 template: metadata: labels: app: django-y spec: containers: - name: django-y image: appstore:5000/liuwenmao/django-hello ports: - containerPort: 8111

如果要获取rc,可运行:

root@node3:/usr/src# kubectl create -f rc.yaml
root@node3:/usr/src# kubectl get rc
NAME             DESIRED   CURRENT   AGE
django-default   1         1         2h
django-x         1         1         2h
django-y         1         1         2h
nginx-ingress    1         1         2h

可能存在的问题
* 当要查看某个pod、service或rc的详细信息,可以用describe。如某个pod挂了,可以详细查看具体日志:

root@node3:/usr/src/kubernetes/cluster/ubuntu# kubectl get pods --namespace=kube-system
NAME                                READY     STATUS         RESTARTS   AGE
kube-dns-v20-0gnu3                  0/3       ErrImagePull   0          9m
kubernetes-dashboard-v1.4.0-5g12f   1/1       Running        4          11h

其中kube-dns看似有问题,继续检查:

root@node3:/usr/src/kubernetes/cluster/ubuntu# kubectl describe pods kube-dns-v20-0gnu3 --namespace=kube-system
Name:           kube-dns-v20-0gnu3
Namespace:      kube-system
Node:           192.168.200.14/192.168.200.14
Start Time:     Thu, 13 Oct 2016 09:56:24 +0800
Labels:         k8s-app=kube-dns
                version=v20
Status:         Pending
IP:             172.16.9.2
Controllers:    ReplicationController/kube-dns-v20
Containers:
  kubedns:
    Container ID:
    Image:              gcr.io/google_containers/kubedns-amd64:1.8
    Image ID:
    Ports:              10053/UDP, 10053/TCP
    Args:
      --domain=cluster.local.
      --dns-port=10053
    Limits:
      memory:   170Mi
    Requests:
      cpu:                      100m
      memory:                   70Mi
    State:                      Waiting
      Reason:                   ErrImagePull
    Ready:                      False
    Restart Count:              0
    Liveness:                   http-get http://:8080/healthz-kubedns delay=60s timeout=5s period=10s #success=1 #failure=5
    Readiness:                  http-get http://:8081/readiness delay=3s timeout=5s period=10s #success=1 #failure=3
    Environment Variables:      <none>
  dnsmasq:
    Container ID:
    Image:              gcr.io/google_containers/kube-dnsmasq-amd64:1.4
    Image ID:
    .......(略)
Events:
  FirstSeen     LastSeen        Count   From                    SubobjectPath                   Type            Reason          Message
  ---------     --------        -----   ----                    -------------                   --------        ------          -------
  6m            6m              1       {default-scheduler }                                    Normal          Scheduled       Successfully assigned kube-dns-v20-0gnu3 to 192.168.200.14
  5m            5m              1       {kubelet 192.168.200.14} spec.containers{kubedns}        Warning         Failed          Failed to pull image "gcr.io/google_containers/kubedns-amd64:1.8": image pull failed for gcr.io/google_containers/kubedns-amd64:1.8, this may be because there are no credentials on this request.  details: (Error response from daemon: {"message":"Get https://gcr.io/v1/_ping: dial tcp 64.233.188.82:443: i/o timeout"})
  4m            4m              1       {kubelet 192.168.200.14} spec.containers{dnsmasq}        Warning         Failed          Failed to pull image "gcr.io/google_containers/kube-dnsmasq-amd64:1.4": image pull failed for gcr.io/google_containers/kube-dnsmasq-amd64:1.4, this may be because there are no credentials on this request.  details: (Error response from daemon: {"message":"Get https://gcr.io/v1/_ping: dial tcp 64.233.188.82:443: i/o timeout"})
  3m            3m              1       {kubelet 192.168.200.14}                                 Warning         FailedSync      Error syncing pod, skipping: [failed to "StartContainer" for "kubedns" with ErrImagePull: "image pull failed for gcr.io/google_containers/kubedns-amd64:1.8, this may be because there are no credentials on this request.  details: (Error response from daemon: {\"message\":\"Get https://gcr.io/v1/_ping: dial tcp 64.233.188.82:443: i/o timeout\"})"
, failed to "StartContainer" for "dnsmasq" with ErrImagePull: "image pull failed for gcr.io/google_containers/kube-dnsmasq-amd64:1.4, this may be because there are no credentials on this request.  details: (Error response from daemon: {\"message\":\"Get https://gcr.io/v1/_ping: dial tcp 64.233.188.82:443: i/o timeout\"})"
, failed to "StartContainer" for "healthz" with ErrImagePull: "image pull failed for gcr.io/google_containers/exechealthz-amd64:1.2, this may be because there are no credentials on this request.  details: (Error response from daemon: {\"message\":\"Get https://gcr.io/v1/_ping: dial tcp 64.233.188.82:443: i/o timeout\"})"
]
  3m    3m      1       {kubelet 192.168.200.14} spec.containers{healthz}        Warning Failed          Failed to pull image "gcr.io/google_containers/exechealthz-amd64:1.2": image pull failed for gcr.io/google_containers/exechealthz-amd64:1.2, this may be because there are no credentials on this request.  details: (Error response from daemon: {"message":"Get https://gcr.io/v1/_ping: dial tcp 64.233.188.82:443: i/o timeout"})
  1m    1m      1       {kubelet 192.168.200.14} spec.containers{dnsmasq}        Warning Failed          Failed to pull image "gcr.io/google_containers/kube-dnsmasq-amd64:1.4": image pull failed for gcr.io/google_containers/kube-dnsmasq-amd64:1.4, this may be because there are no credentials on this request.  details: (Error response from daemon: {"message":"Get https://gcr.io/v1/_ping: dial tcp 64.233.189.82:443: i/o timeout"})
  4m    1m      2       {kubelet 192.168.200.14} spec.containers{healthz}        Normal  Pulling         pulling image "gcr.io/google_containers/exechealthz-amd64:1.2"
  59s   59s     1       {kubelet 192.168.19.14} spec.containers{healthz}        Warning Failed          Failed to pull image "gcr.io/google_containers/exechealthz-amd64:1.2": image pull failed for gcr.io/google_containers/exechealthz-amd64:1.2, this may be because there are no credentials on this request.  details: (Error response from daemon: {"message":"Get https://gcr.io/v1/_ping: dial tcp 64.233.189.82:443: i/o timeout"})
  59s   59s     1       {kubelet 192.168.200.14}                                 Warning FailedSync      Error syncing pod, skipping: [failed to "StartContainer" for "kubedns" with ErrImagePull: "image pull failed for gcr.io/google_containers/kubedns-amd64:1.8, this may be because there are no credentials on this request.  details: (Error response from daemon: {\"message\":\"Get https://gcr.io/v1/_ping: dial tcp 64.233.189.82:443: i/o timeout\"})"
, failed to "StartContainer" for "dnsmasq" with ErrImagePull: "image pull failed for gcr.io/google_containers/kube-dnsmasq-amd64:1.4, this may be because there are no credentials on this request.  details: (Error response from daemon: {\"message\":\"Get https://gcr.io/v1/_ping: dial tcp 64.233.189.82:443: i/o timeout\"})"
, failed to "StartContainer" for "healthz" with ErrImagePull: "image pull failed for gcr.io/google_containers/exechealthz-amd64:1.2, this may be because there are no credentials on this request.  details: (Error response from daemon: {\"message\":\"Get https://gcr.io/v1/_ping: dial tcp 64.233.189.82:443: i/o timeout\"})"

能看到是因为向google的仓库下载超时导致的,可参考上面的代理方法解决。

比如现在我需要运行一个django的web服务,那么可以通过运行下面的配置文件即可实现。

为Service添加互联网入口

Load balance

可以expose deployment(带–type=”LoadBalance”)的方式将服务暴露出去,但是目前这种方式支持公有云,如Google Container Engine等,貌似不能应用于私有的数据中心。具体可以参考官网Hello World的Allow external traffic一节

Ingress

在内网部署服务,希望对外暴露,可以使用Ingress的方式,以下配置文件为将上述服务在80端口上做映射,实现虚拟主机的功能。

# An Ingress with 2 hosts and 3 endpoints
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: echomap
spec:
  rules:
  - host: foo.bar.com
    http:
      paths:
      - path: /foo
        backend:
          serviceName: django-x
          servicePort: 18111
  - host: bar.baz.com
    http:
      paths:
      - path: /bar
        backend:
          serviceName: django-y
          servicePort: 18111
      - path: /foo
        backend:
          serviceName: django-x
          servicePort: 18111

然后运行并查看Ingress

root@node3:/usr/src/gohome/src/k8s.io/contrib/ingress/controllers/nginx-alpha# kubectl create -f rc.yaml
replicationcontroller "nginx-ingress" created
root@node3:~$ kubectl get ing
NAME      HOSTS                     ADDRESS   PORTS     AGE
echomap   foo.bar.com,bar.baz.com             80        3d
root@node3:~$ kubectl describe ing echomap
Name:           echomap
Namespace:      default
Address:
Default backend:    default-http-backend:80 (<none>)
Rules:
  Host      Path    Backends
  ----      ----    --------
  foo.bar.com
            /foo    django-x:18111 (<none>)
  bar.baz.com
            /bar    django-y:18111 (<none>)
            /foo    django-x:18111 (<none>)
Annotations:
No events.

Ingress是Kubernets的一种用于访问服务的机制,可以通过api获得这些映射关系,不过如何实现具体的功能,例如上例中的虚拟主机功能,可以使用Kubernets的contrib中的ingress-controller实现,本文给出的是使用nginx实现。

contrib是一大堆未进入Kubernets核心的代码集合,代码在(https://github.com/kubernetes/contrib),安装请按照项目的README进行(需要在$GOPATH里面,不是随便安装即可的),假设$GOPATH=/usr/src/gohome,那么Kubernets Contrib在/usr/src/gohome/src/k8s.io/contrib/,而我们说的nginx Ingress控制器则在/usr/src/gohome/src/k8s.io/contrib/ingress/controllers/nginx-alpha。

查看一下nginx-alpha目录下面的rc.yaml可知,Ingress Controller后台使用了gcr.io/google_containers/nginx-ingress镜像,不过这个镜像在笔者测试时有问题,所以实验中还是根据同目录下的Dockerfile重新生成的。查看Dockerfile可知,这个镜像是基于nginx的,并通过运行controller程序将Kubernets API获得的Ingress映射翻译成nginx的配置文件,从而实现了反向代理到运行不同网站的Service的功能。

当运行完创建rc.yaml后,我们可以找到相应的docker容器,观察其中的映射规则:

root@node3:/usr/src# kubectl get pods -o wide|grep nginx
nginx-ingress-g518r    1/1       Running   2          3d        172.16.66.3   192.168.200.13
root@node3:/usr/src# docker ps |grep nginx
4374a4965333        gcr.io/google_containers/nginx-ingress:0.1                                    "/controller"            46 hours ago        Up 46 hours                                                                                                                                                            k8s_nginx.a9cb3eb9_nginx-ingress-g518r_default_71b457b9-914e-11e6-821d-c81f66f3c543_f9c7501f
0051bb8806d1        gcr.io/google_containers/pause-amd64:3.0                                      "/pause"                 46 hours ago        Up 46 hours         0.0.0.0:80->80/tcp                                                                                                                                 k8s_POD.6cfd0339_nginx-ingress-g518r_default_71b457b9-914e-11e6-821d-c81f66f3c543_e01f441e
root@node3:/usr/src# docker exec -it 4374a4965333 /bin/bash
[ root@nginx-ingress-g518r:/etc/nginx ]$ ls
certs/   fastcgi.conf    koi-utf  mime.types  proxy_params  sites-available/  snippets/     win-utf
conf.d/  fastcgi_params  koi-win  nginx.conf  scgi_params   sites-enabled/    uwsgi_params
[ root@nginx-ingress-g518r:/etc/nginx ]$ cat nginx.conf

events {
  worker_connections 1024;
}
http {
  # http://nginx.org/en/docs/http/ngx_http_core_module.html
  types_hash_max_size 2048;
  server_names_hash_max_size 512;
  server_names_hash_bucket_size 64;



  server {
    listen 80;
    server_name foo.bar.com;

    location /foo {
      proxy_set_header Host $host;
      proxy_pass http://django-x.default.svc.cluster.local:18111;
    }
  }
  server {
    listen 80;
    server_name bar.baz.com;

    location /bar {
      proxy_set_header Host $host;
      proxy_pass http://django-y.default.svc.cluster.local:18111;
    }
    location /foo {
      proxy_set_header Host $host;
      proxy_pass http://django-x.default.svc.cluster.local:18111;
    }
  }
[ root@nginx-ingress-g518r:/etc/nginx ]$ ping django-x.default.svc.cluster.local
PING django-x.default.svc.cluster.local (100.0.87.12) 56(84) bytes of data.
64 bytes from django-x.default.svc.cluster.local (100.0.87.12): icmp_seq=1 ttl=47 time=265 ms
64 bytes from django-x.default.svc.cluster.local (100.0.87.12): icmp_seq=2 ttl=47 time=253 ms

到此为止,背后的原理已经清楚了。我们可以看一下运行效果。因为三个服务运行的都是django,所以可以查看django的输出日志(如果用supervisor运行的,可以进容器查看/var/log/supervisor里面相关的log文件)查看真实的Web访问情况。

此时,如果访问http://bar.baz.com/bar,就会转到django-y的服务(http://django-y.default.svc.cluster.local:18111),如果访问http://bar.baz.com/foo或http://foo.bar.com/foo,就转到django-x,其余的则转到ngix的404页面。

上述思路不仅可以使用在web服务,还可以用于如外部接入ssh等。

Leave a Comment

Your email address will not be published. Required fields are marked *