openshift metrics with dynamic storage provision

Openshift metrics support Persistant Storage ( USE_PERSISTENT_STORAGE=true option ) and dynamic storage provisioning ( DYNAMICALLY_PROVISION_STORAGE=true ) using storageclasses. To learn more about kubernetes storage classes, read documentation at this location kubernetes documentation

In order to use these parameters, it is necessary to have

  • have OCP platform configured for particular storage backend, check OCP documentation for details how to configure OCP to work with particular storage backend. OCP supports many storage backends, so pick up one which is closes to you use case.
  • configure storage class which will provide storage for persistent storage which will be used by Openshift metrics

In metrics templates we can see there is also dynamic pv template from where we can see that Persistent Volume claim is defined as

- kind: PersistentVolumeClaim
  apiVersion: v1
  metadata:
    name: "${PV_PREFIX}-${NODE}"
    labels:
      metrics-infra: hawkular-cassandra
    annotations:
      volume.beta.kubernetes.io/storage-class: "dynamic"
  spec:
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: "${PV_SIZE}"

Until, storageclass name is fixed as parameter upstream, it is either necessary to have storageclass name dynamic, or if there is request to use storage class with different name then it is necessary to rebuild metrics images. Rebuilding metrics images is quite easy and can be done following below steps ( I assume that images have already pre-built volume.beta.kubernetes.io/storage-class part already )

# git clone https://github.com/openshift/origin-metrics
# do necessary changes in template related to storageclass 

# cd origin-metrics/hack 
# ./build-images.sh --prefix=new_image --version=1.0

Once this finish, the name of desired storage class will be part of deployer pod and configuring metrics with dynamic storage provisioning will work as expected

#dynamic-storage-provisioning, #kubernetes-dynamic-storage, #ocp, #openshift-container-platform, #openshift-metrics

Remove CNS configuration from OCP cluster

WARNING: Below steps are destructive, follow them on your own responsibility

CNS – Container Native Storage
OCP – Openshift Container Platform

CNS cluster can be part of OCP cluster, and that means running CNS cluster inside OCP cluster and have all managing by OCP. Read more about OCP here and about CNS here.

This post is not going to be about how to setup CNS / OCP, if you want to learn how to setup CNS then follow documentation links. It will be about how to remove CNS configuration from OCP cluster. Why would anybody want to do this? I see below two as most obvious

  • stop using CNS as storage backend and free up resources for other projects
  • test various configurations and setups before going with final one, and during configuration testing it is necessary to clean up configuration and start over.

Deleting CNS pods and storage configuration will result in data loss , but assuming you know what and why you are doing this, it is safe to play with this.
So, how to delete / recreate CNS configuration from OCP cluster? Steps are below!

CNS itself provides cns-deploy

# cns-deploy --abort 

if run in namespace where CNS pods are created, it will report below output

# cns-deploy --abort
Multiple CLI options detected. Please select a deployment option.
[O]penShift, [K]ubernetes? [O/o/K/k]: O
Using OpenShift CLI.
NAME      STATUS    AGE
zelko     Active    12h
Using namespace "zelko".
Do you wish to abort the deployment?
[Y]es, [N]o? [Default: N]: Y
No resources found
deploymentconfig "heketi" deleted
service "heketi" deleted
route "heketi" deleted
service "heketi-storage-endpoints" deleted
serviceaccount "heketi-service-account" deleted
template "deploy-heketi" deleted
template "heketi" deleted

from this it is visible that deploymentconfig, services, serviceaccounts, and templates were deleted, but not labels on nodes, I opened BZ

This is first step,however CNS pods are still present, and this is by CNS design, CNS pods should not be deleted so easily as deleting them will destroy data, but in this specific case and for reasons listed above we want to delete them.

# oc get pods -n zelko 
NAME              READY     STATUS    RESTARTS   AGE
glusterfs-72bps   1/1       Running   0          12h
glusterfs-fg3k5   1/1       Running   0          12h
glusterfs-gb9h4   1/1       Running   0          12h
glusterfs-hn0gk   1/1       Running   0          12h
glusterfs-jsrn8   1/1       Running   0          12h

deleting CNS project will delete CNS pods

# oc delete project zelko

cns-deploy --abort and oc delete project zelko can be reduced to deleting only project which will give same result

This will not clean up all stuff on CNS nodes, there will be still PV/VG/LVs created by CNS configuration, and if logged to OCP node which earlier hosted CNS pods, there will be visible

#vgs
  
  vg_18c5f1249a65b678dd5a904fd70a9cd8   1   0   0 wz--n- 199.87g 199.87g
  vg_7137ba2d9b189997110f53a6e7b1a5e4   1   2   0 wz--n- 199.87g 197.85g
# pvs
 PV         VG                                  Fmt  Attr PSize   PFree  
  /dev/xvdc  vg_7137ba2d9b189997110f53a6e7b1a5e4 lvm2 a--  199.87g 197.85g
  /dev/xvdd  vg_18c5f1249a65b678dd5a904fd70a9cd8 lvm2 a--  199.87g 199.87g
# lvs                    
  brick_29ec6b1183398e9513676e586db1adb5 vg_7137ba2d9b189997110f53a6e7b1a5e4 Vwi-a-tz--  2.00g tp_29ec6b1183398e9513676e586db1adb5        0.66                                   
  tp_29ec6b1183398e9513676e586db1adb5    vg_7137ba2d9b189997110f53a6e7b1a5e4 twi-aotz--  2.00g                                            0.66   0.33                  

recreating CNS as long as these are present from previous configuration will not work. For removing logical volumes, fastest approach I found was vgchange -an volumegroup; vgremove volumegroup --force

It is also necessary to clean up stuff in /var/lib/glusterd/, /etc/glusterfs/ and /var/lib/heketi/, below is list of files which needs to be removed before running again cns-deploy

# ls -l /var/lib/glusterd/
total 8
drwxr-xr-x. 3 root root 17 Feb 27 14:14 bitd
drwxr-xr-x. 2 root root 34 Feb 27 14:12 geo-replication
-rw-------. 1 root root 66 Feb 27 14:14 glusterd.info
drwxr-xr-x. 3 root root 19 Feb 27 14:12 glusterfind
drwxr-xr-x. 3 root root 46 Feb 27 14:14 glustershd
drwxr-xr-x. 2 root root 40 Feb 27 14:12 groups
drwxr-xr-x. 3 root root 15 Feb 27 14:12 hooks
drwxr-xr-x. 3 root root 39 Feb 27 14:14 nfs
-rw-------. 1 root root 47 Feb 27 14:14 options
drwxr-xr-x. 2 root root 94 Feb 27 14:14 peers
drwxr-xr-x. 3 root root 17 Feb 27 14:14 quotad
drwxr-xr-x. 3 root root 17 Feb 27 14:14 scrub
drwxr-xr-x. 2 root root 31 Feb 27 14:14 snaps
drwxr-xr-x. 2 root root  6 Feb 27 14:12 ss_brick
drwxr-xr-x. 3 root root 29 Feb 27 14:14 vols


# ls -l /etc/glusterfs/
total 32
-rw-r--r--. 1 root root  400 Feb 27 14:12 glusterd.vol
-rw-r--r--. 1 root root 1001 Feb 27 14:12 glusterfs-georep-logrotate
-rw-r--r--. 1 root root  626 Feb 27 14:12 glusterfs-logrotate
-rw-r--r--. 1 root root 1822 Feb 27 14:12 gluster-rsyslog-5.8.conf
-rw-r--r--. 1 root root 2564 Feb 27 14:12 gluster-rsyslog-7.2.conf
-rw-r--r--. 1 root root  197 Feb 27 14:12 group-metadata-cache
-rw-r--r--. 1 root root  276 Feb 27 14:12 group-virt.example
-rw-r--r--. 1 root root  338 Feb 27 14:12 logger.conf.example

# ls -l /var/lib/heketi/
total 4
-rw-r--r--. 1 root root 219 Feb 27 14:14 fstab
drwxr-xr-x. 3 root root  49 Feb 27 14:14 mounts

do this on every machine which was previously part of CNS cluster, pay attention when deleting VGs/PVs/LVs to point to correct names/devices!!!

remove storagenode=glusterfs labels from nodes which were previously used in CNS deployment, check BZ for why is this necessary. To remove node labels, either use approach described here or oc edit node and then clean storagenode=glusterfs label.

After all this is done, then running cns-deploy should work as expected

#cns, #container-native-storage, #gluster, #ipv6, #lvm, #openshift-container-platform

firewalld custom rules for OpenShift Container Platform

If you end with below while trying to restart iptables, then firewalld is the service you will be looking at

# systemctl restart iptables 
Failed to restart iptables.service: Unit is masked.

Firewalld service has set of commands, but most notable one is firewall-cmd and if run in help mode, it will present iteself in whole messy glory … try and run!

# firewall-cmd -h

will give you all necessary to proceed to play with firewalld rules.

Useful ones are

# systemctl status firewalld
# firewall-cmd --get-zones
# firewall-cmd --list-all-zones
# firewall-cmd --get-default-zone
# firewall-cmd --get-active-zones
# firewall-cmd --info-zone=public

and hundredths of others, man firewal-cmd is man page to read.

If for some reason we have to change firewalld rules then that could be different experience than most linux users are get used.

In recent OpenShift installation you will notice many firewalld rules created by Openshift installation. An example of input chain is

Chain IN_public_allow (1 references)
  pkts bytes target     prot opt in     out     source               destination         

  598 31928 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:22 ctstate NEW
    0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:2379 ctstate NEW
   24  1052 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:443 ctstate NEW
   34  1556 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:80 ctstate NEW
    0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:8053 ctstate NEW
    0     0 ACCEPT     udp  --  *      *       0.0.0.0/0            0.0.0.0/0            udp dpt:10255 ctstate NEW
 2669  160K ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:8443 ctstate NEW
    0     0 ACCEPT     udp  --  *      *       0.0.0.0/0            0.0.0.0/0            udp dpt:4789 ctstate NEW
    0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:10250 ctstate NEW
    0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:10255 ctstate NEW
    0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:8444 ctstate NEW
    0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:2380 ctstate NEW
13862 1488K ACCEPT     udp  --  *      *       0.0.0.0/0            0.0.0.0/0            udp dpt:8053 ctstate NEW

Trying to add additional rule in IN_public_allow with classical iptables will not work. Firwealld has different approach.

ie. to add CNS ( Container Native Storage ) ports ( which are by default not open and that will be like that as long as CNS is not part of default OpenShift ansible installer ) then we need to run

# firewall-cmd --direct --add-rule ipv4 filter IN_public_allow 1 -m tcp -p tcp -m conntrack --ctstate NEW --dport 24007 -j ACCEPT
# firewall-cmd --direct --add-rule ipv4 filter IN_public_allow 1 -m tcp -p tcp -m conntrack --ctstate NEW --dport 24008 -j ACCEPT
# firewall-cmd --direct --add-rule ipv4 filter IN_public_allow 1 -m tcp -p tcp -m conntrack --ctstate NEW --dport 2222 -j ACCEPT
# firewall-cmd --direct --add-rule ipv4 filter IN_public_allow 1 -m tcp -p tcp -m conntrack --ctstate NEW -m multiport --dports 49152:49664 -j ACCEPT

keyword is --direct as it name says, it will interact with firewalld rules direct-ly. More about this here and here

After adding rules, if not saved with

# firewall-cmd --runtime-to--permanent

next restart of firewalld.service will clean ip them, so necessary to save rules. These rules will be written in /etc/firewalld/direct.xml

#fedora-2, #firewalld, #iptables, #openshift, #redhat

OCP metrics error message “Error from server: No API token found for service account “metrics-deployer”

I wanted to recreate OCP – OpenShift Container Platform metrics and followed same upstream process as many times before but it was keep failing with

Error from server: No API token found for service account "metrics-deployer", retry after the token is automatically created and added to the service account 

huh, new trouble, luckily restarting master services helped in this case

# systemctl restart atomic-openshift-master-controllers; systemctl restart atomic-openshift-master-api

This was multimaster configuration, so necessary to restart master services on all masters.

Just writing it down in hope google will pick up tags and hopefully help someone with same issue

Happy hacking!

#atomic-openshift-master-api, #atomic-openshift-master-controlers, #kubernetes, #metrics, #ocp, #openshift-container-platform, #openshift-metrics

OpenShift : Error from server: User “$user” cannot list all nodes/pods in the cluster

The below OpenShift error messages can be quite annoying, and they appear if current login is not for system:admin
Example error messages

# oc get pods 
No resources found.
Error from server: User "system:anonymous" cannot list pods in project "default"
root@dhcp8-176: ~ # oc get nodes 
No resources found.
Error from server: User "system:anonymous" cannot list all nodes in the cluster

Trying to login with user admin will not help

# oc login -u admin 
Authentication required for https://dhcp8-144.example.net:8443 (openshift)
Username: admin
Password: 
Login successful.
You don't have any projects. You can try to create a new project, by running
    oc new-project 

root@dhcp8-176: ~ # oc get pods
No resources found.
Error from server: User "admin" cannot list pods in project "default"
root@dhcp8-176: ~ # oc get nodes 
No resources found.
Error from server: User "admin" cannot list all nodes in the cluster

To get rid of it, login as system:admin

# oc login -u system:admin

what it does and what certificates reads in order to succeed is possible to see if last command run with --loglevel=10

# oc login -u system:admin --logleve=10

#kubernetes, #openshift

Recreate OpenShift v.3.3 router

If for any reason is necessary to recreate OpenShift router, do below
Delete router pod,service, serviceaccount,dc

# oc delete pods $(oc get pods | grep router | awk '{print $1}') 
# oc delete svc router 
# oc delete serviceaccount router 
# oc delete dc router 

re-create router

 
# oadm router router --service-account=router

Reference : Openshift v3.3 documentation

#kubernets, #oadm, #openshift-v3-3, #rhel, #router

etcd error message “etcd failed to send out hearbeat on time”

… etcd distributed key value store that provides a reliable way to store data across a cluster of machines per 1 and 2. ETCD is very sensitive on delays in networks, and not only in networks but all kind of overlay sluggishness of etcd cluster nodes can lead to complete kubernets cluster functionality problems.

At time when OpenShift/Kubernetes cluster starts reporting error messages as showed below, cluster will already behave inappropriate and pods scheduling / deleting will not work as expected and problems will be more than visible

Sep 27 00:04:01 dhcp7-237 etcd: failed to send out heartbeat on time (deadline exceeded for 1.766957688s)
Sep 27 00:04:01 dhcp7-237 etcd: server is likely overloaded
Sep 27 00:04:01 dhcp7-237 etcd: failed to send out heartbeat on time (deadline exceeded for 1.766976918s)
Sep 27 00:04:01 dhcp7-237 etcd: server is likely overloaded

systemctl status etcd output

 systemctl status etcd
● etcd.service - Etcd Server
   Loaded: loaded (/usr/lib/systemd/system/etcd.service; enabled; vendor preset: disabled)
   Active: active (running) since Sat 2016-10-01 09:18:37 EDT; 5h 20min ago
 Main PID: 11970 (etcd)
   Memory: 1.0G
   CGroup: /system.slice/etcd.service
           └─11970 /usr/bin/etcd --name=dhcp6-138.example.net --data-dir=/var/lib/etcd/ --listen-client-urls=https://172.16.6.138:2379

Oct 01 14:38:55 dhcp6-138.example.net etcd[11970]: server is likely overloaded
Oct 01 14:38:56 dhcp6-138.example.net etcd[11970]: failed to send out heartbeat on time (deadline exceeded for 377.70994ms)
Oct 01 14:38:56 dhcp6-138.example.net etcd[11970]: server is likely overloaded
Oct 01 14:38:56 dhcp6-138.example.net etcd[11970]: failed to send out heartbeat on time (deadline exceeded for 377.933298ms)
Oct 01 14:38:56 dhcp6-138.example.net etcd[11970]: server is likely overloaded
Oct 01 14:38:58 dhcp6-138.example.net etcd[11970]: failed to send out heartbeat on time (deadline exceeded for 1.226630142s)
Oct 01 14:38:58 dhcp6-138.example.net etcd[11970]: server is likely overloaded
Oct 01 14:38:58 dhcp6-138.example.net etcd[11970]: failed to send out heartbeat on time (deadline exceeded for 1.226803192s)
Oct 01 14:38:58 dhcp6-138.example.net etcd[11970]: server is likely overloaded
Oct 01 14:39:07 dhcp6-138.example.net etcd[11970]: the clock difference against peer f801f8148b694198 is too high [1.078081179s > 1s]

# systemctl status etcd -l will also have similar messages,and check these too

ETCD configuration file is located in /etc/etcd/etcd.conf and has similar content as below, this one is from RHEL, other OSes can have it a bit changed

ETCD_NAME=dhcp7-237.example.net
ETCD_LISTEN_PEER_URLS=https://172.16.7.237:2380
ETCD_DATA_DIR=/var/lib/etcd/
ETCD_HEARTBEAT_INTERVAL=6000
ETCD_ELECTION_TIMEOUT=30000
ETCD_LISTEN_CLIENT_URLS=https://172.16.7.237:2379

ETCD_INITIAL_ADVERTISE_PEER_URLS=https://172.16.7.237:2380
ETCD_INITIAL_CLUSTER=dhcp7-241.example.net=https://172.16.7.241:2380,dhcp7-237.example.net=https://172.16.7.237:2380,dhcp7-239.example.net=https://172.16.7.239:2380
ETCD_INITIAL_CLUSTER_STATE=new
ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster-1
ETCD_ADVERTISE_CLIENT_URLS=https://172.16.7.237:2379


ETCD_CA_FILE=/etc/etcd/ca.crt
ETCD_CERT_FILE=/etc/etcd/server.crt
ETCD_KEY_FILE=/etc/etcd/server.key
ETCD_PEER_CA_FILE=/etc/etcd/ca.crt
ETCD_PEER_CERT_FILE=/etc/etcd/peer.crt
ETCD_PEER_KEY_FILE=/etc/etcd/peer.key

bold parameters in above configuration files are ones we want to change ETCD_HEARTBEAT_INTERVAL and ETCD_ELECTION_TIMEOUT and there is not unified value for all, it is necessary to play with different values and find out what is best. For most cases default (500/2500) will be fine.

After changing /etc/etcd/etc.conf do not forget to restart etcd service

# systemctl restart etcd

Below issue affecting ETCD nodes can lead to problem described in this post

  • network latency
  • storage latency
  • combination of network latency and storage latency

if network latency is low, then check storage which is used by Kubernets/OpenShift ETCD servers. This is workaround for case when root cause is discovered and changes as stated in this post are performed in order to mitigate issue when no other option is possible. First and better solution would be to solve issue at its roots by fixing problematic subsystem(s).

In my particular case storage subsystem was slow and not possible to change that without bunch of $$$

References : etcd documentation

#etcd, #k8s, #kubernetes, #linux, #openshift, #redhat, #storage