Friday, June 24, 2016

TripleO QuickStart HA Setup && Keeping undercloud persistent between cold reboots

================
UPDATE 09/03/2016
================
Undercloud VM gets created with AutoStart at boot up
in meantime.So just change permissions and allow services
to start on undercloud (5 min - 7 min )

Up on deployment completed
[stack@ServerTQS72 ~]$ virsh dominfo undercloud | grep -i autostart
Autostart:      enable

================
UPDATE 08/18/2016
================
Make following updates

[root@ServerTQS72 ~]# cat /etc/rc.d/rc.local
#!/bin/bash
# Please note that you must run 'chmod +x /etc/rc.d/rc.local' to ensure
# that this script will be executed during boot.
mkdir -p /run/user/1001
chown -R stack /run/user/1001
chgrp -R stack /run/user/1001

touch /var/lock/subsys/local

========================
In stack's .bashrc
========================

[stack@ServerTQS72 ~]$ cat .bashrc
# .bashrc

# Source global definitions
if [ -f /etc/bashrc ]; then
    . /etc/bashrc
fi

# Uncomment the following line if you don't like systemctl's auto-paging feature:
# export SYSTEMD_PAGER=

# User specific aliases and functions
# BEGIN ANSIBLE MANAGED BLOCK
# Make sure XDG_RUNTIME_DIR is set (used by libvirt
# for creating config and sockets for qemu:///session
# connections)
: ${XDG_RUNTIME_DIR:=/run/user/$(id -u)}
export XDG_RUNTIME_DIR
export DISPLAY=:0.0
export NO_AT_BRIDGE=1

# END ANSIBLE MANAGED BLOCK

===========================
Reboot VIRTHOST
===========================

$ sudo su -
# xhost +
# su - stack

[stack@ServerTQS72 ~]$ virt-manager --connect qemu:///session

Start VM undercloud



=============
END UPDATE
=============


This post follows up http://lxer.com/module/newswire/view/230814/index.html
and might work as timer saver unless status undecloud.qcow2 per
http://artifacts.ci.centos.org/artifacts/rdo/images/mitaka/delorean/stable/
requires fresh installation to be done from scratch
So, we intend to survive VIRTHOST cold reboot (downtime) and keep previous version of undercloud VM been able to bring it up avoiding build via quickstart.sh and restart procedure from logging into undercloud and immediately run overcloud deployment. Proceed as follows :-

1. System shutdown
    Cleanly commit :-
    [stack@undercloud~] $ openstack stack delete overcloud
2. Login into VIRTHOST as stack and gracefully shutdown undercloud
    [stack@ServerCentOS72 ~]$ virsh shutdown undercloud

  

**************************************
 Shutdown and bring up VIRTHOST
**************************************

 Login as root to VIRTHOST :-

[boris@ServerCentOS72 ~]$ sudo su -
[sudo] password for boris:
Last login: Fri Jun 24 16:47:25 MSK 2016 on pts/0

********************************************************************************
This is core step , not to create /run/user/1001/libvirt by root
setting appropriate permissions, just only set correct permissions
on /run/user.  This will allow "stack" to issue `virsh list --all` and create
by himself /run/user/1001/libvirt. The rest works fine for myself
********************************************************************************

[root@ServerCentOS72 ~]# chown -R stack /run/user
[root@ServerCentOS72 ~]# chgrp -R stack /run/user

[root@ServerCentOS72 ~]# ls -ld  /run/user
drwxr-xr-x. 3 stack stack 60 Jun 24 20:01 /run/user

[root@ServerCentOS72 ~]# su - stack
Last login: Fri Jun 24 16:48:09 MSK 2016 on pts/0

[stack@ServerCentOS72 ~]$ virsh list --all
 Id    Name                           State
----------------------------------------------------
 -     compute_0                   shut off
 -     compute_1                   shut off
 -     control_0                      shut off
 -     control_1                      shut off
 -     control_2                      shut off
 -     undercloud                   shut off

**********************
Make sure :-
**********************

[stack@ServerCentOS72 ~]$ ls -ld /run/user/1001/libvirt
drwx------. 6 stack stack 160 Jun 24 21:38 /run/user/1001/libvirt


[stack@ServerCentOS72 ~]$ virsh start undercloud
Domain undercloud started

[stack@ServerCentOS72 ~]$ virsh list --all
 Id    Name                           State
---------------------------------------------------------------
 2     undercloud                    running
 -     compute_0                      shut off
 -     compute_1                      shut off
 -     control_0                        shut off
 -     control_1                        shut off
 -     control_2                        shut off

Wait about 5 min and access the undercloud from workstation by:-

[boris@fedora22wks tripleo-quickstart]$ ssh -F /home/boris/.quickstart/ssh.config.ansible undercloud
Warning: Permanently added '192.168.1.75' (ECDSA) to the list of known hosts.
Warning: Permanently added 'undercloud' (ECDSA) to the list of known hosts.
Last login: Fri Jun 24 15:34:40 2016 from gateway

[stack@undercloud ~]$ ls -l
total 1640244
-rw-rw-r--. 1 stack stack   13287936 Jun 24 13:10 cirros.img
-rw-rw-r--. 1 stack stack    3740163 Jun 24 13:10 cirros.initramfs
-rw-rw-r--. 1 stack stack    4979632 Jun 24 13:10 cirros.kernel
-rw-rw-r--. 1  1001  1001      21769 Jun 24 11:56 instackenv.json
-rw-r--r--. 1 root  root   385824684 Jun 24 03:28 ironic-python-agent.initramfs
-rwxr-xr-x. 1 root  root     5158704 Jun 24 03:28 ironic-python-agent.kernel
-rwxr-xr-x. 1 stack stack        487 Jun 24 12:17 network-environment.yaml
-rwxr-xr-x. 1 stack stack        792 Jun 24 12:17 overcloud-deploy-post.sh
-rwxr-xr-x. 1 stack stack       2284 Jun 24 12:17 overcloud-deploy.sh
-rw-rw-r--. 1 stack stack       4324 Jun 24 13:50 overcloud-env.json
-rw-r--r--. 1 root  root    36478203 Jun 24 03:28 overcloud-full.initrd
-rw-r--r--. 1 root  root  1224070144 Jun 24 03:29 overcloud-full.qcow2
-rwxr-xr-x. 1 root  root     5158704 Jun 24 03:29 overcloud-full.vmlinuz
-rw-rw-r--. 1 stack stack        389 Jun 24 14:28 overcloudrc
-rwxr-xr-x. 1 stack stack       3374 Jun 24 12:17 overcloud-validate.sh
-rwxr-xr-x. 1 stack stack        284 Jun 24 12:17 run-tempest.sh
-rw-r--r--. 1 stack stack        161 Jun 24 12:17 skipfile
-rw-------. 1 stack stack        287 Jun 24 12:16 stackrc
-rw-rw-r--. 1 stack stack        232 Jun 24 14:28 tempest-deployer-input.conf
drwxrwxr-x. 9 stack stack       4096 Jun 24 15:23 tripleo-ci
-rw-rw-r--. 1 stack stack       1123 Jun 24 14:28 tripleo-overcloud-passwords
-rw-------. 1 stack stack       6559 Jun 24 11:59 undercloud.conf
-rw-rw-r--. 1 stack stack     782405 Jun 24 12:16 undercloud_install.log
-rwxr-xr-x. 1 stack stack         83 Jun 24 12:00 undercloud-install.sh
-rw-rw-r--. 1 stack stack       1579 Jun 24 12:00 undercloud-passwords.conf
-rw-rw-r--. 1 stack stack       7699 Jun 24 12:17 undercloud_post_install.log
-rwxr-xr-x. 1 stack stack       2780 Jun 24 12:00 undercloud-post-install.sh

[stack@undercloud ~]$ ./overcloud-deploy.sh

  
  

  Fourth redeployment based on same undercloud VM.  DHCP pool of ctlplane
  is obviosly increasing  starting point



   Libvirt's pool && volumes configuration been built by QuickStart


[stack@ServerCentOS72 ~]$  virsh pool-dumpxml oooq_pool
<pool type='dir'>
  <name>oooq_pool</name>
  <uuid>dcf7f52b-e7f7-46aa-aa67-591afe598804</uuid>
  <capacity unit='bytes'>257572208640</capacity>
  <allocation unit='bytes'>85467271168</allocation>
  <available unit='bytes'>172104937472</available>
  <source>
  </source>
  <target>
    <path>/home/stack/.quickstart/pool</path>
    <permissions>
      <mode>0775</mode>
      <owner>1001</owner>
      <group>1001</group>
      <label>unconfined_u:object_r:user_home_t:s0</label>
    </permissions>
  </target>
</pool>
 
***************************************************************************
A bit different way to manage - login as stack and invoke virt-manager
via `virt-manager --connect qemu:///session` when /run/user already got
a correct permissions.
***************************************************************************
$ sudo su -
# chown -R stack /run/user
# chgrp -R stack /run/user
^D
[stack@ServerCentOS72 ~]$ virsh list --all
 Id    Name                           State
----------------------------------------------------
 -     compute_0                      shut off
 -     compute_1                      shut off
 -     control_0                      shut off
 -     control_1                      shut off
 -     control_2                      shut off
 -     undercloud                     shut off

[stack@ServerCentOS72 ~]$ virt-manager --connect qemu:///session
[stack@ServerCentOS72 ~]$ virsh list --all
 Id    Name                           State
----------------------------------------------------
 2     undercloud                     running
 -     compute_0                      shut off
 -     compute_1                      shut off
 -     control_0                      shut off
 -     control_1                      shut off
 -     control_2                      shut off

   To start virt-manager without warning :-

  


From workstation connect to undercloud
[boris@fedora22wks tripleo-quickstart]$ ssh -F /home/boris/.quickstart/ssh.config.ansible undercloud
[stack@undercloud~] ./overcloud-deploy.sh
In several minutes you will see
  
  

[stack@undercloud ~]$ nova list
+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+
| ID                                   | Name                    | Status | Task State | Power State | Networks            |
+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+
| 40754e8a-461e-4328-b0c4-6740c71e9a0d | overcloud-controller-0  | ACTIVE | -          | Running     | ctlplane=192.0.2.27 |
| df272524-a0bd-4ed7-b95c-92ac779c0b96 | overcloud-controller-1  | ACTIVE | -          | Running     | ctlplane=192.0.2.26 |
| 22802ff4-c472-4500-94d7-415c429073ab | overcloud-controller-2  | ACTIVE | -          | Running     | ctlplane=192.0.2.29 |
| e79a8967-5c81-4ce1-9037-4e07b298d779 | overcloud-novacompute-0 | ACTIVE | -          | Running     | ctlplane=192.0.2.25 |
| 27a7c6ac-a480-4945-b4d5-72e32b3c1886 | overcloud-novacompute-1 | ACTIVE | -          | Running     | ctlplane=192.0.2.28 |
+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+

[stack@undercloud ~]$ ssh heat-admin@192.0.2.27
Last login: Sat Jun 25 09:35:35 2016 from 192.0.2.1
[heat-admin@overcloud-controller-0 ~]$ sudo su -
Last login: Sat Jun 25 09:54:06 UTC 2016 on pts/0
[root@overcloud-controller-0 ~]# .  keystonerc_admin
[root@overcloud-controller-0 ~(keystone_admin)]# pcs status
Cluster name: tripleo_cluster
Last updated: Sat Jun 25 10:04:32 2016        Last change: Sat Jun 25 09:21:21 2016 by root via cibadmin on overcloud-controller-0
Stack: corosync
Current DC: overcloud-controller-2 (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum
3 nodes and 127 resources configured

Online: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]

Full list of resources:

 ip-172.16.2.5    (ocf::heartbeat:IPaddr2):    Started overcloud-controller-0
 ip-172.16.3.4    (ocf::heartbeat:IPaddr2):    Started overcloud-controller-1
 Clone Set: haproxy-clone [haproxy]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Master/Slave Set: galera-master [galera]
     Masters: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: memcached-clone [memcached]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 ip-192.0.2.24    (ocf::heartbeat:IPaddr2):    Started overcloud-controller-2
 ip-10.0.0.4    (ocf::heartbeat:IPaddr2):    Started overcloud-controller-0
 ip-172.16.2.4    (ocf::heartbeat:IPaddr2):    Started overcloud-controller-1
 ip-172.16.1.4    (ocf::heartbeat:IPaddr2):    Started overcloud-controller-2
 Clone Set: rabbitmq-clone [rabbitmq]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-core-clone [openstack-core]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Master/Slave Set: redis-master [redis]
     Masters: [ overcloud-controller-1 ]
     Slaves: [ overcloud-controller-0 overcloud-controller-2 ]
 Clone Set: mongod-clone [mongod]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-aodh-evaluator-clone [openstack-aodh-evaluator]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-nova-scheduler-clone [openstack-nova-scheduler]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-l3-agent-clone [neutron-l3-agent]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-netns-cleanup-clone [neutron-netns-cleanup]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-ovs-cleanup-clone [neutron-ovs-cleanup]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 openstack-cinder-volume    (systemd:openstack-cinder-volume):    Started overcloud-controller-0
 Clone Set: openstack-heat-engine-clone [openstack-heat-engine]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-ceilometer-api-clone [openstack-ceilometer-api]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-aodh-listener-clone [openstack-aodh-listener]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-metadata-agent-clone [neutron-metadata-agent]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-gnocchi-metricd-clone [openstack-gnocchi-metricd]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-aodh-notifier-clone [openstack-aodh-notifier]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-heat-api-clone [openstack-heat-api]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-ceilometer-collector-clone [openstack-ceilometer-collector]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-glance-api-clone [openstack-glance-api]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-cinder-scheduler-clone [openstack-cinder-scheduler]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-nova-api-clone [openstack-nova-api]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-nova-consoleauth-clone [openstack-nova-consoleauth]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-sahara-api-clone [openstack-sahara-api]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-heat-api-cloudwatch-clone [openstack-heat-api-cloudwatch]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-sahara-engine-clone [openstack-sahara-engine]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-glance-registry-clone [openstack-glance-registry]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-gnocchi-statsd-clone [openstack-gnocchi-statsd]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-ceilometer-notification-clone [openstack-ceilometer-notification]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-cinder-api-clone [openstack-cinder-api]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-dhcp-agent-clone [neutron-dhcp-agent]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-openvswitch-agent-clone [neutron-openvswitch-agent]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-nova-novncproxy-clone [openstack-nova-novncproxy]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: delay-clone [delay]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-server-clone [neutron-server]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-ceilometer-central-clone [openstack-ceilometer-central]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: httpd-clone [httpd]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-heat-api-cfn-clone [openstack-heat-api-cfn]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-nova-conductor-clone [openstack-nova-conductor]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]

Failed Actions:
* openstack-aodh-evaluator_monitor_60000 on overcloud-controller-1 'not running' (7): call=92, status=complete, exitreason='none',
    last-rc-change='Sat Jun 25 09:16:45 2016', queued=0ms, exec=0ms
* openstack-gnocchi-metricd_monitor_60000 on overcloud-controller-1 'not running' (7): call=355, status=complete, exitreason='none',
    last-rc-change='Sat Jun 25 10:00:10 2016', queued=0ms, exec=0ms
* openstack-gnocchi-statsd_start_0 on overcloud-controller-1 'not running' (7): call=313, status=complete, exitreason='none',
    last-rc-change='Sat Jun 25 09:20:51 2016', queued=0ms, exec=2101ms
* openstack-ceilometer-central_start_0 on overcloud-controller-1 'not running' (7): call=328, status=complete, exitreason='none',
    last-rc-change='Sat Jun 25 09:23:05 2016', queued=0ms, exec=2121ms
* openstack-aodh-evaluator_monitor_60000 on overcloud-controller-0 'not running' (7): call=97, status=complete, exitreason='none',
    last-rc-change='Sat Jun 25 09:16:43 2016', queued=0ms, exec=0ms
* openstack-gnocchi-metricd_monitor_60000 on overcloud-controller-0 'not running' (7): call=365, status=complete, exitreason='none',
    last-rc-change='Sat Jun 25 10:00:12 2016', queued=0ms, exec=0ms
* openstack-gnocchi-statsd_start_0 on overcloud-controller-0 'not running' (7): call=324, status=complete, exitreason='none',
    last-rc-change='Sat Jun 25 09:22:32 2016', queued=0ms, exec=2237ms
* openstack-ceilometer-central_start_0 on overcloud-controller-0 'not running' (7): call=342, status=complete, exitreason='none',
    last-rc-change='Sat Jun 25 09:23:32 2016', queued=0ms, exec=2200ms
* openstack-aodh-evaluator_monitor_60000 on overcloud-controller-2 'not running' (7): call=94, status=complete, exitreason='none',
    last-rc-change='Sat Jun 25 09:16:47 2016', queued=0ms, exec=0ms
* openstack-gnocchi-metricd_monitor_60000 on overcloud-controller-2 'not running' (7): call=353, status=complete, exitreason='none',
    last-rc-change='Sat Jun 25 10:00:08 2016', queued=0ms, exec=0ms
* openstack-gnocchi-statsd_start_0 on overcloud-controller-2 'not running' (7): call=318, status=complete, exitreason='none',
    last-rc-change='Sat Jun 25 09:22:39 2016', queued=0ms, exec=2113ms
* openstack-ceilometer-central_start_0 on overcloud-controller-2 'not running' (7): call=322, status=complete, exitreason='none',
    last-rc-change='Sat Jun 25 09:22:48 2016', queued=0ms, exec=2123ms



PCSD Status:
  overcloud-controller-0: Online
  overcloud-controller-1: Online
  overcloud-controller-2: Online

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled