Tuesday 20 March 2012

Unable to run Instance, Ends with status Error in OpenStack

Hello all
I came across this problem when I started testing the OpenStack Diablo release.Let me explain what happens usually when you begin working with OpenStack .
I use to re-compile the OpenStack Installation for development and testing purpose. (i.e re-running /.stack.sh when you use devstack). Every thing will go smoothly and the installation will show "Successfully Installation " , And when you open the Dashboard (http://cloud server),  there will be nothing wrong but when you launch an instance it will end with "error" status or sometime "build" status will never change.

Reason:

This happens only when you launched instances during last installation and closed/restarted/shutdown your cloud server without terminating the  running instances.
When you launch an instance (in case of KVM) , a new instance will be created by  defining a XML file which has the instance configuration ( hw config like cpu,ram,Disk,etc) and launched.
If you /restarted/shutdownthe cloud server or reinstall the OpenStack without terminating the instances, then the created running instance's status will change to "shut off" and it remains in the hypervisor (i.e xml file will not be destroyed). So when you tries to launch an new instances from the re installed OpenStack ,  every thing will go well by instances cannot be launched , this is because when a new installation sends a request to create an instances it will send request with name "instance-00000001" (i.e always begins in 1 and increments it for successive attempts). This request will fail because there exist an instance with same name in the hypervisor (KVM). If you check the log of "n-cpu" you will find
-----------------------------------------------------------------------------------------------
Error: operation failed: domain 'instance-00000001' already exists with uuid xxxxxxxx-xxxx-yyyy-cccc-xxxxxxxxxxx
-------------------------------------------------------------------------------------------

Solutions
There are two solutions for this

1. Simplest one is attempting to launch instance more number of times so that it  exceed the total number of instances launched during last installation. This is not complete solution. I figured out this solution during my initial experience with openstack and then I started to explore more to find the solution which comes next.

2. Run this commend

----------------------
$virsh list --all
---------------------


you will get something like below
------------------------------------------------------------------
Id Name State
----------------------------------
  - instance-00000001 shut off
  - instance-00000002 shut off 
  - instance-00000003 shut off 

-------------------------------------------------------------------------------

These are the definitions exist in hypervisors for the previously launched instances so you have to un-define it by using this command
-------------------------------------------------------------

$virsh undefine  instance-00000004

Domain instance-00000001 has been undefined
-------------------------------------------------------------

This has to be done for each instances in the list.

Now you are good to go.. The problem is solved. 





Note: since instance is not running, you cannot destroy these instance using
----------------------------------------------
$virsh destroy  instance-00000004

----------------------------------------------

You can find more details about"virsh"  command in following links

Ref: