Just to keep in mind an overview of the load balancing of vCloud Director.
vCNS load balancer :
Netscaler load balancer :
Just a little reminder about the configuration of vRA
Tenant creation (System administrator privilege required for this operation)
Tenant configuration (Tenant administrator privilege required for this action)
Create endpoint (tenant administrator privilege required for this action)
Make sure that no error appears in IaaS logs :
|Tips : If cluster storage does not appear, you can think about an issue with MSDTC between SQL and IaaS|
Blueprint creation (IaaS Architecte privilege required for this action)
Catalog creation (Tenant administrator privilege required for this action)
Test and enjoy 🙂
Let’s keep with the VMware product SSL certificates replacement serie.
I assume here that you already prepared your PKI. If you did not managed it yet check KB2112009.
Today we’ll be talking about NSX certificates.
Installing trusted certificate is quite easy in NSX. The action relies on four steps :
From the NSX Manager administration interface (reachable via http://NSXManager-FQDN with admin account and “default” as password if you did not change it) :
From your MS PKI web page (https://PKI-FQDN/certsrv) :
At this step you will need to use the Root-CA certificate of your MS PKI.
Before building the SSL chain, verify that the NSX certificate has been correctly created
The new chain should be like that :
Go back to the NSX Manager administration interface :
Thank you guys for reading and as usual, feel free to comment, share and give me support 🙂
It’s been a long time since my last post.
Here I come with a new issue in vCloud Director Service Provider (8.10 but I think it’s applicable to more recent and previous versions).
Our platform consists of vCenter 6.0U2 (external PSC , tiny model), NSX 6.2.5 and vCloud Director SP 8.10.1.
A few days ago I faced a weird issue while I was trying to import a vApp template in my organization catalog :
“[xxxxx] Folder xxxx does not exist in our inventory, but vCenter Server claims that is does.”
Meanwhile, I was able to upload media files in the same catalog.
I observed the same results after running the same tests on all our catalogs and organization.
How the problem has been solved ?
I first thought it was due to a vCloud inventory issue. I forced a synchronization with vCenter without any improvement. I even cleared the INV tables in the vCloud database. Same result, still unable to upload vApp or OVF in the catalogs.
VMware support pointed out the issue without even requiring the support bundles.
This was actually a vCenter issue and not a vCloud one. Our vCenter was suffering of low memory.
vCenter did what vCloud asked but took too much time to inform to update the cell. This is the explanation given by VMware. A simple reboot should solve the issue.
To confirm the RAM problem, i simply ran the command “free -m” on our vCenter appliance, the output showed that the swap partition was heavily used, almost entirely, more than 20 GB. I do not mention the RAM on purpose because it almost always consumes around 8 GB.
In this case, swapping very likely means that the vCenter has memory leaks…
A simple reboot could have freed the memory and flushed the swap partition. I think so, however I decided to add some more RAM and adjust the VM to the small model. This, because our platform also backs vRA and a lot of other components that interact with vCenter.
After the reboot my upload issue was solved !
Good to know !
Notice that as of the version 6 of vcsa, it is no more required to manually adjust the RAM dedicated to the JVM. The JVM memory is dynamically adjusted.
The famous William Lam (blog VirtuallyGhettto) talks about that in this post.
Disk sizing upgrade
Moreover, no need to manually resize all the file systems. A script checks the disks and volumes and resizes them automatically at the boot of the appliance. If the resizing occurs while the appliance is already running a simple command line does the job. W. Lam explains that too here.
If you want to get more information about the VCSA partitioning you can check this KB.
Here, a reminder for the different VCSA sizing models.
I could also have named this post “Why you should never purge the content of the staging folder of your cells” 🙂
As I was troubleshooting a problem for a customer, I faced an annoying issue :
I was suddenly unable to download specific vApps and was always receiving the following error from vCloud “Invalid response from server”.
Very interesting and crystal clear message, isn’t it ?!
My best option to figure out what was going on was to browse vCloud logs.
vCloud-container-debug.log file gave precious information that helped me to understand my problem.
Look at this :
|Resource file: descriptor.ovf(2b448314-daf6-46dc-b7f1-84bb205f35c6). Download failed. Unable to locate resource file | requestId=da57dd71-aded-47e4-8d9d-93a64a8cab95,request=GET https://CellIPAddress/transfer/2a3ee224-60a8-4f64-b99b-84afade8f3e9/descriptor.ovf|
The OVF descriptor of the vApp I wanted to download could not be found. vCloud was unable to know what to transfer to its client.
|What is OVF descriptor ?
In a nutshell the OVF descriptor is a XML file that contains all necessary information about an OVF package (also used with OVA), its content (the VMs that make up the vApp) and how to download it. You can find more details about OVF format here.
I can guess a question rising in your head : Why vCloud is unable to find the files since we are simply trying to download an existing vApp template ? vCloud should already know the different templates it stores in its catalogs ! You’re right but …
Getting more and more confused, I started thinking about the last events and I remembered that some days before, I had manually purged the staging folder !
All the files were quite old (more than 2 weeks) and were supposed to be automatically removed. I thought – and I was wrong – that there was an issue with the cell.
What a big mistake !! Actually by manually deleting the content of the staging folder (/opt/vmware/vcloud-director/data/transfer) I accidentally broke the link between vCloud and its vApp download session.
At this time, I realized that I was missing something in the understanding of the download process and I decided to delve this particular topic.
This post will expose what I knew and learnt and also how VMware support team helped me to solve the problem. It will be articulated like that :
When one downloads a vApp template two main steps are achieved :
1 – After clicking on “Download…” vCloud enables the vApp template for download.
Actually, enabling a vApp for download is not only changing a property from “False” to “True”, its also copying the vApp content in the staging folder. That’s why if you pay attention to the operation, you may feel that it is may be very long (depending on the size of the vApp).
2 – Once the enablement action is completed, the download from the client starts.
If you look at the picture below you will see that :
1. The vApp is being enabled for download in vCloud Director
2. In the same time vCenter is exporting the OVF template (the target folder is the staging folder !).
3. Nothing is happening in the browser transfer windows. It is totally normal. The transfer will process only once the OVF export will have been completed.
But behing this, several things are achieved : checks, db updates etc…
Let me give you some details about what really runs in background when a cell manages a download request.
I made the diagram below according to my understanding of the process. So feel free to tell me if something’s wrong.
Depending on when you cancel a download – during or after the enablement of the vApp – different exepected results are observed.
I drew another diagram to present them.
|*About the transfer session time out value :
This value defines the period in course of which any interrupted transfer session can be resumed (without re-enabling the vApp).
Once the limit is reached, the data are deleted from the staging folder.
One can consider this value as the link between vCloud and a vApp download session (the famous I shouldn’t have broken).
You can find this value in the system settings of vCloud Director :
In normal conditions an automatic cleanup of the DB should have been done in the DB but in my case, VMware support pointed out a time sync issue between the cells and the DB server. The cells were running 3 mns behind the DB server so the “CleanTransferSession” triggers could never be met.
VMware and I decided to first clean the DB and only after and for me, solve the time sync* issue.
|To see the planned “CleanTransferSession” triggers, run the query :
select * from QRTZ_TRIGGERS where TRIGGER_NAME=‘GLOBAL_com.vmware.vcloud.transfer.server_cleanTransferSessionsTrigger’
To read the time value, use some timer converter website like https://www.epochconverter.com/
This manual DB cleanup relies on clearing records related to the download tasks plus – I think it is optional – the usual queries to clear QUARTZ and INV tables.
* This post will not show how to solve the time sync issue (ntpdate service misconfiguration here) but you can have a look at time keeping KBs :
For linux, timekeeping best practices are available here.
For winfows, the same there.
Clearing vApp download tasks
Of course, at this step run a backup of the DB before process with any records deletion !
Also notice that the procedure below is not supported by VMware and must not be followed without their support !
You first have to identify the records to delete. To do so, launch the query
select * from transfer_session
You will get something like this :
Then, for each transfer session (vApp download task), you must delete the relevant files.
select * from dbo.resource_file where spool_dir = ‘/opt/vmware/vcloud-director/data/transfer/2a3ee224-60a8-4f64-b99b-84afade8f3e9’ to confirm the content of a specific vApp and delete from transfer_session where transfer_session_id = 0x2A3EE22460A84F64B99B84AFADE8F3E9
delete * from dbo.resource_file and delete * from transfer_session can be run too of course but must be used with caution
You will have understood that dbo.resource_file.spool_dir = transfer_session.base_dir
Once all transfer sessions have been cleared, we can reset QUARTZ and INV tables.
|As info or reminder, QUARTZ tables store information about vCloud processes and tasks and INV tables store data about vCenter inventory.
Among several reasons, we may have to clear them when vCloud objects status are not synced with their real status in vSphere.
Clearing QUARTZ and INV tables
Prior the resume, we have to stop the cell according. I deal with stopping vCloud in this post, under “Implement vCloud certificates” section.
|DELETE FROM QRTZ_SCHEDULER_STATE;
DELETE FROM QRTZ_FIRED_TRIGGERS;
DELETE FROM QRTZ_PAUSED_TRIGGER_GRPS;
DELETE FROM QRTZ_CALENDARS;
DELETE FROM QRTZ_TRIGGER_LISTENERS;
DELETE FROM QRTZ_BLOB_TRIGGERS;
DELETE FROM QRTZ_CRON_TRIGGERS;
DELETE FROM QRTZ_SIMPLE_TRIGGERS;
DELETE FROM QRTZ_TRIGGERS;
DELETE FROM QRTZ_JOB_LISTENERS;
DELETE FROM QRTZ_JOB_DETAILS;
|DELETE FROM compute_resource_inv;
DELETE FROM custom_field_manager_inv;
DELETE FROM cluster_compute_resource_inv;
DELETE FROM datacenter_inv;
DELETE FROM datacenter_network_inv;
DELETE FROM datastore_inv;
DELETE FROM datastore_profile_inv;
DELETE FROM dv_portgroup_inv;
DELETE FROM dv_switch_inv;
DELETE FROM folder_inv;
DELETE FROM managed_server_inv;
DELETE FROM managed_server_datastore_inv;
DELETE FROM managed_server_network_inv;
DELETE FROM network_inv;
DELETE FROM resource_pool_inv;
DELETE FROM storage_pod_inv;
DELETE FROM storage_profile_inv;
DELETE FROM task_inv;
DELETE FROM vm_inv;
DELETE FROM property_map;
What we can keep in mind :
1 – Make sure that every component of your environment are time synced.
2 – Do not delete any file in the staging folder if the session transfer time-out value has not been reached.
3 – In case of remaining folder, double check the transfer session time out value and the transfer_session table of the vCloud database before deleting anything in the staging folder.
At the time I am writing this post, I’m observing something weird : For some cancelled downloads and although there is no more transfer session record, transfer session folders still exist in the staging folder. For each of them, a partial vmdk file (20 MB) is visible and an error is logged in vCloud logs. I’ve opened a case at VMware for this. Meanwhile, I gonna check the vCenter logs too.
I’ll let you know asap.
I did not track it yet as I did for a vApp but I assume that the mecanism I described here are the same for another catalog item download. I will update later on this.
I will try to make another post for the upload process soon.
Fresh update regarding the remaining files after a download cancellation : VMware support confirm a bug and will try to fix it for version 9 of vCloud, expected by july (2017).
This post describes how to replace the machine certificate for vCenter installed in External mode.
Replacing the machine certificate only allow vCenter web sites to be trusted by user browsers.
To complete this operation you will need to make sure that your PKI has a VMware certificate template (check KB2112009) and that the root certificate of your PKI is installed on the computer you will run the checks from.
We’ll follow the steps below :
Replacing PSC machine certificate
From your PSC node :
Go to /usr/lib/vmware-vmca/bin folder
As usual, be cautious and take a snapshot of your PSC and vCenter nodes and of course, run a backup of your vCenter DB (if external).
Once all has been checked and done, let’s go through the replacement steps.
Start VMware Certificate Management tool
Select option 1 “Replace Machine SSL certificate…”
Provider the SSO administrator account and his password.
If like me you did not install your PSC node with the default SSO domain, don’t forget to modify the default account.
Then, select option 1 to generate the certificate signing resquest.
Provide the folder where the CSR and the related keys will be created.
Answer all questions
Notice that Name and Hostname are the FQDN of the PSC node.
VMCA Name is the last value to provide. After the script will execute certool commands to create the private key of the and the CSR of the PSC node.
You can leave the wizard at this step and provide the CSR to the security team so they can issue the certificate.
Once received, you can check differents properties like :
The common name must be the FQDN of your PSC
Must display the FQDN and the IP address at least (depends on the information you provided when you prepared the CSR
It must match with the key usage of the vSphere certificate template (remember the KB, “Non-repudiation”)
Once you’ve copied the certificate in a folder of your PSC, restart the certificate management tool and select option 1 to proceed with the replacement.
Then select option 2 to import the certificate.
Provide the full path of the certificate, its key and of the root certificate (of the Certificate Authority).
Accept to continue the operation
The replacement process is starting and will check all PSC services
9 services have been updated. PSC services are restarted.
Next action : Restart the services of each vCenter node(s).
Before going further, I usually check the certificate update and the SSO service by going to the PSC web portal and try a connection on the certificate management portal of the PSC : https://FDQN/psc.
No security warning displayed and SSO service running as expected
Now, as required, restart the vCenter node thanks to service-control –all –stop and service-control –all –start
Of course, validate that everything is running correctly by opening a session on the vSphere web client.
We can now proceed with the replacement of the self-signed certificate of the vCenter.
From your vCenter node :
As we did for the PSC, let’s go to /usr/lib/vmware-vmca/bin folder
One more time, prevention is better than cure, take a snapshot of your psc and vCenter nodes and of course, run a new backup of your vCenter DB (if external).
Once done, let’s start Certifficate Manager tool.
Select the first option to replace the machine SSL certificate
Provide the SSO administrator account and password
Select option 1 to generate the CSR
If you have an external PSC, its IP address will be required
Give the certificate details
Remember that VMCA Name is the last propertie you will set so do not validate it too fast (because the script will run right after, without any confirmation)
The script has just created the CSR and the private key.
Leave the tool (actually, it’s up to you 🙂 )
Once received, have a quick look at some properties of the delivered certificate.
When you have copied the certificate on your vCenter node, restart the certificate manager tool.
Again, select the first option
Like the last time, provide the SSO administrator account credentials
Select option 2 to import the signed certificate
Give the path of the certificate, its private key and the root certificate of Certificate Authority
Validate to achieve the replacement
The message indicates that 21 services have been updated and the services of the vCenter have been rebooted.
To check that vCenter functions properly, just open a vSphere web client session.
As you can see, the certificate of the vCenter is now trusted.
As you have just seen, replacing SSL machine certificates is pretty easy and can be done quickly.
In a previous lab I also changed the user solution certificates directly thanks to the certool commands but the problems I experienced after with NSX (I had to cheat and apply the old fingerprint to the new PSC certificate to reconnect to lookup services. VMware referenced this issue here) and vROM (I lost the connection to vCenter) led me to give up the idea of changing all vCenter certificates. Too paintful for administrators teams.
Depending on the customer security policies you might have to replace them all, then be cautious and don’t forget to double check the connection of your other components to vCenter nodes.
As I hate loosing, I will try to update you soon with a new post for the “FULL” replacement 🙂
vShield is a very nice soft but one can admit that sometimes, it can be really paintfull to modify some item configuration.
In this post, we will see how to modify the MTU of a dvSwitch that is being used in a Network prepared cluster.
The procedure consists of three steps :
Several tools exist for this but when I first worked on this topic, I used the Firefox Rest client. You can find it here.
Once downloaded, we have to deploy it.
More than easy :
The plugin is now available :
If you get the message below after selecting the XPI file :
– Edit Mozilla configuration file and replace the line « lockPref(“xpinstall.enabled” ,false) » by « lockPref(“xpinstall.enabled” ,true) ».
– Save the file.
– Replay the Add-on installation step.
Load your RestClient
You should get this page :
Now, configure the header of your requests by setting an authentication profile :
In my case, I choose vShield local admin account
You can now see the authentication information in the header panel
Once done, you can run GET requests.
Let’s retrieve the ID of the dvSwitch we want to reconfigure.
|As we will need right after to send a XML request, I higly suggest to copy the content of the vdsContext tag.
Doing so, you should get :
To reconfigure a setting we have to use another REST method, PUT*.
To do so, we first have to add another information in the request header, the format we’ll use to send the data.
|* In REST language, several methods are supported depending on the URi you are working with. The main four are : GET, PUT, POST, DELETE
GET helps in getting information
PUT, in modifying an existing entry
POST, in adding a new entry
DELETE, in deleting an existing entry.
Headers must have changed
The header is now ready, let’s prepare the update request.
In the REQUEST panel :
|Your XML should look like this :
Press “SEND” to execute the query and verify the status code in the tab “Response Headers”, it must be “200 OK”.
|Another way to check the new MTU :
Via CLI command, on each ESX attached to the dvSwitch : esxcli network nic list
Check the MTU for the NIC attached to the dvUplink participating to VXLAN.
If you are wondering if the procedure described is reusable with NSX, the answer is NO.
The PUT request does not work, actually it is even not supported.
BUT, you can easily change the MTU thanks to your vSphere Web Client.
From the Network view, just edit the settings of your dvSwitch and change your MTU.
As I was deploying my first NSX controller cluster, i faced a really annoying issue : The second node deployment was stuck on “Deploying” status.
I had already tried without any improvement to :
Unable to cancel, remove, kill (or whatever you want 🙂 ) the task, i decided to browse the NSX API guide.
This post will show you the different steps i followed to remove the task from the task queue and redeploy my second node.
|My lab consisted of :
vSphere 5.1 => vCloud director 5.1 => vApp => Nested ESXi 6.0.u1 => vcsa 6.0.u1 (external install) + NSX manager 6.1.4 + NSX controller
After installing the Firefox REST Client (this post talks about the installation of Firefox RESTClient ):
I was able to confirm the status of the second node
I was then able to redeploy my controller 🙂
|PS : Notice that if you have to delete the last controller, you will have to force the removal by using the command https://NSXManagerIP/api/2.0/vdn/controller/controller-ID?forceRemoval=True.|