Greenplum 4.1 Community Edition VM – Getting it to work

I have recently been playing with the Greenplum 4.1 Community Edition VM available from Greenplum.  EMC has an internal initiative to make all of its “demo’s” as VM’s and I generally agree with this.  I would say that make the VM’s available but also  make the software available so people can build their own environments as well.  Not sure if that will be the case.

Even though the VM is designed to work on VMware Player or Fusion, I got it working under Parallels with a few tweaks.

The VM itself fires up fine but there are a few problems with it you may encounter and I have made an attempt to catalog some of those things here and how to get around the problems.  I will add to this article as I find new things.

When the system boots up, you are presented with a nice desktop of icons, one of the first things you will likely do is click the Start Greenplum DB icon.  Here is a picture of the Desktop you are presented with:










You will be presented with an output that should show all has gone well, and at the end it directs you to fire up a browser to view the Performance Monitor User Interface:

Note: you can now use the GP monitor if you want monitor query and system performance Connect to the GUI by opening this link in a browser (outside of the VM): https://gp-single-host:28080/ Login using the user/pass: gpmon/password

The output directs you to connect from “outside of the VM”, as in your laptop which is hosting the VM by hitting https://gp-single-host:28080.  Obviously you will need to add a host to your laptops host file with the name of gp-single-host and the IP address of the VM.  You can get the IP address of the VM by simply opening a terminal window and doing an ifconfig eth0.  A good test is to connect to the Performance Monitor UI from within the VM itself, this will fail.  First you must re-rerun the installer:

After you run this, you should now be able to connect locally from within the VM.  You will notice however you cannot connect outside of the VM.  This is due to the firewall rules that are in effect on the CentOS VM:

[root@gp-single-host gpadmin]# /sbin/iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
RH-Firewall-1-INPUT all — anywhere anywhere

Chain FORWARD (policy ACCEPT)
target prot opt source destination
RH-Firewall-1-INPUT all — anywhere anywhere

Chain OUTPUT (policy ACCEPT)
target prot opt source destination

Chain RH-Firewall-1-INPUT (2 references)
target prot opt source destination
ACCEPT all — anywhere anywhere
ACCEPT icmp — anywhere anywhere icmp any
ACCEPT esp — anywhere anywhere
ACCEPT ah — anywhere anywhere
ACCEPT udp — anywhere udp dpt:mdns
ACCEPT udp — anywhere anywhere udp dpt:ipp
ACCEPT tcp — anywhere anywhere tcp dpt:ipp
ACCEPT all — anywhere anywhere state RELATED,ESTABLISHED
ACCEPT tcp — anywhere anywhere state NEW tcp dpt:ssh
REJECT all — anywhere anywhere reject-with icmp-host-prohibited

The easiest thing to do here is just change the security level to disable the firewall.  If you are knowledgable in iptables you can modify it to suit your needs.  To change the security level execute:

[gpadmin@gp-single-host ~]$ su
[root@gp-single-host gpadmin]# system-config-securitylevel

Set the Security Level to “disabled” and you can leave the SELinux setting to “Enforcing”.  Now your web browser should be able to connect to the Performance Monitor UI from outside the VM.

The next issue you will encounter is when you click on the “Run Queries Demo” icon on the Desktop.  You will encounter the following error:

Running Query demo
This demo will create and load data for 8 tables, then will run 22 queries

Press enter key to continue…
Executing command: ./
Running command “psql -d gpadmin -c ‘drop database if exists gpdemo'” …

Error running command psql -d gpadmin -c ‘drop database if exists gpdemo’
Output is in file /home/gpadmin/gpquery/sysout
There was an error running the command.

Press enter key to continue…

The issue is that the gpadmin database does not exist.

The script in the demo tries to run:

runCmd “psql -d gpadmin -c ‘drop database if exists $PGDATABASE'”

yet there is no gpadmin database as installed by default in the Greenplum CE VM. So the script fails.

We must create it:

[gpadmin@gp-single-host ~]$ psql -d template1
psql (8.2.15)
Type “help” for help.

template1=# CREATE DATABASE gpadmin;
template1=# \q

Now you can re-run the “Run Queries Demo” script and it should succeed with no errors.

As a note, you should definitely read the documentation provided on the Desktop of the VM. The Installation Guide and Administration Guide have much  useful information in them.  For example, if you want to connect to the database externally, you will need to add users to the pg_hba.conf file.  The correct pg_hba.conf file lives in ${MASTER_DATA_DIRECTORY}. I just added a wildcard to allow all connections like so:

host     all         gpadmin      trust

As I run into any other caveats with the Greenplum 4.1 Community Edition VM I will update this article.

This entry was posted in Data Analytics, Greenplum and tagged , . Bookmark the permalink.

5 Responses to Greenplum 4.1 Community Edition VM – Getting it to work

Leave a Reply