About this blogThis blog is mostly about my pursuits in Data Science. Previous blog entries also dealt with storage, compute, virtualization and professional services. Currently the focus is on Data Science, including Big Data, Hadoop, Business Intelligence, Data Warehouse, Data Integration and Visualization. From time to time I will blog about other things of interest. The opinions expressed in this blog are entirely my own and should not be taken as the opinion of my employer.
Category Archives: Greenplum
The Greenplum 4.1 Community Edition comes with a mapreduce demo that has two parts. Part 1 uses the perl language and it parses multiple apache access_log files Part 2 uses the python language and does a word count in the … Continue reading
I have recently been playing with the Greenplum 4.1 Community Edition VM available from Greenplum. EMC has an internal initiative to make all of its “demo’s” as VM’s and I generally agree with this. I would say that make the … Continue reading
Greenplum released Community Edition 4.1 which is a great free VM appliance you can run to get your feet wet with Greenplum and gain an understanding of what it can offer. Unfortunately it was only released to work on VMware … Continue reading