shinajaran: linux

Monday, February 3, 2014

Noob’s install guide to Apache CloudStack4.2 management server with XenServer 6.2 using Xen-In-Xen

Apache CloudStack is one of the many Cloud Computing “stacks” such as OpenStack, Eucalyptus that is available to end user that provides a computing framework to create a Private Cloud (or Public Cloud), provisioning and managing massive amount of VirtualMachines (VM) as a service provider of Infrastructure As A Service (IaaS). On top of that, it works off many types of VMM (Virtual Machine Monitor) aka Hypervisor; for example KVM, XenServer, VMWare.

Just some historical trivia, Apache CloudStack 4.2 is the open source cousin of Citrix CloudPlatform. Both of them stemmed from the same code based but there are some differences, especially Cloud Computing at the enterprise level. One particular specimen is the CloudPortal running off CloudPlatfrom that behaves as a billing gateway, and utility & resource management dashboard. End user that could life without commercial technical support will be happy with the open source variant.

My earlier post gives a generic overview of cloudstack4.2 with XenServer 6.2 using Xen-In-Xen. Xen-In-Xen is a unique feature in XenServer6.2 where a hypervisor exist in the hypervisor that operates off the computing hardware. One important point to note: CloudStack4.2 management server (CSMS) cannot co-exist with the resource pool within the same Xen-In-Xen hypervisor. In other words, Cloudstack4.2 management server should exist in a separate Xen-In-Xen hypervisor separate from the hypervisor to host the VMs’.

In this install guide it assumes the following

CloudStack4.2 management server on CentOS
XenServer 6.2 Xen-In-Xen; alternatively a standalone CloudStack4.2 management server with a separate Xen Server 5.2 works too.

Several resource constraints need to be addressed for CSMS prior to setting up.

NFS is the default file system used in CSMS for keeping the images and snapshots of the VMs; other alternatives such as AWS S3, Cisco UCS, SolidFire are supported too. As per the default case where the NFS co-exist within the CSMS, the HDD capacity of the CSMS needs to be large to hold the variety of VM images.
MySQL database co-exist in the CSMS as per the default case. Therefore, the HDD capacity needs to be taken into consideration.
If more than one VLAN is used for CSMS and the VMs, the Switch and Router need to be configured to allow routing between the VLANs.
CSMS uses ACL, so IP addresses used with CSMS need to be pre-determined.
Hostnames need to be configured if use.
Default file editor is vi, otherwise install tools such as nano.

Step1: Install CentOS VM on Xen-In-Xen

WARNING: failure to provision sufficient HDD capacity will result CSMS not functioning, procedures to mitigate lack of HDD capacity post installation can be found here.

Step2: Perform preliminary config on to the newly created CentOS VM

This step is necessary to perform preliminary configurations for CSMS on CentOS, besides SSH, it is possible to use XenCenter to do the same.

#update

yum update

#install dns utils

yum install bind-utils

#disable media in cd

yum --disablerepo=c6-media check-update

#install ntp

yum install ntp

#add the CloudStack repository

touch /etc/yum.repos.d/cloudstack.repo

#edit cloudstack.repo

[cloudstack]

name=cloudstack

baseurl=http://cloudstack.apt-get.eu/rhel/4.2/

enabled=1

gpgcheck=0

Step3: install & config MYSQL

# install mysql server

yum install mysql-server

#edit mysql

nano /etc/my.cnf

[mysqld]

innodb_rollback_on_timeout=1

innodb_lock_wait_timeout=600

max_connections=350

log-bin=mysql-bin

binlog-format = 'ROW'

#start mysql daemon

service mysqld start

#stop mysql daemon

service mysqld stop

#secure mysql install on centos

mysql_secure_installation

#check selinux

rpm -qa | grep selinux

#set permissime selinux

nano /etc/selinux/config

SELINUX=permissive

#start permissive selinux without reboot

setenforce permissive

Step4: Install & config NFS

#install the nfs utils

sudo yum install nfs-utils

#create folder to store pri and secondary, follow cloudstack terminology

mkdir -p /export/primary

mkdir -p /export/secondary

#edit etc/exports to use new dir created as NFS exports

nano /etc/exports

/export *(rw,async,no_root_squash,no_subtree_check)

#export the dir for pri and sec sto

exportfs -a

#edit nfs conf for ports to be use by NFS

nano /etc/sysconfig/nfs

#take out comment for

LOCKD_TCPPORT=32803

LOCKD_UDPPORT=32769

MOUNTD_PORT=892

RQUOTAD_PORT=875

STATD_PORT=662

STATD_OUTGOING_PORT=2020

Step5: Setup Firewall Rules for NFS

#edit acl in iptables

nano /etc/sysconfig/iptables

#add rules for NFS to use port at beginning of INPUT chain.

#network use 172.16.89.0/24; the DHCP range in XenInXen network

-A INPUT -s 172.16.89.0/24 -m state --state NEW -p udp --dport 111 -j ACCEPT

-A INPUT -s 172.16.89.0/24 -m state --state NEW -p tcp --dport 111 -j ACCEPT

-A INPUT -s 172.16.89.0/24 -m state --state NEW -p tcp --dport 2049 -j ACCEPT

-A INPUT -s 172.16.89.0/24 -m state --state NEW -p tcp --dport 32803 -j ACCEPT

-A INPUT -s 172.16.89.0/24 -m state --state NEW -p udp --dport 32769 -j ACCEPT

-A INPUT -s 172.16.89.0/24 -m state --state NEW -p tcp --dport 892 -j ACCEPT

-A INPUT -s 172.16.89.0/24 -m state --state NEW -p udp --dport 892 -j ACCEPT

-A INPUT -s 172.16.89.0/24 -m state --state NEW -p tcp --dport 875 -j ACCEPT

-A INPUT -s 172.16.89.0/24 -m state --state NEW -p udp --dport 875 -j ACCEPT

-A INPUT -s 172.16.89.0/24 -m state --state NEW -p tcp --dport 662 -j ACCEPT

-A INPUT -s 172.16.89.0/24 -m state --state NEW -p udp --dport 662 -j ACCEPT

#restart ip routing to activate the acl

service iptables restart

service iptables save

Step6: Install & config CloudStack4.2 management server

#install cloudstack management

yum install cloudstack-management

#download vhd-util for XenServer only

#loc is /root/downloadFolder/vhd-util

curl -o vhd-util "http://download.cloud.com.s3.amazonaws.com/tools/vhd-util"

#copy to

cp vhd-util /usr/share/cloudstack-common/scripts/vm/hypervisor/xenserver

#setup db to be used with cloudstack with “root:password”

cloudstack-setup-databases cloud:password@localhost --deploy-as=root:password

#chk db log

/etc/cloudstack/management/db.properties

#start the mgt server

cloudstack-setup-management

#stop the mgt server

#if required to stop#service cloudstack-management stop

#start mgt server agent

service cloudstack-agent start

Step7: Check CloudStack up status via login portal

Step8: Config & mounting of NFS on a separate host of XenServer aka Xen-In-Xen.

#make dir to mount nfs shares on the client eg hypervisor host

mkdir /primarymount

mount -t nfs 172.16.89.211:/export/primary /primarymount

#un-mount only if needed

umount /primarymount

#chk port mapper for nfs eg 662

/usr/sbin/rpcinfo -p

#restart nfs on NFS server only if needed

service nfs restart

#restart portmapper only if needed

service portmap restart

#restart nfs on NFS server(mgtserver) for client(hypervisor)

##when no mount shown from hypervisor

service nfs restart

#on client hypervisor chk nfs mounts

/etc/init.d/rpcbind start

#on client eg hypervisor chk rpc

rpcinfo -p 172.16.89.211

On separate XenServer e.g CTXXX02

Note the difference in HDD capacity from the above 2 screenshots.

Step9: Download and prepare virtual machine template

##check mysqld and cloud mgt started at every reboot.

#prepare VM template

#path different from ori docs

#template script store here /usr/share/cloudstack-common/scripts/storage/secondary/

#/export/secondary is the NFS share mount point for client

#this step take 5gb and 30min

/usr/share/cloudstack-common/scripts/storage/secondary/cloud-install-sys-tmplt -m

/export/secondary -u http://d21ifhcun6b1t2.cloudfront.net/templates/4.2/systemvmtemplate-2013-07-12-master-xen.vhd.bz2 -h xenserver -F

Congratulations! CloudStack4.2 with XenServer6.2 is now installed and ready for subsequent configuration to provision VMs.

Sunday, February 2, 2014

cheat’s Q&D Hadoop 0.23.6 install guide

[CC] cheat’s Q&D Hadoop 0.23.6 install guide

Hadoop is one of the most popular open source “Cloud Computing” platforms that is used to crunch massive amount of data on generic hardware (computer hardware that is non-proprietary and not necessary has to be identical). It is not exactly “Cloud Computing” per se, because it is a computing architecture that is meant for processing massively large amount of data in parallel. Taxonomically, Parallel Computing (the predecessor to cloud computing) would be the closer terminology. Hadoop comes with several features, most notably the HDFS (Hadoop File System), and MapReduce. I attempt to describe HDFS, and MapReduce in a one liner. HDFS: it is an open source cousin of GFS (Google File Systems), provides a framework to manage data redundancy, and most importantly the scalability is as simple as adding more generic hardware. MapReduce: it is a programming model for processing very large amount of data that leverages on the classic computing method: the divide and conquer approach through the Map stage follow by the Reduce stage. On top of that, it performs sorting intrinsically via the programming model. Oh wait… I busted my one liner quota for MapReduce.

Back in late 2012 I have followed the text book example and played with Hadoop 0.20.0. Setting up and Installation is a breeze, due to the fact that many user guide and tutorials that are made available by the community. In early 2013, Hadoop 0.23.6 come by and I assumed the installation is going to be identical to the earlier version, but I was wrong. As a matter of fact, I have used some nonstandard way by the tree command to find the changes in directory for the necessary configuration files. If the version documentation rocks at that time, it will really save me some of my hair.

Hadoop 0.23.6 is an interesting release. In this version, several major changes/overhaul are made. Most notably, the API call of HADOOP.MAPRED is deprecated and superseded by HADOOP.MAPREDUCE aka MRv2. Resource management of a Hadoop Cluster was relegated to a dedicated service named YARN. Several advanced data structures meant for programming MapReduce were added; some were deprecated (I will go into the details of implementation in the future posts).
For a complete genealogy of Hadoop versions, check this out.

This install guide assumes

Ubuntu server 11.x on a VM; I have used 40GB for a start, but run out very quickly.
Hadoop 0.23.6 in releases
Java 6 openJDK
Hadoop cluster lives as a single node

Several things to take note prior to running Hadoop. Locate the directory of configuration files; differentiate between datanode and namenode; dedicate a “Hadoop user”; necessary files permission on the directories; HDFS is not your regular file system, it requires a separate software to access; Hadoop starts with a daemon;

Step1: Download Hadoop and extract it to a directory.

Then name of the directory with the files extracted shall be used in all of the following config. E.g I have created a folder “/usr/local/hadoop” and the files are extracted in it.

Step2: Locate the configuration templates, and directory to place the configurations

#template

/usr/local/hadoop/share/hadoop/common/templates/conf

#path to place configuration files

/usr/local/hadoop/etc/hadoop

Step3: create directory for temporary files, logs, namenode, and datanode

/usr/local/hadoop/data/hdfs/datanode

/usr/local/hadoop/data/hdfs/namenode

/usr/local/hadoop/data/hdfs

/home/user/hadoop/tmp

#output for hadoop logs

/user/user

Step4: copy example configuration templates to config directory and then edit the configuration files.

Configuration files needed are “yarn-site.xml;core-site.xml;hdfs-site.xml;mapred-site.xml”. Put in the parameters as per required in to the configuration files mentioned above. A sample of the configured configuration files are available to download here.

Step5: add the necessary paths and verify the paths

#path to add to ~/.bash

$JAVA_HOME=/usr/lib/jvm/java-6-openjdk

$HADOOP_HOME=/usr/local/hadoop

# update the paths

source ~/.bash

#Verify the paths

#output should be similar to the following

share/doc/hadoop/api/org/apache/hadoop/examples

/usr/local/hadoop/hadoop/hadoop-0.23.6/share/doc/hadoop/api/org/apache/hadoop/examples

/usr/local/hadoop/share/hadoop/hadoop-0.23.6/share/doc/hadoop/api/org/apache/hadoop/examples

/usr/local/hadoop/share/doc/hadoop/api/org/apache/hadoop/examples

/usr/local/hadoop/share/hadoop/hadoop-0.23.6/share/doc/hadoop/api/org/apache/hadoop/lib

/usr/local/hadoop/share/hadoop/hadoop-0.23.6/share/hadoop/mapreduce/hadoop-mapreduce-client-core-0.23.6.jar

/usr/local/hadoop/share/hadoop/hadoop-0.23.6/share/hadoop/mapreduce/hadoop-mapreduce-client-common-0.23.6.jar

Step 6: Format name node

Warning: this step only requires to be done ONCE for each newly setup cluster. Executing this command on an existing cluster will risk data loss.

#do once only, at the initial setup of hadoop

bin/hadoop namenode –format

Step 7: Start the daemon for Hadoop

sbin/hadoop-daemon.sh start namenode

sbin/hadoop-daemon.sh start datanode

sbin/yarn-daemon.sh start resourcemanager

sbin/yarn-daemon.sh start nodemanager

sbin/mr-jobhistory-daemon.sh start historyserver

Step8: verify Hadoop cluster with jps

Assumed that the setting up and configuration went fine, the following screen will appear after typing the command “jps”.

Step9: verify Hadoop cluster with web based consoles

Note: 192.168.253.130 is the IP address of my Ubuntu server.

#namenode console to verify o/p

http://192.168.253.130:50070/dfshealth.jsp

#for ResourceManager

http://192.168.253.130:8088/cluster

#for Job History Server

http://192.168.253.130:19888/jobhistory

Step10: Verify Hadoop & MapReduce in action

run example word count

#copy text files from “/home/user/upload” to HDFS directory “/user/user/txt”

bin/hadoop dfs -copyFromLocal /home/user/upload /user/user/txt

bin/hadoop dfs -ls /user/user/txt

bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-0.23.6.jar wordcount /user/user/txt /user/user/txt-output

calculate pi

#run an example calc pi

bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-0.23.6.jar pi -Dmapreduce.clientfactory.class.name=org.apache.hadoop.mapred.YarnClientFactory -libjars share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-0.23.6.jar 16 10000

Compile a custom word count in java with MapReduce on Hadoop 0.23.6

#to compile

javac -classpath /usr/local/hadoop/share/hadoop/hadoop-0.23.6/share/hadoop/common/hadoop-common-0.23.6.jar:/usr/local/hadoop/share/hadoop/hadoop-0.23.6/share/hadoop/common/lib/commons-cli-1.2.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-common-0.23.6.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-core-0.23.6.jar -d classes WordCount.java && jar -cvf wordcount.jar -C classes/

#to execute

/usr/local/hadoop/bin/hadoop jar wordcount.jar org.myorg.WordCount /user/user/txt /user/user/bigram-output

Verify output with

/usr/local/hadoop/bin/hdfs dfs -ls /user/user

shinajaran

Monday, February 3, 2014

Noob’s install guide to Apache CloudStack4.2 management server with XenServer 6.2 using Xen-In-Xen

Sunday, February 2, 2014

cheat’s Q&D Hadoop 0.23.6 install guide

About Me

Search This Blog

Labels

Blog Archive

My Blog List

Followers