Installing Hadoop and HBase on OS X 10.8.5
These are the steps it used for installing Hadoop, HBase, and Pig on OS X 10.8.5. The versions of software I am using is:* Hadoop 2.2.0
* HBase 0.96.0
* Pig 0.12.0
Credits
First, I'd like to credit a two sources that were invaluable for me while performing this install and configuration:* Andy C's Installing Hadoop 2 on a Mac blog post.
* Freddy's Hadoop & Hbase on OSX 10.8 Mountain Lion blog post.
Get Software
Download the Hadoop 2.2.0 tarball from a mirror near you.
Download the HBase 0.96.0 tarball from a mirror near you.
Download the Pig 0.12.0 tarball from a mirror near you.
Install Software
Extract each of the above mentioned tarballs in a directory of your choice. I exploded them in /Users/mkeating/Dev.
Configure Environemnt
I prefer to use the .bash_profile file to maintain my settings between terminal sessions. I added the following to my .bash_profile.
export JAVA_HOME="$(/usr/libexec/java_home -v 1.6)"
export HADOOP_INSTALL="/Users/mkeating/Dev/hadoop-2.2.0"
export HBASE_HOME="/Users/mkeating/Dev/hbase-0.96.0-hadoop2"
export PIG_HOME="/Users/mkeating/Dev/pig-0.12.0"
export PATH=$PATH:$HADOOP_INSTALL/bin:$HADOOP_INSTALL/sbin:
$HBASE_HOME/bin:$PIG_HOME/bin:$SUBLIME_TEXT_BIN
$HBASE_HOME/bin:$PIG_HOME/bin:$SUBLIME_TEXT_BIN
Create HDFS and MapReduce Directories
Create directories on your local file system to store HDFS and MapReduce data. I created the following two directories:
* /Users/mkeating/Dev/fs_root/hadoop-hdfs
* /Users/mkeating/Dev/fs_root/hadoop-mapreduce
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
Add the following to the bottom of hadoop-env.sh:
export HADOOP_OPTS="-Djava.net.preferIPv4Stack=true -Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk"
Note: I'm no longer sure this is necessary or has any effect on the installation. At one point, I believed the addition of the above suppressed "Unable to load realm info from SCDynamicStore" errors.
Add the following to hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/Users/mkeating/Dev/fs_root/hadoop-hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/Users/mkeating/Dev/fs_root/hadoop-hdfs/datanode</value>
</property>
</configuration>
Add the following to the bottom of yarn-env.sh:
YARN_OPTS="$YARN_OPTS -Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk"
Note: I'm no longer sure this is necessary or has any effect on the installation. At one point, I believed the addition of the above suppressed "Unable to load realm info from SCDynamicStore" errors.
Add the following to yarn-site.xml:
<configuration>
<property>
<name>yarn.resourcemanager.address</name>
<value>localhost:8032</value>
</property>
<property>
<name>yarn.nodemanager-aux-services</name>
<value>madpreduce.shuffle</value>
</property>
</configuration>
* /Users/mkeating/Dev/fs_root/hadoop-hdfs
* /Users/mkeating/Dev/fs_root/hadoop-mapreduce
Configure Hadoop
Add the following to core-site.xml:<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
Add the following to the bottom of hadoop-env.sh:
export HADOOP_OPTS="-Djava.net.preferIPv4Stack=true -Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk"
Note: I'm no longer sure this is necessary or has any effect on the installation. At one point, I believed the addition of the above suppressed "Unable to load realm info from SCDynamicStore" errors.
Add the following to hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/Users/mkeating/Dev/fs_root/hadoop-hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/Users/mkeating/Dev/fs_root/hadoop-hdfs/datanode</value>
</property>
</configuration>
Add the following to the bottom of yarn-env.sh:
YARN_OPTS="$YARN_OPTS -Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk"
Note: I'm no longer sure this is necessary or has any effect on the installation. At one point, I believed the addition of the above suppressed "Unable to load realm info from SCDynamicStore" errors.
Add the following to yarn-site.xml:
<configuration>
<property>
<name>yarn.resourcemanager.address</name>
<value>localhost:8032</value>
</property>
<property>
<name>yarn.nodemanager-aux-services</name>
<value>madpreduce.shuffle</value>
</property>
</configuration>
Start Hadoop
First format the namenode by executing the following command at the command prompt:
hadoop namenode -format
Next, start HDFS and Yarn by executing the following commands at the command prompt:
start-dfs.sh
start-yarn.sh
Test Hadoop
Test your Hadoop installation/configuration by executing the following at the command prompt:
jps
You should see output that looks something like this:
5053 Jps
2598 SecondaryNameNode
2416 NameNode
4933 Main
4915 HMaster
2704 ResourceManager
2498 DataNode
2789 NodeManager
2598 SecondaryNameNode
2416 NameNode
4933 Main
4915 HMaster
2704 ResourceManager
2498 DataNode
2789 NodeManager
Assuming no errors, you can run two more simple tests to check the installation. First, cd to the $HADOOP_INSTALL directory. Then copy a file to HDFS by executing the following commands.
hadoop fs -mkdir /user
hadoop fs -mkdir /user/<username> (where username is your logon ID)
hadoop fs -put LICENSE.txt
hadoop fs -ls
Finally try to run a MapReduce job by executing the following:
cd share/hadoop/mapreduce
hadoop jar ./hadoop-mapreduce-examples-2.2.0.jar wordcount LICENSE.txt out
hadoop fs -mkdir /user
hadoop fs -mkdir /user/<username> (where username is your logon ID)
hadoop fs -put LICENSE.txt
hadoop fs -ls
Finally try to run a MapReduce job by executing the following:
cd share/hadoop/mapreduce
hadoop jar ./hadoop-mapreduce-examples-2.2.0.jar wordcount LICENSE.txt out
Configure HBase
Add the following to hbase-env.sh:
export HBASE_OPTS="-Djava.net.preferIPv4Stack=true -Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk"
Add the following to hbase-site.xml:
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:9000/hbase</value>
</property>
</configuration>
export HBASE_OPTS="-Djava.net.preferIPv4Stack=true -Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk"
Add the following to hbase-site.xml:
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:9000/hbase</value>
</property>
</configuration>
Update the Hadoop JARs that come with HBase
HBase 0.96.0 bundles Hadoop 2.1.0-beta JAR files. To prevent errors, replace the 2.1.0-beta JARs in hbase-0.96.0-hadoop2/lib with the 2.2.0 JARs under the hadoop-2.2.0/share/hadoop directory.Note: I could not find a replace for hadoop-client-2.1.0-beta.jar so I left it in the hbase-0.96.0-hadoop2/lib directory. So far, I have not run into any problems.
Start HBase
Start HBase by executing the following command at the command prompt:
start-hbase.sh
Test HBase
Test your HBase installation/configuration by launching the HBase shell and creating a table. First, type the following command at the command prompt to open the HBase shell.hbase shell
Next, create a new table and insert a value by executing the following commands within the HBase shell:
create 'my_table', 'col_fam'
put 'my_table', 'row1', 'col_fam:a', 'value1'
put 'my_table', 'row1', 'col_fam:b', 'value2'
put 'my_table', 'row2', 'col_fam:a', 'value3'
scan 'my_table'
get 'my_table', 'row1'
disable 'my_table'
drop 'my_table'
You put really very helpful information. Keep it up.
ReplyDeleteBig Data Training in Chennai
Extremely decent review. I totally appreciate this site. Much obliged!
ReplyDeletebest interiors