Installing Hue on OS X 10.8.5
For me, installing Hue on OS X was a pain in the ass. Unfortunately, I didn't meticulously record each step I executed to install Hue so I'm authoring this post from memory. Hopefully folks will find it useful.
Credits
First, I'd like to point out three sources that were valuable to me while performing the Hue install and configuration:
* Hue's
GitHub page.
* The
Hue section of the CDH4 Installation Guide.
* A
stackoverflow post about Configuring Hue With CDH4.3.
Preparing for the Installation
Hue relies on several external software libraries. Even before I downloaded Hue, I prepared my environment by:
1. Installing
JDK 1.7 (which is required to compile Hue)
3. Installed easy_install
Once the above listed software (and there dependencies) was installed, I used MacPorts to obtain the following libraries:
liblxml
libxml2
libxslt
mysql55
sqlite3
I used easy_install to obtain the simplejson library.
Obtaining the Source Code
I downloaded the Hue source by issuing the following command at the command prompt:
git clone http://github.com/cloudera/hue.git
Modifying the Source
Depending on your configuration, this step may not be necessary. In my case, I did not establish an 'hdfs' user for installing Hadoop and owning the HDFS directory and files. If the owner of the HDFS root directory is someone other than 'hdfs', you must change the DEFAULT_HDFS_SUPERUSER in the
hue/desktop/libs/hadoop/src/hadoop/fs/webhdfs.py file.
Building Hue from Source
Note: As mentioned above, Hue can only be build using JDK 1.7. If needed, be sure to update your JAVA_HOME to point to the appropriate directory. For me, I executed the following command prior to issuing the commands below: export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home
I built the Hue source by issuing the following commands at the command prompt:
Note: I seem to recall I had a problem with the mysql55 library obtained from MacPorts. If I remember correctly, the make command was expected a MacPorts library (named mysql5_devel) that has since become obsoleted by mysql55. Unfortunately, I do not recall how I worked around this issue.
Configuring Hadoop
Add the following to hdfs-site.xml:
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
Add the following to core-site.xml:
<property>
<name>hadoop.proxyuser.hue.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hue.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.httpfs.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.httpfs.groups</name>
<value>*</value>
</property>
Configuring Hue
Add the following to the [[hdfs_clusters]] section of the hue/desktop/conf/pseudo-distributed.ini:
webhdfs_url=http://localhost:50070/webhdfs/v1/
hadoop_hdfs_home=/Users/mkeating/Dev/fs_root/hadoop-hdfs
hadoop_bin=/Users/mkeating/Dev/hadoop-2.2.0/bin/hadoop
hadoop_conf_dir=/Users/mkeating/Dev/hadoop-2.2.0/etc/hadoop
Add the following to the [[yarn_clusters]] section of the hue/desktop/conf/pseudo-distributed.ini:
hadoop_mapred_home=/Users/mkeating/Dev/fs_root/hadoop-mapreduce
hadoop_bin=/Users/mkeating/Dev/hadoop-2.2.0/bin/hadoop
hadoop_conf_dir=/Users/mkeating/Dev/hadoop-2.2.0/etc/hadoop
Starting Hue
Start Hue by executing the following command at the command prompt:
hue/build/env/bin/hue runserver
Note: When you login to Hue, you will see a number of error messages indicating problems with the current configuration. Most of the messages are likely legitimate. One message, however, can be misleading. If you get a message that looks like the one below, you should first test Hue's access to HDFS (through the File Browser) before troubleshooting.
Current value: http://localhost:50070/webhdfs/v1/
Filesystem root '/' should be owned by 'hdfs'
According to this post on Google Groups, this error message will always appear given our configuration (even if Hue is working properly).