Setting up HBase on Windows

Yes, there is an “official” guide to HBase installation for Windows, but it seems to be written for older versions of HBase. Some steps are not necessary anymore, but on the other hand, there are some steps that weren’t mentioned, but are crucial (like the ZooKeeper stuff).

This tutorial will guide you through the HBase installation which is based on the Cygwin in a way that is similar to the official guide. I have tested this on Windows 7, 64bit.

Downloading Cygwin

  1. download cygwin setup.exe and run it
  2. choose an appropriate mirror
    I will assume that Cygwin will be installed into C:\Programs\Cygwin. Do not install Cygwin into a folder that contains a space character (C:\Program Files). If you do so, you will face many random and unexpected troubles.
  3. from packages, choose the following:
    • OpenSSH,
    • tcp_wrappers,
    • diffutils [this should be pre-selected],
    • zlib
  4. proceed with installation until it is finished.

Configuring Cygwin

  1. run CygWin Bash Shell with Administrator privileges (C:\cygwin\Cygwin.bat)
  2. from this Bash shell run ssh-host-config
    • say “yes” to privilege separation
    • say “yes” to create the sshd account
    • say “yes” to install sshd as a service
    • press to enter an empty value of CYGWIN for the daemon
    • Now Cygwin needs to create a new account that will be used as a “proxy”/setuid origin account. Say “no” to use the default name (cyg_server).
    • say “yes” to create a new privileged account cyg_server.
    • create a password for this new privileged account and confirm it
  3. synchronize Windows user accounts with Cygwin user accounts:
    mkpasswd -cl > /etc/passwd
    mkgroup --local > /etc/group
    
  4. start SSH server with net start sshd

  5. test connection with ssh localhost from Cygwin Bash Shell.
    • say “yes” to check and store server fingerprint
    • put your Windows account password to authenticate
    • issue a few test commands in the remote session
    • close session with exit.
  6. alternatively: test your SSHD with putty.

Configuring HBase

  1. I assume that you have Java JDK installed (if not, it’s time to do that now.) However, I assume that Java is installed into a file without spaces in the name. (Again, no C:\Program Files\Java.). If you have a previous Java installation with a space-using filename, reinstall it now.
  2. Download HBase from Apache Site. Unpack it into an appropriate folder. I assume this should be C:\java\hbase.
  3. Open ./conf/hbase-env.sh in HBase directory
    • uncomment and modify this line so it reads:
      export JAVA_HOME=/cygdrive/c/java/jdk7
      
    • uncomment and modify this line so it reads:
      export HBASE_CLASSPATH=/cygdrive/c/java/hbase/lib/zookeeper-3.4.3.jar 
      
  4. Copy ./src/main/resources/hbase-default.xml to ./conf

  5. Open ./conf/hbase-default.xml in HBase directory
    • Change hbase.rootdir to /tmp
      This will resolve into C:\tmp on Windows. We will create it later.
    • Change hbase.tmp.dir to C:/programs/cygwin/root/tmp/hbase/tmp
      This also assumes that Cygwin is installed into C:\programs\cygwin.
    • If you have a computer that has no domain name, then determine your hostname: either by running hostname from shell or from System Properties | Computer Name tab. For example, my PC has hostname rn-PC.

    • Change hbase.zookeeper.quorum to rn-PC instead of localhost
      Windows 64-bit seems to have trouble resolving localhost to 127.0.0.1.
    • Change hbase.defaults.for.version.skip to true instead of false
      This will disable weird version warnings. We are actually running HBase from “uncompiled” source tree, therefore some config files get unprocessed. Despite the fact that HBase is being built by Maven, it is heavily depending on Linux tools and building requires lots of hacking. Fortunately, it is not necessary.
  6. Create the appropriate directories. Execute this from Cygwin Bash Shell:
    mkdir -pv /root/tmp/hbase/data
    mkdir -pv /cygdrive/c/tmp
    
  7. Grant the appropriate rights
    chmod 777 /root/tmp/hbase/data
    chmod 777 /cygdrive/c/tmp
    

Running HBase

  1. Within Bash, change dir to
    cd /cygdrive/c/java/hbase
    
  2. Run
    ./bin/start-hbase.sh/
    
  3. Enter password twice and HBase should start. On the first run, you may be prompted for the SSH fingerprint mismatch — in that case, just confirm with “yes”. Ideally, the console should show:
    $ ./bin/start-hbase.sh
    rn@127.0.0.1's password:
    127.0.0.1: starting zookeeper, logging to /cygdrive/c/java/hbase/bin/../logs/hbase-rn-zookeeper-rn-PC.out
    starting master, logging to /cygdrive/c/java/hbase/bin/../logs/hbase-rn-master-rn-PC.out
    rn@localhost's password:
    localhost: starting regionserver, logging to /cygdrive/c/java/hbase/bin/../logs/hbase-rn-regionserver-rn-PC.out
    
  4. In case of failure, check the log files (see the C:\java\hbase\log).

  5. HBase can be stopped with
    ./bin/stop-hbase.sh.
    

    Note that you should wait for the stopping of the server (it may take a long time), otherwise you risk data corruption.

Using HBase

  1. Start Bash and start the HBase Shell:
    ./bin/hbase shell
    
  2. Create a simple table:
    create 'test', 'data'
    
  3. Verify that the table has been created
    list
    
  4. Insert some data:
    put 'test', 'row1', 'data:1', 'value1'
    
  5. List all rows in the table
    scan 'test'
    
  6. Optionally, drop table
    disable 'test'
    drop 'test'
    
  7. You can leave the HBase shell with exit.