Mass export of HBase tables

To quote the HBase docs:

Export is a utility that will dump the contents of table to HDFS in a sequence file. Invoke via…

How about mass export of all tables?

Have a tiny shiny script.

We will iterate all directories in HBase folder, skipping those which start with a dot (e. g. .logs or .META.) and those who start with a dash (-), like -ROOT-.

For each table folder, we create a timestamp, construct a target folder and call the Export job.

#!/bin/sh
HBASE_HOME=/data/hbase
HBASE_TABLES_DIR=$HBASE_HOME/tables                                                                                                                                                 HBASE_BACKUP_DIR=/data/hbase-backups

for TABLE in $( find "$HBASE_TABLES_DIR" -maxdepth 1  -type d -name '[^.-]*' -printf '%f\n')
do
        if [ -n "$TABLE" ]
        then
                TS=$(date '+%Y-%m-%d')
                $HBASE_HOME/bin/hbase org.apache.hadoop.hbase.mapreduce.Export "$TABLE" $HBASE_BACKUP_DIR/$TS/$TABLE
        fi
done

Then, we may chown to the hbase user, chmod this file with sgid and symlink it into cron daily directory.

Pridaj komentár

Vaša e-mailová adresa nebude zverejnená. Vyžadované polia sú označené *