Bash Magic: List Hive Table Sizes in GB

To list the sizes of Hive tables in Hadoop in GBs:

sudo -u hdfs hadoop fs -du /user/hive/warehouse/ | awk '/^[0-9]+/ { print int($1/(1024**3)) " [GB]\t" $2 }'

Result:

448 [GB] hdfs://aewb-analytics-staging-name.example.com:8020/user/hive/warehouse/mybigtable
8 [GB]	hdfs://aewb-analytics-staging-name.example.com:8020/user/hive/warehouse/anotherone
0 [GB]	hdfs://aewb-analytics-staging-name.example.com:8020/user/hive/warehouse/tinyone

Published by Jakub Holý

I’m a JVM-based developer since 2005, consultant, and occasionally a project manager, working currently with Iterate AS in Norway.