To list the sizes of Hive tables in Hadoop in GBs: Result: 448 [GB] hdfs://aewb-analytics-staging-name.example.com:8020/user/hive/warehouse/mybigtable 8 [GB] hdfs://aewb-analytics-staging-name.example.com:8020/user/hive/warehouse/anotherone 0 [GB] hdfs://aewb-analytics-staging-name.example.com:8020/user/hive/warehouse/tinyone
Tag Archives: hadoop
Enabling JMX Monitoring for Hadoop And Hive
Hadoop’s NameNode and JobTracker expose interesting metrics and statistics over the JMX. Hive seems not to expose anything intersting but it still might be useful to monitor its JVM or do simpler profiling/sampling on it. Let’s see how to enable JMX and how to access it securely, over SSH.
How to Add MapRed-Only Node to Hadoop
I was surprised not to be able to google an answer to this so I want to record my findings here. To add (a.k.a. commision) a node to Hadoop cluster that should be used only for map-reduce tasks and not for storing data, you have multiple options: Do not start the datanode service on theContinue reading “How to Add MapRed-Only Node to Hadoop”