Skip to content

Latest commit

 

History

History
50 lines (39 loc) · 1.17 KB

README.md

File metadata and controls

50 lines (39 loc) · 1.17 KB

Scaling the Namenode

General guidelines for scaling the Namenode in HDFS, lessons learnt from the field.

This repo is a collateral for my presentation at ApacheCon 2021 titled "Scaling the Namenode - Lessons Learnt"

The slide deck from the talk is available here.

Commonly reported problems

  • Performance
  • RPC Processing Time
  • GC pauses
  • Read/Write performance
  • Too long to start NN
  • Stability
  • Frequent Failover
  • Frequent Crash

Various causes

  • Small files
  • Sub optimal heap settings
  • Missing RPC improvements
  • Bad Applications / Mistuned Components
  • Degraded AD
  • Too frequent/delayed checkpointing
  • Heavy Services co-located / Disk throughput
  • Too much logging
  • Degraded JN / communication between NN/JN/ZK

Tips for Scaling