Basic spark cluster resources shared
3Nodes, [16 Cores, 48GB RAM] <----- Per Node
Subract 1Core and 1GB RAM for Operating System
3Nodes, [15 Cores, 47GB RAM] <----- Per Node
Cluster Resource: Total Cores = 15cores X 3Nodes = 45Cores - 1 => 44 Cores
Total Memory = 47GB X 3Nodes = 141 GB - 1 => 140GB RAM
[Basic thumb rule is to have optimal number of cores per executors are between 3 to 5]
Number of executers = Total cores/cores per executers = 44/4 = 11
Memory per executer = Total Memory/no. of executers = 140/11 = ~12GB
Formula for Memory overhead = Max(384MB, 10% of Memory per executer)
Memory overhead = Max(384MB, 10% of ~12GB) = ~1.2GB
Actual memory per executer = 12 - ~1.2 GB = 11GB
spark-submit: --number of executers = 11
--executer cores = 4
--executer memory = 11GB
Note: RAM allocation for each partition for processing = 11GB / 4cores = ~2.5GB
No comments:
Post a Comment