apache spark - Are cached RDDs resilient to graceful worker shutdown? -


i have (very) small spark cluster used 'sandpit' environment several people. occasionally, need restart worker nodes in course of maintaining cluster.

if running job working off rdd has been .cache()'d, , worker stopped gracefully (by running ./stop-slave.sh on node), happens portion of cached rdd?

the 2 scenarios can think of (assuming storage level rdd memory_only, no replication) that:

  1. the worker distributes portion of rdd across other workers;
  2. the portion of rdd held worker lost, , must recomputed.

the documentation suggests partition recomputed, it's unclear whether covers 'graceful' worker shutdown.


Comments

Popular posts from this blog

python - No exponential form of the z-axis in matplotlib-3D-plots -

php - Best Light server (Linux + Web server + Database) for Raspberry Pi -

c# - "Newtonsoft.Json.JsonSerializationException unable to find constructor to use for types" error when deserializing class -