Objective: practice secure access patterns and Slurm job engineering inside the Docker cluster. Deliver a short lab notebook (Markdown) with command transcripts, screenshots, and timing tables. Keep code under codes/lab02/ unless otherwise noted.
Exercise 1 — Hardened SSH workflow¶
Generate two SSH key pairs: one Ed25519 for personal logins, one RSA limited to Git interactions. Store them under
~/.ssh/keys/lecture02/with descriptive names.Create a multi-host SSH config:
sspa-controller(localhost:2222) using the Ed25519 key.sspa-worker1reachable viaProxyJump sspa-controller.A fake campus cluster entry that demonstrates
ProxyJumpchaining plus a local port forward example.
Set up local and remote SSH agents (
ssh-agent+ssh-add). Document how you forwarded the agent through the controller to worker1 (ssh -A).Provide a troubleshooting section: how to rotate host keys, what to do when permissions are wrong, and how to test tunnels with
ncorcurl.
Exercise 2 — Job arrays & dependencies¶
Extend the provided
vec_add_array.sbatchto cover ntasks{1,2,4}and capture results into$WORK/results/vec_add_ntasks-<n>.txt.Modify
montecarlo_array.sbatchso that each array task writes metadata into$SCRATCH/montecarlo/meta/<jobid>_<taskid>.jsonalongside the.npzfile.Write a new cleanup job
cleanup.shthat deletes scratch artifacts older than 2 days (usefind -mtime +2). Chain it with--dependency=afterany:<aggregate_jobid>.Keep an experiment log summarizing queue wait time, run time, and resource requests. Include at least one
scontrol show joboutput annotated to explain key fields.
Exercise 3 — Interactive debugging workflow¶
Request an interactive allocation (
salloc --nodes=1 --time=00:20:00) and within it:Launch
htoportopto observe CPU usage.Run
srun --pty bashon worker1 and verify$SLURM_JOB_IDremains the same.
Demonstrate live code editing: modify
montecarlo.pyto print progress every 5 seconds, run it interactively, then revert the change.Capture the commands required to forward a Jupyter notebook from worker1 back to your laptop using SSH tunnels (
ssh -L). Explain how you would adapt the ports when bridging through the controller.
Submission checklist¶
Provide SLURM job IDs, command transcripts, and relevant log snippets.
Include any helper scripts you wrote (cleanup, metadata writer, etc.).
Note open issues or unanswered questions you encountered; these feed future lectures.