Multi-Node Remote Machine Setup
D5.0This skill automates the setup of rented GPU machines using a single setup script.
Get This Skill on GitHubOverview
name: multinode-setup description: Automates setup of remote GPU rental machines (single or multi-node). Triggered when user provides a mapping of public IPs to private IPs for nodes to set up, or asks to configure remote machines for distributed training.
Multi-Node Remote Machine Setup
This skill automates the setup of rented GPU machines using a single setup script.
Quick Start
When user provides a node mapping, run:
~/.claude/skills/multinode-setup/scripts/setup_cluster.sh PUBLIC1:PRIVATE1 PUBLIC2:PRIVATE2 ...
Example:
~/.claude/skills/multinode-setup/scripts/setup_cluster.sh 147.185.41.18:10.15.22.9 147.185.41.19:10.15.22.17
The first node is always the head node (rental0).
What the Script Does
- Updates
~/.ssh/config- Creates rental0, rental1, ... aliases (overwrites existing) - Copies setup script - SCPs
multinode_claude.shto all nodes in parallel - Runs setup on all nodes - Executes with correct
NID,IHN, andRAY_HEAD_IPvars - Verifies setup - Checks Docker and Ray cluster status
- Prints final report - Node table and quick access commands
Input Format
User provides mapping in various formats. Parse into PUBLIC:PRIVATE pairs:
Public IP Private IP
147.185.40.110 10.15.17.105
147.185.40.111 10.15.17.106
Becomes:
./setup_cluster.sh 147.185.40.110:10.15.17.105 147.185.40.111:10.15.17.106
Script Variables
The underlying multinode_claude.sh uses:
RAY_HEAD_IP: Private IP of the head node (auto-set to first node's private IP)NID: Node ID (0 for head, 1, 2, ... for workers)IHN: "Is Head Node" -trueonly for node 0
After Setup
Quick access commands:
ssh rental0 # Head node
ssh rental1 # Worker node
ssh rental0 "ray status" # Check Ray cluster
Troubleshooting
If setup fails, check logs:
cat /tmp/setup_rental0.log # Local log for head node
cat /tmp/setup_rental1.log # Local log for worker node
ssh rental0 "cat /workspace/onstart.log" # Remote log
The setup script is idempotent - safe to re-run.
Ready to use this skill?
Visit the original repository to get the full skill configuration and installation instructions.
View on GitHub