CSIL as your Compute Cluster
We will use CSIL Linux machines for this course, providing a seamless transition to running Spark on personal laptops, cloud VMs (like AWS EC2), and even Google Colab. This flexibility ensures you can scale from local development to distributed computing with ease.
Connecting Remotely
The goal here is to connect to a CSIL Linux machine via SSH.
From On-Campus
If you are on the campus network or using SFU VPN, you can SSH directly into a CSIL machine using port 24.
To connect, pick a machine from the available CSIL hosts (see below for naming conventions):
[yourcomputer]$ ssh -p24 <USERID>@asb9840-a01.csil.sfu.ca
Each machine follows a structured naming scheme:
- asb9840-{a-e}{01-08}.csil.sfu.ca (Room 9840, letters a-e, numbers 01-08)
- asb9804-{a-d}{01-08}.csil.sfu.ca (Room 9804, letters a-d, numbers 01-08)
- asb9838n-{a-e}{01-16}.csil.sfu.ca (Room 9838n, letters a-e, numbers 01-16)
If you want to automatically select a random machine, you can use the following function in Python:
import random
def random_csil_machine():
machines = [f"asb9840-{l}{n:02d}.csil.sfu.ca" for l in "abcde" for n in range(1,9)] + \
[f"asb9804-{l}{n:02d}.csil.sfu.ca" for l in "abcd" for n in range(1,9)] + \
[f"asb9838n-{l}{n:02d}.csil.sfu.ca" for l in "abcde" for n in range(1,17)]
return random.choice(machines)
From On-Campus With SSH Keys
To avoid entering your password every time, you can set up SSH keys:
1. Generate an SSH key if you don’t already have one:
[bash]
ssh-keygen -t ed25519
2. Copy your public key to a CSIL machine:
[bash]
ssh-copy-id -p24 <USERID>@asb9840-a01.csil.sfu.ca
3. Modify your `/.ssh/config` file for easier access:
[text]
Host csil
User <USERID>
Port 24
HostName asb9840-a01.csil.sfu.ca
ServerAliveInterval 120
With this setup, you can simply use:
ssh csil
to connect.
Copying Files
To transfer files to a CSIL machine, use SCP:
scp -P24 code.py <USERID>@asb9840-a01.csil.sfu.ca:~
Or use your preferred SCP/SFTP method.
From Off-Campus
To connect from off-campus, you must use SFU VPN. Follow the SFU IT instructions to configure it: SFU VPN Setup.
Once the VPN is active, you can SSH normally:
ssh -p24 <USERID>@asb9840-a01.csil.sfu.ca
Running Applications
Once connected to a CSIL machine, you can execute Python or other applications as needed:
python3 code.py
For long-running jobs, consider using `nohup` or `screen`:
nohup python3 myscript.py &
Or:
screen -S mysession
python3 myscript.py
Cleaning Up
Please remove unnecessary files and terminate any running background processes:
rm -rf output_folder
ps -u <USERID>
kill <PROCESS_ID>
Use `htop` or `top` for resource monitoring:
htop