I recently read Joe Feeeny’s amazing guide on how to get Jupyter set up in the
cloud. Having suffered trying to optimize models on my laptop, I
was really excited about the ability to do this, but automated, of course.
I would recommend two small additions on top of that post:
Use Amazon Linux Machine Learning AMIs, so that most deep learning
frameworks(Keras + TensorFlow, Theano, numpy) and low level libraries(like
CUDA) are installed already, so no need to waste precious time installing
anaconda. I haven’t investigated this thoroughly, but it appears that the
machine learning amis have 30gb of free storage that comes with the image,
much higher than the 8gb limit that comes with Ubuntu AMIs.
Actually secure the server. Fortunately, this is really easy to do with
Ansible Roles.
If you are new to Ansible and Terraform, this might not be the best post to
start, as I will only cover the broad strokes.
Provision the server
The relevant parts here are to open an incoming port to the server so that
Jupyter notebook server can listen on it, in addition to to the default ssh
port that needs to be exposed for Ansible. I had already previously set up an
AWS key pair and a security group enabling outbound access and opening the ssh
port. As you can see here I also use cloudflare to provision an A record so
that we can set up SSL.
Note that I also modify a local file that is configured to be my ansible hosts
file. You can make an ansible.cfg file to do this.
Lets Encrypt support for Amazon Linux AMIs are in development, so I had to
essentially copy over certbot_create_command and add the ‘debug’ flag.
certbot_create_standalone_stop_services has to be set to [] for me, since it
assumes nginx is running by default, and the script fails if nginx is not
running.
You might need to install the geerlingguy.certbot role if you haven’t already
ansible-galaxy install geerlingguy.certbot
The rest is straightforward, and can be updated to set more configurations on
the config file!
With that done, all that is left is to ssh into the server, source the right
environment, and run the jupyter notebook(with a command like jupyter notebook). I guess this could be daemonized, but I like to be sshed in to have
confirmations that the notebook is still alive. I ran into an issue
trying to debug this on a t2.nano instance, where the notebook would continually
crash, and it was good to see some output.
I had to stop going down the rabbit hole, but it would be trivial to run
fail2ban as good measure on the server. Right now we also still need to copy the
token from stdout when the server starts, but the config file could be modified
to do that.