Blog
-
Jupyter in the Cloud
I recently read Joe Feeeny’s amazing guide on how to get Jupyter set up in the cloud. Having suffered trying to optimize models on my laptop, I was really excited about the ability to do this, but automated, of course.
I would recommend two small additions on top of that post:
-
Use Amazon Linux Machine Learning AMIs, so that most deep learning frameworks(Keras + TensorFlow, Theano, numpy) and low level libraries(like CUDA) are installed already, so no need to waste precious time installing anaconda. I haven’t investigated this thoroughly, but it appears that the machine learning amis have 30gb of free storage that comes with the image, much higher than the 8gb limit that comes with Ubuntu AMIs.
-
Actually secure the server. Fortunately, this is really easy to do with Ansible Roles.
If you are new to Ansible and Terraform, this might not be the best post to start, as I will only cover the broad strokes.
Provision the server
The relevant parts here are to open an incoming port to the server so that Jupyter notebook server can listen on it, in addition to to the default ssh port that needs to be exposed for Ansible. I had already previously set up an AWS key pair and a security group enabling outbound access and opening the ssh port. As you can see here I also use cloudflare to provision an A record so that we can set up SSL.
Note that I also modify a local file that is configured to be my ansible hosts file. You can make an ansible.cfg file to do this.
# config.tf provider "aws" { access_key = "${var.aws_access_key}" secret_key = "${var.aws_secret_key}" region = "${var.region}" } provider "cloudflare" { email = "${var.cloudflare_email}" token = "${var.cloudflare_api_key}" } resource "aws_security_group" "notebook_access" { name = "jupyter_access" description = "Allow access on Jupyter default port" ingress { from_port = 8888 to_port = 8888 protocol = "tcp" cidr_blocks = ["0.0.0.0/0"] } tags { Name = "allow_notebook_access" } } data "aws_security_group" "default_security_group" { id = "${var.aws_default_security_group_id}" } resource "aws_instance" "chestnut" { ami = "${lookup(var.deep_learning_amis, var.region)}" instance_type = "p2.xlarge" key_name = "deployer-key" # already existing through other configuration security_groups = ["${data.aws_security_group.default_security_group.name}", "${aws_security_group.notebook_access.name}"] count = "${var.count}" } resource "cloudflare_record" "chestnut" { domain = "${var.cloudflare_domain}" name = "chestnut" value = "${aws_instance.chestnut.public_ip}" type = "A" } resource "local_file" "ansible_hosts" { filename = "${path.module}/ansible/hosts" content = <<EOF [web] ${cloudflare_record.chestnut.hostname} EOF }
Configure notebook
Using a playbook, we can do SSL signing and updating the notebook config in one fell swoop.
--- - hosts: web gather_facts: no remote_user: ec2-user vars: domain: "mydomain.com" notebook_config_path: "~/.jupyter/jupyter_notebook_config.py" certbot_install_from_source: yes certbot_auto_renew: yes certbot_auto_renew_user: "{{ ansible_user }}" certbot_auto_renew_minute: 20 certbot_auto_renew_hour: 5 certbot_admin_email: "{{ email }}" certbot_create_if_missing: yes certbot_create_standalone_stop_services: [] certbot_create_command: "{{ certbot_script }} certonly --standalone --noninteractive --agree-tos --email {{ cert_item.email | default(certbot_admin_email) }} -d {{ cert_item.domains | join(',') }} --debug" certbot_certs: - domains: - "{{ domain }}" roles: - role: geerlingguy.certbot become: yes tasks: - name: Enable daily security updates become: yes package: name: yum-cron-security.noarch state: present - name: Ensure that cert keys can be read become: yes file: path: /etc/letsencrypt/live mode: a+rx recurse: yes - name: Ensure that archive is readable too become: yes file: path: /etc/letsencrypt/archive mode: a+rx recurse: yes - name: Update certfile replace: path: "{{ notebook_config_path }}" regexp: '.*c.NotebookApp\.certfile.*' replace: "c.NotebookApp.certfile = '/etc/letsencrypt/live/{{ domain }}/fullchain.pem'" - name: Update keyfile replace: path: "{{ notebook_config_path }}" regexp: '.*c.NotebookApp\.keyfile.*' replace: "c.NotebookApp.keyfile = '/etc/letsencrypt/live/{{ domain }}/privkey.pem'" - name: Configure notebook to bind to all ips replace: path: "{{ notebook_config_path }}" regexp: '.*c.NotebookApp\.ip.*' replace: "c.NotebookApp.ip = '*'" - name: Don't open browser by default replace: path: "{{ notebook_config_path }}" regexp: '.*c.NotebookApp\.open_browser.*' replace: "c.NotebookApp.open_browser = False"
Some interesting things to point out here:
- Lets Encrypt support for Amazon Linux AMIs are in development, so I had to
essentially copy over
certbot_create_command
and add the ‘debug’ flag. certbot_create_standalone_stop_services
has to be set to [] for me, since it assumes nginx is running by default, and the script fails if nginx is not running.- You might need to install the geerlingguy.certbot role if you haven’t already
ansible-galaxy install geerlingguy.certbot
The rest is straightforward, and can be updated to set more configurations on the config file!
With that done, all that is left is to ssh into the server, source the right environment, and run the jupyter notebook(with a command like
jupyter notebook
). I guess this could be daemonized, but I like to be sshed in to have confirmations that the notebook is still alive. I ran into an issue trying to debug this on a t2.nano instance, where the notebook would continually crash, and it was good to see some output.I had to stop going down the rabbit hole, but it would be trivial to run fail2ban as good measure on the server. Right now we also still need to copy the token from stdout when the server starts, but the config file could be modified to do that.
-
-
Working With Elb
For the web project I am currently working on, I started needing more disk space on my AWS instance than the free 8gbs that come with t2 instances. Eventually I should probably move to S3 to host static assets like these, but for now I took the opportunity to learn how to attach EBS volumes to my ec2 instances.
I was surprised at how much patching that needed to be done on top of Terraform to properly mount EBS volumes. The providers that are relevant are:
Basically,
aws_ebs_volume
is what represents an EBS volume, andaws_ebs_volume_attachment
is what associates this volume with anaws_instance
. This in and of itself is not hard to grasp, but there are several gotchas.When defining
aws_instance
, it is also important to specify not only the region but the specific availability zone:resource "aws_instance" "instance" { availability_zone = "us-east-1a" ... } resource "aws_ebs_volume" "my_volume" { availability_zone = "us-east-1a" size = 2 type = "gp2" }
This is so that, as you can see, the aws instance can be guaranteed to attach to the ebs volume. Now, if you don’t care about what’s on the ebs volume and can blow it out each time the ec2 instance changes, then this is not an issue, and you simply compute the availability zone each time:
resource "aws_instance" "instance" { ... } resource "aws_ebs_volume" "my_volume" { availability_zone = "${aws_instance.instance.availability_zone}" ... }
This is a long way of saying that if you need a persistent ebs volume across aws instance restarts, then you must specify the availability zone explicitly.
Another issue is that even after attaching the ebs volume to an instance, the volume must actually be mounted to be used. It turns out that you must modify fstab to mount the drives, but you have to do it after the volume attachment finishes attaching. A remote provisioner can be used here:
resource "aws_volume_attachment" "attachment" { device_name = "/dev/sdh" skip_destroy = true volume_id = "${aws_ebs_volume.my_volume.id}" instance_id = "${aws_instance.instance.id}" provisioner "remote-exec" { script = "mount_drives.sh" connection { user = "deploy_user" private_key = "${file("~/.ssh/id_rsa")}" host = "${aws_instance.instance.public_ip}" } } }
I picked the name
/dev/sdh
for my volume, but ubuntu/debian maps this name to/dev/xvdh
, and mapping, though always consistent within a linux distro, will be different between distros. AWS Linux AMIs apparently will create symbolic names so that the name that you chose for the volume will be preserved. In any case, here is mount_drives.sh:In this case, we supply the name we gave in the volume attachment above, and mount it to the
MNTPOINT
, which in this case for me is/images
. This ensures that after this provisioner is run, we will have a usable space at/images
.Which brings us to our last gotcha - since we had terraform doesn’t know anything about mounting, we also have to unmount the volume when we destroy the instance, which brings us to another configuration on the instance. If the ec2 instance gets destroyed, we make sure to unmount the volume first.
resource "aws_instance" "instance" { ... provisioner "remote-exec" { when = "destroy" inline = ["sudo umount -d /dev/xvdh"] # see aws_volume_attachment.attachment.device_name. This gets mapped to /dev/xvdh connection { user = "deploy_user" private_key = "${file("~/.ssh/id_rsa")}" } } }
I wish that Terraform supported this kinds of use cases out of the box, but fortunately it is flexible enough that the workarounds can be implemented fairly easily.
-
Setting up logwatch
One of the parts of managing linux instances is understanding the state of the machine so it can be troubleshooted. I really needed to output logs from my machine, so I set out to learn a very well known tool for log summarization called logwatch.
There are three parts to any logwatch ‘service’. I use this term in quotes because I haven’t defined it yet, but also because it is different than the unix concept of a service. Generally it encompasses the type of logs you wish to summarize.
- A logfile configuration(located in
<logwatch_root_dir>/logfiles/mylog.conf
) - A service configuration(located in
<logwatch_root_dir>/services/mylog.conf
) - A filter script(located in
<logwatch_root_dir>/scripts/services/mylog
)
This three pieces together are what forms a service. What really helped me understand was to go through an example of one of these logs. Let’s take for example, the http-error service.
Before we continue, a note about
<logwatch_root_dir>
. Logwatch internally looks for these configuration files based on several dirs. For Ubuntu, the lookup is/usr/share/logwatch/default.conf
>/usr/share/logwatch/dist.conf
>/etc/logwatch/
. The idea being that each successive directory overrides parameters from the previous location. This is covered really well here.Logfile Configuration
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
# /usr/share/logwatch/default.conf/logfiles/http-error.conf ######################################################## # Define log file group for httpd ######################################################## # What actual file? Defaults to LogPath if not absolute path.... LogFile = httpd/*error_log LogFile = apache/*error.log.1 [ ... truncated ] # If the archives are searched, here is one or more line # (optionally containing wildcards) that tell where they are... #If you use a "-" in naming add that as well -mgt Archive = archiv/httpd/*error_log.* Archive = httpd/*error_log.* Archive = apache/*error.log.*.gz [ ... truncated ] # Expand the repeats (actually just removes them now) *ExpandRepeats # Keep only the lines in the proper date range... *ApplyhttpDate
Line 7-8 are basically file filters on which files from the log root will logwatch feed into your service. This is a pretty great idea, because you could potentially generate a custom log based on many different kinds of logs. For example, your custom log can incorporate the number of http access errors that were encountered by your server in a given time period. If absolute paths are not given, paths are relative to the default log root,
/var/log/
Lines 14-15 show that you can also search archive files for the same log information.
Line 21 seems to be some left over unused code, but was meant to expand the logs when standard syslog files have the message “Last Message Repeated n Times”. As the comment indicates, it just removes repeats.
Line 24 is interesting. The * indicates to the logwatch perl script to apply a filter function to all lines of this file. At
<logwatch_root_dir>/scripts/shared/applyhttpdate
, we can see that this filters the dates in the logs, assuming a certain header format for the lines in the file. Logwatch provides a couple of standard filters with intuitive names like onlycontains, remove, etc.Service Configuration
So now we know how logwatch finds the logs that might be of interest to us. What does it do with these files? For that, we have to look at the service configuration file:
1 2 3 4 5 6 7
# /usr/share/logwatch/default.conf/services/http-error.conf Title = http errors # Which logfile group... LogFile = http-error Detail = High
The directive on Line 2 is straightforward - what should this log be named? When the log output is generated, this is what goes in the headers.
Line 5, confusingly, tells logwatch which logfile “group” it’s interested in. This is simply the logfile configuration we looked at earlier, minus the .conf extension. However, just as the logfile configuration can filter logs with different extensions and names, the service configuration can incorporate multiple logfile groups.
Service Configuration
Finally, logwatch runs the output of all of the logs gathered by the configurations through a script with the same name as the service configuration, but in
<logwatch_root_dir>/scripts/services/<servicename>
. Most bundled scripts are perl scripts, but the great thing is that you can pretty much use any scripting language.I won’t actually go through
/usr/share/logwatch/scripts/services/http-error
here, one because it’s pretty long, and two, I don’t understand perl and can’t explain it very well : ) However, the gist of it is that it takes all the output of all the logs and summarizes them, outputting the result in stdout.
My custom log watch doesn’t actually watch any logs, but I still need to write these three files. This was my final setup.
Logfile conf
1 2 3 4
# /etc/logwatch/conf/logfiles/customlogger.conf # This is actually a hack - I ask for log files, but I never actually use them. LogFile = *.log *ExpandRepeats
Service conf
1 2 3
# /etc/logwatch/conf/services/customlogger.conf Title = customlogger LogFile = customlogger
Script
1 2 3 4 5 6
# /etc/logwatch/scripts/services/customlogger #!/usr/bin/env bash # I just need to know what the memory usage is like at this point in time. top -o %MEM -n 1 -b | head -n 20 free -m
To test, I just ran:
$ logwatch --service customlogger --------------------- customlogger Begin ------------------------ top - 21:36:06 up 4:16, 1 user, load average: 0.12, 0.05, 0.01 Tasks: 144 total, 1 running, 143 sleeping, 0 stopped, 0 zombie %Cpu(s): 2.4 us, 0.6 sy, 0.0 ni, 96.8 id, 0.2 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 497664 total, 43716 free, 194456 used, 259492 buff/cache KiB Swap: 0 total, 0 free, 0 used. 280112 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 13373 root 20 0 202988 44448 12796 S 0.0 8.9 0:00.76 uwsgi 11868 root 20 0 762376 41940 8192 S 0.0 8.4 1:04.90 dockerd 13389 root 20 0 202988 35224 3568 S 0.0 7.1 0:00.00 uwsgi 13390 root 20 0 202988 35220 3564 S 0.0 7.1 0:00.00 uwsgi 13305 root 20 0 47780 15720 6432 S 0.0 3.2 0:02.16 supervisord 9024 root 20 0 290696 13876 3268 S 0.0 2.8 0:02.55 fail2ban-se+ 31127 ubuntu 20 0 36412 11060 4316 S 0.0 2.2 0:00.07 logwatch 11872 root 20 0 229844 9460 2536 S 0.0 1.9 0:00.28 containerd 1162 root 20 0 266524 7372 488 S 0.0 1.5 0:00.02 snapd 27837 root 20 0 101808 6844 5868 S 0.0 1.4 0:00.00 sshd 417 root 20 0 62048 6356 3732 S 0.0 1.3 0:01.52 systemd-jou+ 1 root 20 0 55208 5796 3936 S 0.0 1.2 0:04.99 systemd 13430 root 20 0 67824 5636 4896 S 0.0 1.1 0:00.02 sshd total used free shared buff/cache available Mem: 486 190 42 7 253 273 Swap: 0 0 0 ---------------------- customlogger End ------------------------- ###################### Logwatch End #########################
So if you need this to run every other hour, the last thing to do is to set up a cron job to do it. Pretty nifty, I think.
0 */2 * * * /usr/sbin/logwatch --service customlogger --output mail --mailto <your-email>
- A logfile configuration(located in