Blog

  • Deploying Selenium

    I had the misfortune of trying to use Selenium in one of my upcoming projects. Actually, Selenium is a pretty amazing tool for automating website testing, but the dependencies can be tricky to nail.

    Installing on OSX is pretty straightforward:

    pip install selenium
    brew install chromedriver
    

    But this became a huge nightmare for me when installing remotely. Fortunately, there is a selenium releases a docker image one can run with this one liner:

    docker run -d -p 4444:4444 --name selenium --shm-size=2g selenium/standalone-chrome:3.8.1-bohrium
    

    This is what you need to do in python:

    from selenium import webdriver
    
    # my_docker_host is usually localhost, but in Docker Toolbox is the ip of the
    # virtual machine
    selenium_server_url = 'http://my_docker_host:4444/wd/hub'
    
    options = webdriver.ChromeOptions()
    options.set_headless(True)
    capabilities = options.to_capabilities()
    
    driver = webdriver.Remote(desired_capabilities=capabilities,
                              command_executor=selenium_server_url)
    
    # then just use the driver as you would normally
    driver.get(some_url)
    

  • Virtualenv Workflow

    Over the past couple of days, I had gained an appreciation for the tooling that has developed around the python ecosystem for package management, from having to develop and deploy several python applications.

    I’d like to share some guides and tips to maintaining python environments.

    pyenv

    I highly recommend using pyenv to manage your python environments.(and while we’re on that topict, rvm for Ruby, nenv for node). What these tools have in common is that it makes sure that you can maintain a separate set of dependencies for different projects.

    The canonical example of where this useful is if you have two projects, A and B, that both depend on a library, let’s call it unicorn. While working on project A, you realize you need the absolute latest release of the unicorn library. But when you upgrade, project B breaks, because there was some code that made assumptions on the previous version of unicorn. This is a problem if there is a single global installation that both project share. This is why the “global” installation of dependencies can be dangerous.

    After installing pyenv, switching python versions becomes as easy as:

    pyenv shell anaconda-2.4.0
    

    I also highly recommend the pyenv-virtualenv tool. This allows an activation of the virtual environment based on the directory you’re in. The syntax for virtualenv is like this

    # this creates a new virtualenv managed by pyenv-virtualenv
    pyenv virtualenv <python-version> <env-name>
    
    # this creates a .python-version file in the local
    # directory, which will instruct the pyenv-virtualenv plugin to activate
    # the env whenever you switch to this directory
    pyenv local <env-name>
    

    Based on this, I recommend the following naming convention for the env name <python-version>_<project-name>. This is because BOTH the python version and the environment name is captured. WHen using pyenv virtualenv, one of the downsides is that all the envs are stored next to each other, so they somehow have to be namespaced.

    So for example, I would do:

    pyenv virtualenv 3.5.1 3_5_1_myproject # assuming python 3.5.1
    pyenv local 3_5_1_myproject
    

    Now the local .python-version can be committed to source code, and if someone else needs to recreate it, they just need to make sure that it’s done in the context of a python 3.5.1 environment.

    pipenv

    I hadn’t heard about pipenv until several days ago, when I had to deploy a python application. I had become so used to npm’s --save-dev option, that I wondered, what was the equivalent for python?

    All virtual environments come bundled with a python package manager called pip. Pip is pretty wonderful, but it has many limitations, one of which is the lack of the concept of dev packages. For me, I wanted to install jupyter notebook for development but not for deployment, so this was a dealbreaker.

    Enter pipenv. Pipenv allows for specifying dependencies and locking them, like in most other languages(package.json/package-lock.json, Podfile/Podfile.lock, etc)

    pipenv install selenium
    pipenv install --dev jupyter
    

    gives us this Pipfile:

    [[source]]
    
    url = "https://pypi.python.org/simple"
    verify_ssl = true
    name = "pypi"
    
    
    [packages]
    
    selenium = "*"
    
    
    [dev-packages]
    
    jupyter = "*"
    

    which is really neat. Furthermore, if you have an environment set up using pyenv, pipenv will happily use it.

    pipenv install will download dependencies only for the production version, which greatly simplifies deployment.


  • That thing in Hugo

    Hugo continues to make occasional splashes on the front page of Hacker News, and like many others who were a little tired of why Jekyll took so long to render even small pages, I took a leap, and would like to share some of my experiences doing so.

    The first caveat I should mention before you read ANY further. The biggest downside of Hugo, in my opinion, is that it does not come batteries included with regards to SASS processing. All the blogs that mention how blazingly fast it is(and they are right) don’t mention this fact. IMHO, writing css in the modern day always involves a css preprocessor. Fortunately, SASS compilation handled by many build systems, and I will share my setup at the end of this post.

    I didn’t have much luck with the hugo import jekyll command, it just ended up creating empty directories in the new project.

    Inspired by https://thatthinginswift.com, here are some basic translations that might help someone migrating from Jekyll. All entries are listed as Jekyll => Hugo.

    Helpful functions

    {{ page.title | relative_url }} => {{ .Title | relURL }}
    {{ page.description | escape }} => {{ .Description | safeHTML }}
    {{ page.date | date: '%B %d, %Y' }} => {{ .Date | dateFormat "January 2, 2006" }}
    

    Page titles are simply passed in with the root context, so no reference to page is necessary.

    Rendering list of items

    # Jekyll - /blog.html
    {% for post in site.posts %}
      <li>
          <h2 class="post-title-home">
            {{ post.title | escape }}
          </h2>
        {{ post.content }}
      </li>
    {% endfor %}
    
    # Hugo - /layouts/blog/list.html
    {{ range .Data.Pages }}
        <h2 class="post-title-home">
          {{ .Title | safeHTML }}
        </h2>
      {{ .Render "li"}}
    {{ end }}
    

    A couple of things to note here:

    • The .Data.Pages variable is populated automatically depeding on which section you are part of.
    • “li” is a template, that lives in layouts/blog/li.html, which tells a page how to render.

    For more, see:

    Template inheritance

    
    # Jekyll
    
    # /_layouts/default.html
    <html>
      {% include head.html %}
      <body>
        {{ content }}
      </body>
    </html>
    
    # /_includes/head.html
    <meta>...</meta>
    <meta>...</meta>
    <meta>...</meta>
    
    # /_layouts/post.html
    <div>
      <h1> {{ page.title }} </h1>
      {{ content }}
    </div>
    
    # Hugo
    
    # /layouts can be substituted for /themes/themename, see documentation below.
    
    # /layouts/default/baseof.html
    <html>
      {{ partial "head.html" . }}
      {%  head.html %}
      <body>
        {{ block "main" . }}
        {{ end }}
      </body>
    </html>
    
    # /layouts/partials/head.html
    <meta>...</meta>
    <meta>...</meta>
    <meta>...</meta>
    
    # /layouts/post/single.html
    {{ define "main" }}
    <div>
      <h1> {{ .Title }} </h1>
      {{ .Content }}
    </div>
    {{ end }}
    

    SASS compilation

    My old jekyll theme heavily used Bootstrap, so I needed a way to compile sass files. I ended up hacking an npm script to do this:

    {
      "name": "brightredchilli-website",
      "version": "0.0.1",
      "description": "Preprocessing code for a hugo site",
      "main": "index.js",
      "scripts": {
        "css:build": "node-sass --source-map true --output-style compressed './themes/brightredchilli/sass/main.scss' --glob -o ./themes/brightredchilli/static/css/",
        "css:watch": "onchange './themes/brightredchilli/sass/' -- npm run css:build",
        "build": "npm run css:build",
        "prewatch": "npm run build",
        "watch": "parallelshell 'npm run css:watch' 'hugo server --buildDrafts --verbose'",
        "start": "npm run watch",
        "deploy": "hugo --baseURL='https://www.yingquantan.com'"
      },
      "author": "Ying",
      "license": "MIT",
      "devDependencies": {
        "node-sass": "^4.7.2",
        "onchange": "^1.1.0",
        "parallelshell": "^3.0.2",
      },
      "dependencies": {}
    }
    

    The relevant part is the fact that I use a theme called brightredchilli, and put my sass files inside of the sass directory. Note that I don’t use static/sass, because that would cause hugo to copy the files over to the publish directory, which I don’t want. The ignoreFiles directive in config.yml didn’t seem to work for me.

    There are scripts that watch for changes, and node-sass compiles the changes and puts it into the static/css directory. Note in the watch script I start a hugo server with verbose flags.

    Another gotcha was that the hugo server serves pages from memory. This means that there is no reference on disk you can use to inspect the layout and output of the content, whether or not files copied over successfully, etc. I found myself periodically just running the hugo command generate the site to publish directory(defaults to /publish).

    I found this reading extemely helpful for the migration process - it goes about describing how to create a minimal theme. The process really helps someone learn about how Hugo renders content.

    I ended up killing a day doing this, and not everything is migrated properly over, but I’m glad I did it.