Category Archives: Software

Quick Guide to Vagrant on Amazon EC2

Here’s a really quick guide to using Vagrant to create virtual machines on Amazon Web Services EC2. I’ve gotten a lot of use out of Vagrant for local development, but sometimes it’s helpful to build out VMs in the cloud. (In particular, if your local machine isn’t very powerful.)

These steps assume you already have Vagrant installed and have an Amazon Web Services account (and know how to use both).


First you’ll need to install the Vagrant AWS plugin:

vagrant plugin install vagrant-aws
vagrant box add dummy

Next login to your Amazon AWS console to get a few things:

  • AWS access key
  • AWS secret key
  • SSH keypair name
  • SSH private key file (.pem extension)
  • Make sure the default security group enables SSH (port 22) access from anywhere

I like to set these up as environment variables to keep them out of the Vagrantfile. On Mac or Linux systems you can add this to your ~.profile file:

export AWS_KEY='your-key'
export AWS_SECRET='your-secret'
export AWS_KEYNAME='your-keyname'
export AWS_KEYPATH='your-keypath'


Now we can configure our Vagrantfile with the specifics needed for AWS. Refer to the vagrant-aws documentation to understand all the options. In the example below we have all the AWS-related settings in the x.vm.provider :aws block:

Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
  config.vm.define :delta do |x| = "hashicorp/precise64"
    x.vm.hostname = "delta"

    x.vm.provider :virtualbox do |v| = "delta"

    x.vm.provider :aws do |aws, override|
      aws.access_key_id = ENV['AWS_KEY']
      aws.secret_access_key = ENV['AWS_SECRET']
      aws.keypair_name = ENV['AWS_KEYNAME']
      aws.ami = "ami-a7fdfee2"
      aws.region = "us-west-1"
      aws.instance_type = "m3.medium" = "dummy"
      override.ssh.username = "ubuntu"
      override.ssh.private_key_path = ENV['AWS_KEYPATH']

See this Github gist for a longer example file.

Now you can bring up the VM by specifying the AWS plugin as the provider:

vagrant up --provider=aws

After about a minute, the VM should be up and running and available for SSH:

$ vagrant up --provider=aws
Bringing machine 'delta' up with 'aws' provider...
==> delta: Launching an instance with the following settings...
==> delta:  -- Type: m3.medium
==> delta:  -- AMI: ami-a7fdfee2
==> delta:  -- Region: us-west-1
==> delta:  -- Keypair: briancantoni
==> delta:  -- Block Device Mapping: []
==> delta:  -- Terminate On Shutdown: false
==> delta:  -- Monitoring: false
==> delta:  -- EBS optimized: false
==> delta:  -- Assigning a public IP address in a VPC: false
==> delta: Waiting for instance to become "ready"...
==> delta: Waiting for SSH to become available...
==> delta: Machine is booted and ready for use!
==> delta: Rsyncing folder: /Users/briancantoni/dev/vagrant/aws/ => /vagrant

$ vagrant ssh
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-29-generic x86_64)



  • You need to configure a specific AMI for Vagrant to use. I find the Ubuntu Amazon EC2 AMI Finder very helpful to match the version and region I wanted to use.
  • A common tripping point is the default security group not allowing SSH (port 22) from any IP address. Also make sure to add any other ports depending on your application (e.g., port 80 for HTTP).
  • Once you have the basics working, make sure to read through the vagrant-aws project to understand all the options available.
  • Make sure to vagrant destroy your VMs when done, and check the AWS Console to make sure they were terminated correctly (to avoid unexpected charges).

Using Vagrant for Local Cassandra Development

vagrant logo cassandra logo

Ever since joining DataStax this year, I’ve spent a lot of time learning and using both Cassandra and the DataStax Enterprise version of it. To really get into it, I wanted to be able to quickly build up and tear down local clusters, without affecting my primary development system (Mac PowerBook).

Vagrant’s tagline says it well:

Create and configure lightweight, reproducible, and portable development environments.

To help those that want to learn and develop with Cassandra, I’ve created a set of sample “getting started” templates and shared them on GitHub: bcantoni/vagrant-cassandra

Take a look at the screencasts linked below, then check out the GitHub project for the detailed instructions.

Installing DataStax Enterprise on Digital Ocean

These instructions explain how to install DataStax Enterprise on a set of Digital Ocean droplets. DataStax provides instructions for Installing on cloud providers, but currently only Amazon EC2 and HP Cloud are described specifically.

The steps below can be used for Digital Ocean, or more generally for any other cloud provider. We’ll create a set of Ubuntu droplets and install DataStax Enterprise (DSE) on them to create a Cassandra cluster.

Update: Scroll to the bottom for a video demo of these install steps.

These are the relevant DataStax documentation pages if you want to learn more details behind each step:


  1. Register for DataStax Enterprise (free, allows use of DataStax Enterprise in dev/test environments)
  2. An active Digital Ocean account (referral link if you don’t have an account yet)

Creating Digital Ocean Droplets

  1. Login to the Digital Ocean and navigate to the control panel
  2. On your local system create an SSH key and store it in the Digital Ocean control panel (help)
  3. On the control panel click Create with settings like:
    • Hostname: node0
    • Size: 4 GB / 2 CPU
    • Region: default
    • Image: Linux Ubuntu 14.04 64-bit
    • SSH key: Select the one created previously
    • Settings: default
  4. Repeat for node1, node2, etc. (as many nodes as desired)
  5. As the nodes are coming up make note of the IP addresses

Installing DataStax Enterprise

For a faster install, see Parallel Installs below.

  1. SSH into the first node
  2. Confirm whether Java is already installed (it may be, depending on the Linux image); if not, install either OpenJDK Java or Oracle Java
  3. Add the DataStax repository using the username and password from your registration:

    echo "deb stable main" | sudo tee -a /etc/apt/sources.list.d/datastax.sources.list

  4. Add the DataStax repository key:

    curl -L --silent | sudo apt-key add -

  5. Update the local package cache:

    sudo apt-get update

    (if you see any “403 Not Authorized” errors here, stop and make sure your username and password are correct)

  6. Install DataStax Enterprise:

    sudo apt-get install dse-full

  7. Edit /etc/hosts and add an entry for the host with its public IP address (replacing the entry if it exists)

  8. Edit /etc/dse/cassandra/cassandra.yaml and change a couple of settings:

    • Set cluster_name as desired
    • In the seeds field, list the IP address of node0 (the first server will be the seed for the cluster)
    • Set listen_address to blank
    • Set num_tokens to 256
  9. Start the DSE service:

    sudo service dse start

  10. Repeat the above steps for the remaining nodes (node1, node2, etc.)

  11. SSH to any of the nodes and check the status of the DSE cluster:

    nodetool status

Parallel Installs

You can run the above install commands in parallel for a much faster setup time. On the Mac I use i2cssh which powers several iTerm2 consoles in parallel.

This technique is borrowed from Jake Luciani’s video How to set up a 4 node Cassandra cluster in under 2 minutes.


  1. Install i2cssh and iTerm2
  2. Create a file ~/.i2csshrc with the server IP addresses. For example this file defines 3 servers included in a cluster named ‘digdemo’:

    version: 2
    iterm2: false
        login: root
  3. Launch parallel terminal sessions: i2cssh -c digdemo

  4. Enable broadcast mode in iTerm2 with Cmd-Opt-I

  5. Type commands from the install procedure above; they should be echoed on all sessions in parallel

Video Demo


Screencast: Blog Backup to

I recently started using to back up my blog. Mover is a relatively new service which can migrate or back up between several different cloud services. I’m starting to use it as part of my backup strategy, making sure even files I have “in the cloud” are located in more that one service.

The pricing model is also quite simple: they charge $1 per GB transferred. You deposit money in your account and it draws down. In the case of this blog, the initial data transfer was about 1 GB, and the daily incremental backups are usually under 1 MB. At that rate my initial $10 will last quite a while.

To demonstrate the steps, here’s a short screencast in which I add a regular backup task from part of my blog to the Box cloud service: Demo – Back up blog to Box cloud service

Tech Advent Calendars – 2013

It’s that time of the year again – Advent calendars for many tech communities. As in past years (2011, 2012), I’ve gathered a few here that should be interesting:

Web Advent is taking the year off.

I have a combined RSS feed (created with Yahoo! Pipes) that picks up all of these advent calendars: (Yahoo Pipe source).