Installing DataStax Enterprise on Digital Ocean

These instructions explain how to install DataStax Enterprise on a set of Digital Ocean droplets. DataStax provides instructions for Installing on cloud providers, but currently only Amazon EC2 and HP Cloud are described specifically.

The steps below can be used for Digital Ocean, or more generally for any other cloud provider. We’ll create a set of Ubuntu droplets and install DataStax Enterprise (DSE) on them to create a Cassandra cluster.

Update: Scroll to the bottom for a video demo of these install steps.

These are the relevant DataStax documentation pages if you want to learn more details behind each step:

Prerequisites

  1. Register for DataStax Enterprise (free, allows use of DataStax Enterprise in dev/test environments)
  2. An active Digital Ocean account (referral link if you don’t have an account yet)

Creating Digital Ocean Droplets

  1. Login to the Digital Ocean and navigate to the control panel
  2. On your local system create an SSH key and store it in the Digital Ocean control panel (help)
  3. On the control panel click Create with settings like:
    • Hostname: node0
    • Size: 4 GB / 2 CPU
    • Region: default
    • Image: Linux Ubuntu 14.04 64-bit
    • SSH key: Select the one created previously
    • Settings: default
  4. Repeat for node1, node2, etc. (as many nodes as desired)
  5. As the nodes are coming up make note of the IP addresses

Installing DataStax Enterprise

For a faster install, see Parallel Installs below.

  1. SSH into the first node
  2. Confirm whether Java is already installed (it may be, depending on the Linux image); if not, install either OpenJDK Java or Oracle Java
  3. Add the DataStax repository using the username and password from your registration:

    echo "deb http://username:password@debian.datastax.com/enterprise stable main" | sudo tee -a /etc/apt/sources.list.d/datastax.sources.list

  4. Add the DataStax repository key:

    curl -L --silent https://debian.datastax.com/debian/repo_key | sudo apt-key add -

  5. Update the local package cache:

    sudo apt-get update

    (if you see any “403 Not Authorized” errors here, stop and make sure your username and password are correct)

  6. Install DataStax Enterprise:

    sudo apt-get install dse-full

  7. Edit /etc/hosts and add an entry for the host with its public IP address (replacing the 127.0.1.1 entry if it exists)

  8. Edit /etc/dse/cassandra/cassandra.yaml and change a couple of settings:

    • Set cluster_name as desired
    • In the seeds field, list the IP address of node0 (the first server will be the seed for the cluster)
    • Set listen_address to blank
    • Set num_tokens to 256
  9. Start the DSE service:

    sudo service dse start

  10. Repeat the above steps for the remaining nodes (node1, node2, etc.)

  11. SSH to any of the nodes and check the status of the DSE cluster:

    nodetool status

Parallel Installs

You can run the above install commands in parallel for a much faster setup time. On the Mac I use i2cssh which powers several iTerm2 consoles in parallel.

This technique is borrowed from Jake Luciani’s video How to set up a 4 node Cassandra cluster in under 2 minutes.

Steps:

  1. Install i2cssh and iTerm2
  2. Create a file ~/.i2csshrc with the server IP addresses. For example this file defines 3 servers included in a cluster named ‘digdemo’:

    version: 2
    iterm2: false
    clusters:
      digdemo:
        login: root
        hosts:
          - 54.176.126.209
          - 54.176.91.139
          - 50.18.136.76
    
  3. Launch parallel terminal sessions: i2cssh -c digdemo

  4. Enable broadcast mode in iTerm2 with Cmd-Opt-I

  5. Type commands from the install procedure above; they should be echoed on all sessions in parallel

Video Demo

Notes