Using Vagrant for Local Cassandra Development

vagrant logo cassandra logo

Ever since joining DataStax this year, I’ve spent a lot of time learning and using both Cassandra and the DataStax Enterprise version of it. To really get into it, I wanted to be able to quickly build up and tear down local clusters, without affecting my primary development system (Mac PowerBook).

Vagrant’s tagline says it well:

Create and configure lightweight, reproducible, and portable development environments.

To help those that want to learn and develop with Cassandra, I’ve created a set of sample “getting started” templates and shared them on GitHub: bcantoni/vagrant-cassandra

Take a look at the screencasts linked below, then check out the GitHub project for the detailed instructions.

Installing DataStax Enterprise on Digital Ocean

These instructions explain how to install DataStax Enterprise on a set of Digital Ocean droplets. DataStax provides instructions for Installing on cloud providers, but currently only Amazon EC2 and HP Cloud are described specifically.

The steps below can be used for Digital Ocean, or more generally for any other cloud provider. We’ll create a set of Ubuntu droplets and install DataStax Enterprise (DSE) on them to create a Cassandra cluster.

Update: Scroll to the bottom for a video demo of these install steps.

These are the relevant DataStax documentation pages if you want to learn more details behind each step:

Prerequisites

  1. Register for DataStax Enterprise (free, allows use of DataStax Enterprise in dev/test environments)
  2. An active Digital Ocean account (referral link if you don’t have an account yet)

Creating Digital Ocean Droplets

  1. Login to the Digital Ocean and navigate to the control panel
  2. On your local system create an SSH key and store it in the Digital Ocean control panel (help)
  3. On the control panel click Create with settings like:
    • Hostname: node0
    • Size: 4 GB / 2 CPU
    • Region: default
    • Image: Linux Ubuntu 14.04 64-bit
    • SSH key: Select the one created previously
    • Settings: default
  4. Repeat for node1, node2, etc. (as many nodes as desired)
  5. As the nodes are coming up make note of the IP addresses

Installing DataStax Enterprise

For a faster install, see Parallel Installs below.

  1. SSH into the first node
  2. Confirm whether Java is already installed (it may be, depending on the Linux image); if not, install either OpenJDK Java or Oracle Java
  3. Add the DataStax repository using the username and password from your registration:

    echo "deb http://username:password@debian.datastax.com/enterprise stable main" | sudo tee -a /etc/apt/sources.list.d/datastax.sources.list

  4. Add the DataStax repository key:

    curl -L --silent https://debian.datastax.com/debian/repo_key | sudo apt-key add -

  5. Update the local package cache:

    sudo apt-get update

    (if you see any “403 Not Authorized” errors here, stop and make sure your username and password are correct)

  6. Install DataStax Enterprise:

    sudo apt-get install dse-full

  7. Edit /etc/hosts and add an entry for the host with its public IP address (replacing the 127.0.1.1 entry if it exists)

  8. Edit /etc/dse/cassandra/cassandra.yaml and change a couple of settings:

    • Set cluster_name as desired
    • In the seeds field, list the IP address of node0 (the first server will be the seed for the cluster)
    • Set listen_address to blank
    • Set num_tokens to 256
  9. Start the DSE service:

    sudo service dse start

  10. Repeat the above steps for the remaining nodes (node1, node2, etc.)

  11. SSH to any of the nodes and check the status of the DSE cluster:

    nodetool status

Parallel Installs

You can run the above install commands in parallel for a much faster setup time. On the Mac I use i2cssh which powers several iTerm2 consoles in parallel.

This technique is borrowed from Jake Luciani’s video How to set up a 4 node Cassandra cluster in under 2 minutes.

Steps:

  1. Install i2cssh and iTerm2
  2. Create a file ~/.i2csshrc with the server IP addresses. For example this file defines 3 servers included in a cluster named ‘digdemo’:

    version: 2
    iterm2: false
    clusters:
      digdemo:
        login: root
        hosts:
          - 54.176.126.209
          - 54.176.91.139
          - 50.18.136.76
    
  3. Launch parallel terminal sessions: i2cssh -c digdemo

  4. Enable broadcast mode in iTerm2 with Cmd-Opt-I

  5. Type commands from the install procedure above; they should be echoed on all sessions in parallel

Video Demo

Notes

Clean Up Your Twitter App Permissions

Twitter users should periodically review their application permission settings to clean out any old applications they have authorized. Over time these can pile up and it’s good to clean them out.

The steps are simple:

  1. Login to Twitter and navigate to the Application Settings page
  2. Review the list of applications and click Revoke Access for any you no longer need or don’t remember authorizing
Twitter App Permissions

Twitter App Permissions

Joined DataStax Engineering Team

DataStax Logo

In April I joined DataStax as a director of engineering on the DataStax Enterprise engineering team. I meant to post something here during my first week, but have been kind of busy since I started (understatement!). We sell an enterprise-class version of the open-source Cassandra database, along with service, support, and training. We also support the Cassandra community and the open-source project itself (the Apache Cassandra committee chair and many committers are all DataStax employees).

My first five week have been both busy and exciting. Here are some observations and highlights so far:

  • It’s great to work for a smaller company once again – everyone is very motivated and focused on the mission, and it’s a very small circle of decision-makers.
  • I’ve worked for companies with remote workers before (especially Citrix), but here we take it to a whole new level. We just call it a “distributed” workforce. In particular the engineering team is spread literally around the world. Many of our job postings list the location as “Anywhere, World” which is quite appropriate.
  • We really like using SaaS based products, and have hardly any “infrastructure” hardware/servers of our own (just a few systems for Engineering & QA). Everything else is “in the cloud”.
  • I’ve had a big learning curve on distributed NoSQL databases in general, Cassandra, and all of the DataStax products.

DataStax is really growing quickly and we’re looking for strong people in a variety of areas. Check out the DataStax Careers page for current openings and let me know if I can help make a referral for you.

In particular these are some key open positions in my group:

  • Driver & Tools Engineer
  • Java Engineer
  • Software Engineer in Test

This Jobvite link will take you to the details page for those 3 positions.

Remind101 – Text Messaging for Classroom or Sports Teams

From the Twilio Customer Stories page I learned about Remind101, a free service designed for teachers to keep students and parents up to date via text messaging or email. From their website you can learn more about the service, or about the team.

This is a very useful service. While it was primarily designed for classrooms, it could also be used for sports teams or anywhere else you need a 1:Many text messaging system.

Remind101 designed their system to be very focused for this type of communication. By keeping with a narrow focus, they’ve got a strong set of features like:

  • Privacy – no one participating (teacher, student, parent) has anyone else’s phone numbers; this would be very important for younger kids in particular
  • Only Group Messaging – there is no support for 1:1 messages; instead everything is sent to the entire group/class
  • One-way Messaging – students and parents cannot reply to any teacher message; I would like to see the ability to reply as well, in order to make it a better communication channel for the students back to the teacher
  • Mobile Apps – for the teacher side, they have apps for both Android and iOS
  • Email – as an alternative to text messaging, students/parents can receive messages via email

As a sport coach or manager, you could set up individual teams (“classes”), and connect with each of your teams separately as needed. For example, “16U Red”, “16U White”, and so on. This app could be used for a whole season, or just for a tournement weekend. (Just delete the group when the weekend is over.)

Here’s a screenshot of the web interface for sending a message:

Remind101 website screenshot

Using Remind101 to send text messages to a sports team