No Huddle Offense

Running a distributed native-cloud python app on a CoreOS cluster

September 21st, 2014 • 1 Comment

Suricate is an open source Data Science platform. As it is architected to be a native-cloud app, it is composed into multiple parts:

a web frontend (which can be load-balanced)
execution nodes which actually perform the data science tasks for the user (for now each user must have at least one exec node assigned)
a mongo database for storage (which can be clustered for HA)
a RabbitMQ messaging system (which can be clustered for HA)

Up till now each part was running in a SmartOS zone in my test setup or run with Openhift Gears. But I wanted to give CoreOS a shot and slowly get into using things like Kubernetes. This tutorial hence will guide through creating: the Docker image needed, the deployment of RabbitMQ & MongoDB as well as deployment of the services of Suricate itself on top of a CoreOS cluster. We’ll use suricate as an example case here – it is also the general instructions to running distributed python apps on CoreOS.

Step 0) Get a CoreOS cluster up & running

Best done using VagrantUp and this repository.

Step 1) Creating a docker image with the python app embedded

Initially we need to create a docker image which embeds the Python application itself. Therefore we will create a image based on Ubuntu and install the necessary requirements. To get started create a new directory – within initialize a git repository. Once done we’ll embed the python code we want to run using a git submodule.

$ git init
$ git submodule add https://github.com/engjoy/suricate.git

Now we’ll create a little directory called misc and dump the python scripts in it which execute the frontend and execution node of suricate. The requirements.txt file is a pip requirements file.

 
$ ls -ltr misc/
total 12
-rw-r--r-- 1 core core 20 Sep 21 11:53 requirements.txt
-rw-r--r-- 1 core core 737 Sep 21 12:21 frontend.py
-rw-r--r-- 1 core core 764 Sep 21 12:29 execnode.py

Now it is down to creating a Dockerfile which will install the requirements and make sure the suricate application is deployed:

 
$ cat Dockerfile
FROM ubuntu
MAINTAINER engjoy UG (haftungsbeschraenkt)

# apt-get stuff
RUN echo "deb http://archive.ubuntu.com/ubuntu/ trusty main universe" >> /etc/apt/sources.list
RUN apt-get update
RUN apt-get install -y tar build-essential
RUN apt-get install -y python python-dev python-distribute python-pip

# deploy suricate
ADD /misc /misc
ADD /suricate /suricate

RUN pip install -r /misc/requirements.txt

RUN cd suricate && python setup.py install && cd ..

Now all there is left to do is to build the image:

 
$ docker build -t docker/suricate .

Now we have a docker image we can use for both the frontend and execution nodes of suricate. When starting the docker container we will just make sure to start the right executable.

Note.: Once done publish all on DockerHub – that’ll make live easy for you in future.

Step 2) Getting RabbitMQ and MongoDB up & running as units

Before getting suricate up and running we need a RabbitMq broker and a Mongo database. These are just dependencies for our app – your app might need a different set of services. Download the docker images first:

 
$ docker pull tutum/rabbitmq
$ docker pull dockerfile/mongodb

Now we will need to define the RabbitMQ service as a CoreOS unit in a file call rabbitmq.service:

 
$ cat rabbitmq.service
[Unit]
Description=RabbitMQ server
After=docker.service
Requires=docker.service
After=etcd.service
Requires=etcd.service

[Service]
ExecStartPre=/bin/sh -c "/usr/bin/docker rm -f rabbitmq > /dev/null ; true"
ExecStart=/usr/bin/docker run -p 5672:5672 -p 15672:15672 -e RABBITMQ_PASS=secret --name rabbitmq tutum/rabbitmq
ExecStop=/usr/bin/docker stop rabbitmq
ExecStopPost=/usr/bin/docker rm -f rabbitmq

Now in CoreOS we can use fleet to start the rabbitmq service:

 
$ fleetctl start rabbitmq.service
$ fleetctl list-units
UNIT                    MACHINE                         ACTIVE  SUB
rabbitmq.service        b9239746.../172.17.8.101        active  running

The CoreOS cluster will make sure the docker container is launched and RabbitMQ is up & running. More on fleet & scheduling can be found here.

This steps needs to be repeated for the MongoDB service. But afterall it is just a change of the Exec* scripts above (Mind the port setups!). Once done MongoDB and RabbitMQ will happily run:

 
$ fleetctl list-units
UNIT                    MACHINE                         ACTIVE  SUB
mongo.service           b9239746.../172.17.8.101        active  running
rabbitmq.service        b9239746.../172.17.8.101        active  running

Step 3) Run frontend and execution nodes of suricate.

Now it is time to bring up the python application. As we have defined a docker image called engjoy/suricate in step 1 we just need to define the units for CoreOS fleet again. For the frontend we create:

 
$ cat frontend.service
[Unit]
Description=Exec node server
After=docker.service
Requires=docker.service
After=etcd.service
Requires=etcd.service

[Service]
ExecStartPre=/bin/sh -c "/usr/bin/docker rm -f suricate > /dev/null ; true"
ExecStart=/usr/bin/docker run -p 8888:8888 --name suricate -e MONGO_URI=<change uri> -e RABBITMQ_URI=<change uri> engjoy/suricate python /misc/frontend.py
ExecStop=/usr/bin/docker stop suricate
ExecStopPost=/usr/bin/docker rm -f suricate

As you can see it will use the engjoy/suricate image from above and just run the python command. The frontend is now up & running. The same steps need to be repeated for the execution node. As we run at least one execution node per tenant we’ll get multiple units for now. After bringing up multiple execution nodes and the frontend the list of units looks like:

 
$ fleetctl list-units
UNIT                    MACHINE                         ACTIVE  SUB
exec_node_user1.service b9239746.../172.17.8.101        active  running
exec_node_user2.service b9239746.../172.17.8.101        active  running
frontend.service        b9239746.../172.17.8.101        active  running
mongo.service           b9239746.../172.17.8.101        active  running
rabbitmq.service        b9239746.../172.17.8.101        active  running
[...]

Now your distributed Python app is happily running on a CoreOS cluster.

Some notes

Container building can be repeated without the need to destroy: docker build -t engjoy/suricate .
Getting the log output of container to check why the python app crashed: docker logs <container name>
Sometimes it is handy to test the docker run command before defining the unit files in CoreOS
Mongo storage should be shared – do this by adding the following to the docker run command: -v <db-dir>:/data/db
fleetctl destroy <unit> and list-units are you’re friends 🙂
The files above with simplified scheduling & authentication examples can be found here.

Categories: Personal • Tags: Analytics, Cloud, CoreOS, Data Science, Python, Software Engineering, Tech • Permalink for this article

SmartStack = SmartOS + OpenStack (Part 3)

February 28th, 2012 • Comments Off

This is the third part os the blog post series. Previous parts can be found here: 1 2

To ensure the proper startup of nova-compute we will register it using SMF – this will also increase the dependability of the services. To start we will define a properties file for nova-compute – we will be using a glance and rabbitMQ host which run on a different host:

--connection_type=fake
--glance_api_servers=192.168.56.101:9292
--rabbit_host=192.168.56.101
--rabbit_password=foobar
--sql_connection=mysql://root:foobar@192.168.56.101/nova

We will store this contents in the file /data/workspace/smartos.cfg. We will also create an XML file with the manifest definition – it has some dependencies defined:

<?xml version='1.0'?>
<!DOCTYPE service_bundle SYSTEM '/usr/share/lib/xml/dtd/service_bundle.dtd.1'>
<service_bundle type='manifest' name='openstack'>
  <service name='openstack/nova/compute' type='service' version='0'>
    <create_default_instance enabled='true'/>
    <single_instance/>
    <dependency name='fs' grouping='require_all' restart_on='none' type='service'>
      <service_fmri value='svc:/system/filesystem/local'/>
    </dependency>
    <dependency name='net' grouping='require_all' restart_on='none' type='service'>
      <service_fmri value='svc:/network/physical:default'/>
    </dependency>
    <dependency name='zones' grouping='require_all' restart_on='none' type='service'>
      <service_fmri value='svc:/system/zones:default'/>
    </dependency>
    <exec_method name='start' type='method' exec='/usr/bin/nohup /data/workspace/nova/bin/nova-compute --flagfile=/data/workspace/smartos.cfg &amp;' timeout_seconds='60'>
      <method_context>
        <method_environment>
          <envvar name="PATH" value="/ec/bin/:$PATH"/>
        </method_environment>
      </method_context>
    </exec_method>
    <exec_method name='stop' type='method' exec=':kill' timeout_seconds='60'>
      <method_context/>
    </exec_method>
   <stability value='Unstable' />
  </service>
</service_bundle>

This can be imported using the svccfg command. After doing so we can use al the known commands to verify all is up and running:

[root@08-00-27-e3-a9-19 /data/workspace]# svcs -p openstack/nova/compute
STATE          STIME    FMRI
online         14:04:52 svc:/openstack/nova/compute:default
               14:04:51     3142 python

Now up to integrate vmadm…

Categories: Personal • Tags: Cloud, SmartOS • Permalink for this article

SmartStack = SmartOS + OpenStack (Part 2)

February 15th, 2012 • 2 Comments

This is the second post in this series. Part 1 can be found here.

After setting up nova on SmartOS you should be able to launch the nova-compute service – Andy actually managed to get the nova-compute instance up and running:

./bin/nova-compute --connection_type=fake --sql_connection=mysql://root:admin@192.168.0.189/nova --fake_rabbit
2012-02-15 10:26:32,205 WARNING nova.virt.libvirt.firewall [-] Libvirt module could not be loaded. NWFilterFirewall will not work correctly.
2012-02-15 10:26:32,374 AUDIT nova.service [-] Starting compute node (version 2012.1-LOCALBRANCH:LOCALREVISION)
2012-02-15 10:26:33,099 INFO nova.rpc [-] Connected to AMQP server on localhost:5672

Note that for now we will use a fake RabbitMQ connection. The connection to the MySQL server however is mandatory.

This is the first step in plan to get OpenStack running on SmartOS. Next step will be to look into integration vmadm as a driver in OpenStack. The overall Ideas we have can be found in this slideset on google docs.

Categories: Personal • Tags: Cloud, SmartOS • Permalink for this article

SmartStack = SmartOS + OpenStack (Part 1)

February 12th, 2012 • 4 Comments

[Update] – The first shot didn’t work – so I updated this post!

This is the first post of hopefully a series of post about getting OpenStack up and running on SmartOS. There is a blueprint for this effort. For this first post we will focus on setting up SmartOS and try to get some needed packages up and running.

I used the the following SmartOS iso image which will you – when booted – guide you through a small setup routine: smartos-20120210T043623Z.iso. After this you need to configure the pkg repositories as described in this tutorial. I encourage you all to have two discs attached – One for the zones – and one for the data. So since the setup routine created the zones pool I will first create a data pool on the second disc:

zpool create data c0t2d0
zfs create data/ec
zfs set mountpoint=/ec data/ec
zfs mount data/ec
cd / && wget -O- -q http://svn.everycity.co.uk/public/solaris/misc/ \
                    pkg5-smartos-bootstrap-20111221.tar.gz | gtar -zxf-

One major dependency is python, gcc and some tools and libs – and we will also need git so we can get the source code of nova later on:

/ec/bin/pkg install pkg:/database/mysql-55/client@5.5.16-0.162 \
                    pkg:/library/python/mysqldb@1.2.3-0.162
/ec/bin/pkg install python26 git setuptools-26 gcc-44 libxslt gnu-binutils

Next step will be to create a workspace file-system using ZFS (Optional) – this will allow me to do rollbacks later on during the porting of OpenStack:

zfs create data/workspace
cd /data/workspace/

Let’s clone nova and install pip:

export PATH=/ec/bin:$PATH
easy_install pip
git clone git://github.com/tmetsch/nova.git
cd nova/tools
export CFLAGS="-D_XPG6 -std=c99"
pip install -r pip-requires

Now let’s see if we can get the test up and running:

cd ..
./run_tests.sh

This will do for the first steps – I’ll be going to get the dependencies of nova up and running next – Will than post the next blog post.

Percent done: 1%

Categories: Personal • Tags: Cloud, SmartOS • Permalink for this article

Coffee instead of snakes – Openshift fun (2)

February 8th, 2012 • Comments Off

In my last post I demoed how OpenShift can be used to deploy WSGI based Python application. But since OpenShift also supports other languages including Java I wanted to give it a shot.

So I developed a very minimalistic application which emulates a RESTful interface. Of course it takes the simplistic – all time favorite – Hello World approach for this. Client can create resources, which when queried will return the obligatory ‘Hello <resource name>’.

To start create a new java application in your OpenShift dashboard. When cloning the git repository – which was created – you will notice that you basically get a maven project with some templates in it. Now you can simple use maven to test and develop your application. When done do a git push and your application is ready to go.

The implementation is pretty straight forward:

/**
 * Simple REST Hello World Service.
 *
 * @author tmetsch
 *
 */
public class HelloServlet extends HttpServlet {
    [...]
    @Override
    protected final void doGet(final HttpServletRequest req,
            final HttpServletResponse resp) throws ServletException,
            IOException {
            [...]
    }
    [...]

So beside from extending the HttpServlet class all you need to do is implement the doGet and doPost methods. When done edit the ‘web.xml’ file in the folder src/main/webapp:

[...]
<servlet>
    <servlet-name>HelloWorld</servlet-name>
    <servlet-class>the.ultimate.test.pkg.HelloServlet</servlet-class>
</servlet>
<servlet-mapping>
    <servlet-name>HelloWorld</servlet-name>
    <url-pattern>/users/*</url-pattern>
</servlet-mapping>
[...]

This will make sure you the application can be found under the right route. If you like you can also add an index.html file in the same folder so people can read a brief introduction when visting the top layer of you application. In my case that would be http://testus-sample.rhcloud.com – The application itself can be found under: http://testus-sample.rhcloud.com/users/.

So I like the approach OpenShift takes here with maven. You can simply add you dependencies like jmock, junit etc. and during deployment it is taken care of that everything falls in place. Also if you write Unittest it’ll also ensure that you’re application will work – Non functional apps will not be deployed obviously if you write a Unittest and use maven. It’s pretty easy to write Unittest for the HttpServlet class if you use a mocking framework like jmock. You can get the source code at github.

Categories: Personal • Tags: Cloud, Java • Permalink for this article