Running a distributed native-cloud python app on a CoreOS cluster
September 21st, 2014 • 1 CommentSuricate is an open source Data Science platform. As it is architected to be a native-cloud app, it is composed into multiple parts:
- a web frontend (which can be load-balanced)
- execution nodes which actually perform the data science tasks for the user (for now each user must have at least one exec node assigned)
- a mongo database for storage (which can be clustered for HA)
- a RabbitMQ messaging system (which can be clustered for HA)
Up till now each part was running in a SmartOS zone in my test setup or run with Openhift Gears. But I wanted to give CoreOS a shot and slowly get into using things like Kubernetes. This tutorial hence will guide through creating: the Docker image needed, the deployment of RabbitMQ & MongoDB as well as deployment of the services of Suricate itself on top of a CoreOS cluster. We’ll use suricate as an example case here – it is also the general instructions to running distributed python apps on CoreOS.
Step 0) Get a CoreOS cluster up & running
Best done using VagrantUp and this repository.
Step 1) Creating a docker image with the python app embedded
Initially we need to create a docker image which embeds the Python application itself. Therefore we will create a image based on Ubuntu and install the necessary requirements. To get started create a new directory – within initialize a git repository. Once done we’ll embed the python code we want to run using a git submodule.
$ git init $ git submodule add https://github.com/engjoy/suricate.git
Now we’ll create a little directory called misc and dump the python scripts in it which execute the frontend and execution node of suricate. The requirements.txt file is a pip requirements file.
$ ls -ltr misc/ total 12 -rw-r--r-- 1 core core 20 Sep 21 11:53 requirements.txt -rw-r--r-- 1 core core 737 Sep 21 12:21 frontend.py -rw-r--r-- 1 core core 764 Sep 21 12:29 execnode.py
Now it is down to creating a Dockerfile which will install the requirements and make sure the suricate application is deployed:
$ cat Dockerfile FROM ubuntu MAINTAINER engjoy UG (haftungsbeschraenkt) # apt-get stuff RUN echo "deb http://archive.ubuntu.com/ubuntu/ trusty main universe" >> /etc/apt/sources.list RUN apt-get update RUN apt-get install -y tar build-essential RUN apt-get install -y python python-dev python-distribute python-pip # deploy suricate ADD /misc /misc ADD /suricate /suricate RUN pip install -r /misc/requirements.txt RUN cd suricate && python setup.py install && cd ..
Now all there is left to do is to build the image:
$ docker build -t docker/suricate .
Now we have a docker image we can use for both the frontend and execution nodes of suricate. When starting the docker container we will just make sure to start the right executable.
Note.: Once done publish all on DockerHub – that’ll make live easy for you in future.
Step 2) Getting RabbitMQ and MongoDB up & running as units
Before getting suricate up and running we need a RabbitMq broker and a Mongo database. These are just dependencies for our app – your app might need a different set of services. Download the docker images first:
$ docker pull tutum/rabbitmq $ docker pull dockerfile/mongodb
Now we will need to define the RabbitMQ service as a CoreOS unit in a file call rabbitmq.service:
$ cat rabbitmq.service [Unit] Description=RabbitMQ server After=docker.service Requires=docker.service After=etcd.service Requires=etcd.service [Service] ExecStartPre=/bin/sh -c "/usr/bin/docker rm -f rabbitmq > /dev/null ; true" ExecStart=/usr/bin/docker run -p 5672:5672 -p 15672:15672 -e RABBITMQ_PASS=secret --name rabbitmq tutum/rabbitmq ExecStop=/usr/bin/docker stop rabbitmq ExecStopPost=/usr/bin/docker rm -f rabbitmq
Now in CoreOS we can use fleet to start the rabbitmq service:
$ fleetctl start rabbitmq.service $ fleetctl list-units UNIT MACHINE ACTIVE SUB rabbitmq.service b9239746.../172.17.8.101 active running
The CoreOS cluster will make sure the docker container is launched and RabbitMQ is up & running. More on fleet & scheduling can be found here.
This steps needs to be repeated for the MongoDB service. But afterall it is just a change of the Exec* scripts above (Mind the port setups!). Once done MongoDB and RabbitMQ will happily run:
$ fleetctl list-units UNIT MACHINE ACTIVE SUB mongo.service b9239746.../172.17.8.101 active running rabbitmq.service b9239746.../172.17.8.101 active running
Step 3) Run frontend and execution nodes of suricate.
Now it is time to bring up the python application. As we have defined a docker image called engjoy/suricate in step 1 we just need to define the units for CoreOS fleet again. For the frontend we create:
$ cat frontend.service [Unit] Description=Exec node server After=docker.service Requires=docker.service After=etcd.service Requires=etcd.service [Service] ExecStartPre=/bin/sh -c "/usr/bin/docker rm -f suricate > /dev/null ; true" ExecStart=/usr/bin/docker run -p 8888:8888 --name suricate -e MONGO_URI=<change uri> -e RABBITMQ_URI=<change uri> engjoy/suricate python /misc/frontend.py ExecStop=/usr/bin/docker stop suricate ExecStopPost=/usr/bin/docker rm -f suricate
As you can see it will use the engjoy/suricate image from above and just run the python command. The frontend is now up & running. The same steps need to be repeated for the execution node. As we run at least one execution node per tenant we’ll get multiple units for now. After bringing up multiple execution nodes and the frontend the list of units looks like:
$ fleetctl list-units UNIT MACHINE ACTIVE SUB exec_node_user1.service b9239746.../172.17.8.101 active running exec_node_user2.service b9239746.../172.17.8.101 active running frontend.service b9239746.../172.17.8.101 active running mongo.service b9239746.../172.17.8.101 active running rabbitmq.service b9239746.../172.17.8.101 active running [...]
Now your distributed Python app is happily running on a CoreOS cluster.
Some notes
- Container building can be repeated without the need to destroy: docker build -t engjoy/suricate .
- Getting the log output of container to check why the python app crashed: docker logs <container name>
- Sometimes it is handy to test the docker run command before defining the unit files in CoreOS
- Mongo storage should be shared – do this by adding the following to the docker run command: -v <db-dir>:/data/db
- fleetctl destroy <unit> and list-units are you’re friends 🙂
- The files above with simplified scheduling & authentication examples can be found here.