Programming

Random Forests now available

The Random Forest algorithm is much described elsewhere, but in short it is a very good choice for prediction problems that involves ensemble learning (aggregating a combination of several models to solve a prediction problem). It is very easy to setup and use, and can be great for classification. I was inspired by Pascal van Kooten’s whereami package (which was actually inspired by FIND!), to implement Random Forests as one of the machine learning algorithms available.

New Python package based on FIND

Python has come back to FIND. Previous FIND contributor, kootenpv did a great job making a cool new Python package for doing indoor positioning: whereami. The package looks really cool, and I think I will try implementing random forests as well. For a more in-depth writeup, check out Pascal van Kooten’s blog: https://kootenpv.github.io/2016-09-19-predict-where-you-are-indoors.

Python -> Go

I’ve rewritten FIND in Go because the server is a bit faster (see below), but the real reason is that it saves me $$ because I can run it on the cheapest Digital Ocean (DO) droplet. I run the FIND server on a droplet with along with a half dozen other services. I only have about 20% of a 500MB of memory to use on my DO machine.

The FIND stack

There are several pieces to our code. The majority of our program is written in Python and Javascript. The main machine-learning server is the following. Flask for routing Flask has served really nicely for fast prototyping. However there are known problems for using Flask as a production environment. Tornado for production To avoid Flask production problems, we use Tornado for a WSGI container. It works nicely, and is async so it can support lots of connections.