Automating ML workflows: a report from the trenches — Jose A. Ortega Ruiz

ML services are quickly becoming a commodity, and they will be taken for granted by developers and computer users alike in the near future. The building blocks for ML as an ubiquitous service are already in place, almost always in the form of remote APIs that provide a first level of abstraction over ML problem-solving and, specially, obviate scalability and resource allocation issues. But that's not enough: those building blocks still leak implementation details inessential to the application developer that needs to provide domain-specific solutions. We need to ascend a couple of rungs in the abstraction ladder and provide domain-specific languages to describe ML solutions without nitty-gritty details unrelated to the problem at hand, offering non-experts the possibility of automating their ML solutions. In this talk, we'll discuss our experience designing and developing BigML's data wrangling and ML workflow DSLs, Flatline and WhizzML, and how they generalize to similar ML services and APIs.

Jose A. Ortega Ruiz is part of the founding team of BigML, a little startup trying to apply machine learning and other AI techniques to big data, and make them accessible to non-specialists. He was hacking for Oblong from 2008 to early 2011. Before that, he worked for Google (from July 2007). From June 2005 to May 2007, he worked on embedded software development for the scientific payload of LISA Pathfinder. He was a theoretical physicist in a previous life, and wrote a Ph. D. thesis on gravitational wave detectors. He also got a bachelor’s degree in computer science. Between 2003 and 2005, he taught courses on programming and computer networks at the Universitat Autonoma of Barcelona, where he was part of the mobile agents research group.

Twitter: @jaotwits - Linkedin