Contextual Bandit Learning in Real-World Applications - Marco Rossi (Microsoft Research)
Oct
25
3:30 PM15:30

Contextual Bandit Learning in Real-World Applications - Marco Rossi (Microsoft Research)

Applications and systems are constantly faced with decisions that require picking from a set of actions based on contextual information. Reinforcement-based learning algorithms such as contextual bandits can be very effective in these settings, but applying them in practice faces fundamental challenges, and no general system exists that supports them completely. To address this problem, we created the Decision Service: the first end-to-end system for contextual learning. The Decision Service enables all aspects of contextual bandit learning using a loop of four system abstractions: explore (the decision space), log, learn, and deploy. Notably, our new explore and log abstractions ensure the system collects correct, unbiased data, enabling online learning and offline experimentation. The Decision Service has a simple user interface and has been applied with strong results in a variety of settings, such as content recommendation, revenue lift in landing page, tech support, and machine failure handling.

 

Marco Rossi is a Sr. Data Scientist working in online learning at Microsoft Research New York City. Previously, he was a Sr. Researcher at a computer vision start-up, where he used unsupervised learning to design image recognition algorithms. He received his undergraduate and graduate degrees in Telecommunications Engineering from Politecnico di Milano, and he obtained his PhD in Electrical Engineering from NJIT. He is passionate about using data and Machine Learning to address real-world problems.

 

View Event →
DragonPaint – Bootstrapping Small Data to Color Cartoons - K. Gretchen Greene
Oct
25
2:15 PM14:15

DragonPaint – Bootstrapping Small Data to Color Cartoons - K. Gretchen Greene

Let the geometry of dragons create your training set for you.

The creation of sufficient quantities of labeled training data is one of the biggest challenges for machine learning applications, especially when the data itself must be created, not just labeled.

DragonPaint presents a generalizable strategy for minimizing the manual creation of data using rule based algorithms to automate the creation of a restricted subset of data and then bootstrapping to the automated creation of unrestricted (rule breaking) training and test data.

A gentle introduction to computer vision, graphics and Machine Learning, accessible to beginners.

A former ship designer, national lab mathematician and Hollywood special effects artist, Gretchen Greene is a computer vision scientist and machine learning engineer working with Cambridge startups on everything from wearables to welding. Greene has been interviewed by Forbes China, the Economist and the BBC and her plasma cut steel sculptures have appeared in Architectural Digest and sold to Dior. Greene has a CPhil and MS in math from UCLA and a JD from Yale.
 

View Event →
Deep Learning and Real-Time Flight Prediction - Sam Zimmerman (Freebird)
Oct
25
2:15 PM14:15

Deep Learning and Real-Time Flight Prediction - Sam Zimmerman (Freebird)

Which of the over 30 million commercial flights in the US will get actually delayed or cancelled? Freebird has built a business based on using data science to answer that question. Learn how co-founder and CTO Sam Zimmerman and his team have approached this problem by building a real-time predictive analytics engine based on dynamic data sets and Deep-Learning algorithms. This talk focuses on experiments the Freebird team has done to model the both pointwise and aggregative flight delay risk using various deep learning approaches and feature representation techniques in a risk management context.

Sam Zimmerman is software developer and data scientist with extensive experience in the commercial application of Machine-Learning algorithms. Prior to Freebird, Sam worked as a quantitative risk analyst in the currency markets and as a team lead automating a large-scale data classification problem for an energy intelligence company. Sam is a Duke grad, and works on a grant with MIT’s Computational Cognitive Science lab to extend decision theory using Machine Learning and Artificial Intelligence.

View Event →
Rent, Rain, and Regulations: predicting crime using ML — Jorie Koster-Hale (Dataiku)
Oct
25
1:45 PM13:45

Rent, Rain, and Regulations: predicting crime using ML — Jorie Koster-Hale (Dataiku)

Crime poses a particularly interesting data challenge — it has both geospatial and temporal dimensions, and may be affected by many different types of features — weather, city infrastructure, population demographics, public events, and government policy. Here, I show that a combination of machine learning, time series modeling, and geostatistics is more effective at predicting future crime than any of these techniques alone. Using a variety of public data sets, including police reports, the US census, Foursquare, newspapers, and the weather, I discuss how to merge, visualize, model, and deploy this type of multi-dimensional data, using PostGIS, spatial mapping, time-series analyses, dimensionality reduction, machine learning, and a public REST API. With this model, I ask where crime will occur next, what predicts it, and what we can do to prevent it in the future.

Jorie Koster-Hale a broadly-trained data scientist at Dataiku with expertise in neuroscience, healthcare data, and machine learning. Prior to joining Dataiku, she completed her Ph.D. in Cognitive Neuroscience at Massachusetts Institute of Technology and worked as a Postdoctoral Fellow at Harvard. Jorie currently resides in Paris, where she builds predictive models and eats pain au chocolat.

View Event →
Learning artistic style for real-time stylization of video - Jeffrey Rainy (Element AI)
Oct
25
1:45 PM13:45

Learning artistic style for real-time stylization of video - Jeffrey Rainy (Element AI)

We present a technique for learning the artistic style of a painting and applying it to a real-time video feed. The video feed is stylized with a convolutional neural network so that it gets the style of the original painting.

Our improvement over previous style-transfer works are two-fold:
- Stablizing the stylization so that the stylized video has temporal coherence. This is done with regularization during training. It prevents style features from jumping around from frame to frame. 
- Using a pipeline for performing the stylization in real-time, with low-latency, by keeping the data on the GPU from end-to-end.

Jeffrey Rainy is an Applied Research Scientist at Element AI. He is interested in applying Artifical Intelligence and Machine Learning to real-world challenges. Background in software development, games, machine learning, fintech and self-driving vehicles.

View Event →
BigData to Rescue Anomaly Detection - Ken Park (Knowru)
Oct
25
11:45 AM11:45

BigData to Rescue Anomaly Detection - Ken Park (Knowru)

Often rarely-happening events become the focus of a predictive analytics project. For example, fraudulent card transactions and web ad clicks are events that happen seldom but to businesses they mean opportunities to learn.
Nonetheless, modeling them is a daunting task. For instance, machine learning models might always conclude every event is not abnormal in their attempts to minimize errors. Even though some of techniques such as re-sampling can alleviate the issue, if data is too imbalanced, these could still fail.
The talk presents a novel approach to the problem. Basically, the approach is a brute-force method that finds all combinations of values that lead to substantial sample size with many anomaly cases. It uses BigData to shorten the calculation time. Pros, cons and use cases will be discussed during the talk.

Ken Park is a CEO of Knowru, a company providing useful tools to data scientists and developers. Before he started Knowru, he led teams of data scientists and fraud investigators to fight fraud in a leading online finance company. He studied Engineering at Northwestern University for his Bachelor's and Computer Science at the University of Chicago for his Master's. He likes to travel and complete marathon and triathlon events.

View Event →
Supercharging Deep Learning with the Unity Engine - Arthur Juliani (Unity Technologies)
Oct
25
11:45 AM11:45

Supercharging Deep Learning with the Unity Engine - Arthur Juliani (Unity Technologies)

At Unity Technologies we are building tools to enable researchers, industry, and hobbyists to build Deep Learning models that interact with games and simulations created using Unity. We will walk through recent developments in Deep Reinforcement Learning, and show how training environments built with Unity can push the field even further. We will then flip the story around, and show how everything from games to simulations for robotics and self-driving cars can take advantage of the integration our APIs to enhance those applications by providing more dynamic gameplay or simulator behavior.

Arthur Juliani is a Senior Machine Learning Engineer at Unity Technologies.

View Event →
Automated Machine Learning:  Mostly Unhelpful - Charles Parker (BigML)
Oct
25
11:15 AM11:15

Automated Machine Learning: Mostly Unhelpful - Charles Parker (BigML)

A fair amount of machine learning research in recent years has focused around "automated machine learning", in which the computer itself attempts to accomplish many of the engineering and optimization steps usually left to the human programmer. Automating away tasks that are time-consuming or complex is always a worthwhile idea, but how much more useful do these automations really make machine learning?  In this talk, I'll argue that  automated machine learning is only a weak proxy for what we really want, and that recent methods don't get us much closer to the true goal.

Charles Parker is the Vice President of Machine Learning Algorithms at BigML.  He holds a Ph.D. in computer science from Oregon State University.  He was previously a research associate at the Eastman Kodak Company where he applied machine learning to image, audio,
video, and document analysis.  He also worked as a research analyst for Allston Holdings, a proprietary stock trading company, developing statistically-based trading strategies for U.S. and European futures markets.  His current work for BigML is in the areas of Deep Learning
and Bayesian Parameter Optimization.

View Event →
Dumpster Fire to Lit: Time-Series Data in Amazon DynamoDB - John Bledsoe (Nexosis)
Oct
25
11:15 AM11:15

Dumpster Fire to Lit: Time-Series Data in Amazon DynamoDB - John Bledsoe (Nexosis)

Amazon DynamoDB promises “single-digit millisecond latency at any scale”, and it can deliver on that promise IF you avoid certain mistakes in the design of the system using it. Learn from one engineer's experience implementing an ML platform backed by DynamoDB, so that you can avoid the pitfalls he encountered and reap the benefits of performance and scalability sooner in your development cycle. You will learn both techniques for leveraging the strengths of DynamoDB and overcoming its weaknesses, as well as indicators to tell you whether or not DynamoDB is a good choice for your system.

John Bledsoe is currently Principal Software Engineer at Nexosis, where they are building a machine learning platform for everyday developers. He has been building software primarily with .NET for 16+ years, and at various points has focused on front-end web clients, middle-tier business services, and back-end database implementations. His vocational passion centers around data structures, design patterns and elegant software solutions.

View Event →
LIGHTNING TALKS
Oct
25
10:00 AM10:00

LIGHTNING TALKS

  • Marvin platform - From exploratory models to production - Lucas Bonatto Miguel (B2W Digital) 
  • AI-Powered Process Improvement - Christine Custis (Shenandoah University)
  • Mining Digital Breadcrumbs: ML to Understand Humanity - Sharon Xu (M.I.T.)
  • Robots need love too — Empathy Mapping for the Machine - Chris Butler (Philosophie)
  • Scaling your API Production Strategy - 5 things you can do today - Emmanuel Paraskakis (Oracle)
View Event →
Beyond prediction: structural modeling as a tool - James Savage (Lendable)
Oct
25
9:30 AM09:30

Beyond prediction: structural modeling as a tool - James Savage (Lendable)

Huge progress has been made in commodifying both predictive modeling and A/B testing. Less progress has been made in productizing so-called structural models---models based on sufficiently rigorous microfoundations to be able to make informed judgements about the behavior of a system away from the training cases. 

In this talk, Jim will walk through a couple of products based on such modeling, including a tool to assist in pricing large portfolios of imperfectly substitutable products, and a tool that can be used to design financial products. 

Jim Savage is an applied modeler and Data Science Lead at frontier markets lender Lendable in New York City. Previously he was at the Grattan Institute, La Trobe University, and the Australian Treasury. With Andrew Gelman, Shoshana Vasserman and David Stephan, he is currently writing a book on Bayesian Econometrics in Stan.

View Event →
Bringing powerful artificial intelligence to all developers - Cedric Archambeau (Amazon)
Oct
24
4:15 PM16:15

Bringing powerful artificial intelligence to all developers - Cedric Archambeau (Amazon)

In this talk, I will give an overview of Amazon AI, a suite of services that bring natural language understanding (NLU), automatic speech recognition (ASR), visual search and image recognition, text-to-speech (TTS), and machine learning (ML) technologies within the reach of every developer. Based on the same proven, highly scalable products and services built by the thousands of machine learning and AI experts across Amazon, Amazon AI services enables developers to build smart applications that rely on high-quality, scalable and cost-effective AI capabilities.

Cedric Archambeau is a Principle Applied Scientist at Amazon, Berlin. He is a technical lead in the Core Machine Learning organization and served as a scientific advisor to Sebastian Gunningham, Amazon Senior Vice President Seller Services. Recently, his team delivered the learning algorithms offered in Amazon Machine Learning, which is part of Amazon AI . Prior to joining Amazon, he was managing the Machine Learning and Mechanism Design area at Xerox Research Centre Europe, Grenoble.

View Event →
Flexible and Scalable Deep Learning with MMLSpark - Mark Hamilton (Microsoft)
Oct
24
3:45 PM15:45

Flexible and Scalable Deep Learning with MMLSpark - Mark Hamilton (Microsoft)

This talk will showcase deep learning at massive scales using Microsoft’s new open source library, MMLSpark. This library combines flexible deep nets in CNTK with fault-tolerant distributed computing on Spark, allowing users to easily perform large scale network inference. MMLSpark also introduces several new models and API improvements for the SparkML ecosystem. In particular, one model leverages pre-trained CNTK networks to intelligently featurize image data without complex hyper parameter tuning. We apply this library to help the Snow Leopard Trust automatically identify snow leopards in a remote monitoring system.

Mark Hamilton is a software engineer at Microsoft's Azure Machine Learning group in Cambridge MA. Here, he works on integrating the deep learning framework CNTK with the distributed computing framework Spark. He graduated from Yale University in 2016 where he studied physics, mathematics, and automated theorem proving. His current academic research mainly focuses on deep learning, unsupervised learning, and NLP.

 

View Event →
 Predictive Fuel consumption Analysis Application - B K Ramesh ( IntelliPredikt Technologies)
Oct
24
2:30 PM14:30

Predictive Fuel consumption Analysis Application - B K Ramesh ( IntelliPredikt Technologies)

Generally Vehicle Fuel Consumption Models are incapable of accurately predicting Fuel consumption for on road measurement, Most existing fuel consumption models are based on steady-state fuel mapping and these models cannot provide satisfactory predictions for vehicles operating under transient conditions. The objective is to characterize transient engine behaviour for fuel consumption modelling and use it for transient corrections to provide accurate and scalable fuel consumption prediction using on road driving cycle data.which will enable to match or predict the output parameters such as fuel economy and the soot collected over the cycle.

B K Ramesh has done his graduation in Electronics and Communication Engineering. He has 25+ years Technology and management experience in Automotive, Robotics and Industrial Control and Communication Systems with leading companies like Bharat Electronics, Motorola, Dearborn Electronics. Currently he is the Co-Founder and Director of IntelliPredikt Technologies and is actively involved in developing Machine Learning based predictive applications for Automotive and Industrial domain.

View Event →
Putting the P in A(P)I: Why APIs are key to make AI scale - Tatiana Mejia
Oct
24
2:30 PM14:30

Putting the P in A(P)I: Why APIs are key to make AI scale - Tatiana Mejia

At Adobe we have more than 500 engineers and data scientists working on features that use machine learning and AI. In fact, AI is not just the fastest growing skill set, it is also one of Adobe's four innovation drivers. With Adobe Sensei, we want to democratize AI, so that intelligence can be part of every app, every tool, and every experience. The way to make AI scale at Adobe: APIs. We share our lessons learned in weaving AI into all our technology, what it takes to build an API layer for AI, and how to market AI at a Fortune 500 scale.

Tatiana Mejia leads product marketing for Adobe Sensei Services. She has over 15 years experience in machine learning, digital marketing, social marketing, and SaaS. Tatiana holds an MBA from the Stanford Graduate School of Business.

View Event →
Predicting Remaining Useful Life using IoT - Adarsh Narasimhamurthy (MathWorks)
Oct
24
2:00 PM14:00

Predicting Remaining Useful Life using IoT - Adarsh Narasimhamurthy (MathWorks)

Predictive maintenance (PM) enables timely scheduling of maintenance by tracking the condition of a machine. Traditionally, you implement a PM system by manually collecting data stored locally on the machines, and analyzing it at a remote location. This requires significant time and resources. With the advent of IoT, you can create PM applications for near real-time system monitoring. Analytics running on the cloud estimate complex system characteristics to predict remaining useful life and generate alerts.

ThingSpeak is an IoT Analytics platform from MathWorks, makers of MATLAB. It enables you to collect data from your devices in real-time and rapidly prototype online PM applications. Most importantly, ThingSpeak removes the burden of standing up an IoT infrastructure and lets you focus on the algorithmic side of the problem. ThingSpeak analytics also provides MATLAB in the cloud.

Come, learn more about ThingSpeak and how you can benefit from using it for your IoT application.

 

Adarsh Narasimhamurthy is a senior engineer at MathWorks whose main area of focus is IoT analytics. He has worked in all of the key IoT domains including hardware connectivity, cloud platform for data collection & analysis, and desktop applications for exploratory analytics. He holds a PhD in Electrical Engineering from Arizona State University. He is a coauthor of the book titled “OFDM Systems for Wireless Communications” and has written several articles on analyzing IoT data.

View Event →
APIs and DSLs for Building and Integrating Many Models - Harlan Harris (WayUp)
Oct
24
2:00 PM14:00

APIs and DSLs for Building and Integrating Many Models - Harlan Harris (WayUp)

Enterprise businesses often build separate, customized versions of vertical-specific models for each customer, requiring tight workflows and tooling to maintain velocity and quality. This talk describes the architecture and tools we built at EAB, an ed-tech company, to integrate many models predicting student graduation into an application. I'll provide guiding principles and how those led to use of a commercial API provider as well as a home-built DSL, a command-line workflow, and web apps for data and model validation.

Harlan D. Harris has a PhD in Computer Science/Machine Learning from the University of Illinois at Urbana-Champaign, and worked as a Cognitive Psychology researcher before turning to industry. He has worked at Kaplan Test Prep, the Advisory Board Company, WeWork and several startups in New York and DC. Harlan also co-founded the Data Science DC Meetup and Data Community DC, Inc., and co-wrote O'Reilly's Analyzing the Analyzers, a short e-book about the variety of data scientists.

View Event →
Pitch Prediction for the World Series - Dennis Oleksyuk (DataRobot)
Oct
24
1:30 PM13:30

Pitch Prediction for the World Series - Dennis Oleksyuk (DataRobot)

MLB data is abundant and free.  We built an application that calculates live pitch predictions and streams them during MLB games. It demonstrates how a business can build a highly scalable predictive software using data science automation and available SaaS infrastructure. Plus, it is a cool demonstration of the machine learning for the general public.

Dennis Oleksyuk has over fourteen years of experience creating software solutions to solve large-scale, real-world problems. Dennis is currently the Director at DataRobot Labs, focused on building end-to-end solutions which incorporate DataRobot into the customers' business processes. Inside of work he can often be found pushing and reviewing code. Outside of work he can often be found playing outside with kids and doing mad science experiments, occasionally simultaneousl

View Event →
Model as a Service up and running in AWS - Lia Bifano (Nubank)
Oct
24
1:30 PM13:30

Model as a Service up and running in AWS - Lia Bifano (Nubank)

Predictive models has significant leverage in business value, such as finding the right product faster to a client, understanding customer behavior, personalizing customer service, among others. However if they aren’t deployed in a productive environment that allows to take decisions quickly, they might become useless. Besides that, it is important create an infrastructure that is possible retrain fast and supports A/B tests. The goal of this presentation is give a overview of best practices and show from scratch how a trained model can be deployed as a service in AWS. 
At the end of this presentation you’ll know what are the steps to deploy a model in AWS and we’ll walkthrough in reproducible scripts that deploys the model using docker and AWS command line clients. And of course, all the code is available in GitHub.

Lia is pursuing a Master's degree at EPFL and in the last two years she worked with Nubank team as a Data Scientist. At Nubank she developed predictive models and ETL pipelines and her previous work experience was at Itaú Bank as a Business Analyst. She has BS (2012) in Statistics from Unicamp and at university she worked with models to predict joined market volatility using copulas and with MCMC methods for Bayesian inference.

View Event →
An API for categorical text classification for news articles - Thomas Boquet (Element AI)
Oct
24
11:30 AM11:30

An API for categorical text classification for news articles - Thomas Boquet (Element AI)

We present our experience in delivering Machine Learning as a Service for categorical and multi-categorical text classification for a major Canadian news organization. We will walk you through our processes for productizing some of the most commonly use Machine Learning techniques for text classification, in a way that is easy to consume by end-users. In about 20 mins you will get to know how we delivered a production-ready API by using state of the art software and development tools and best practices to ensure the quality of the product.

 

Thomas Boquet is an Applied Research Scientist at Element AI

View Event →
Making Business More Bayesian: From Uncertainty to Action - Richard Tibbetts (Empirical Systems)
Oct
24
11:30 AM11:30

Making Business More Bayesian: From Uncertainty to Action - Richard Tibbetts (Empirical Systems)

Businesses have spent decades trying to make better decisions through the complete understanding of data. New technologies make Bayesian inference and generative modeling more accessible to business analysts. But the ability to rapidly quantify uncertainty, simulate new data, and understand direction, magnitude, and confidence of effects creates new communications challenges. This talk presents techniques for capturing domain knowledge and making findings actionable for decision makers.

Richard Tibbetts is CEO of Empirical Systems, an AI startup. He was founder and CTO at StreamBase, a CEP company that merged with TIBCO in 2013, as well as a visiting scientist at the Probabilistic Computing Project at MIT. He holds a MEng in computer science and a BS in computer science and electrical engineering from MIT.

View Event →
Real world Turing test, when AI answers phone calls - Vincent Van Steenbergen
Oct
24
11:00 AM11:00

Real world Turing test, when AI answers phone calls - Vincent Van Steenbergen

In this talk, we'll see how we went to develop an advanced phone call handling platform using speech-to-text, text-to-speech, NLP and sentimental analysis APIs to answer calls, analyse the query and come up with a good answer (if possible) while we determine the overall satisfaction level of the call. We'll illustrate this with chunks of code and a live phone call to explain the internals of the call handling platform.

Vincent Van Steenbergen is a senior (big) data engineer who's working on systems able to handle terabytes of data, usually involving Spark, Scala, Kafka, Hadoop and Cassandra. His main interest right now is applying these techniques to solve machine learning problems. Vincent was previously a technical architect at Property.Works, a real estate startup in London and before that an R&D engineer at IDAaaS.

View Event →
Behind the AI Curtain: Designing for Trust in Data Science - Crystal C Yan (FiscalNote)
Oct
24
11:00 AM11:00

Behind the AI Curtain: Designing for Trust in Data Science - Crystal C Yan (FiscalNote)

When startups first launch, they can make the news with application of cutting edge AI - but convincing users to trust the AI is often another story. There's often also no process for integrating future AI development into product roadmaps. This session covers three key principles for how design and data science teams can work together better to build greater trust among users, and a case study to illustrate those principles in practice.

Crystal Yan is a product and design leader with international experience in emerging markets, committed to transforming organizations to be more customer centric. As a Product Manager at FiscalNote, Crystal uses behavioral design and artificial intelligence to create meaningful user experiences. Crystal holds a degree in Economics from Amherst College and has worked in the US, India, China, and Cambodia.

View Event →
 AI disruption in e-commerce : The Present and The Promise - David Drollette (Wayfair)
Oct
24
9:45 AM09:45

AI disruption in e-commerce : The Present and The Promise - David Drollette (Wayfair)

The confluence of a data rich environment and smarter/powerful computing techniques has made AI disruption in e-commerce a natural consequence. Machine learning techniques have made a significant change to the way we shop online – Chat bot assistants guiding our choices in real time, AR systems that help us visualize how a dress or couch looks realtime in our surroundings, NLP techniques that help surface the most relevant products from a near infinite shelf-space from a keyword search- the list is long and continues to grow.

In this talk we will discuss some of these innovations in the form of a case study from Wayfair’s experience in investing in technology that created the visual search feature, that now continues to spawn multiple areas of product development. We will also touch upon various ways we use the promise of AI in fundamentally changing the way consumers shop for home furnishings.

 

David Drollette leads the analytics group at Wayfair, which includes the business intelligence and data science functions.  He served as a founding member of the Wayfair data science team, helping to bring machine learning to bear more broadly across the enterprise.  David has been with Wayfair for 11.5 years and held various roles in finance before transitioning to analytics. He holds a B.A. in Mathematics and Physics from Ithaca College.

View Event →
Machine Learning, Technical Debt, and You - D. Sculley (Google)
Oct
24
9:15 AM09:15

Machine Learning, Technical Debt, and You - D. Sculley (Google)

Machine learning offers a fantastically powerful toolkit for building useful complex prediction systems quickly. In this talk, we'll argue it is dangerous to think of these quick wins as coming for free. Using the software engineering framework of technical debt, we find it is common to incur massive ongoing maintenance costs in real-world ML systems. We explore several ML-specific risk factors to account for in system design. These include boundary erosion, entanglement, hidden feedback loops, undeclared consumers, data dependencies, configuration issues, changes in the external world, and a variety of system-level anti-patterns.  We then show how to pay down ML technical debt by following a set of recommended best practices for testing and monitoring needed for real world systems.

D. Sculley is a Senior Staff Software Engineer at Google

View Event →