KubeCon + CloudNativeCon North America 2017: Full Schedule

7:30pm CST

BoF: Machine Learning on Kubernetes - hosted by David Aronchick, Google

Speakers

David Aronchick

Head of OSS Machine Learning, Microsoft

David leads Open Source Machine Learning Strategy at Azure. This means he spends most of his time helping humans to convince machines to be smarter. He is only moderately successful at this.Previously, he led product management for Kubernetes, launched Google Kubernetes Engine and... Read More →

Wednesday December 6, 2017 7:30pm - 9:00pm CST
Meeting Room 7, Level 3

BoF (Birds-of-a-Feather) Data + Machine Learning - KubeCon

11:10am CST

All You Need to Know to Build Your GPU Machine Learning Cloud [B] - Ye Lu, Qunar

GPU is becoming the new common, but at the moment, GPU resources are still hard to find for people who wants to have a taste. So how to build your GPU machine learning cloud?

Resource management & App templating
Even if your company or organization have purchased some GPU devices. Environment and resource isolation is always a problem. Also at the beginning the cloud is more used as a playground, so another consideration is to improve usage rate of resources. How we use Kubernetes to solve this problems.

How to use a wizard to generate machine learning, you can choose using tensorflow or theano, how many GPUs you need, etc.

Make the “customized changes” in immutable container be played back.
The features of container is immutable, which is a double-edged sword, which can ensure the environment can be unique/portable. On the other side, any changes inside the running container can be lost after recreation. How the customed env is saved and reuse?

Managing persistence storage in Kubernetes
How to turn our RBD served as hosted s3, to save models, training data, and so on. So The data scientist can access their data both as a volume and s3 standard api.
Support the running machine learning app,like tensorflow to do online resize.

App model & permission control
We'll talk about the app center , design of appcode and permission control.

Speakers

Ye Lu

Cloud Computing Enginneer, Bytedance

Yelu is working as a cloud computing engineer in ByteDance, which has more than 600 millions active users and hundreds of thousands of servers all over the world. She is responsible for the IaaS architecture of ByteDance’s production environment, including private cloud and edge... Read More →

Thursday December 7, 2017 11:10am - 11:45am CST
Meeting Room 9C, Level 3

Data + Machine Learning - KubeCon

Audience Beginner
Link to Session Recording https://youtu.be/IoEFLGe4EFs

11:55am CST

Building GPU-Accelerated Workflows with TensorFlow and Kubernetes [I] - Daniel Whitenack, Pachyderm

GPUs are critical to some artificial intelligence workflows. In particular, workflows that utilize TensorFlow, or other deep learning frameworks, need GPUs to efficiently train models on image data. These same workflows typically also involve mutli-stage data pre-processing and post-processing. Thus, a unified framework is needed for scheduling multi-stage workflows, managing data, and offloading certain workloads to GPUs.

In this talk, we will introduce a stack of open source tooling, built around Kubernetes, that is powering these types of GPU-accelerated workflows in production. We will do a live demonstration of a GPU enabled pipeline, illustrating how easy it is to trigger, update, and manage multi-node, accelerated machine learning at scale. The pipeline will be fully containerized, will be deployed on Kubernetes via Pachyderm, and will utilize TensorFlow for model training and inference.

Speakers

Daniel Whitenack

Lead Data Scientist and Advocate, Pachyderm

Daniel Whitenack (@dwhitena) is a Ph.D. trained data scientist working with Pachyderm (@pachydermIO). Daniel develops innovative, distributed data pipelines which include predictive models, data visualizations, statistical analyses, and more. He has spoken at conferences around the... Read More →

KubeCon Pachyderm(2) pdf

Thursday December 7, 2017 11:55am - 12:30pm CST
Meeting Room 9C, Level 3

Data + Machine Learning - KubeCon

Audience Intermediate
Link to Session Recording https://youtu.be/OZSA5hmkb0o

2:00pm CST

''Hot Dogs or Not" - At Scale with Kubernetes [I] - Vish Kannan & David Aronchick, Google

Kubernetes promises to be a multi workload platform. This talk will explore how Kubernetes can be easily leveraged to build a complete Deep Learning pipelines starting all the way from data ingestion/aggregation, pre-processing, ML training, and serving with the mighty Kubernetes APIs. This talk will use Tensorflow and other other ML frameworks to highlight the value that Kubernetes brings to Machine Learning. Along the way, key infrastructure features introduced to abstract and handle hardware accelerators which make Machine Learning possible will also be presented.

Speakers

David Aronchick

Head of OSS Machine Learning, Microsoft

Vishnu Kannan

Software Engineer, Google Inc

Vishnu Kannan is a Senior Software Engineer at Google. Vishnu received his Masters in ECE from Georgia Tech. He has been a systems engineer ever since he graduated. He hacked on the Linux Kernel for a couple of years at Cisco. He then worked on Borg at Google. He is currently an... Read More →

Thursday December 7, 2017 2:00pm - 2:35pm CST
Meeting Room 9C, Level 3

Data + Machine Learning - KubeCon

Audience Intermediate
Link to Session Recording https://youtu.be/R3dVF5wWz-g

2:45pm CST

eBay Geo-Distributed Database on Kubernetes [A] - Chengyuan Li & Xinglang Wang, eBay

Database as a Service is one of the most interesting and challenging domains on the cloud industry. In eBay, we implemented a cloud-native geo-distributed document service based on the kubernetes. eBay extended the kubernetes to support local disk volume on bare metal machine, which enables the high performance DB can be deployed on the kubernetes as a Pod. On top of the kubernetes platform, we develop a control layer to orchestrate the databased pods and enable it can be distributed on multiple cluster, and expand the WISB model to use a workflow to auto manage the database cluster.

Speakers

Chengyuan Li

Sr MTS Software Engineer, eBay

Chengyuan Li is a member in eBay Kubernetes team, his focus area is host-runtime and storage in Kubernetes. Before joining Kubernetes project, he worked in computer and network area for eBay cloud.

Xinglang Wang

Principle MTS 首席工程师, eBay

Xinglang Wang is an architect in eBay Data platform, he is working on ebay next generation geo-distribute database, and his main focus is the distribution and control layer of the database. Before he is the architect of ebay real-time behaviour data pipeline, focus on real-time stream... Read More →

Thursday December 7, 2017 2:45pm - 3:20pm CST
Meeting Room 9C, Level 3

Data + Machine Learning - KubeCon

Audience Advanced
Link to Session Recording https://youtu.be/2t_HVmih4IM

3:50pm CST

Running MySQL on Kubernetes [I] - Patrick Galbraith, Consultant

MySQL is the world's most popular open source database and there are a number of ways to run it on Kubernetes. This talk will cover each type of MySQL deployment strategy starting from a simple MySQL pod, to a asynchronous replicated master-slave, synchronous Galera cluster, and on to a Vitess clustering system which allows for horizontal scaling of MySQL and innately has built-in sharding, explaining how each is deployed, what features are available, and what type of application they lend themselves to.

Speakers

Patrick Galbraith

Principal Platform Engineer, Oracle

Patrick Galbraith has been involved in MySQL, Linux, and other Open Source (OSS) projects back to the early days of Slackware. He has worked broad spectrum of companies and in a wide array of roles throughout his career, including Slashdot, MySQL, Blue Gecko, Hewlett-Packard, and... Read More →

MySQL on Kubernetes pdf

Thursday December 7, 2017 3:50pm - 4:25pm CST
Meeting Room 9C, Level 3

Data + Machine Learning - KubeCon

Audience Intermediate
Link to Session Recording https://youtu.be/J7h0F34iBx0

4:35pm CST

Accelerating Humanitarian Relief with Kubernetes [I] - Erik Schlegel & Christoph Schittko, Microsoft

How can UN humanitarian aid field experts use social media to gain insight, understand trends and track key humanitarian issues? Through a collaboration with Microsoft and UN OCHA, Project Fortis was created to accelerate the surveillance around humanitarian disasters and health epidemics around the world.

This talk discusses the architecture of a high-available native spark pipeline running across multiple Kubernetes clusters to support Fortis customers.

Speakers

Christoph Schittko

Principal Software Development Engineer, Microsoft

Christoph Schittko is an engineer with Microsoft working with customers on innovative solutions in the areas of containerization and AI. He's been working with Microsoft customers on building cloud solutions since Azure was called "Red Dog". He’s recently been a contributor to... Read More →

Erik Schlegel

Senior Engineer, Microsoft

Erik is an open source engineer at Microsoft, and based in the Austin area. He's one of the original contributors to the React Native Universal Windows Platform (UWP). Erik leads the engineering effort of Project Fortis, an open source data gathering / surveillance insight platform... Read More →

Kubecon 2017 Humanitarian Aid Multi Cloud pdf

Thursday December 7, 2017 4:35pm - 5:10pm CST
Meeting Room 9C, Level 3

Data + Machine Learning - KubeCon

Audience Intermediate
Link to Session Recording https://youtu.be/UywgL70FQ3s

11:10am CST

Modern Big Data Pipelines over Kubernetes [I] - Eliran Bivas, Iguazio

Big data used to be synonymous with Hadoop, but our ecosystem has evolved over time with new database, streaming and machine learning solutions which don’t necessarily benefit from the Hadoop deployment model of Map/Reduce, YARN and HDFS. These solutions require a generic cluster scheduling layer to host multiple workloads such as Kafka, Spark and TensorFlow, alongside databases such as Cassandra, Elasticsearch and cloud-based storage.

Eliran Bivas is a senior big data architect with years of hands-on experience working on both big data and cloud native solutions. Eliran will go over a common solution framework to create cloud native end-to-end analytics applications. It involves using Kubernetes as an alternative to Yarn, running Spark, Presto, machine learning frameworks (TensorFlow, Python and Spark ML kits) and serverless functions coupled with local and cloud-based storage. The session will showcase customer use-cases from IoT, automotive, cloud SaaS and finance. It will also include a live solution demo which demonstrates the benefits of using big data and analytics over a cloud native architecture, eliminating the existing challenges of complexity and moving towards a continuous integration and development architecture for big data.

Speakers

Eliran Bivas

Senior Big Data Architect, iguazio

Eliran Bivas is a senior big data architect at iguazio and a self-proclaimed tech junkie with a passion for innovation. Eliran is skilled with object-oriented design and development, having worked extensively on cloud native environments. He has broad experience developing with cloud... Read More →

KubeCon 2017 Bivas pdf

Friday December 8, 2017 11:10am - 11:45am CST
Meeting Room 9C, Level 3

Data + Machine Learning - KubeCon

Audience Intermediate
Link to Session Recording https://youtu.be/8tIcisyXWDU

11:55am CST

Kafka Operator: Managing and Operating Kafka Clusters in Kubernetes [A] - Nenad Bogojevic, Amadeus

In this talk we will demonstrate an approach to management of kafka clusters in kubernetes deployments. We will show how we can provision kafka clusters and configure it using kubernetes concepts and an operator process. The kafka and zookeeper cluster elements will be provisioned using StatefulSet. As these applications benefit from high performance storage, we will also show how we can use node selectors or persistent volume claims to schedule instances on correct hardware. In order for clients to use it, the necessary message topics have to be configured in kafka cluster. We will show how using an operator process, based on kubernetes custom resources or ConfigMaps we can manage this configuration in descriptive manner and ensure consistent configuration across different development and operations stages as well as cluster restarts. Finally we will discuss how all this ties in with service catalog.

Speakers

Nenad Bogojevic

Software Architect, Amadeus

Nenad Bogojevic, platform solutions architect at Amadeus, has 20+ years of experience in software development. He has worked on e-commerce applications, natural language processing tools, and high-performance network middleware. In his job, Nenad is an architect who codes, a technical... Read More →

KubeCon 2017 Kafka Operator pdf

Friday December 8, 2017 11:55am - 12:30pm CST
Meeting Room 9C, Level 3

Data + Machine Learning - KubeCon

Audience Advanced
Link to Session Recording https://youtu.be/jAz8sdO1rgE

2:00pm CST

Distributed Database DevOps Dilemmas? Kubernetes to the Rescue - Denis Magda, GridGain

Distributed databases can make so many things easier for a developer... but not always for DevOps. OK, almost never for DevOps. Kubernetes has come to the rescue with an easy application orchestration!

It’s straightforward to do the orchestration leaning on relational databases as a data layer. However, it’s becoming a bit trickier to do the same when a distributed SQL database or other kind of distributed storage is used instead.

In this talk you will learn how Kubernetes can orchestrate distributed database like Apache Ignite, in particular:

Cluster Assembling - database nodes auto-discovery in Kubernetes.
Database Resilience - automated horizontal scalability.
Database Availability - what’s the role of Kubernetes and the database.
Utilizing both RAM and disk - set up Apache Ignite in a way to get in-memory performance with durability of disk.

Speakers

Denis Magda

Director of Product Management, GridGain

Denis Magda is a Director of Product Management at GridGain Systems and Apache Ignite PMC Chair. He is an expert in distributed systems and platforms. Before joining GridGain and becoming a part of Apache Ignite community, he worked for Oracle where he led the Java ME Embedded Porting... Read More →

distributed database deployment with kubernetes pptx

Friday December 8, 2017 2:00pm - 2:35pm CST
Meeting Room 9C, Level 3

Data + Machine Learning - KubeCon

Link to Session Recording https://youtu.be/k1y0Uoqepak

2:45pm CST

Democratizing Machine Learning on Kubernetes [I] - Joy Qiao & Lachlan Evenson, Microsoft

One of the largest challenges facing the machine learning community today is understanding how to build a platform to run common open-source machine learning libraries such as Tensorflow. Both Joy and Lachie are both passionate about making machine learning accessible to the masses using Kubernetes. In this session they'll share how to deploy a distributed Tensorflow training cluster complete with GPU scheduling on Kubernetes. We’ll also share how distributed Tensorflow training works, various options for distributed training, and when to choose what option. We’ll also share some best practices on using distributed Tensorflow on top of Kubernetes, based on our latest performance tests performed on public cloud providers. All work presented in this session will be accessible via a public Github repository.

Speakers

Lachie Evenson

Principal Program Manager, Microsoft

Lachlan is a Principal Program Manager on the open source team at Azure. As a cloud native ambassador, emeritus Kubernetes steering committee member and release lead, Lachlan has deep operational knowledge of many Cloud Native projects. He spends his days building and contributing... Read More →

Joy Qiao

Senior Solution Architect - AI and Research Group, Microsoft

Joy Qiao is a senior solution architect in the AI & Research Group at Microsoft, where she is responsible for driving end-to-end AI/ML solutions on Azure among the partner eco-system. Joy has over 15 years of IT industry experience including 11 years at Microsoft working as technical... Read More →

Democratizing Machine Learning on Kubernetes pdf

Friday December 8, 2017 2:45pm - 3:20pm CST
Meeting Room 9C, Level 3

Data + Machine Learning - KubeCon

Audience Intermediate
Link to Session Recording https://youtu.be/gvuZpRmCzTM

3:40pm CST

Kube-native Postgres [I] - Josh Berkus, RedHat

Database systems remain the last frontier for Kubernetes, and at the Patroni Project we're working on conquering it. Having fully automated PostgreSQL clusters using Patroni, the project is now working on making Patroni more "Kubernetes native", so that SQL databases can be seen simply as a PostgreSQL resource.

In this talk, we will explain and demonstrate the current projects integrating Patroni PostgreSQL with Kubernetes, including:

* Patroni Operator, using the CoreOS Operator pattern
* Kube-native Patroni, which uses the Kubernetes controller instead of its own management

These works in progress will both acquaint attendees with tools they can use for their own high-availability database architectures, and explore some areas where Kubernetes could improve to support database systems better.

Speakers

Josh Berkus

Kubernetes Community Architect, Red Hat

Josh Berkus is the Kubernetes Community Manager for Red Hat. He contributes to Kubernetes, Etcd, Elekto, and a few other projects. Josh is a TAG Contributor Strategy co-chair, and recently retired from being a Kubernetes SIG lead. He also still dabbles in databases, despite being... Read More →

Friday December 8, 2017 3:40pm - 4:15pm CST
Meeting Room 9C, Level 3

Data + Machine Learning - KubeCon

Audience Intermediate
Link to Session Recording https://youtu.be/Zn1vd7sQ_bc

4:25pm CST

Don’t Hassle Me, I’m Stateful - Jeff Bornemann & Michael Surbey, Red Hat

Stateless, cloud-ready applications are the future for many enterprise users, but what do you do about legacy monoliths, and existing vendor applications? New StatefulSet features within Kubernetes allow developers and administrators to work with these types of applications, and still reap the many rewards of a containerized platform. This session will explore some of these features by deploying a full MongoDB cluster on-top of OpenShift.

Speakers

Jeff Bornemann

Senior Consultant, Red Hat

Jeff has been developing software for Fortune 500 companies for many years, including contributions to multiple OSS projects. Jeff works with Red Hat's OpenShift platform, helping to bring container adoption to Red Hat customers.

Michael Surbey

Solutions Architect, Red Hat, Inc.

With a background in development, design, and management of enterprise IT-driven solutions, Mike enjoys helping U.S. public sector customers, contributors, and partners create better a citizen experience the open source way.

Don't Hassle Me, I'm Stateful pdf

Friday December 8, 2017 4:25pm - 5:00pm CST
Meeting Room 9C, Level 3

Data + Machine Learning - KubeCon

Audience Intermediate
Link to Session Recording https://youtu.be/yUfPd39-jHo