MLOps NYC19 Conference 정리

MLOps NYC19 영상을 정리한 글입니다
시청한 영상
- MLOps in the Newsroom
- Netflix Presents: A Human-Friendly Approach to MLOps
- The Architecture That Powers Twitter’s Feature Store
- Serverless for ML Pipelines from A to Z
- Deep Learning on Business Data at Uber
- The Growth and Future of Kubeflow for ML
- Stateless ML Pipelines: Achieve reproducibility and automation while simplifying the pipeline
Training session은 영상이 없지만, Review를 통해 간접적으로 내용을 볼 수 있습니다

MLOps in the Newsroom

Information Platforms and the Rise of the Data Scientist(2009)라는 책에선 R / Hadoop을 사용했다고 하는데 NYTimes에선 TF / GCP를 사용함
NYTimes의 DS Software Stack
Deployment at The New York Times
뉴스
- 20세기엔 church + state였다면
- 21세기엔 church + state + data(데이터가 앞 2가지 요소에 영향을 미침)
모델링
- Descriptive modeling
  - Readerscope
  - 데이터를 설명하는 패턴을 찾음
  - 현재 발생하는 상황에 대해 더 정교하고 실시간 통찰력이 필요했음
  - 누가 무엇을 읽고 있는지? 그리고 어디에 있는가?
  - 마케터가 drill down으로 LA에서 무슨 일이 있는지 등을 알 수 있음
  - Contextual bandit
- Predictive modeling
  - 누가 구독할 것인가, 누가 떠날 것인가를 예측(퍼널 단계 단계를 예측)
  - 리스크를 위해 성능과 해석 가능성을 중시
  - 광고주가 광고할 때 어떤 사람에게 효과가 있는지? 관련있는 context를 제시 => 우리 광고는 Travel 부분과 digital 세대에 영향을 많이 미칠 것이다 등
  - 텍스트 기반해서 Labeling(Inspired, Happiness, Sadness, Love 등)
- Prescriptive modeling
  - 추천을 어떻게 할 것인가
  - 톰슨 샘플링 & 밴딧 사용
  - Modeling Social Data
IDEA
- best AI is AI + IRL
  - recpect for craft
  - recpect for collaborators

Netflix Presents: A Human-Friendly Approach to MLOps

넷플릭스는 출시 전에 매일 프로그램 시청자의 예상 크기(estimated size)를 알고 싶어함
The Life of a Model
- EDA
  - 프로젝트를 잡고, 노트북에서 잡음
  - correlation를 찾고, scatter plot을 그림
  - 2주 정도 진행
- Prototyping
  - 다양한 실험을 하고, Feature 추가, 모델링 등등을 진행
  - 6~8주
- Productionalize
  - Ship Model to Production(v1)
  - Scale & Deploy
  - ETL / Feature Engineering / Model Traing / Model Hoasting / Batch Scoring / Live Scoring / Audits / Scheduling etc
Metaflow
- 빠른 프로토 타이핑을 위해 만듬
- 코드의 구조는 next로 다음 행동을 지정할 수 있음
- 프로토타입 할 때 특정 부분만 안되면 resume 명령어로 다시 실행
- 컴퓨팅 파워는 titus에게 말하면 됨
- 분산 처리
Production 배포시 Meson 사용, meson create
- 넷플릭스 테크 블로그 : Meson: Workflow Orchestration for Netflix Recommendations
Real time Scoring
몇달이 지난 후
The Life of a Model
- Maintenance
  - 모델 유지하고 version 2 빌드
  - v1를 안전하게 복사
  - 새로운 feature를 추가해 진행
inspect & debug
Pick up & iterate
- 다른 사람의 실험도 돌릴 수 있음
- 태그도 가능
Metaflow at scale
- 도입 후 정말 많은 프로젝트가 생기고 있음

The Architecture That Powers Twitter’s Feature Store

많은 사람들이 모델링하면 겹치는 Feature Engineering이 있음
- 이미 많이 진행한 팀, 이제 막 머신러닝을 도입하려는 팀 등
- 이미 진행한 팀꺼를 fork하거나 밑바닥부터 만들 수 있음
- 전사에서 사용할 라이브러리 생성
Featre Store는 library
오.. 트위터껀 아니지만 Gojek이 만든 feast가 있음
Share
- Feature Catalog
- Succinct, declarative Definitions
Datasets
- 온라인/오프라인 접근이 가능
- 동영상에서 화질이 너무 낮아 알아보기 힘듬 ㅠ
- 추가할 Feature를 정의
Offline Integration
- 스칼라 사용
- joinFeatures
Online Integration
Feature Store Client
Strato
- FeatureStore Client에서 Strato로 데이터 보내고, 캐싱하거나 DB에 넣거나 서비스에 쓰거나 하는듯
영상은 10분만에 끝남. 딱히 인상 깊진 않음

Serverless for ML Pipelines from A to Z

Code / Model Development is Just the First Step
파이프라인 예시
- Weather 정보도 추가
- 여기서도 Feature Store란 단어가 나옴
Nuclio를 사용해 ETL과 Streaming을 가속
- Nuclio Github
Nuclio를 사용해 Serving
- Nuclio Jupyter Github
Buidling ML Pipelines From (Serverless) Functions
- Feature Store가 있군!
Demo
- KFServing을 쓰는듯

Deep Learning on Business Data at Uber

왜 딥러닝인가?
- 기존에 사용하던 알고리즘보다 딥러닝이 더 좋은 성능을 보이고 있음
- 특정 도메인에선 압도적인 성능(vision)
- 기존에 사용하던 트리 모델과 결합해 하이브리드 모델을 만듬
딥러닝 In Uber
이런 시스템을 어떻게 구현할까?
- Option A : TFX
- Option B : Apache Spark (이걸 사용)
  - Powerful ETL
  - Easy integration with XGBoost
  - 이미 스파크를 사용하는 시스템이 있었음
1) Feature Store
- Real time과 Batch를 통합
2) Model Training
- Apache Spark에서 딥러닝을 어떻게 합칠까?
- Preprocessing
- SparkML Pipelines
  - Estimator, Trnasformer, Pipeline
- Distributed Training
- Petastorm : 딥러닝 학습을 위한 데이터 접근
  - Parquet
- End-to-end Training Architecture
  - Petastorm Blog
3) Prediction Service
- 자바와 딥러닝 프레임워크를 같이 실행시켜야 함
4) Authoring
- 데이터 사이언티스트들은 쥬피터 노트북을 좋아함
- 하나의 노트북에서 아이데이션, 학습, 평가, 딥러닝 모델 배포 등을 할 수 있을까?
- Data Access
- Data Preparation
- Model Construction
- Train the Model
- Deploy the Model
5) Don’t know Deep Learning?
- 딥러닝을 몰라도 Ludwig 활용
- Ludwig 홈페이지, Ludwig Github
Recap(요약)
- 거대한 데이터셋을 가진 회사에서 딥러닝을 사용하면 powerful한 모델을 만들 수 있음
- Uber의 딥러닝 시스템 아키텍쳐와 E2E DL 파이프라인을 정의하기 위해 노트북 친화적으로 만든 API를 떠올리기
- Apache Spark, Horovod, Petastorm을 사용함
- Ludwig에서 코드 없이 딥러닝 모델을 만듬

The Growth and Future of Kubeflow for ML

ML 구성은 매우 복잡함(이거 진짜 모든 MLOps 세미나에서 나오는듯…ㅋㅋㅋ)
MLOps Team이 당면한 문제
- 10배 넘게 생산성을 가지도록 하는 방법은?
Kubeflow
Vibrant(활기찬) Ecosystem of Kubeflow
- 엄청 활발하게 발전되고 있는 에코시스템
Deploy & Manage
- Composable, Scalable, Portable
쿠버네티스가 MLOps에 좋은 이유
Kubeflow 0.6
- Metadata
  - 인공물을 저장하고 스키마 정의 가능
- Deployment
  - Kustomize가 ksonnet을 대체함
- Multi user support
- Pipelines
  - API와 UI 개선
Anthos가 MLOps에 좋은 이유
- Google Developer의 Anthos 소개 글
- 흠 Anthos는 규모에 따라 이득일지 아닐지가 나뉠듯..?

Stateless ML Pipelines: Achieve reproducibility and automation while simplifying the pipeline

나이키
데이터 사이언티스트와 팀은 모델 파이프라인부터 프러덕션까지 할 수 있어야 하고, 모델의 전체 lifecycle을 알아야 함
What Stateless Pipelines Changed
- Airflow 사용 -> 실패하면 알람
- 이제 모델 파이프라인을 몇분만에 만듬
Lifecycle of an ML Project
- CI/CD 어떻게 하는지 궁금
파이프라인
- 기존 에어플로 설정을 더 간소화함
- 아래는 뭐로 한건지 모르겠음. dsp?
모델 실행 파이프라인 예시
Providing Paths
- Dev / Test / Prod가 저장되는 폴더가 다름
Metrics
Standard CI/CD pipeline
- 젠킨스파일을 가짐
Result

Reference

MLOps NYC 2019 Youtube

카일스쿨 유튜브 채널을 만들었습니다. 데이터 분석, 커리어에 대한 내용을 공유드릴 예정입니다.

PM을 위한 데이터 리터러시 강의를 만들었습니다. 문제 정의, 지표, 실험 설계, 문화 만들기, 로그 설계, 회고 등을 담은 강의입니다

이 글이 도움이 되셨거나 의견이 있으시면 댓글 남겨주셔요.

Buy me a coffee

MLOps in the Newsroom

Netflix Presents: A Human-Friendly Approach to MLOps

The Architecture That Powers Twitter’s Feature Store

Serverless for ML Pipelines from A to Z

Deep Learning on Business Data at Uber

Ludwig에서 코드 없이 딥러닝 모델을 만듬

The Growth and Future of Kubeflow for ML

Stateless ML Pipelines: Achieve reproducibility and automation while simplifying the pipeline

Reference

Share this post