FROM RESEARCH TO PRODUCTION WITH PYTORCH 1.0
PETER GOLDSBOROUGH
WHAT IS DYNAMIC PYTORCH? NEURAL NETWORKS WITH STRONG GPU ACCELERATION A MACHINE LEARNING FRAMEWORK BORN WITH AN EMPHASIS ON
DYNAMIC DISTRIBUTED NEURAL TRAINING NETWORKS
EAGER GPU SIMPLICITY EXECUTION ACCELERATION OVER COMPLEXITY MISSION PyTorch enables . . . MISSION PyTorch enables . . .
Research
252 Mentions @ ICLR 2018 (87 in 2018) MISSION PyTorch enables . . .
Research
Ecosystems MISSION PyTorch enables . . .
ELF Research Translate Glow Ecosystems MISSION PyTorch enables . . .
Research
Ecosystems
Partnerships A MACHINE LEARNING FRAMEWORK BORN WITH AN EMPHASIS ON
DYNAMIC DISTRIBUTED NEURAL TRAINING NETWORKS
EAGER GPU SIMPLICITY EXECUTION ACCELERATION OVER COMPLEXITY A MACHINE LEARNING FRAMEWORK BORN WITH AN EMPHASIS ON
DYNAMIC DISTRIBUTED NEURAL TRAINING NETWORKS
EAGER GPU SIMPLICITY EXECUTION ACCELERATION OVER COMPLEXITY EAGER STATIC Matrix a = ...; Matrix<6, 9> a = ...; Matrix b = ...; Matrix<6, 9> b = ...; Matrix c = ...; Matrix<16, 6> c = ...; scalar s = 7; scalar s = 7; d = s * a + b; d = s * a + b; e = matmul(c, d); e = matmul(c, d); if s > 0 { x = if_clause(s > 0, d, e); x = d; } else { while_loop(s > 0, [x, c], lambda x,c { x = e; x = input(); } c = matmul(c, x); }); while s > 0 { x = input(); result = evaluate(x); c = matmul(c, x); } result = c;
EAGER STATIC PROGRAM PROGRAM
GRAPH
RESULT GRAPH RESULT
EAGER STATIC EAGER STATIC PRO — No boundaries on flexibility — Happier debugging — No expensive compilation
CONTRA — Harder to optimize — Harder to deploy
EAGER STATIC PRO PRO — No boundaries on flexibility — Easier to optimize — Happier debugging — Easier to deploy — No expensive compilation — Easier to post-process
CONTRA CONTRA — Harder to optimize — Harder to understand — Harder to deploy — Harder to debug — Less flexible
EAGER STATIC
PYTORCH 1.0 A SEAMLESS PATH FROM RESEARCH TO PRODUCTION WITH TORCH SCRIPT PyTorch
Models are Python programs
• Intuitive, Native • Debuggable — print and pdb • Hackable — use any Python library PyTorch
Models are Python programs
• Simple • Debuggable — print and pdb • Hackable — use any Python library
• Needs Python to run • Difficult to optimize and parallelize PyTorch Eager Mode
Models are Python programs
• Simple • Debuggable — print and pdb • Hackable — use any Python library
• Needs Python to run • Difficult to optimize and parallelize PyTorch Eager Mode PyTorch Script Mode
Models are Python programs Models are programs written in an optimizable subset of Python
• Simple • Production deployment • Debuggable — print and pdb • No Python dependency • Hackable — use any Python library • Optimizable
• Needs Python to run • Difficult to optimize and parallelize PYTORCH JIT Tools to transition eager code into script mode
@torch.jit.script EAGER SCRIPT MODE MODE torch.jit.trace
For prototyping, training, For use at scale and experiments in production Transitioning a model with torch.jit.trace
Take an existing eager model, and provide import torch example inputs. import torchvision
The tracer runs the function, recording the tensor operations performed. def foo(x, y): return 2*x + y We turn the recording into a Torch Script module. # trace a model by providing example inputs • Can reuse existing eager model code traced_foo = torch.jit.trace(foo, • ⚠ Control-flow is ignored (torch.rand(3), torch.rand(3)))
traced_resnet = torch.jit.trace(torchvision.models.resnet18(), torch.rand(1, 3, 224, 224)) Transitioning a model with @torch.jit.script
Write model directly in a subset of Python, class RNN(torch.jit.ScriptModule): annotated with @torch.jit.script or def __init__(self, W_h, U_h, W_y, b_h, b_y): super(RNN, self).__init__() @torch.jit.script_method self.W_h = nn.Parameter(W_h) You can mix bothself.U_h trace= nn.Parameterand script(U_h) • Control-flow is preserved self.W_y = nn.Parameter(W_y) • print statements can be used for in a singleself.b_h model.= nn.Parameter(b_h) debugging self.b_y = nn.Parameter(b_y) • Remove the annotations to debug using @torch.jit.script_method standard Python tools. def forward(self, x, h): y = [] for t in range(x.size(0)): h = torch.tanh(x[t] @ self.W_h + h @ self.U_h + self.b_h) y += [torch.tanh(h @ self.W_y + self.b_y)] if t % 10 == 0: print("stats: ", h.mean(), h.var()) return torch.stack(y), h Loading a model without Python
Torch Script models can be saved to a model # Python: save model archive, and loaded in a python-free traced_resnet = torch.jit.trace(torchvision.models.resnet18(), executable using a C++ API. torch.rand(1, 3, 224, 224)) traced_resnet.save("serialized_resnet.pt") Our C++ Tensor API is the same as our Python API, so you can do preprocessing and // C++: load and run model post processing before calling the model. auto module = torch::jit::load("serialized_resnet.pt"); auto example = torch::rand({1, 3, 224, 224}); auto output = module->forward({example}).toTensor(); std::cout << output.slice(1, 0, 5) << '\n'; What subset of PyTorch is valid Torch Script?
✓ Tensors and numeric primitives ✗ In-place updates to tensors or lists ✓ If statements ✗ Direct use of standard nn.Modules like ✓ Simple loops nn.Conv (trace them instead!) ✓ Code organization using nn.Module ✗ Calling grad() or backwards() within @script ✓ Tuples, Lists Coming in 1.0 stable ✓ print and strings ✓ Gradients propagation through script functions
For more details https://pytorch.org/docs/master/jit.html#torch-script-language-reference PYTORCH FLEXIBILITY IN C++ SIMPLICITY PERFORMANCE ACROSS LANGUAGE BOUNDARIES ENABLING PATHWAYS
RESEARCH PRODUCTION
EAGER SCRIPT
PYTHON C++ ATEN #include
at::Tensor#include x =
z.backwardauto ();z = x + y; z.mul_(2); std::cout << x.grad();
C++ C++ EXTENSIONS FRONTEND
PYTORCH C++ API C++ EXTENSIONS
The power of C++, CUDA and ATen in imperative PyTorch models #include
cv::Mat output; cv::warpPerspective(input, output, warp, {64, 64});
return torch::from_blob(output.ptr
PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) { m.def("compute", &compute); }
C++ EXTENSION from setuptools import setup from torch.utils.cpp_extension \ import BuildExtension, CppExtension import torch.utils.cpp_extension setup( module = torch.cpp_extension.load( name='extension', name='extension', packages=['extension'], sources='extension.cpp', ext_modules=[CppExtension( ) name='extension', sources='extension.cpp', module.compute(...) )], cmdclass=dict(build_ext=BuildExtension))
SETUPTOOLS JIT EXTENSION import torch import extension image = torch.randn(128, 128) warp = torch.randn(3, 3) output = extension.compute(image, warp)
PYTHON INTEGRATION C++ FRONTEND
The aesthetics of imperative PyTorch for high performance, pure C++ research environments MISSION The aesthetics of PyTorch in pure C++
MOTIVATION Enable research in environments that are . . . MISSION The aesthetics of PyTorch in pure C++ LOW LATENCY BARE METAL
VALUES MULTITHREADED ALREADY C++ Enable research in environments that are . . . torch::nn torch::optim torch::data NEURAL NETWORKS OPTIMIZERS DATASETS & DATA LOADERS
torch::serialize torch::python torch::jit SERIALIZATION PYTHON INTER- OP TORCH SCRIPT INTER- OP #include
C++ PYTHON Net net; net = Net() auto data_loader = torch::data::data_loader( data_loader = torch.utils.data.DataLoader( torch::data::datasets::MNIST("./data")); torchvision.datasets.MNIST('./data')) torch::optim::SGD optimizer(net->parameters()); optimizer = torch.optim.SGD(net.parameters()) for (size_t epoch = 1; epoch <= 10; ++epoch) { for epoch in range(1, 11): for (auto batch : data_loader) { for data, target in data_loader: optimizer.zero_grad(); optimizer.zero_grad() auto prediction = net->forward(batch.data); prediction = net.forward(data) auto loss = torch::nll_loss(prediction, loss = F.nll_loss(prediction, target) batch.label); loss.backward() loss.backward(); optimizer.step() optimizer.step(); if epoch % 2 == 0: } torch.save(net, "net.pt") if (epoch % 2 == 0) torch::save(net, "net.pt"); }
C++ PYTHON pytorch.org
PETER GOLDSBOROUGH