TedNet: A Pytorch Toolkit for Tensor Decomposition Networks

Yu Pana, Maolin Wangc, Zenglin Xua,b,∗

aHarbin Institute of Technology Shenzhen, Shenzhen, China bPengcheng Lab, Shenzhen, China cUniversity of Electronic Science and Technology of China, Chengdu, China

Abstract Tensor Decomposition Networks(TDNs) prevail for their inherent com- pact architectures. For providing convenience, we present a toolkit named TedNet that is based on the Pytorch framework, to give more researchers a flexible way to exploit TDNs. TedNet implements 5 kinds of tensor de- composition(i.e., CANDECOMP/PARAFAC(CP), Block-Term Tucker(BT), Tucker-2, Tensor Train(TT) and Tensor Ring(TR)) on traditional deep neu- ral layers, the convolutional and the fully-connected layer. By utilizing these basic layers, it is simple to construct a variety of TDNs like TR-ResNet, TT-LSTM, etc. TedNet is available at https://github.com/tnbar/tednet. Keywords: Tensor Decomposition Networks, Deep Neural Networks, Tensor Optimization

1. Introduction Tensor Decomposition Networks(TDNs) are constructed by decomposing deep neural layers with tensor formats. For the reason that the original tensor

arXiv:2104.05018v1 [cs.LG] 11 Apr 2021 of a layer can be recovered from tensor decomposition cores, TDNs are often regarded as a compression method for the corresponding networks. Com- pared with traditional networks like Neural Networks(CNNs) and Recurrent Neural Networks(RNNs), TDNs can be much smaller and

∗Corresponding author Email addresses: iperryuu@.com (Yu Pan), [email protected] (Maolin Wang), [email protected] (Zenglin Xu)

Preprint submitted to Neurocomputing April 13, 2021 TD Block

Tensor Train Convolution Tensor Ring TD Block

Tucker-2 TD ResNet

Linear Block-Term Tucker CP

Tensor Decomposition Layer TD Cell

PyTorch TD

Figure 1: The framework of TedNet. TedNet is based on Pytorch and adopts NumPy to process numerical calculations. Tensor decomposition (TD) can be applied to convo- lutional layers or linear layers. We implemented several variants of tensor decomposition methods, including CP, Tucker, Tensor Ring, Tensor Train, and Block-term Tucker. Ten- sor decomposition can be fulfilled in convolution neural networks. An illustration of two tensorial classical neural blocks(i.e., ResNet and RNN) that are built on the Tensor De- composition Layer is shown in the right of the figure. occupy a little memory. For example, TT-LSTM [1], BT-LSTM [2], TR- LSTM [3] are able to reduce 17554, 17414 and 34192 times parameters with a higher accuracy than the original models. With light-weight architectures and good performance, TDNs are promising to be used in kinds of source- restricted scenes like mobile equipment, microcomputers and so on. Under this background, we design TedNet package for providing convenience for researchers to explore on TDNs. Related packages are T3F [4], Tensorly [5], TensorD [6], TensorNetwork [7], tntorch [8]. T3F is explicitly designed for Tensor Train Decomposition on Tensorflow [9]. Similarly based on Tensorflow, TensorD supports CP and Tucker decomposition. By contrast, to provide sufficient help, TedNet im- plements 5 kinds of tensor decomposition with backend Pytorch [10]. Tensor- Network and Tensorly incorporate abundant tensor calculation tools. Ten- sorNetwork is built on Tensorflow, while Tensorly runs with a variety of backends like CuPy, Pytorch, Tensorflow and so on. Unfortunately, Tensor- Network and Tensorly both serve for tensor decomposition algorithms rather

2 Function Description set tn type Set the tensor decomposition type. Generate tensor decomposition nodes, set nodes then edit node information. set params info Record information of Parameters. tn contract The function of contracting inputs and tensor nodes.

Table 1: Functions of TNBase. than TDNs and lack support to Application Programming Interface(API) to build tensorial neural networks directly. Compared with them, TedNet can set up a TDN layer quickly by calling API directly. In addition, we also provide several deep TDNs that are popular for researchers now. Due to the Dynamic Graph Mechanism of Pytorch, TedNet is also flexible to DEBUG for .

2. TedNet Details TedNet is designed with the goal of building TDNs by calling correspond- ing , which can extremely simplify the process of constructing TDNs. As shown in Figure 1, TedNet adopts Pytorch as the training framework be- cause of its auto differential function and convenience to build DNN models. Specifically, the fundamental module of TedNet is TNBase, which is an abstracted class and inherits from .nn.Module. Thus, TedNet mod- els can be amicably combined with other Pytorch models. As an abstracted class, TNBase requires sub-classes to implement 4 functions as described in Table 1. In addition, for better numerical calculations, TedNet also uses NumPy [11] to assist in tensor operations. Usually, DNNs are constructed with CNNs and Linears. The weight of a CNN is a 4-mode tensor ∈ RK×K×Cin×Cout , where K means the con- volutional window, Cin denotes the input channel and Cout represents the counterpart output channel. And a Linear is a matrix W ∈ RI×O, where I and O are length of input and output feature respectively. Similar to DNNs, TDNs consist of TD-CNNs and TD-Linears(For simplification, TD- denotes the corresponding tensor decomposition model), whose weights C and W are factorized with tensor decomposition. Following this pattern, there are 5 frequently-used tensor decomposition(i.e. CP, Tucker-2, Block-Term Tucker, Tensor Train and Tensor Ring) in TedNet, which satisfies most of common

3 250 0.9 200 0.8 177× 164× 164× 0.7 150 146× 146×

LSTM:0.8703 0.6 CR BTT-LSTM:0.8892 100

Top-1 Accuracy 0.5 CP-LSTM:0.8892 TK2-LSTM:0.75 0.4 TR-LSTM:0.9209 50 TT-LSTM:0.9019 0.3 0 25 50 75 100 125 150 0 Epoch BTT CP TK2 TR TT (a) Training Process in 150 epochs. (b) Compression Ratio.

Figure 2: Experiments on UCF11. CR is short for Compression Ratio.

Cifar10 Cifar100 Model Rank Params CR Accuracy Params CR Accuracy ResNet-32 - 0.46M 1× 0.9228 0.47M 1× 0.6804 BTT-ResNet-32 4 0.08M 6× 0.8589 0.08M 6× 0.5206 CP-ResNet-32 10 0.03M 18× 0.8802 0.03M 18× 0.4445 TK2-ResNet-32 10 0.05M 9× 0.8915 0.06M 9× 0.5398 TR-ResNet-32 10 0.09M 5× 0.9076 0.09M 5× 0.653 TT-ResNet-32 10 0.09M 5× 0.9020 0.10M 5× 0.6386

Table 2: Experiments on Cifar10/100. Params denotes the number of parameters. situations. Notably, TedNet is the first open-source package who supports Tensor Ring Decomposition. Besides, based on TD-CNNs and TD-Linears, TedNet has built some tensor decomposition based Deep Neural Networks, e.g. TD-ResNets, TD-RNNs.

3. Benchmark Until now, TDNs are mostly applied in field. Thus, aim- ing to validate performance of TedNet, we consider to conduct experiments on two datasets:

• UCF11 Dataset: contain 1600 video clips of a resolution 320 × 240 and is divided into 11 action categories. Each category consists of 25 groups of videos, within more than 4 clips in one group.

4 • Cifar10/100 Dataset: Both CIFAR10 and CIFAR100 datasets consist of 50,000 train images and 10,000 test images with size as 32 × 32 × 3. CIFAR10 has 10 object classes and CIFAR100 has 100 categories.

Listing 1: A usage sample of TR-LeNet-5.

1 import tednet.tnn.tensor i n g as t r 2 3 import torch 4 import torch.nn.functional as F 5 import torch.optim as optim 6 7 from torchvision import datasets , transforms 8 9 # Define data loader 10 d a t a loader = torch. utils .data.DataLoader( 11 datasets .MNIST( ’./data ’ , train=True, download=True, 12 transform=transforms .Compose([ 13 transforms .ToTensor() , 14 transforms.Normalize((0.1307,), (0.3081,)) 15 ])), 16 b a t c h size=128, shuffle=True) 17 18 # Define TR−LeNet5 19 model = tr.TRLeNet5(10, [6, 6, 6, 6]) 20 optimizer = optim.SGD(model.parameters(), lr=2e −2) 21 22 # Train model 23 model. train() 24 f o r epoch in range(20): 25 f o r data, target in data l o a d e r : 26 optimizer.zero g r a d ( ) 27 output = model(data) 28 loss = F.cross entropy(output, target) 29 loss .backward() 30 optimizer.step()

On UCF11, our goal is to complete video classification tasks. The same setting as described in the literature [3], we first extract feature of dimension 2048 from each frame of a video by Inception-V3 [12]. Then throw these features as step inputs into TD-LSTMs. Results are shown in Figure 2. Almost every tensor decomposition model can achieve better accuracy except Tucker-2. As for Cifar10/100, we construct the image classification task on them. Through applying TD-ResNet-32, we can validate performance of all the tensor decomposition in TedNet. Results are demonstrated in Table 2. Unlike the results of UCF11, tensor decomposition can lead to a negligible loss in contrast to the original accuracy.

5 4. Installation and Illustrative Examples There are two ways to install TedNet. For the sake that the source code of TedNet is submitted to GitHub, it is feasible to install from the downloaded code by command python setup.py install. Compared with aforemen- tioned fussy way, another one, the recommended way is to install TedNet trough PyPI 1 by command pip install tednet. After installation, all ten- sor decomposition models of TedNet can be used. An example of Tensor Ring is shown in Listing 1. Tensor ring decompo- sition can be used by import module tednet.tnn.tensor ring. The usage of other decomposition is the same and more details can be found in the Document 2.

5. Conclusion In this paper, we present a package named TedNet that is specially de- signed for TDNs. TedNet is completely open-source and distributed under the MIT license. Compared with other related python packages, TedNet con- tains the most kinds of tensor decomposition. As for performance, tensorial networks of TedNet can reach accuracy as high as reported in their original paper.

Acknowledgements This paper was partially supported by the National Key Research and Development Program of China (No. 2018AAA0100204), and a fundamental Program of Shenzhen Science and Technology Innovation Commission (No. ZX20210035).

References [1] Y. Yang, D. Krompass, V. Tresp, Tensor-train recurrent neural networks for video classification, in: D. Precup, Y. W. Teh (Eds.), Proceedings of the 34th International Conference on , ICML 2017, Sydney, NSW, Australia, 6-11 August 2017, Vol. 70 of Proceedings of

1https://pypi.org/project/tednet 2https://tednet.readthedocs.io

6 Machine Learning Research, PMLR, 2017, pp. 3891–3900. URL http://proceedings.mlr.press/v70/yang17e.html

[2] J. Ye, L. Wang, G. Li, D. Chen, S. Zhe, X. Chu, Z. Xu, Learning com- pact recurrent neural networks with block-term tensor decomposition, in: 2018 IEEE Conference on Computer Vision and , CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, IEEE Com- puter Society, 2018, pp. 9378–9387. doi:10.1109/CVPR.2018.00977.

[3] Y. Pan, J. Xu, M. Wang, J. Ye, F. Wang, K. Bai, Z. Xu, Compress- ing recurrent neural networks with tensor ring for action recognition, in: The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019, AAAI Press, 2019, pp. 4683–4690.

[4] A. Novikov, P. Izmailov, V. Khrulkov, M. Figurnov, I. V. Oseledets, Tensor train decomposition on tensorflow (T3F), J. Mach. Learn. Res. 21 (2020) 30:1–30:7. URL https://www.jmlr.org/papers/v21/18-008.html

[5] J. Kossaifi, Y. Panagakis, A. Anandkumar, M. Pantic, Tensorly: Tensor learning in python, J. Mach. Learn. Res. 20 (2019) 26:1–26:6. URL http://jmlr.org/papers/v20/18-277.html

[6] L. Hao, S. Liang, J. Ye, Z. Xu, Tensord: A tensor decomposition in tensorflow, Neurocomputing 318 (2018) 196–200. doi:10.1016/j. neucom.2018.08.055. URL https://doi.org/10.1016/j.neucom.2018.08.055

[7] C. Roberts, A. Milsted, M. Ganahl, A. Zalcman, B. Fontaine, Y. Zou, J. Hidary, G. Vidal, S. Leichenauer, Tensornetwork: A library for physics and machine learning, CoRR abs/1905.01330. arXiv:1905.01330. URL http://arxiv.org/abs/1905.01330

[8] R. Ballester-Ripoll, tntorch - tensor network learning with (2018). URL https://github.com/rballester/tntorch

7 [9] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. A. Tucker, V. Vasudevan, P. War- den, M. Wicke, Y. Yu, X. Zheng, Tensorflow: A system for large-scale machine learning, in: K. Keeton, T. Roscoe (Eds.), 12th USENIX Sym- posium on Operating Systems Design and Implementation, OSDI 2016, Savannah, GA, USA, November 2-4, 2016, USENIX Association, 2016, pp. 265–283.

[10] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, A. Lerer, Automatic differentiation in pytorch.

[11] S. van der Walt, S. C. Colbert, G. Varoquaux, The numpy array: A structure for efficient numerical computation, Comput. Sci. Eng. 13 (2) (2011) 22–30. doi:10.1109/MCSE.2011.37. URL https://doi.org/10.1109/MCSE.2011.37

[12] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna, Rethinking the inception architecture for computer vision, in: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, IEEE Computer Society, 2016, pp. 2818– 2826. doi:10.1109/CVPR.2016.308. URL https://doi.org/10.1109/CVPR.2016.308

8 Required Metadata Current code version Ancillary data table required for subversion of the codebase. Kindly re- place examples in right column with the correct information about your cur- rent code, and leave the left column as it is.

Nr. Code metadata description Please fill in this column C1 Current code version 0.1.3 C2 Permanent link to code/repository https : //github.com/tnbar/tednet/ used of this code version releases/tag/0.1.3 C3 Legal Code License MIT License C4 Code versioning system used git C5 Software code languages, tools, and Python, Pytorch services used C6 Compilation requirements, operat- Python3.X, NumPy ing environments & dependencies C7 If available Link to developer docu- https : //tednet.readthedocs.io/ mentation/manual en/latest/index.html C8 Support email for questions [email protected]

Table 3: Code metadata (mandatory)

9