Available online at www.sciencedirect.com ScienceDirect

Procedia Computer Science 36 ( 2014 ) 234 – 239

Complex Adaptive Systems, Publication 4 Cihan H. Dagli, Editor in Chief Conference Organized by Missouri University of Science and Technology 2014 - Philadelphia, PA Nonlinear Modeling using Neural Networks for Trading the Complex

Phoebe S. Wilesa, David Enkeb*

aMissouri University of Science and Technology, 223 Engineering Management, 600 West 14th Street, Rolla, MO 65409-0370, USA bMissouri University of Science and Technology, 227 Engineering Management, 600 West 14th Street, Rolla, MO 65409-0370, USA

Abstract

Recently, there has been a spike in the prices and popularity of . On a macroeconomic level, developing countries are increasing production; while on a microeconomic level, speculative traders are becoming more involved in the market. Agricultural products have a diverse array of factors that can affect the price (i.e. political, government, population, weather, supply and demand). prices can suffer from extreme volatility in the term, changing as much as 50% in one year. This research uses the soybean crush spread as a model. The soybean complex adds an interesting component as the underlying soybean product can be crushed into soymeal and soy oil. All three products (, soymeal, and soy oil) currently have contracts on the Chicago Mercantile Exchange. The crush represents the profit margin a processor will receive from crushing the soybeans into the underlying products (soymeal and soy oil). This research adds to the literature of agricultural price forecasting models, using artificial intelligence and nonlinear modeling. The performance of different neural network architectures and inputs to discover desirable returns for both speculative trading and hedging are investigated. ©ª5IF"VUIPST1VCMJTIFECZ&MTFWJFS#7 2014 Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/3.0/). 4FMFDUJPOBOEQFFSSFWJFXVOEFSSFTQPOTJCJMJUZPGTDJFOUJGJDDPNNJUUFFPG.JTTPVSJ6OJWFSTJUZPG4DJFODFBOE5FDIOPMPHZ Peer-review under responsibility of scientific committee of Missouri University of Science and Technology Keywords: soybean comple[ neural networks nonlinear modeling

1. Introduction

The soybean complex is one of the most heavily traded agriculture commodities. It consists of three underlying contracts: soybeans, soybean meal, and soybean oil. Soybeans can be processed into soymeal and soy oil. Soymeal is used for animal feeds, while soy oil can be used for food consumption [1]. In 2013 soy oil made up 63% of vegetable oil consumption [2]. With the many uses of soymeal, soy oil, and soybeans, the supply and demand of one can affect the price of the others. In 2013, approximately 76.53 million acres of soybeans were planted and 75.86 million acres of soybeans were harvested in the United States [3]. This makes soybeans the second largest crop produced in the United States. With a yield of 43.3 bushels per acre, this equates to 3.288 billion bushels of soybeans. The average price of soybeans in 2013 was $14.1, which equates to approximately $41.8 billion in production of soybeans [3]. With soybean cash

* Corresponding author. Tel.: 573-341-4749. E-mail address: [email protected].

1877-0509 © 2014 Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/3.0/). Peer-review under responsibility of scientific committee of Missouri University of Science and Technology doi: 10.1016/j.procs.2014.09.085 Phoebe S. Wiles and David Enke / Procedia Computer Science 36 ( 2014 ) 234 – 239 235 receipts of $41.8 billion, it is easy to see that soybeans play a vital role in not only the agricultural economy, but also the U.S. economy as a whole. However, in comparison to financial derivatives, the research on forecasting commodity prices is lacking [4, 5, 6]. “There’s a compelling reason to own commodities as an inflation or source of diversification [7].” It is estimated that money managers invested approximately 7.5 billion in the commodity markets during the first half of 2014 [8]. Thus this research aims to use neural networks to forecast soybean future prices for a competitive edge in commodity trading.

2. Soybean Crush

The crush margin calculates the price difference between soybean products and the actual soybeans [9]. One of the many reasons the crush margin is important is that it shows the profit/loss a processor can make. Speculators also use the soybean crush spread to trade on the narrowing or widening of the spread [10]. As an example, to short the crush, a trader could be long 10 soybean contracts, short 11 soybean meal contracts, and short 9 soybean oil contracts. To long the crush, an individual might be short 10 soybean contracts, long 11 soybean meal contracts, and long 9 soybean oil contracts [11]. When a trader believes the crush spread is narrowing, the trader would want to sell meal and oil and buy the soybeans. When the trader believes the crush spread is widening, the trader would do the opposite [10]. Listed below are the specifics for each contract:

• Soybeans have a contract size of 50,000 bushels, are priced in cents per bushel, and the symbol is S on the trading floor [12]. • Soybean meal has a contract size of 1000 short tons, is priced in dollars and cents, and the symbol is SM on the trading floor [13]. • Soybean oil has a contract size of 60,000 pounds, is priced in cents per pound, and the symbol is BO on the trading floor [14].

Typically, 1 bushel, or 60 pounds of soybeans, creates 48 pounds of meal, 11 pounds of oil, and 1 pound of waste. The Chicago Board of Trade (CBOT) and Chicago Mercantile Exchange (CME) crush calculation is 44 pounds of meal and 11 pounds of oil. The crush margin calculates the price difference in the soybean products and the actual soybeans [9]. This research follows the CME/CBOT crush margin and is calculated as follows:

Crush spread = .022SM +.11BO - .01S (1)

3. Forecast Modeling and Methodology

Neural networks are ideal for modeling the intricate interaction of the soybean complex as neural networks can approximate any function to a certain degree of accuracy [15]. By using a neural network to estimate the soybean crush margin, both the implicit stochastic process of the back contracts and the relation of the front contract crush spread can be determined from observed data [16]. Previous work has investigated modeling the soybean oil crush spread and the corn ethanol spread [17, 18]. For the soy oil spread, researchers use the soybean futures prices and soy oil futures prices [17]. For trading the corn ethanol spread, researchers utilized a variety of variables, including indexes, moving averages, energy returns, volatility, and lags [18]. There are two major issues to deal with when rolling over contracts: 1) the rollover date and 2) price adjustment [19]. As liquidity is not a major concern, this research uses the simple rolling over on the last trading date, with no price adjustment [19]. A “widely” accepted belief is that due to the erratic volume and volatile prices, data from the last month of a can be worthless [19]. However, neural networks in their design are expected to deal with such situations. Thus, last month’s data is used in this research. The front contract is the contract that is closest to expiry; the back contract is the contract that is a month or more behind expiry. In this research, the front contract is the closest contract to expiry, while the back contract is the second closest contract to expiry. Many traders, in addition to technical and fundamental analysis, observe the prices of back contracts as an idea of where prices are heading and the current level of trader sediment. This paper will explore using back contracts to model the soybean crush margin of the front contract. 236 Phoebe S. Wiles and David Enke / Procedia Computer Science 36 ( 2014 ) 234 – 239

This research will incorporate these two neural network architectures for modeling the soybean complex: a radial basis function (RBF) network, and Levenberg-Marquardt backpropagation (LM) network with 10 hidden neurons, a hyperbolic sigmoid transfer function (tansig), and a linear activation function (purelin). The hyperbolic sigmoid transfer function was chosen as it is the most commonly used activation function for business applications [20]. Previous research has found that the popular GARCH(1,1) model is not out-performed by more sophisticated models [21, 22]. Therefore, the volatility was found using a GARCH(1,1) model. The crush margin was calculated using the CBOT margin formula (equation 1) with the settlement prices of S1, SM1, and BO1.

4. Data Selection

Data selection is crucial in forecasting commodity prices. The data set in this research includes the following:

• Front soybean contracts (S1) settlement price • Front soy oil contracts (BO1) settlement price • Front soymeal contracts (SM1) settlement price • Back soybean contract (S2) settlement price • Back soy oil contracts (SB2) settlement price • Back soymeal contracts (SM2) settlement price • Volatility S2 • Volatility SM2 • Volatility BO2 • Crush margin

Data was acquired from the CME [23, 24, 25]. Contracts can be labeled numerically in order of expiration, thus S1 is the closest contract to expiry and S2 would be the second closest contract to expiry. Making inferences from a single chart is very difficult when viewing soymeal, soy oil, and soybeans given that soy oil prices, soymeal prices, and soybean prices differ in units. As an example, below are the individual charts of each contract in the study.

Fig. 1. S1 Settlement Price Fig. 2. SM1 Settlement Prices Fig. 3. BO1 Settlement Prices

As seen in Figures 1-3, there appears to be a relationship between soybeans, soymeal and soy oil. This research aims to model this relationship for trading opportunities. The variables noted above are all priced quite differently, thus the data was normalized. In normalization, the minimum and maximum of the variables are used. To give an idea of the prices, and the crush margin, the minimum and maximum of the variables are listed below in Table 1. Daily data was collected from January 6, 2010 to April 1, 2014. This gives 1045 data points once holidays and missing data are considered.

Table 1. Maximum and Minimum Values of the Variables

Settle Settle Settle Settle Settle Settle Crush

BO2 BO1 SM1 SM2 S2 S1 Margin

Maximum 60.39 59.77 548.1 538.4 1768.25 1771 1.2505

Minimum 35.99 35.84 249.6 251.3 912.25 908 -0.0808

Phoebe S. Wiles and David Enke / Procedia Computer Science 36 ( 2014 ) 234 – 239 237

5. Results

Described below are the results of the best performing models that were developed and tested.

1.1. Front Soybean Prices

Model 1

The first model was developed to predict S1 from BO1, SM1, S2, SB2, and SM2 using a RBF neural network. This network was trained with 1035 data points. After training there were 105 neurons and a mean square error of 4.9883e-05. The last 10 days were left out for predictive trading (testing).

Shown in Figure 4 are the predicted normalized prices of S1 (shown as a solid line) versus the actual prices of S1 (shown as a dashed line). The training performance is measured with mean square error and is 6.1276e-05. This model performs very well, as the predicted normalized prices are very close to the actual normalized prices of S1. The momentum and direction is clearly visible. A trader could easily trade the front soybean contract from this model solely using BO1, SM1, S2, SB2, and SM2.

Model 2

Model 2 uses a feed-forward neural network with the Levenberg-Marquardt back propagation training. This model also uses 1035 data points for training, testing, and validation. The network uses a hyperbolic transfer sigmoid function to predict S1 from BO1, SM1, S2, SB2, and SM2. The performance is measured by mean square error. The overall mean square error of this model is 7.7051e-05.

Once again, 10 days were left out for predictive testing. Figure 5 shows in black the predicted normalized prices of S1 (shown as a solid line) versus the actual normalized prices of S1 (shown as a dashed line). The mean square error of the predictive trading testing for the last ten days is 1.8e-03. While the LVM model performance is notable, it does not perform as well as the RBF model. However, the LVM does show the momentum and direction of S1 prices, and the model trains much quicker, which is important when trading.

0.67 0.7 Predicted S1 Predicted S1 0.66 Actual S1 Actual S1 0.68 0.65

0.64 0.66

0.63 0.64 0.62 0.62 0.61

0.6 0.6

0.59 0.58 0.58

1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10

Fig. 4. Model 1 Predicted vs. Actual Prices Fig. 5. Model 2 Predicted vs. Actual Prices

1.2. Crush Margin

Model 3

Model 3 is trained with an RBF and uses S2 prices, SM2 prices, BO2 prices, S2 volatility, SM2 volatility, and BO2 volatility to predict the front crush margin. Since the GARCH(1,1) model was used for volatility, this decreases 238 Phoebe S. Wiles and David Enke / Procedia Computer Science 36 ( 2014 ) 234 – 239

the data points to 1043 with the last 10 data points being left out for the predictive testing. After training, Model 3 had 562 neurons and a mean square error of 4.994e-04.

Once again, 10 days were left out for predictive testing. Figure 6 shows the predicted normalized prices of crush margin (shown as a solid line) versus the actual normalized prices of crush margin (shown as a dashed line). The mean square error of the predictive 10 days of trading is 0.0338. This model does not perform well. There is a slight general direction movement in line with the actual crush margin. However, this model runs very slowly and does not produce optimal trading results.

Model 4

Model 4 is trained with an LV back propagation and uses S2 prices, SM2 prices, BO2 prices, S2 volatility, SM2 volatility, and BO2 volatility to predict the front crush margin. Like Model 3, 1033 data points were used for training, testing, and validation, with the last 10 days left out for predictive testing. The performance as measured with mean square of this network is 2.8e-03.

Model 4 uses only the back contracts and back volatility to predict the front crush margin. Again, the last 10 days of data were left out for predictive testing. This simple model, with only six inputs, shows the direction and movement of the crush margin very well with a mean square error of 0.0083. This model trains quicker and has a lower mean square error than the equivalent RBF network in Model 3. Trading a narrowing or widening spread could be achieved by using this model. Again, notice in Figure 7 the predicted normalized prices of crush margin (shown as a solid line) versus the normalized actual prices of crush margin (shown as a dashed line).

0.5 0.5

0 0.45

-0.5 0.4

-1 0.35

-1.5 0.3

Predicted Crush Predicted Crush Actual Crush Actual Crush -2 0.25 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10

Fig. 6. Model 3 Predicted vs Actual Prices Fig. 7. Model 4 Predicted vs Actual Prices

6. Conclusions

This research has studied four different neural networks for modeling the soy complex. After modelling and testing, it does appear that the back contract does influence the front crush margin. A trader could easily make inferences about soymeal and soy oil prices by using Models 1 and 2 to predict the prices of soybeans for trading. The Levenberg-Marquardt back propagation (LM) network performs better for trading the crush spread. Also, the Levenberg-Marquardt back propagation (LM) network runs much quicker for trading the crush spread, which is vital in the fast paced commodity trading environment. The model also satisfies the initial goal of attempting to model the widening, narrowing, and momentum of the spread for trading purposes is satisfied. This research leads to further forecasting research questions, for example, whether research on lagged predications and different variables, such as technical indicators or volume, should be considered. The soy complex is one of the most interesting agriculture futures contracts to trade, but more research must be done on the interactions of the futures contracts in the soy complex.

Phoebe S. Wiles and David Enke / Procedia Computer Science 36 ( 2014 ) 234 – 239 239

References

1."SoyStats" SoyStats. . 2."SoyStats" SoyStats. . 3."NASS - Statistics By Subject." NASS - Statistics By Subject. . 4.12 Sato, Akira, Lukáš Pichl, and Taisei Kaizoji. "Using Neural Networks for Forecasting of Commodity Time Series Trends." Databases in Networked Information Systems. Springer Berlin Heidelberg (2013): 95-102. 5. Moshiri, Saeed, and Faezeh Foroutan. "Forecasting Nonlinear Crude Oil Futures Prices." The Energy Journal 27.4 (2006): 81-95. 6. Panella, Massimo, Francesco Barcellona, and Rita L. D'ecclesia. "Forecasting Energy Commodity Prices Using Neural Networks." Advances in Decision Sciences 2012 (2012): 1-26. 7. Johnson, Nicolas."Investors Cash In With a Commodities Trading Strategy."The Wall Street Journal. Dow Jones & Company, n.d. Web. 15 July 2014. . 8. Shumsky, Tatyana. "Investors Cash In With a Commodities Trading Strategy."The Wall Street Journal. Dow Jones & Company, n.d. Web. 15 July 2014. . 9. "Grain Market Analysis FAQ." Grain Market Analysis FAQ. . 10. Mitchell, John. "Soybean Futures Crush Spread Arbitrage: Trading Strategies and Market Efficiency." Journal of Risk and Financial Management 3.1 (2010): 63-96. 11. "Soybean Crush Future Contract Specs." Soybean Crush Future Contract Specs. . 12. "Soybean Futures Contract Specs." Soybean Futures Contract Specs. . 13. "Soybean Meal Futures Contract Specs." Soybean Meal Futures Contract Specs. . 14. "Soybean Oil Futures Contract Specs." Soybean Oil Futures Contract Specs. . 15. Hornik, Kurt, Maxwell Stinchcombe, and Halbert White. "Multilayer feedforward networks are universal approximators." Neural Networks 2.5 (1989): 359-366. 16. Anders, Ulrich, Olaf Korn, and Christian Schmitt. "Improving the pricing of options: a neural network approach." Journal of Forecasting 17.56 (1998): 369-388. 17. Dunis, Christian , Jason Laws, Peter Midddleton, and Andreas Karathanasopoulos. "Trading and hedging the corn/ethanol crush spread using time-varying leverage and nonlinear models." The European Journal of Finance (2013). 18. Dunis, Christian L., Jason Laws, and Ben Evans. "Modeling and Trading the Soybean-oil crush spread with Recurrent and Higher Order Networks: A Comparative Analysis." Neural Network World 16.3 (2006): 193-213. 19. Ma, Christopher K., Jeffery M. Mercer, and Matthew A. Walker. "Rolling Over Futures Contracts: A Note." The Journal of Futures Markets, 12.2 (1992): 203-17. 20. Yao, Jingtao, Yili Li, and Chew Lim Tan. "Option price forecasting using neural networks." Omega 28.4 (2000): 455-466. 21. Hansen, Peter R., and Asger Lunde. "A Forecast Comparison Of Volatility Models: Does Anything Beat A GARCH(1,1)?." Journal of Applied Econometrics 20.7 (2005): 873-889. 22. Bollerslev, Tim. "Generalized Autoregressive Conditional Heteroscedasticity." Journal of Econometrics 31 (1986): 307-327. 23. "CME Soybean Meal Futures (SM)." Data from Quandl. . 24. "CME Soybean Oil Futures (BO)." Data from Quandl. . 25. "CME Soybeans Futures (S)." Data from Quandl. .