end-point temperature preset of molten steel in the final

ISIJ International, Vol. 61 (2021), No. 7

© 2021 ISIJ 2100

ISIJ International, Vol. 61 (2021), No. 7, pp. 2100–2110

https://doi.org/10.2355/isijinternational.ISIJINT-2020-540

* Corresponding author: E-mail: [email protected]

© 2021 The Iron and Steel Institute of Japan. This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs license (https://creativecommons.org/licenses/by-nc-nd/4.0/).

1. Introduction

Due to the characteristics of high-temperature (1 500–1 700°C), multi-process, quasi-continuation along with various complex physical and chemical reactions among gas-liquid-solid multiphase, steelmaking-continuous casting process (SCCP) is still perceived as a crucial section in the whole steel manufacturing process.1) Temperature control of molten steel, especially its casting temperature, is increasingly important for the final steel quality and energy consumption in SCCP.2–5) A poor casting temperature of molten steel can result in various quality problems of con-tinuous casting strands, e.g. center segregation and surface crack, and even casting failure.

With increasingly stringent requirement for high-quality

End-point Temperature Preset of Molten Steel in the Final Refining Unit Based on an Integration of Deep Neural Network and Multi-process Operation Simulation

Jianping YANG,1) Jiangshan ZHANG,1) Weida GUO,2) Shan GAO2) and Qing LIU1,3)*

1) State Key Laboratory of Advanced Metallurgy, University of Science and Technology Beijing, Beijing, 100083 China.2) Laiwu Iron and Steel Group Yinshan Section Steel Co., Ltd., Jinan, 271104 Shandong, China.3) Engineering Research Center of MES Technology for Iron & Steel Production, Ministry of Education, Beijing, 100083 China.

(Received on September 7, 2020; accepted on April 13, 2021; J-STAGE Advance published date: May 27, 2021)

End-point temperature preset of molten steel in the final refining unit is as important as its prediction for casting temperature control. However, it has not been given sufficient concern yet, and the proposed preset models in the literature usually cannot be used as practical tools due to their inherent shortcom-ings, e.g., oversimplifications made to a real environment during modelling. In this study, a novel preset approach was developed by integrating deep neural network (DNN) and multi-process operation simulation (MOS). By using MOS, the accurate transfer times of heats between the final refining unit and continuous caster can be solved before their actual scheduling, which is very significant for availability of the preset model based on DNN in practice. The DNN preset model was trained and tested with varying the values of hyper-parameters based on vast data points collected from a real steelmaking plant. Furthermore, pre-set models based on extreme learning machine (ELM) and multivariate polynomial regression (MVPR) were also established for comparison. The testing results indicate the DNN preset model with 3 hidden layers which contain 8, 4 and 2 neurons in sequence shows an advantage over other alternatives because of its evident improvement in preset accuracy and robustness. Meanwhile, a fine classification of data points considering metallurgical expertise can improve the generalization performance of the DNN preset model. The integrated approach has been applying in the studied steelmaking plant, and the ratio of qualified heats increases by 9.5% than before using it.

KEY WORDS: final refining unit; end-point temperature preset; molten steel; deep neural network; multi-process operation simulation; integrated approach.

steel products, the temperature control of molten steel has been the focus of various modelling attempts, most of which are related to the end-point temperature prediction of molten steel in refining units. It is workable to control the casting temperature of molten steel effectively by predict-ing its end-point temperature in refining units, especially in the final one before casting. As reported in the literature, common methods for the end-point temperature prediction of molten steel in refining units include theoretical models based on heat transfer analysis,6) data models based on machine learning (ML)7) and integrated models combining them.8) With the fast development of artificial intelligence (AI), the ML models are more in favour owing to their powerful availability to deal with complex industrial prob-lems.9,10) Another truth observed from previous researches is that the end-point temperature prediction of molten steel in ladle refining furnace (LF) has gained momentum as a

https://creativecommons.org/licenses/by-nc-nd/4.0/


© 2021 ISIJ2101

result of its wide application in steelmaking process. Nath et al.11) developed an online model to predict the end-point temperature of molten steel in LF unit by analyzing the balance relations of materials and heat synthetically. Feng et al.12) proposed a combined method of case-based reason-ing (CBR) and Bayesian Belief Network (BBN) to predict the end-point temperature of molten steel in LF unit. Fu et al.13) established a gray-box model to achieve the end-point temperature prediction of molten steel in LF unit, which was integrated by a heat transfer model and a BP neural network model. Apart from LF, the end-point temperature prediction in other refining units were also carried out, e.g., Ruhrstahl Heraeus Vacuum Degasser (RH), Argon Oxygen Decarburization (AOD) and Vacuum Oxygen Decarburiza-tion (VOD).14–16) Nowadays, the actual casting temperature of molten steel frequently deviates from the expected one although several prediction models have been utilized. The main reason for the deviation can be summarized that the adjustment of molten steel temperature before casting based on its prediction in refining units is conducted blindly with-out a reliable preset value as reference. Therefore, the end-point temperature preset of molten steel is as meaningful as its prediction in refining units. He et al.17) constructed a hybrid model with the backward preset and forward predic-tion of molten steel temperature in the whole SCCP. In his study, however, the transfer times of heats between adjacent processing units as indispensable input variables for model-ling were set only referring to empirical values from the statistics, without considering various production scenarios. It is true that the accurate transfer times of heats between adjacent processing units are difficult to know before their actual scheduling because of the impacts of intricate crane running and ladle cycling. Thus, it inhibits the precise preset of end-point temperature of molten steel in refining units.

To solve the aforementioned bottleneck for using the preset models in practice, a novel approach based on an inte-gration of ML and MOS was developed to preset the end-point temperature of molten steel in the final refining unit by reason of its direct relation with the casting temperature. In this study, DNN, one of the most popular ML methods, was employed to establish an end-point temperature preset model that was trained and tested by utilizing massive data points collected from a medium-large converter steelmak-ing plant in China. To realize the practical usage of the DNN preset model, the MOS model was introduced to calculate the potential transfer times of heats between the final refining unit and continuous caster dynamically under various production scenarios. As an attempt of ML in the steelmaking field, the present work can offer an all-purpose method to operators in controlling the casting temperature of molten steel.

2. Overviews of DNN and MOS

2.1. DNNArtificial neural network (ANN) is composed of input

layer, hidden layer and output layer, and the ones with no less than 3 hidden layers are generally deemed as DNN. Owing to the advantage in learning the representations of complex problems with high-dimensional data space, DNN and its derivatives such as convolution neural network

(CNN) have shown state-of-the-art performance in voice/pattern recognition and many industrial applications such as material engineering.18–20) Figure 1 shows a schematic diagram of DNN with K hidden layers and the workflow for learning as a package of mathematical formulas. The introduction of activation function makes it possible for the neural network to conduct a non-linear mapping learning. In particular, the rectified linear units (ReLU) f(z)=max(0, z) has gradually become a promising activation function for DNN because it can efficiently avoid a gradient vanishing

problem when using the sigmoid function f ze z

( ) ��

��

��

1

1

as the activation function.21,22)

In general, the initial W (k) and b(k) in DNNs are randomly generated, and so the DNNs should be trained iteratively to find out the optimal ones with a limited iteration number or error cost. At present, the backpropagation (BP) algorithm based on gradient descent is a more prevalent learning method, and its mathematical expression for one iteration is shown as Eq. (1):

�� J( ) ( )� .................. (1)

Where, θ indicates the collection of W (k) and b(k). J(θ) is a loss function used to evaluate the error cost of DNN after each iteration, and mean square error (MSE) and root mean square error (RMSE) are widely used, displayed as Eqs. (2) and (3), where n, Yitar and Yical are the total number of data points in training or testing datasets, the target (actual) and calculated values of output variable, respectively. Ω(θ) in Eq. (1) is the regularizer used to reduce the risk of DNN overfitting. The lasso23) and weight decay24) are broadly used as regularization methods. η and λ represent learning rate and regularization coefficient, respectively.

MSE tar cal� ��1 2

1nY Yi i

i

n

( ) ...................... (2)

RMSE tar cal� �

��1 2

1nY Yi i

i

n

( ) ................... (3)

In fact, the DNN with multiple hidden layers may not achieve a good generalization performance when only being trained by BP algorithm since it is prone to get stuck into poor local solutions, and thus a promising strategy has been proposed by combing pre-training and fine-tuning.20) In the process of pre-training, a hidden layer would be trained when its previous one has been trained completely instead of all hidden layers being trained synchronously. Restricted Boltzmann machine (RBM)25) and stacked auto-encoder (SAE)26) are extensively used as the pre-training approaches. After pre-training, the fine-tuning is carried out to search the global optimal weights using BP algorithm. In addition to reaching a better generalization performance, the computa-tion cost for pre-training and fine-tuning is also lower than that for training only by BP algorithm.

By analyzing on previous researches related to the appli-cation of DNN,19,21,27,28) the hidden layer number as well as the neuron number per hidden layer have great influences on the generalization performance of DNN. With respect to other hyper-parameters in DNN, e.g., learning rate and iteration number, Refs. 19,21,27,28) also suggested some


© 2021 ISIJ 2102

empirical values as reference.

2.2. MOSTo realize intelligent manufacturing in SCCP, the opti-

mization of multi-process operation (mainly denoted as production scheduling) is as significant as the improvement of single-process techniques. However, the performance of most scheduling models established using traditional meth-ods, such as the mathematical programming, cannot be sat-isfactory in a real environment because of less consideration of crane running and ladle cycling. Owing to a great power to describe complex industrial systems, the simulation method based on Plant Simulation software has been gradu-ally utilized to study the multi-process operation problem in SCCP. He et al.29) applied Plant Simulation to understand the influence of workshop layout and process parameters on operation efficiency of SCCP. Deng et al.30) uncovered the effect of operation modes of duplex converter on equipment efficiency and steel output of SCCP using Plant Simulation.

Based on Plant Simulation, a MOS model was developed by our research group to simulate the multi-process opera-tion taking crane running and ladle cycling into account.31) Figure 2 shows the visualization of a simplified MOS model containing one converter (BOF), one ladle refining furnace

(LF) and one continuous caster (CCM). In Fig. 2, the loca-tions of each processing unit are marked by red lines across the “Track”, and they also enable to sense the real-time positions of two cranes. The indications of main modelling objects in the simplified MOS model are also explained in the bottom half of Fig. 2. With the constraint of multi-heat continuous casting in CCM, a strategy of heat scheduling being reversed was employed in the MOS model.31,32) Equa-tions (4) and (5) reveal how the heat scheduling is reversed in CCM, where the cast number is assumed as m, and end-ing cast times of the m casts are presented as t1 < t2 < ... < ti ... < tm.

t t t i mi m i� � � �2 1 2, , ,..., ..................... (4)

t t N j t j Nij i i i� �� ( ) , , ,...1 2 ................ (5)

Where, ti∗ and tij∗ are starting cast times of cast i and heat j in cast i, respectively, being represented in simulation; t is the casting cycle of CCM, and Ni is the heat number of cast i. By using the above strategy, the simulated scheduling solved by the MOS model is an inverse duplication of the actual one. In Fig. 2, for instance, the tapping of molten steel in BOF is prior to oxygen blowing. Hence, the operations of processing units in the simulation cannot be equivalent

Fig. 1. A schematic diagram of DNN with K hidden layers and the workflow for learning. (Online version in color.)


© 2021 ISIJ2103

to their real ones.Figure 3 presents a detailed workflow of the MOS model,

including the preparation of an initial scheduling solution as input of the MOS model, the simulation considering crane running and ladle cycling, and the output of the final simulated scheduling solution. In Fig. 3, the rules for crane running and ladle cycling as addressed in Step 2 are the key to hold the simulation closely representing the real environ-ment. Through on-site survey, not limited to the studied steelmaking plant, the movement of each crane in tapping (refining) span is allowed in a specified area to avoid fre-quent conflictions between adjacent cranes. However, the conflictions between them are hard to prevent under some complex scenarios. In such cases, a strategy to eliminate the conflictions is employed in terms of the priority of missions loaded by two cranes, which is determined by the mission scheduling. For ladle cycling, the main work is to select a ladle from multiple candidates for online cycling, and it is

reasonable that the ladle with the longest waiting time in “LadleSta” has the highest priority for online cycling. Since detailed formulaic descriptions on algorithms for crane run-ning and ladle cycling have been reported in Ref. 31), this paper does not repeat the illustrations. After simulation as shown in Step 3, the heat scheduling must be reversed again referring to Eqs. (4) and (5), and then a usable scheduling solution is obtained.

By sufficient testing experiments in Ref. 31) for the same steelmaking plant studied in this paper, the availability of the MOS model has been proved, and for almost all tested heats, their mean error in transfer time was about 10.8% that is considerably acceptable for a harsh steelmaking environ-ment. Therefore, the MOS model based on Plant Simulation is a reliable candidate to study the multi-process operation problem in SCCP, and furthermore it is more reasonable to adopt the simulated scheduling solution for practical use rather than the initial one.

Fig. 2. A simplified MOS model with the explanations of main modelling objects in it. (Online version in color.)

Fig. 3. Workflow of the MOS model. (Online version in color.)


© 2021 ISIJ 2104

3. Integration of DNN and MOS

3.1. Determination of Input VariablesTo develop a DNN model for presetting the end-point

temperature of molten steel in the final refining unit, the first step is to determine input variables. As reported in Ref. 17), the casting temperature of molten steel was influenced by following factors: end-point temperature of molten steel in the final refining unit, molten steel mass before casting, transfer time of the corresponding heat from the final refin-ing unit to continuous caster and ladle thermal state. There-fore, it is believable to preset the end-point temperature of molten steel in the final refining unit by using the casting temperature, the molten steel mass before casting, the trans-fer time of the corresponding heat between the final refin-ing unit and continuous caster as well as the ladle thermal state. As reported in Refs. 17,33,34), the temperature and cooling time of ladle lining before tapping and its thick-ness (erosion degree) had noticeable impacts on the heat transfer behavior of ladles. So, quantizing the ladle thermal state is workable by the above variables. However, the lin-ing thickness is hard to measure directly by hardware tools in hand, and thus empty ladle mass is an alternative to it. Furthermore, the data on lining cooling time before tapping has not been recorded accurately in the studied steelmak-ing plant, and thus, the data on lining temperature before tapping is the alternative to it based on a close relationship between the lining temperature and the lining cooling time, since a longer lining cooling time means more heat loss and further a lower lining temperature. From the above, Table 1 lists the potential input variables of the DNN preset model. Using 2 250 data points offered by the studied steelmaking plant, the correlations between the input and output vari-ables were validated by Pearson correlation coefficient and its significance testing based on the Student’s t-Test. The calculation methods of Pearson correlation coefficient and its significance are shown as Eqs. (6) and (7).

r

X X Y Y

X X Y Y

pi p i

i

n

pi p

i

n

i

i

n�

� �

� �

�

� �

�

� �

( )( )

( ) ( )

1

2

1

2

1

................ (6)

tr n

r�

�

�

2

1 2 ................................ (7)

Where, n is the total number of data points; Xpi and Yi are the ith values of the pth input variable and output variable for all data points, respectively; Xp and Y are the mean values of the pth input variable and output variable for all data points, respectively. Herein, the data from the first heats in each cast was not collected into the set of 2 250 data points because of their specific temperature schedule with a com-pensation to temperature loss of the tundish. Meanwhile, for each heat, the mean value of casting temperatures measured in three time points was used to calculate the correlation with the output variable and train the studied models. The results of the correlations between the input and output variables are given in Table 1.

The significances of the correlation coefficients are usu-ally characterized by P-value. The lower P-value indicates a more significant correlation between the input and output variables. In particular, the correlation is extremely sig-nificant if the P-value is less than 0.01. It is found from Table 1 that most potential input variables show significant correlations with the output variable except for the empty ladle mass. Hence, the final input variables for DNN model-ling include the casting temperature, the lining temperature before tapping, the transfer time of the corresponding heat between the final refining unit and continuous caster as well as the molten steel mass before casting.

3.2. Architecture and Workflow of the IntegratedApproach

After determining the input variables, it is observed that the DNN preset model cannot be directly applied in actual production as one of the input variables, i.e. the transfer time of the corresponding heat between the final refining unit and continuous caster, is difficult to know accurately before its actual scheduling. In view of this, a practical approach based on the integration of DNN and MOS was developed for the studied steelmaking plant. Figure 4 describes the architecture and workflow of the integrated approach. Evidently, the DNN preset model is the main body of the integrated approach, while the MOS model calculates the transfer times of each heat in plans between the final refin-ing unit and continuous caster and feeds them to the DNN preset model. As shown in Fig. 4, an initial scheduling for N heats is solved in advance by a scheduling model that was fully described in Ref. 35). Then the MOS model pro-duces a simulated scheduling within several minutes once it receives the initial one. In other words, the transfer times of N heats between the final refining unit and continuous caster can be known before their actual scheduling and fed to the DNN preset model in sequence according to which heat being processed. With respect to other input variables, their values can be directly measured online except for the casting temperature. Figure 5 shows time stamps to indicate when the DNN preset model receives the values of other input variables. In most refining processes, alloy composi-tion trimming is commonly the last charging operation, and then the mass of molten steel barely changes until it is processed in the next unit. Therefore, it is valid to measure the molten steel mass after alloy composition trimming in the final refining unit and feed it into the DNN preset model as its mass before casting. For the casting temperature, it is closely related to steel grade, and the expected casting

Table 1. The results of the correlations between the input and output variables.

Potential input variables Correlation coefficient P-value

Molten steel mass before casting/t −0.231 0.000

Empty ladle mass/t 0.012 0.202

Lining temperature before tapping/°C 0.340 0.000

Transfer time of the corresponding heat between the final refining unit

and continuous caster/min−0.351 0.000

Casting temperature/°C 0.907 0.000


© 2021 ISIJ2105

temperatures for each steel grade are known in advance. However, the actual casting temperatures are often far from the expected ones. In this study, the actual casting tempera-tures were utilized to train and test the DNN preset model, whereas the expected ones were used for real preset of the end-point temperature of molten steel in the final refining unit. Based on plenty of production experience in Q345B steel, a major product in the studied steelmaking plant, the suitable range for its casting temperature is from 20 to 30°C (superheat degree) over the liquidus temperature. Thus, the expected casting temperature for Q345B steel is set as 25°C over the liquidus temperature, just as the study in Ref. 36).

Once receiving the values of all input variables corre-sponding to heat j, the DNN preset model starts to run and then releases a preset value to the final refining unit within a few seconds. Hereby, the operators can receive the preset value before soft blowing as shown in Fig. 5. For the first heats in each cast, the preset values for them are obtained by adding an empirical compensation value of 20°C to the initial preset ones. Then a decision can be made on the subsequent schedule in heating and soft blowing in a given time through a comparison between current actual tem-

perature of molten steel and its preset value. With respect to the mentioned given time, it is determined based on a compromise between the simulated and actual scheduling. It is, however, inevitable that the actual scheduling deviates from the simulated one. Hence, the initial and simulated scheduling must be adjusted in accordance with the actual starting cast times of each heat.

3.3. Setting of the DNN Preset ModelIndeed, the performance of the integrated approach is

decided by the MOS and DNN preset models. Ref. 31) has verified the availability of the MOS model, and so it is a crucial work to establish a high-efficient DNN preset model. In particular, the settings of hyper-parameters are very important for usability of the DNN preset model. As representation in Section 2.1, the numbers of hidden layers and neurons in each hidden layer have great effects on the generalization performance of DNN. As reported in Refs. 19,21,27,28), the DNN models with hidden layers of 2 to 4 were suitable to solve the regression problems with low-medium dimensional parameter space. Moreover, the DNN model proposed in Ref. 27) exhibited an optimal generaliza-

Fig. 4. Architecture and workflow of the integrated approach. (Online version in color.)

Fig. 5. Time stamps to indicate when the DNN preset model receives the input values and releases the preset value. (Online version in color.)


© 2021 ISIJ 2106

tion performance when φ (the ratio of the neuron number in the first hidden layer to the input variable number) and λ (the ratio of the neuron numbers in the next and previous hidden layers) were equal to 2 and 0.5, respectively. Models 2, 4 and 6 shown in Table 2 were established based on the study in Ref. 27), and other models shown in Table 2 were established to make comparison by setting φ as 1, 3 and 4. In addition, the settings of other main hyper-parameters for 8 preset models are summarized in Table 3, which were utilized to find the optimal hidden layer number and neuron number per hidden layer. After that, learning rate, regularization coefficient and iteration number were further optimized through additional training. As suggested in Ref. 37), Models 1–5 with 1 or 2 hidden layers were trained only by BP algorithm, and Models 6–8 with more than 2 hidden layers were trained by combining pre-training based on SAE and fine-tuning.

From 2 250 data points offered by the studied steelmaking plant, 450 data points were randomly extracted as a testing dataset, and the remaining 1 800 data points were treated as a training dataset. Before modelling, the original values of the input and output variables were normalized to between 0 and 1 in the same dimensionless unit to prevent the occur-rence of low preset accuracy resulted from the disparity on order of magnitude among different variables.

3.4. Evaluation of the DNN Preset ModelIn this study, the hit ratios within error of ±3 and ±5°C,

RMSE and the goodness of fit (R2) were taken into account as performance metrics of the DNN preset model. The math-

ematical expression of R2 is displayed as Eq. (8), where Y tar indicates the mean actual value of output variable for all data points in the training or testing datasets, and the indica-tions of other parameters are the same as the corresponding ones in Eq. (3). It is well known that the larger R2 and the smaller RMSE demonstrate the better performance of mod-els. Since the preset accuracy for the testing dataset directly indicates the generalization performance of the DNN preset model in practice, the experimental results related to the testing dataset were recorded to make comparison.

R

Y Y

Y Y

i i

i

n

i

i

n2

2

1

2

1

1� ��

�

�

�

�

�

( )

( )

tar cal

tar tar

...................... (8)

4. Results and Discussion

4.1. EffectofHyper-parametersonModelPerformanceIn this study, the programs for 8 preset models were writ-

ten using Python language. Each of the preset models was trained and tested 50 times on a personal computer with Intel (R) i5-3770U CPU/8.0 GB RAM/Window 10, and the training and testing datasets were re-determined before each of the trials. Table 4 shows the testing results of 8 preset models, each of which is the mean value for 50 trials. The scatter plots of the hit ratios within error of ±3 and ±5°C for testing each preset model 50 times are drawn in Figs. 6(a) and 6(b). Furthermore, the mean computation costs for training each preset model 50 times are also listed in Table 4 to estimate their potential update rates in practice. However, there is no an additional discussion on the computation costs for testing each preset model, which are less than a second.

It is not hard to observe from Table 4 that the best results of the hit ratios within error of ±3 and ±5°C, RMSE and R2 are 76.8%, 93.6%, 3.1868 and 0.9274 respectively, which are all addressed in the same row corresponding to Model 6. Oppositely, the worst results of above evaluation indica-tors are found in the same row corresponding to Model 1. In Figs. 6(a) and 6(b), the differences between the maximum and minimum values of the hit ratios within error of ±3 and ±5°C for Model 6 are 6% and 5.3%, respectively. Indeed, the scattered degrees of hit ratios corresponding to Model 6 are lower than those corresponding to other models in the cases of error within either ±3 or ±5°C. That means

Table 4. Testing results of 8 preset models.

Model Hit ratio within error of ± 3°C/%

Hit ratio within error of ±5°C/% RMSE R2 Time for

training/s

Model 1 66.1 88.7 3.8009 0.8963 17.79

Model 2 70.6 90.9 3.5425 0.9096 21.59

Model 3 67.3 89.8 3.7316 0.8997 22.37

Model 4 72.4 91.8 3.4261 0.9155 35.78

Model 5 67.7 90.2 3.7253 0.9001 39.91

Model 6 76.8 93.6 3.1868 0.9274 36.85

Model 7 74.7 92.9 3.2506 0.9240 48.03

Model 8 73.1 91.9 3.3847 0.9175 62.46

Table 3. Settings of other main hyper-parameters.

Items Values

Learning method BP/(pre-training + fine-tuning)

Activation function in hidden layer ReLU

Activation function in output layer Linear (y = x)

Loss function RMSE

Learning rate 0.01

Regularization coefficient 0.001

Batch size 200

Iteration number 20 000

Table 2. Network structures of 8 preset models.

Model Hidden layer number

Neuron number per hidden layer

Model 1 1 4

Model 2 1 8

Model 3 2 4–2

Model 4 2 8–4

Model 5 2 12–6

Model 6 3 8–4–2

Model 7 3 12–6–3

Model 8 4 16–8–4–2


© 2021 ISIJ2107

Model 6 shows a stronger robustness. In addition to Model 6, Models 7 and 8 also show acceptable testing results, and however, the computation costs for training them are higher than that for training Model 6 because of more neurons or hidden layers in them. It is true that the lower computation cost for training Model 6 indicates a faster update rate for its practical use with the accumulation of historical data points. In summary, Model 6 is a more promising candidate to preset the end-point temperature of molten steel in the final refining unit based on a comprehensive consideration of hit ratio, robustness and update rate, although Models 7 or 8 may show better performances under sufficient itera-tions. Meanwhile, it also demonstrates the advantage of the DNN preset model with φ and λ being equal to 2 and 0.5, as reported in Ref. 27).

Based on the network structure of Model 6, other hyper-parameters were further optimized as the sequence of learn-ing rate, regularization coefficient and iteration number, i.e. the model was trained with varying the values of a hyper-parameter under the previous ones that had been optimized being fixed. For each of the tested hyper-parameter sets, 10 trials were carried out with holding the sizes of training and testing datasets being 1800 and 450. Herein, RMSE was chosen as the indicator, and its mean values for 10 trials

Fig. 6. Scatter plots of the hit ratios within error of a) ± 3 and b) ± 5°C. (Online version in color.)

were recorded to evaluate the performances of each tested hyper-parameter set. Figures 7(a), 7(b) and 7(c) show the values of RMSE with the increments of learning rate, regu-larization coefficient and iteration number, respectively. It is obvious that RMSE decreases firstly and then increases with the increments of either learning rate or regularization coefficient. Known from Ref. 38), if learning rate is too small, the loss function (RMSE in this paper) decreases very slowly, and if learning rate is too large, the cost may increase or oscillate. Meanwhile, if regularization coefficient is too small, the regularization term has no impact on train-ing, hence overfitting may occur; by contrast, if regulariza-tion coefficient is too large, the minimization decreases the values of the parameters irrespective of the modelling error, hence underfitting may occur. Thus, RMSE shows a minimum with learning rate or regularization coefficient changing. Aiming at the minimization in RMSE, the optimal values of learning rate, regularization coefficient and itera-tion number are 0.05, 0.003 and 23 000, respectively. Figure 7(d) presents the testing results of the hit ratios within error of ±3 and ±5°C, RMSE and R2 of Model 6 under above optimal hyper-parameters. As expected, there is an improve-ment in the generalization performance of Model 6 even though it is not significant. Herein, the higher computation cost for training Model 6 than that shown in Table 4 is resulted from a larger iteration number.

4.2. EffectofDataAttributesonModelPerformanceThe studied steelmaking plant has four CCMs to produce

multiple grades of steel products with different process routes, in which the physical and chemical properties of molten steel are different.31,35) Meanwhile, there is also a difference in ladle thermal state between ladles that cycle around SCCP continuously and the reused ones after main-tenance. Hence, it may be not appropriate that using data points from the production of common steels trains and tests the DNN preset model for high-quality steels. Thus, 2 250 data points were classified into different datasets based on steel grade and ladle type to further explore the effect of data attributes on model performance. The detailed classifica-tions discussed in this section are listed in Table 5.

As the testing results in Section 4.1, Model 6 with the optimized hyper-parameters was chosen to conduct the study in this section. To validate its advantage as a kind of DNN model, ELM and MVPR were also applied to establish preset models and make comparison. Some main hyper-parameters of the ELM and MVPR preset models were determined based on plenty of trials, and the best ones are shown in Table 6. The indications of hyper-parameters are similar as the corresponding ones in the DNN preset model. To further understand the working mechanisms of ELM and MVPR, Refs. 39) and 40) are suggested for the interested readers.

The same evaluation indicators as the ones used in Sec-tion 4.1 were employed to compare the performances of the DNN, ELM and MVPR preset models under C-1, C-2 and C-3, respectively. Running each case 20 times, and the mean values of each evaluation indicator were recorded as shown in Table 7. The DNN preset model (Model 6) demonstrates noticeable improvements in almost all of the evaluation indicators under C-1, C-2 and C-3 against the correspond-


© 2021 ISIJ 2108

Fig. 7. The values of RMSE with the increments of a) learning rate, b) regularization coefficient and c) iteration number as well as d) the values of each indicator under the optimized hyper-parameters. (Online version in color.)

Table 5. Classifications of data points based on steel grade and ladle type.

Item Attribute Size of training dataset/heat

Size of testing dataset/heat

C-1 Q345B a) 1 000 250

C-2 Ladle that cycles continuously b) 1 000 250

C-3 Both a) and b) c) 800 200

a): The steel grades corresponding to each data point in C-1 are all Q345B; b): The ladle types corresponding to each data point in C-2 are all the one that cycles continuously; c) The steel grades and ladle types cor-responding to each data point in C-3 are all Q345B and the one that cycles continuously.

Table 6. Hyper-parameters of the ELM and MVPR preset models.

Model Hyper-parameter

ELMNeuron (node) number in hidden layer Regularization coefficient Activation function

20 0.001 Sigmoid

MVPRDimension Regularization method Regularization coefficient

4 Bayesian bridge 0.001

ing ones under the unclassified dataset as shown in Fig. 7(d). Taking the hit ratios within error of ±3 and ±5°C of Model 6 under C-3 as examples, there are increments of 3.5% and 3.4% respectively against the ones under the unclassified dataset. Furthermore, all evaluation indicators under C-3 are superior to the corresponding ones under C-1

and C-2. The same observation is obtained from the testing results of the ELM and MVPR preset models. Therefore, it is concluded that better performances of the preset models can be obtained when they are trained and tested by finer classification of data points. In addition, the results of evalu-ation indicators of the DNN preset model are all better than those of the ELM and MVPR preset models under C-1, C-2 and C-3 as shown in Table 7. The computation cost of 20 seconds for training the DNN preset model can be neglected in comparison to technological operations of tens of minutes in actual production. To sum up, the expected size of the training dataset may be chosen in the range from 800 to 1 800 for practical use of the DNN preset model, since there is a good trade-off between generalization performance and update rate.

4.3. Actual Application of the Integrated ApproachAs discussed in Sections 4.1 and 4.2, Model 6 with the

optimized hyper-parameters shows the best performance among multiple candidates to preset the end-point tem-perature of molten steel in the final refining unit. Thus, it


© 2021 ISIJ2109

was used to develop the integrated approach with the MOS model.

Before applying the integrated approach in the studied steelmaking plant, the decision on the end-point tempera-ture of molten steel in the final refining unit was made in the dependence of artificial expertise. Figure 8(a) shows the statistical results of superheat degree of molten steel corresponding to the heats produced before applying it, and the ratio of qualified heats with a superheat degree of 20 to

Table 7. Testing results of the DNN, ELM and MVPR preset models under C-1, C-2 and C-3.

Classification Model Hit ratio within error of ± 3°C/%

Hit ratio within error of ±5°C/% RMSE R2 Time for

training/s

C-1

DNN (Model 6) 76.2 94.4 3.1537 0.9290 26.82

ELM 72.9 91.7 3.3295 0.9211 1.27

MVPR 71.6 91.2 3.4064 0.9172 1.45

C-2

DNN (Model 6) 77.6 95.3 3.0928 0.9317 26.73

ELM 75.3 92.5 3.2361 0.9255 1.32

MVPR 76.0 93.2 3.1913 0.9271 1.50

C-3

DNN (Model 6) 80.5 97.0 2.9206 0.9396 22.28

ELM 78.5 95.5 3.0390 0.9348 1.03

MVPR 77.5 95.0 3.0962 0.9323 1.16

Fig. 8. Statistical results of superheat degree of molten steel cor-responding to the heats produced a) before and b) after applying the integrated approach. (Online version in color.)

30°C for casting is merely 87.0%. Since September 2019, the integrated approach has been applying, and the ratio of qualified heats reaches up to 96.5% in the period from September 2019 to March 2020 as shown in Fig. 8(b). Table 8 shows the statistical results of three key indexes in the studied steelmaking plant before (from April to July 2019) and after (from September 2019 to March 2020) applying the integrated approach, including the unqualified strand amount (Index 1), the adjustment number of casting speed (Index 2) and the casting failure number (Index 3) averagely per month. Herein, the unqualified strands denote the ones resulted from poor temperature control, not including the head and tail strands in each cast. It is observed from Table 8 that above three key indexes reduce by 66.6%, 58.4% and 64.0% respectively after applying the integrated approach. Apparently, the production status of the studied steelmaking plant has a significant improvement. In conclusion, the inte-grated approach is very efficient to achieve a fine control of the casting temperature by conducting a reliable end-point temperature preset of molten steel in the final refining unit.

5. Conclusions

In this study, a novel approach has been proposed based on the integration of DNN and MOS. Its practical use to preset the end-point temperature of molten steel in the final refining unit directly depends on the availability of the DNN preset model. The MOS model offers the DNN preset model an accurate transfer time of each heat between the final refining unit and continuous caster as one of its input variables. Meanwhile, casting temperature, lining tempera-ture before tapping and molten steel mass before casting are also selected as input variables. Hyper-parameters of the optimal DNN preset model in this study, i.e., the hidden layer number, the neuron number per hidden layer, learning

Table 8. Statistical results of three key indexes before and after applying the integrated approach.

Production index Before application After application

Index 1/ton 352.0 117.6

Index 2/time 72.3 30.1

Index 3/time 2.5 0.9


© 2021 ISIJ 2110

rate, regularization coefficient and iteration number are 3, 8/4/2, 0.05, 0.003 and 23000, respectively. It shows a higher preset accuracy than the ELM and MVPR preset models. Owing to a fine classification of data points, the generaliza-tion performance of the DNN preset model can be further improved. After applying the integrated approach in the studied steelmaking plant, the ratio of qualified heats with a superheat degree of 20 to 30°C for casting has increased from 87.0% to 96.5%. This is beneficial to achieve a fine continuous casting procedure with constant casting speeds and accelerate the intelligent production with a higher qual-ity and output of required steel products.

In our next work, a high-efficient integration involving the backward preset and forward prediction will be undertaken with more emphasis on temperature control of molten steel in the whole SCCP, not limited to the casting temperature.

AcknowledgementsThis study was supported by the National Natural Science

Foundation of China (Nos. 50874014 and 51974023) and the Fundamental Research Funds for Central Universities (No. FRF-BR-17-029A).

REFERENCES

1) R. Y. Yin: Theory and Methods of Metallurgical Process Integration, Academic Press, Cambridge, MA, (2016), 102.

2) S. Sonoda, N. Murata, H. Hino, H. Kitada and M. Kano: ISIJ Int., 52 (2012), 1086.

3) P. D. Lee, P. E. Ramirez-Lopez, K. C. Mills and B. Santillana: Ironmaking Steelmaking, 39 (2012), 244.

4) F. Bianco, V. Dimitrijevic, M. Piazza, A. Spadaccini and R. Turco: Stahl Eisen, 136 (2016), 129.

5) J. -P. Birat: Metall. Res. Technol., 113 (2016), 201.6) J. R. S. Zabadal, M. T. M. B. Vilhena and S. Q. Bogado Leite:

Ironmaking Steelmaking, 31 (2004), 227.7) X. J. Wang: IEEE/CAA J. Autom. Sin., 4 (2017), 770.8) H. X. Tian, Z. Z. Mao and Y. Wang: ISIJ Int., 48 (2008), 58.9) A. Diez-Olivan, J. Del Ser, D. Galar and B. Sierra: Inf. Fusion, 50

(2019), 92.10) J. P. U. Cadavid, S. Lamouri, B. Grabot, R. Pellerin and A. Fortin:

J. Intell. Manuf., 31 (2020), 1531.11) N. K. Nath, K. Mandal, A. K. Singh, B. Basu, C. Bhanu, S. Kumar

and A. Ghosh: Ironmaking Steelmaking, 33 (2006), 140.12) K. Feng, D. F. He, A. J. Xu and H. B. Wang: Steel Res. Int., 87

(2016), 79.

13) G. Q. Fu, Q. Liu, Z. Wang, J. Chang, B. Wang, F. M. Xie, X. C. Lu and Q. P. Ju: J. Univ. Sci. Technol. Beijing, 35 (2013), 948 (in Chinese).

14) Y. P. Bao, X. Li and M. Wang: Ironmaking Steelmaking, 46 (2019), 343.

15) H. B. Wang, A. J. Xu, L. X. Ai, N. Y. Tian and X. Du: ISIJ Int., 52 (2012), 80.

16) J. W. Li and B. Y. Ma: Proc. 2nd Int. Symp. on Instrumentation and Measurement, Sensor Network and Automation (IMSNA), IEEE Computer Society, Washington, D.C., IEEE, Piscataway, NJ, (2013), 595.

17) F. He, D. F. He, A. J. Xu, H. B. Wang and N. Y. Tian: J. Iron Steel Res. Int., 21 (2014), 181.

18) J. Schmidhuber: Neural Netw., 61 (2015), 85.19) S. Feng, H. Y. Zhou and H. B. Dong: Mater. Des., 162 (2019), 300.20) Y. LeCun, Y. Bengio and G. Hinton: Nature, 521 (2015), 436.21) G. W. Song, B. A. Tama, J. Park, J. Y. Hwang, J. Bang, S. J. Park

and S. Lee: Steel Res. Int., 90 (2019), 1900321.22) C. Baldassi, E. M. Malatesta and R. Zecchina: Phys. Rev. Lett., 123

(2019), 170602.23) R. Tibshirani: J. R. Stat. Soc. Ser. B, 58 (1996), 267.24) A. Gupta and S. M. Lam: Neural Netw., 11 (1998), 1127.25) G. E. Hinton, S. Osindero and Y.-W. Teh: Neural Comput., 18

(2006), 1527.26) Y. Bengio, P. Lamblin, D. Popovici and H. Larochelle: 20th Annual

Conf. on Neural Information Processing Systems (NIPS), Neural Information Processing Systems Foundation, San Diego, CA, (2007), 153.

27) C. A. Myers and T. Nakagaki: ISIJ Int., 59 (2019), 687.28) C. L. Li, P. L. Narayana, N. S. Reddy, S.-W. Choi, J.-T. Yeom, J.-K.

Hong and C. H. Park: J. Mater. Sci. Technol., 35 (2019), 907.29) L. M. He and Y. Hu: 2010 Int. Conf. on Advances in Materials

and Manufacturing Processes (ICAMMP), Trans Tech Publications, Clausthal-Zellerfeld, Stafa-Zurich, (2011), 712.

30) S. Deng, A. J. Xu and H. B. Wang: Steel Res. Int., 90 (2019), 1800507.

31) J. P. Yang, J. S. Zhang, M. Guan, Y. J. Hong, S. Gao, W. D. Guo and Q. Liu: Metals, 9 (2019), 1078.

32) Y. Xiao: M.S. thesis, Chongqing University, (2012), https://kns.cnki.net/kcms/detail/detail.asp-x?dbcode = CMFD&filename=1012049435.nh, (accessed 2020-06-25).

33) J. L. Xia and T. Ahokainen: Metall. Mater. Trans. B, 32 (2001), 733.34) A. Tripathi, J. K. Saha, J. B. Singh and S. K. Ajmani: ISIJ Int., 52

(2012), 1591.35) J. P. Yang, B. L. Wang, Q. Liu, M. Guan, T. K. Li, S. Gao, W. D.

Guo and Q. Liu: ISIJ Int., 60 (2020), 1213.36) X. Z. Gao, J. S. Li and S. F. Yang: J. Cent. South Univ. (Sci.

Technol.), 46 (2015), 3991 (in Chinese).37) Z. H. Zhou: Machine Learning, Tsinghua University Press, Beijing,

(2016), 113 (in Chinese).38) G. Dreyfus: Neural Networks, Methodology and Applications,

Springer-Verlag, Heidelberg, (2005), 115.39) S. F. Ding, H. Zhao, Y. N. Zhang, X. Z. Xu and R. Nie: Artif. Intell.

Rev., 44 (2015), 103.40) P. Sinha: Int. J. Sci. Eng. Res., 4 (2013), 962.

https://kns.cnki.net/kcms/detail/detail.asp-x?dbcode=CMFD&filename=1012049435.nh



end-point temperature preset of molten steel in the final

Documents