Daniele Giovanni Gioia: Mathematical Engineer

About


Hi, I am Daniele and I am a Mathematical Engineer. I work as a Decision Scientist at the German Aerospace Center (Deutsches Zentrum für Luft- und Raumfahrt) in the Institute for the Protection of Terrestrial Infrastructure, dealing with protection and security of critical infrastructures on earth.
I have a Ph.D. in pure and applied mathematics at Politecnico di Torino. The application context was decision-making under uncertainty in engineering/management applications.
Practically speaking, someone pays me to generate wrong models of real problems. However, this is what people call engineering, and unexpectedly, it (may) work too.I earned my Master's degree in mathematical engineering at Politecnico di Torino and Technische Universiteit Eindhoven summa cum laude. During my Master's, I was a post-graduate research assistant for the development of software for the generation of a polyhedric mesh in domains with randomly generated interfaces for F.E.M. (if you want to compute things, I swear it is better they are convex). I also worked as a visiting researcher at the Technical University of Munich for a collaborative research project in the area of multi-echelon perishable inventory management at the logistics and supply chain management department.Besides doing math, I play the trumpet (still trying to play the IV Arban's Carnival of Venice variation with no errors) and do fitness (I once got a fitness trainer certificate, but nothing serious, I was curious). A long time ago I also earned a pre-academic certification in trumpet at the Superior Institute of Musical Studies of Caltanissetta.

Working_On


Most of my applications so far require making sequential decisions under uncertainty. A diverse array of problems is presented hereafter. Tools employed to provide decision aid include stochastic programming, dynamic programming, Bayesian metamodels, and simulation-based optimization. For a better understanding of the paradigms governing sequential decisions under conditions of uncertainty, I suggest the work of Warren Powell, who has devoted a large part of his Princeton career to this topic.Since working at DLR, I am now exploring the role of Bayesian Causality and Multi-Criteria Decision analysis for applications concerning infrastructure safety and security.

Control and design optimization of Wave Energy Converters

Descrizione #1

The energy problem related to the pursuit of alternatives to fossil fuels is an active challenge for the entire world. Unlike other types of energy conversion technologies, such as photovoltaic plants or wind turbines, wave energy converters (WECs) have not reached a sufficient level of technological maturity and two of the most crucial issues are: the development of suitable advanced control strategies for WEC devices and the optimal design in terms of cost and extracted energy.
I tried to develop some strategies that apply a Gaussian Process Regression (GPR) optimization approach to compute the parameter of a reactive control action in the article: Data-driven control of a Pendulum Wave Energy Converter: A Gaussian Process Regression approach, published by Ocean Engineering. Although it is not my domain, I also lent a hand on techniques to robustify the physical design against mechanical mismatches, co-authoring the conference proceedings Wave Energy Converter Optimal Design Under Parameter Uncertainty presented at the ASME 2022 41st International Conference on Ocean, Offshore and Arctic Engineering.

Assemble-to-Order Problems

Assemble-to-order is a production strategy where components are manufactured under demand uncertainty and end items are assembled only after demand is realized. This strategy is commonly applied to hedge against significant uncertainties in the order of the end items, naturally leading to Two-Stage and Multi-Stage Stochastic Programming formulations. I applied some ad-hoc reinforcement learning strategies to reduce the Multi-Stage complexity and study different multistage multi-item models, showing how they behave, based on the usage of the available information.
An article that deals with seasonality, bimodality, and correlations in the distribution of end items demand, where the approximation of terminal values and rolling horizon simulations are applied has been published in the International Journal of Production Research, with the title Rolling horizon policies for multi-stage stochastic assemble-to-order problems. A preprint version is available on ArXiv. The code is open-source and available on GitHub.
A book chapter contribution that tackles these problems with risk-averse models in a two-stage stochastic linear programming setup, considering the introduction of a classical risk measure from finance is available in
Optimization and Decision Science: Operations Research, Inclusion and Equity, part of AIRO Springer Series.

Inventory control of perishable items

Descrizione #1

The new sustainable development goals are considered a blueprint to achieve a better and more sustainable future for all. Reduction of losses and overstocking on supply chains and retail environments are central topics, even more so when the considered items have a short life (i.e. food, blood platelets). I tried to deal with advanced inventory management control systems that employ different mathematical methods. However parametrical policies seem to be the most accepted in this domain.A study that allows for a multi-item setting with substitution between similar goods, deterministic deterioration, delivery lead times, and seasonality was presented in Nantes at the 10th IFAC Conference on Manufacturing Modelling, Management and Control MIM, with the title Inventory management of vertically differentiated perishable products with stock-out based substitution. This work was extended and presented in more detail and comprehensively in Computer & Operation Research, in the article Simulation-based inventory management of perishable products via linear discrete choice models. An open-source simulation-based framework has been developed, filling the lack of open-source libraries in the perishable inventory literature, and released on GitHub.
Given their dimensional complexity and the strong autocorrelation within each simulated process, the best numerical strategy to optimize the policies I investigated is still an open problem and not solved in the article.
Another work on perishable items, dealing with Multichannel and omnichannel multi-echelon networks is published in Transportation Research Part E: Logistics and Transportation Review with the title On the value of multi-echelon inventory management strategies for perishable items with on-/off-line channels. There, a dynamic model is proposed, jointly optimizing allocation and replenishment policies in the case of perishable goods with stochastic demand, uncertainty in customer selection preferences, and fixed lead times on online/offline channels. In short, I tried to generalize base-stock policies over multi-echelon networks, analyzing the effect that potential correlations and imbalances in demand volumes across channels generate on the heuristics. The correspondent library is available on GitHub.

Financial portfolios optimization

The academic world is full of research that aims to optimize financial portfolios in commodity or stock markets. The information sources and the related methodologies to extract insights are countless. Moment-based methods like the well-known Markowitz's model have many model parameters and, in addition to requiring considerable computational effort, raise serious questions about the reliability of these values. Often, distributions are fat-tailed, and pure moment-based solutions are completely meaningless. One of the first things I worked on in 2020 involved optimizing long-term portfolios. In general, however, I moved away from such applications.The few ideas I had about it were about the building of very general decision support systems able to make usable whatever performance and risk metric the decision-maker considers adequate (e.g., deep learning strategies), allowing for known instruments like technical and fundamental analysis, and attempting to overcome some of the inherent complexities and limitations of the most classical strategy. An academic article titled Early portfolio pruning: a scalable approach to hybrid portfolio selection has been published in Knowledge and Information Systems by Springer Nature. There, a hybrid approach that combines itemset extraction and Markowitz’s model logic, generalizing the idea of balancing profit and risk, but dealing with sets of candidate portfolios rather than with single stocks and allowing for technicals, fundamentals and risk measures not necessarily related with poor estimation is presented. This paper presents experiments that validate the strategy by using back-testing. However, back-testing might have biases, which I discuss in the technical notes of this website. The contribution is therefore more methodological than pragmatic.

Descrizione #1

(Very) Technical_Notes


Despite authors' and reviewers' efforts, most academic articles naturally contain typos and some (hopefully small) necessary posthumous corrections. As far as I know, the only articles with no errors are the ones that no one ever read again after being published.Moreover, research is dynamic; the articles themselves should be bases for debate and not absolute truths. In fact, if no one has questions, it is probably because no one is interested.Thanks to the filter of time, I have gathered below some additional observations and some necessary corrections on my works, hoping they may help facilitate those who want to build something on what I already did. Unfortunately, or fortunately, back-propagation does not exist in the real world and experience does not go backward. I hope sharing mine will help you create your own faster.

Daniele Giovanni Gioia, Leonardo Kanashiro Felizardo, and Paolo Brandimarte. Simulation-based inventory management of perishable products via linear discrete choice models. Computers & Operations Research, page 106270, 2023.
doi:10.1016/j.cor.2023.106270.

  • Note/Better Formalization: In the practical implementation of dynamic problems, memory cells of a vector are often reused with different meanings during a simulation. However, to make the theoretical exposition clearer, the correct notation for Eqs. (10) and (12) would require that the shelf life indices do not consider the maximum value (being empty because at the end of the day) and that the lead time index iterates semantic values and not memory locations of the vector. In short, the ranges of the sums are best written as:

\[ \sum_{l=0}^{\mathsf{LT}_j-1} \quad \sum_{d=1}^{\mathsf{SL}_j-1} \]
  • Typo: At the end of page 5, regarding policies where one item is seasonally managed and the other ones are not, a -1 is missing. The correct dimension is:

\[ \mathbf{z} \in \mathbb{R}^{J-1+(K+1)} \]

Daniele Giovanni Gioia, Edoardo Pasta, Paolo Brandimarte, and Giuliana Mattiazzo. Data-driven control of a pendulum wave energy converter: A Gaussian process regression approach. Ocean Engineering, 253:111191, 2022. doi:10.1016/j.oceaneng.2022.111191.

  • Notational clarification: In a consistent way with its use in the rest of the article, In Table 2, and after Eq. (30), lambda represents the variance and not the standard deviation. Furthermore, equation (30) itself does not need a squared lambda.

  • Notational clarification: After Eq. (32), the kernel operator vector/matrix has an image with dimensionality n, not d. I.e.,

\[ \mathbf{\mathcal{K}}(\mathbf{x}^*,\mathcal{X}): \mathbb{R}^d \times \mathbb{R}^{d\times n} \to \mathbb{R}^n. \]

  • Notational clarification: In Eq.(42) some parenthesis are missing. The correlation coefficient multiplies the noise matrices as well, being:

\[ \begin{bmatrix} \mathcal{K}_\mathsf{l}(\mathbf{X}_\mathsf{l},\mathbf{X}_\mathsf{l}) + \lambda_\mathsf{l} \mathbf{I}_{n_\mathsf{l}} & \rho (\mathcal{K}_\mathsf{l}(\mathbf{X}_{l},\mathbf{X}_\mathsf{nl}) + \lambda_\mathsf{l} \begin{bmatrix} \mathbf{0}_{n_\mathsf{l} - n_\mathsf{nl} \times n_\mathsf{nl}} \\ \mathbf{I}_{n_\mathsf{nl}} \end{bmatrix}) \\ \begin{split} \rho (& \mathcal{K}_\mathsf{l}(\mathbf{X}_\mathsf{nl},\mathbf{X}_\mathsf{l})+ \lambda_\mathsf{l} \\& \begin{bmatrix} \mathbf{0}_{n_\mathsf{nl} \times n_\mathsf{l}- n_{\mathsf{nl}}} & \mathbf{I}_{n_\mathsf{nl}}\end{bmatrix} )\end{split} & \left[ \begin{split} \rho^2&( (\mathcal{K}_\mathsf{l}(\mathbf{X}_\mathsf{nl},\mathbf{X}_\mathsf{nl})+\lambda_\mathsf{l} \mathbf{I}_{n_\mathsf{l}})) \\ &+(\mathcal{K}_\mathsf{nl}(\mathbf{X}_\mathsf{nl},\mathbf{X}_\mathsf{nl}) +\lambda_\mathsf{nl} \mathbf{I}_{n_\mathsf{nl}}) \end{split} \right] \end{bmatrix}. \]

Daniele Giovanni Gioia and Stefan Minner. On the value of multi-echelon inventory management strategies for perishable items with on-/off-line channels. Transportation Research Part E: Logistics and Transportation Review, 180:103354, 2023.
doi:10.1016/j.tre.2023.103354.

  • Additional note: In equation 10 there is a slight abuse of notation. In fact, in its practical application, the reward is calculated in two time frames. The part relating to sales before the time shift, while the part relating to salvage values or disposal costs, at the time of scrapping. This means that the inventory with superscript 0 represents the inventory with superscript 1 from which sales are subtracted.

  • Bug on Table 5 and 6: The code associated with the numerical simulations related to the heuristic approaches (Section 4.2) in the article "On the value of multi-echelon inventory management strategies for perishable items with on-/off-line channels" had a bug in the estimation of the expected value, not achieving the claimed accuracy over the 35-period sliding window employed on the evaluation of heuristics approaches. The stopping criterion for the difference between the minimum and maximum values of the statistic associated with the expected value was blocked by a limit on the maximum number of simulated steps (1400), which was insufficient to guarantee a width of 0.02%, as claimed. Fluctuations of the expected value statistic, and thus of the objective function itself, might affect the optimization strategy by excessive fluctuations and biased function evaluations. We repeated the experiments with a maximum number of steps ten times larger, equal to 14000, using the same stopping criterion and optimization strategy presented in Gioia and Minner (2023). For the out-of-sample evaluation, we increase the 7000-period-long horizon five-fold to 35000. Evaluation and optimization of the full design of experiments are here presented in an updated version of Tables 5 and 6 from Gioia and Minner (2023). Conclusions and remarks in Gioia and Minner (2023) remain valid, but some values have changed slightly. For example, the waste reduction of the BSP policy for a 5-period shelf-life compared to the COP policy has decreased, while the profit values of many multi-echelon policies have improved, as they are more prone to non-convergence of the expected value estimate due to more complex dynamics during simulation than single-echelon policies. It is also reasonable to point out that the very choice of optimization algorithm is practically a hyperparameter of the study and that, using non-surrogate techniques, different results might be obtained.

Table 5 (Precise): Average profit and waste (Profit|Waste) per period with respect to different subsets of parameters and policies. Values normalized w.r.t. COP (profit, higher = better | waste smaller = better). COP values presented raw.
\[ \begin{array}{|l|c|c|c|c|c|c|c|c|c|} \hline \text{} & \text{Subset} & \text{COP} & \text{BSP} & \text{FPL_l} & \text{FPC_l} & \text{FPL2K_l} & \text{SP_l} & \text{SC_l} & \text{SP2K_l} \\ \hline \textbf{On/Off} & 80/20 & 430 \,|\, 20.8 & -1.4 \,|\, -2.3 & 0.3 \,|\, -3.1 & 0.5 \,|\, -7.6 & 0.1 \,|\, -1.7 & 0.9 \,|\, -8.7 & 0.9 \,|\, -1.6 & 0.7 \,|\, -5.6 \\ & 50/50 & 418 \,|\, 24.7 & -2.5 \,|\, 3.9 & 0.4 \,|\, -9.5 & 1.0 \,|\, -12.2 & 0.4 \,|\, -7.7 & 1.4 \,|\, -13.4 & 1.6 \,|\, -9.8 & 1.6 \,|\, -8.0 \\ & 20/80 & 407 \,|\, 27.3 & -3.6 \,|\, -0.1 & -3.8 \,|\, -5.2 & -1.2 \,|\, -9.0 & -3.2 \,|\, -3.4 & -1.9 \,|\, 1.2 & -0.1 \,|\, -4.0 & -0.2 \,|\, 1.7 \\ \hline \textbf{$\rho$} & -0.5 & - & - & 0 \,|\, -12.8 & 1.6 \,|\, -17.3 & 0.8 \,|\, -12.1 & 1.4 \,|\, -12.4 & 1.9 \,|\, -9.6 & 1.8 \,|\, -10.1 \\ & 0 & 418 \,|\, 24.3 & -2.4 \,|\, 0.5 & -0.9 \,|\, -6.0 & 0.2 \,|\, -10.3 & -0.8 \,|\, -3.4 & 0.2 \,|\, -6.9 & 0.8 \,|\, -4.7 & 0.7 \,|\, -1.6 \\ & 0.5 & - & - & -2 \,|\, 0.3 & -1.3 \,|\, -1.8 & -2.3 \,|\, 2.0 & -0.9 \,|\, -0.8 & -0.1 \,|\, -1.9 & -0.1 \,|\, 0.1 \\ \hline \textbf{LIFO/FIFO} & 50/50 & 425 \,|\, 21.8 & -2.1 \,|\, 9.3 & -1.4 \,|\, -1.4 & -0.1 \,|\, -6.8 & -1.0 \,|\, -2.2 & -0.2 \,|\, 0.0 & 0.4 \,|\, -1.1 & 0.5 \,|\, 0.8 \\ & 90/10 & 411 \,|\, 26.7 & -2.7 \,|\, -6.3 & -0.5 \,|\, -9.7 & 0.5 \,|\, -11.9 & -0.6 \,|\, -6.0 & 0.8 \,|\, -11.9 & 1.3 \,|\, -8.6 & 1.1 \,|\, -7.3 \\ \hline \textbf{cv} & 0.6 & 440 \,|\, 18.1 & -1.1 \,|\, 0.0 & 0.1 \,|\, -11.2 & 1.0 \,|\, -15.3 & 0.3 \,|\, -9.1 & 0.4 \,|\, -7.3 & 1.0 \,|\, -5.5 & 0.8 \,|\, -2.6 \\ & 0.9 & 396 \,|\, 30.4 & -3.8 \,|\, 1.2 & -2.1 \,|\, -2.9 & -0.7 \,|\, -6.3 & -1.9 \,|\, -1.4 & 0.1 \,|\, -6.1 & 0.7 \,|\, -5.0 & 0.8 \,|\, -4.3 \\ \hline \textbf{SL} & 3 & 402 \,|\, 28.5 & -4.9 \,|\, 8.0 & -3.2 \,|\, 2.2 & -1.7 \,|\, -1.2 & -3.0 \,|\, 5.4 & -0.1 \,|\, -4.1 & 0.7 \,|\, -2.2 & 0.7 \,|\, -0.8 \\ & 5 & 434 \,|\, 20.0 & -0.1 \,|\, -9.6 & 1.2 \,|\, -17.7 & 1.9 \,|\, -21.6 & 1.3 \,|\, -18.1 & 0.5 \,|\, -9.9 & 1.0 \,|\, -9.6 & 0.9 \,|\, -7.7 \\ \hline \textbf{$\mathsf{newsR}$} & 0.75 & 662 \,|\, 41.8 & -2.3 \,|\, 1.4 & -1.3 \,|\, -5.0 & -0.4 \,|\, -9.3 & -1.1 \,|\, -3.5 & 0.0 \,|\, -5.8 & 0.4 \,|\, -4.5 & 0.6 \,|\, -3.6 \\ & 0.25 & 174 \,|\, 6.7 & -2.8 \,|\, -3.4 & 0.5 \,|\, -12.3 & 2.4 \,|\, -11.8 & 0.3 \,|\, -9.5 & 1.4 \,|\, -10.7 & 2.4 \,|\, -9.7 & 1.7 \,|\, -3.9 \\ \hline \end{array} \]
Table 6 (Precise): Percentage of relative improvement of profit and waste (Profit|Waste) per period with respect to different subsets of parameters and policies. Values normalized w.r.t. COP (profit, higher = better | waste smaller = better). COP values presented raw.
\[ \begin{array}{|l|c|c|c|c|c|c|c|c|c|} \hline \text{} & \text{Subset} & \text{COP} & \text{BSP} & \text{FPL_l} & \text{FPC_l} & \text{FPL2K_l} & \text{SP_l} & \text{SC_l} & \text{SP2K_l} \\ \hline \textbf{On/Off} & 80/20 & 430 \,|\, 20.8 & -1.8 \,|\, -4.3 & 0.4 \,|\, -7.9 & 0.8 \,|\, -9.9 & 0.2 \,|\, -5.3 & 1.5 \,|\, -8.6 & 1.5 \,|\, -1.9 & 1.2 \,|\, -2.7 \\ & 50/50 & 418 \,|\, 24.7 & -3.0 \,|\, 1.6 & 0.9 \,|\, -12.5 & 1.9 \,|\, -13.6 & 0.8 \,|\, -8.9 & 2.2 \,|\, -12.1 & 2.4 \,|\, -12.0 & 2.4 \,|\, -7.0 \\ & 20/80 & 407 \,|\, 27.3 & -4.2 \,|\, -1.8 & -3.6 \,|\, -10.2 & -0.4 \,|\, -13.8 & -3.2 \,|\, -9.8 & -1.9 \,|\, -4.0 & 0.4 \,|\, -6.1 & -0.4 \,|\, -0.9 \\ \hline \textbf{$\rho$} & -0.5 & - & - & 1 \,|\, -17.0 & 2.9 \,|\, -20.4 & 1.5 \,|\, -14.1 & 2.3 \,|\, -14.8 & 2.9 \,|\, -12.8 & 2.7 \,|\, -11.5 \\ & 0 & 418 \,|\, 24.3 & -3.0 \,|\, -1.5 & -0.8 \,|\, -9.8 & 0.8 \,|\, -14.2 & -0.8 \,|\, -6.9 & 0.6 \,|\, -8.4 & 1.2 \,|\, -6.1 & 1.0 \,|\, -3.6 \\ & 0.5 & - & - & -2 \,|\, -3.9 & -1.4 \,|\, -2.7 & -2.9 \,|\, -3.0 & -1.1 \,|\, -1.5 & 0.0 \,|\, -1.1 & -0.5 \,|\, 4.5 \\ \hline \textbf{LIFO/FIFO} & 50/50 & 425 \,|\, 21.8 & -2.9 \,|\, 5.6 & -1.4 \,|\, -7.6 & 0.3 \,|\, -11.9 & -1.1 \,|\, -8.0 & -0.1 \,|\, -3.3 & 0.8 \,|\, -4.5 & 0.6 \,|\, -0.7 \\ & 90/10 & 411 \,|\, 26.7 & -3.1 \,|\, -8.6 & -0.1 \,|\, -12.8 & 1.3 \,|\, -13.0 & -0.4 \,|\, -8.0 & 1.3 \,|\, -13.2 & 2.0 \,|\, -8.8 & 1.5 \,|\, -6.4 \\ \hline \textbf{cv} & 0.6 & 440 \,|\, 18.1 & -1.1 \,|\, -5.5 & 0.6 \,|\, -16.3 & 1.8 \,|\, -18.1 & 0.8 \,|\, -12.7 & 0.7 \,|\, -11.5 & 1.6 \,|\, -8.1 & 1.2 \,|\, -7.2 \\ & 0.9 & 396 \,|\, 30.4 & -4.9 \,|\, 2.5 & -2.1 \,|\, -4.1 & -0.3 \,|\, -6.8 & -2.2 \,|\, -3.3 & 0.5 \,|\, -5.0 & 1.2 \,|\, -5.3 & 1.0 \,|\, 0.1 \\ \hline \textbf{SL} & 3 & 402 \,|\, 28.5 & -6.3 \,|\, 6.3 & -3.8 \,|\, 0.7 & -1.8 \,|\, -0.6 & -3.7 \,|\, 4.2 & 0.1 \,|\, -4.6 & 1.2 \,|\, -2.2 & 0.8 \,|\, -1.1 \\ & 5 & 434 \,|\, 20.0 & 0.3 \,|\, -9.2 & 2.2 \,|\, -21.1 & 3.3 \,|\, -24.3 & 2.2 \,|\, -20.2 & 1.1 \,|\, -11.9 & 1.6 \,|\, -11.1 & 1.3 \,|\, -6.0 \\ \hline \textbf{$\mathsf{newsR}$} & 0.75 & 662 \,|\, 41.8 & -2.5 \,|\, 0.6 & -1.5 \,|\, -7.0 & -0.5 \,|\, -11.4 & -1.2 \,|\, -5.7 & -0.1 \,|\, -5.6 & 0.4 \,|\, -3.9 & 0.5 \,|\, -2.6 \\ & 0.25 & 174 \,|\, 6.7 & -3.5 \,|\, -3.5 & -0.1 \,|\, -13.5 & 2.0 \,|\, -13.5 & -0.2 \,|\, -10.3 & 1.3 \,|\, -10.9 & 2.4 \,|\, -9.5 & 1.6 \,|\, -4.5 \\ \hline \end{array} \]
  • Additional note: The range of values for the coefficient of variation, if directly modeled by considering an adjusted independent daily adaptation of the weekly estimated values from Broekmeulen and van Donselaar (2019), would be:

\[ \text{cv}_{\text{daily}} = \frac{ \sigma_{\text{daily}} }{ \mu_{\text{daily}} } = \frac{\sigma_{\text{weekly}}}{\mu_\text{daily}\sqrt{7}} = \frac{\mu^{0.77}_\text{weekly}0.7}{\mu_\text{daily}\sqrt{7}} = \frac{7^{0.77}\mu^{0.77}_\text{daily}0.7}{\mu_\text{daily}\sqrt{7}} = \frac{7^{0.77}0.7}{\sqrt{7}}\mu^{-0.23}_\text{daily} = 0.41 \]

However, we deal with products with high daily sales (and low shelf life) and they state and report that under these assumptions perishables are correlated to higher correspondent daily standard deviations. Unfortunately, for confidentiality reasons, they normalize their data and provide only aggregated statistics, making more specific deductions complex. To provide meaningful experiments, we assume higher values and investigate more than one option cv = 0.6, 0.9. focusing on the relative differences in their effects rather than absolute behaviors in a specific case study.

Daniele Giovanni Gioia, Jacopo Fior and Luca Cagliero. Early portfolio pruning: a scalable approach to hybrid portfolio selection. Knowledge and Information Systems 65.6 (2023): 2485-2508. doi:10.1007/s10115-023-01832-7.

  • The bias of back-testing: One of the big problems with back-testing is data acquisition. Especially in the fintech sector, the appearance would suggest that there is all the data you want. In reality, this is false, especially when you want to leverage data from fundamentals (e.g., balance sheets and income statements), you will have some missings. It follows that the difficult question is: are the stocks for which detailed data are available the ones that historically then performed the better? Is a survival bias implied in the possibility of acquiring data itself? In the study at hand, we acquired as much data as possible to generate a reasonable test bed to generate the pool of stocks on which to test the algorithm. Nevertheless, all back-tested results inherently depend on the data quality. For this reason, my article, like 99 percent of those concerning fintech, should be used only for methodological scientific advancement and not pragmatically taking its validation results as oracles of superior performance. Testing is only to see if the method can make sense or if it is completely useless. Driving a car looking only at the rearview mirror is never a great idea.

Contacts


If you want to talk about decisions under uncertainty, baroque music or deadlifts.