Abstract— The method of determining a set of informative features for forecasting sports training of the status of athletes for competitions using methods of data mining is considered. Correlation and regression analysis were used to select informative features. Correlation coefficients between the predicted and informative parameters, as well as between informative parameters, are determined. Correlation coefficients are corrected. The most informative parameter is revealed.
Keywords— mini football, forecasting, informative parameters, predicted parameter, correlation, regression.
Introduction
To date, the evaluation of athletic training of athletes for competition is an urgent task in all sports.
Athletic training of an athlete is determined in training and to achieve a good result, qualified trainers and planned training processes are needed. Training processes enable the coach to determine the pace of training to achieve a good result, to determine the sports readiness of the athlete and make a decision.
Also, to assess athletic readiness of an athlete you need a comprehensive assessment of athletic preparedness. On the basis of a comprehensive assessment, you can determine the quality of training, change the training process, make a decision and help the coach to predict the athlete's fitness for participation in further competitions.
The main criterion for assessing the sports readiness of an athlete is the parameters of physical development, physical readiness and special physical preparedness.
In mini football, physical development parameters include, for example, weight, height, age and maximum oxygen consumption (MIC), as well as 60m run, 30m run, 15m run and fivefold jump. Experts believe that the parameters of special physical preparedness include shuttle run 3x7m, impact on range, running with a ball at 30m, running 7x50m, juggling with a ball [1-3].
Using the statistical data of these parameters, it is possible to construct a mathematical model of forecasting (predictive model) of the state of sports readiness of an athlete.
II. Definition of informative parameters
The purpose of this article is to develop a mathematical method for determining a set of informative parameters necessary to predict the state of sports readiness of futsal athletes. This method can be based on the combined use of correlation and regression analysis.
Correlation analysis allows you to assess the degree of relationship between two or more variables. Regression analysis allows you to build a mathematical model that describes the relationship between independent and dependent variables. The combined use of these two methods allows us to identify the relationship between the parameters and determine the most important parameters for predicting the state of sports readiness.
Such a mathematical method will significantly reduce time and labor costs when choosing informative parameters for predicting the state of sports readiness of futsal athletes, which is an urgent task in sports science and practice.
This article describes the use of correlation and regression analysis to determine a set of informative parameters that can be used to predict the level of athlete's sports readiness. In order to predict the sports readiness level parameter of an athlete, it is necessary to have a probabilistic relationship between this value and the selected informative parameters.
The predicted parameter in this case is considered a random process, and the functional relationship between the predicted and informative parameters is a particular and rather rare case. However, if there is a correlation between the predicted and informative parameters, then it is possible to construct a regression curve that will approximate the empirical values.
Such a forecasting method was considered in [12], where correlation and regression analyzes were used to determine the most informative parameters for predicting the level of athletes' sports readiness. The results showed that the selected parameters have a significant correlation with the predicted parameter, which allows them to be used to predict the level of sports readiness of athletes.
Thus, the use of correlation and regression analysis to determine informative parameters is an effective method for predicting the level of sports readiness of athletes and can be applied in various areas where it is necessary to predict some important parameters.
Latent variables or latent variables may be important in predicting the predictor and therefore may be included in the analysis. In the Rasch model, latent variables are used to explain relationships between observed variables and can help refine the model if there is a significant correlation between the observed variables and no direct relationship with the predicted parameter.
To identify latent variables and assess their impact on the predicted parameter, various analysis methods can be used, such as factor analysis or structural equation modeling. These methods allow researchers to identify hidden factors that explain the relationship between observed variables and assess their impact on the predicted parameter.
Thus, when conducting an analysis to determine a set of informative parameters for predicting a predicted parameter, it is important to consider both observed and latent variables in order to ensure the accuracy and reliability of the prediction.
The presence of a correlation between variables does not always mean the presence of a causal relationship between them. Correlation analysis can only show the presence of a statistical relationship between variables, but does not give any clues about which variable is the cause and which is the effect.
To establish a causal relationship between variables, it is necessary to conduct experimental studies, during which manipulations with the independent variable are performed and changes in the dependent variable are measured. Only in the presence of such an experiment can one speak of a causal relationship between variables.
Therefore, when interpreting the results of correlation analysis, it must be taken into account that the discovery of a correlation between variables is not a sufficient condition for establishing a causal relationship between them. Instead, the existence of a causal relationship must be confirmed by an experimental approach that manipulates the independent variable and measures changes in the dependent variable.
To establish a correlation between parameters, it is first necessary to determine whether there is a relationship between them. There are several ways to define a relationship, for example, you can use a scatterplot, a correlogram, or a correlation coefficient.
The Pearson correlation coefficient is one of the most common ways to evaluate a linear relationship between two continuous variables. It measures the degree of linear relationship between two variables and takes values from -1 to 1, where 0 means no correlation, and -1 and 1 indicate complete inverse and direct correlation, respectively.
However, it should be noted that correlation does not necessarily mean that there is a causal relationship between variables. It is also possible to have a random correlation, when the relationship between variables occurs by chance without any reason. Therefore, to establish a causal relationship between variables, it is necessary to conduct additional research and experiments.