Methods of Fourier spectral analysis and its properties. Spectral analysis based on fast Fourier transform. Options for individual assignments on the LP topic

1

Video surveillance cameras are widely used to monitor traffic conditions on highways with high traffic volumes. Information received from video cameras contains data on temporary changes in the spatial position of vehicles in the system’s field of view. Processing of this information based on algorithms used in television measuring systems(TIS), allows you to determine the speed of vehicles and provide traffic flow management. It is these factors that explain the growing interest in television monitoring of transport highways.

To develop methods for filtering images of vehicles against a background of noise, it is necessary to know their basic parameters and characteristics. Previously, the authors conducted a study of the Fourier and wavelet spectra of natural and urban backgrounds. This work is devoted to the study of similar spectra of vehicles.

  • Using a digital camera, a bank of source .bmp files of monochrome images of vehicles was created various types(cars and trucks, buses, for each group the number of images was 20-40 at different angles and lighting conditions); the images had dimensions of 400 pixels horizontally and 300 pixels vertically; brightness range from 0 to 255 units;
  • since the images also contained a background component in addition to the vehicle, it was artificially suppressed to a zero level to prevent it from influencing the result;
  • The characteristics of vehicle images were analyzed using Fourier and wavelet analysis methods.

The program developed in the MATLAB environment allows you to calculate the average brightness (i.e. the mathematical expectation of the image brightness), brightness dispersion, Fourier spectrum of individual and total image lines, spectrograms, as well as wavelet spectra using various well-known wavelets (Haar, Daubechies, Simleta and etc.). The analysis results are reflected in the form of two-dimensional and 3D image spectra.

Based on the research results, the following conclusions can be drawn:

  • average brightness characteristics (average brightness, dispersion) of images of various vehicles have similar values ​​for all types; Sun glare from the glass and surfaces of the car has a significant impact on the brightness characteristics; depending on the intensity and direction of illumination, black cars may have brightness characteristics similar to light-colored cars;
  • Regardless of the type of vehicle, Fourier and wavelet spectra have a similar structure;
  • the Fourier width of the vehicle spectrum depends slightly on the type of car; the spectrum has a significantly uneven structure, changing with changes in lighting and vehicle orientation; the spectrum in the horizontal plane has a more uneven structure than in the vertical plane; the spectral characteristics of semi-trucks and buses are greatly influenced by drawings and inscriptions (advertising) on ​​its surfaces;
  • when turning cars, there is a significant change in the spectra of images in the horizontal plane, the spectrum in the vertical plane remains quite stable; this is especially clearly visible in wavelet spectra;
  • analysis of the spectra of an individual vehicle and a vehicle against a background of interference shows that they differ in the amplitude levels of the spectral components; in the absence of a background, the vertical spectrum is significantly more uniform; for images of cars without a background, there is a greater likelihood of deep dips in the spectrum (higher unevenness), the envelope of the spectrum of images with a background is more uniform than without a background;
  • studies have shown that due to the strong influence large number factors, the spectral characteristics of vehicles (both obtained using Fourier analysis and wavelet analysis) do not allow identifying stable spectral characteristics of vehicle images; this reduces the effectiveness of spectral image filtering performed to suppress background;
  • V automated systems traffic control, in order to identify cars against a background of noise, it is necessary to use a set of features, such as color, spectrum, geometric parameters of objects (sizes and aspect ratios) and dynamic characteristics.

BIBLIOGRAPHY

  1. Makaretsky E.A., Nguyen L.H. Study of the characteristics of images of natural and urban backgrounds // Izv. Tulsk. State Univ. Radio engineering and radio optics. - Tula, 2005. - T. 7.- P.97-104.

Bibliographic link

Makaretsky E.A. STUDY OF FOURIER AND WAVELET SPECTRA OF IMAGES OF VEHICLES // Fundamental Research. – 2006. – No. 12. – P. 80-81;
URL: http://fundamental-research.ru/ru/article/view?id=5557 (access date: 01/15/2020). We bring to your attention magazines published by the publishing house "Academy of Natural Sciences"

Spectral analysis

Spectral analysis is a wide class of data processing methods based on their frequency representation, or spectrum. The spectrum is obtained by decomposing the original function, which depends on time (time series) or spatial coordinates (for example, an image), into the basis of some periodic function. Most often, for spectral processing, the Fourier spectrum obtained on the basis of the sine basis (Fourier decomposition, Fourier transform) is used.

The main meaning of the Fourier transform is that the original non-periodic function free form, which cannot be described analytically and is therefore difficult to process and analyze, is represented as a set of sines or cosines with different frequencies, amplitudes and initial phases.

In other words, complex function transforms into many simpler ones. Each sine wave (or cosine wave) with certain frequency and the amplitude obtained as a result of the Fourier expansion is called spectral component or harmonic. The spectral components form Fourier spectrum.

Visually, the Fourier spectrum is presented in the form of a graph on which the circular frequency, denoted by the Greek letter “omega,” is plotted along the horizontal axis, and the amplitude of the spectral components, usually denoted by the Latin letter A, is plotted along the vertical axis. Then each spectral component can be represented as a count, position which horizontally corresponds to its frequency, and height – its amplitude. A harmonic with zero frequency is called constant component(in temporal representation this is a straight line).

Even a simple visual analysis of the spectrum can tell a lot about the nature of the function on the basis of which it was obtained. It is intuitively clear that rapid changes in the initial data give rise to components in the spectrum with high frequency, and slow ones - with low. Therefore, if the amplitude of its components decreases rapidly with increasing frequency, then the original function (for example, a time series) is smooth, and if the spectrum contains high-frequency components with large amplitude, then the original function will contain sharp fluctuations. Thus, for a time series, this may indicate a large random component, instability of the processes it describes, or the presence of noise in the data.

Spectral processing is based on spectrum manipulation. Indeed, if you reduce (suppress) the amplitude of high-frequency components, and then, based on the changed spectrum, restore the original function by performing an inverse Fourier transform, then it will become smoother due to the removal of the high-frequency component.

For a time series, for example, this means removing information on daily sales, which are highly susceptible to random factors, and leaving behind more consistent trends, such as seasonality. You can, on the contrary, suppress low-frequency components, which will remove slow changes and leave only fast ones. In the case of a time series, this will mean suppression of the seasonal component.

By using the spectrum in this way, you can achieve the desired change in the original data. The most common use is to smooth time series by removing or reducing the amplitude of high-frequency components in the spectrum.

To manipulate spectra, filters are used - algorithms that can control the shape of the spectrum, suppress or enhance its components. Main property any filter is its amplitude-frequency response (AFC), the shape of which determines the transformation of the spectrum.

If a filter passes only spectral components with a frequency below a certain cutoff frequency, then it is called a low-pass filter (LPF), and it can be used to smooth the data, clear it of noise and anomalous values.

If a filter passes spectral components above a certain cutoff frequency, then it is called a high-pass filter (HPF). It can be used to suppress slow changes, such as seasonality in data series.

In addition, many other types of filters are used: mid-pass filters, band-stop filters and bandpass filters, as well as more complex ones that are used in signal processing in radio electronics. Choosing the type and shape frequency response filter, you can achieve the desired transformation of the original data through spectral processing.

When performing frequency filtering of data for the purpose of smoothing and removing noise, it is necessary to correctly specify the low-pass filter bandwidth. If you select it too high, the degree of smoothing will be insufficient, and the noise will not be completely suppressed. If it is too narrow, then along with the noise, changes that bring useful information. If in technical applications there are strict criteria for determining the optimal characteristics of filters, then in analytical technologies it is necessary to use mainly experimental methods.

Spectral analysis is one of the most effective and well-developed data processing methods. Frequency filtering is just one of its many applications. In addition, it is used in correlation and statistical analysis, synthesis of signals and functions, building models, etc.

Mathcad has built-in Fast Fourier Transform (FFT) tools that greatly simplify the procedure for approximate spectral analysis.

FFT- fast algorithm for transferring information about a function specified 2 m(m- integer) samples in the time domain, in the frequency domain.

elements:

Fig.3 Spectral analysis using FFT

Function fft( v )implements forward FFT returns forward FFT 2 m-dimensional vector v, Where v- a vector whose elements store function samples f(t). The result will be a vector A dimensions 1 + 2 m- 1 with complex elements - counts in frequency domain. In fact, the real and imaginary parts of the vector are Fourier coefficients a k And b k, which greatly simplifies their receipt.

Function ifft( v) implements inverse FFT - returns the inverse FFT of a vector v with complex elements. Vector v has 1 + 2 m – 1

Filtration analog signals

Ø Definition Filtering- selection useful signal from its mixture with an interfering signal - noise. The most common type of filtering is frequency filtering. If the frequency range occupied by the useful signal is known, it is enough to isolate this area and suppress those areas that are occupied by noise.

Using forward FFT, the noisy signal is converted from the time domain to the frequency domain, creating a vector f of 64 frequency components.

Then a filter transformation is performed using the Heaviside function

F (X) - Heaviside step function.

Returns 1 if X 0; otherwise 0.

Filtered signal (vector g) is subjected to an inverse FFT and produces an output vector h.

A comparison of the time dependences of the source and output signals shows that the output signal almost completely repeats the input signal and is largely free from high-frequency noise interference that masks the useful signal

Fig.4. Filtering analog signals

Figure 4 illustrates the filtering technique using FFT. First, the original signal is synthesized, represented by 128 vector samples v. Noise is then added to this signal using a random number generator ( function rnd ) and a vector of 128 samples of the noisy signal is formed.

.
Procedure for performing laboratory work

Exercise 1. Calculate the first six pairs of coefficients of the Fourier series expansion of the function f(t) on the segment .

Construct graphs of 1st, 2nd and 3rd harmonics.

Perform harmonic synthesis of a function f(t) for 1st, 2nd and 3rd harmonics. The synthesis results are displayed graphically.

Task options 1

f(t) Option No. f(t) Option No. f(t)
cos e |sin 3 t|

Task 2. Perform classical spectral analysis and function synthesis f(t). Display graphically the spectra of amplitudes and phases, the result of spectral synthesis of the function f(t).


Task 3. Perform numerical spectral analysis and function synthesis f(t). To do this, you need to set the original function f(t) discretely in 32 samples. Display graphically the spectra of amplitudes and phases, the result of spectral synthesis of the function f(t).

Task 4. Perform spectral analysis and function synthesis f(t) using FFT. To do this you need:

· set the initial function f(t) discretely in 128 samples;

perform direct FFT using the function fft and display graphically the found spectra of amplitudes and phases of the first six harmonics;

perform inverse FFT using the function ifft and display graphically the result of the spectral synthesis of the function f(t).

Task 5. Filter a function f(t) using FFT:

· synthesize a function f(t) in the form of a useful signal represented by 128 vector samples v;

to a useful signal v attach noise using function rnd (rnd(2) - 1) and form a vector of 128 samples of the noisy signal s;

Convert signal with noise s from time domain to frequency domain using forward FFT (function fft). The result will be a signal f of 64 frequency components;

· perform a filtering transformation using the Heaviside function (filtering parameter  = 2);

using the function ifft perform an inverse FFT and get the output vector h;

· build graphs of the useful signal v and the signal obtained by filtering the noisy signal s.

Topic 1. “Propositional logic”

Exercise

1. Determine whether this formula is identically true.

2. Write this statement in the form of a propositional logic formula. Construct the negation of this statement in the form of a formula that does not contain external signs of negation. Translate into natural language.

3. Determine whether this reasoning is correct (check whether the conclusion follows from the conjunction of premises).


Options for individual assignments on the LP topic

Option #1

3. If a person has made some decision, and he is brought up correctly, then he will overcome all competing desires. The man made a decision, but did not overcome competing desires. Therefore, he was brought up incorrectly.

Option No. 2

2. It is raining and snowing.

3. If this phenomenon is mental, then it is caused by an external influence on the body. If it is physiological, then it is also due to external influences on the body. This phenomenon is neither mental nor physiological. Consequently, it is not caused by external influences on the body.

Option No. 3

2. He is a good student or a good athlete.

3. If the suspect committed the theft, then either it was carefully prepared, or he had accomplices. If the theft had been carefully prepared, then if there were accomplices, a lot would have been stolen. Little was stolen. This means the suspect is innocent.

Option No. 4

2. If a steel wheel is heated, its diameter will increase.

3. If the price of securities rises or the interest rate decreases, then the price of shares falls. If the interest rate decreases, then either the stock price does not fall or the stock price does not rise. The stock price is going down. Consequently, the interest rate decreases.

Option No. 5

3. Either the witness was not intimidated, or, if Henry committed suicide, then the note was found. If the witness was intimidated, then Henry did not commit suicide. The note was found. Consequently, Henry committed suicide.

Option #6

2. He studies at the institute or takes foreign language courses.

3. If a philosopher is a dualist, then he is not a materialist. If he is not a materialist, then he is a dialectician or a metaphysician. He is not a metaphysician. Therefore he is a dialectician or dualist.

Option No. 7

2. He is capable and diligent.

3. If investment remains constant, government spending will increase or unemployment will occur. If government spending does not increase, taxes will be reduced. If taxes are reduced and investment remains constant, unemployment will not increase. Unemployment will not increase. Consequently, government spending will increase.

Option No. 8

2. This book is difficult and uninteresting.

3. If the source data is correct and the program works correctly, then the correct result is obtained. The result is incorrect. Consequently, the input data is incorrect or the program does not work correctly.

Option No. 9

2. He is both a reaper, and a Swede, and a trumpet player.

3. If prices are high, then wage high. Prices are high or price controls are applied. If price controls are applied, then there is no inflation. There is inflation. Therefore, wages are high..

Option No. 10

2. If water is cooled, its volume will decrease.

3. If I'm tired, I want to go home. If I'm hungry, I want to go home or go to a restaurant. I'm tired and hungry. That's why I want to go home.

Option No. 11

2. If a number ends in zero, it is divisible by 5.

3. If it is cold tomorrow, I will wear a warm jacket if the sleeve is repaired. Tomorrow it will be cold and the sleeve will not be repaired. So I won't wear a warm jacket.

Option No. 12

2. The body, deprived of support, falls to the ground.

3. If it snows, it will be difficult to drive the car. If it's difficult to drive, I'll be late if I don't leave early. It's snowing and I'll leave early. So I won't be late.

Option No. 13

2. Ivan and Peter know Fedor.

3. If a person tells a lie, then he is mistaken or deliberately misleads others. This man is telling a lie and is clearly not mistaken. This means that he deliberately misleads others.

Option No. 14

2. This book is useful and interesting.

3. If he were smart, he would have seen his mistake. If he had been sincere, he would have confessed to her. However, he is neither smart nor sincere. Consequently, he either will not see his mistake or will not admit it.

Option No. 15

2. This actor plays in the theater and does not play in films.

3. If a person is a materialist, then he recognizes the knowability of the world. If a person recognizes the knowability of the world, then he is not an agnostic. Therefore, if a person is not a consistent materialist, then he is an agnostic.

Option No. 16

2. If a dog is teased, it will bite.

3. If there is justice in the world, then evil people cannot be happy. If the world is the creation of an evil genius, then evil people can be happy. This means that if there is justice in the world, then the world cannot be the creation of an evil genius

Option No. 17

2. If you own English language, you can handle this job.

3. If Ivanov works, then he receives a salary. If Ivanov studies, he receives a scholarship. But Ivanov does not receive a salary or receive a stipend. Therefore, he does not work or study.

Option No. 18

2. If the function is odd, then its graph is symmetrical about the origin.

3. If I go to bed, I won’t pass the exam. If I study at night, I won't pass the exam either. Therefore, I will not pass the exam.

Option No. 19

2. If a number is divisible by 3, then the sum of its digits is divisible by 3.

3. If I go to the first lecture tomorrow, I will have to get up early. If I go to a disco in the evening, I will go to bed late. If I go to bed late and get up early, I will feel bad. Therefore, I must miss the first lecture or not go to the disco.

Option No. 20

2. If a word is placed at the beginning of a sentence, then it is written with a capital letter.

3. If x 0 and y 0, then x 2 + y 2 > 0. If x= 0 and y= 0, then the expression ( xy):(x + y) doesn't make sense. It is not true that x 2 + y 2 > 0. Therefore, the expression ( xy):(x + y).

Option No. 21

2. Ivan and Marya love each other.

3. If the book I am reading is useless, then it is not difficult. If a book is difficult, then it is not interesting. This book is complex and interesting. So it's useful.

Option No. 22

2. A bad soldier is one who does not dream of becoming a general.

3. If it rains tomorrow, I will wear a raincoat. If there is wind, I will wear a jacket. Therefore, if there is no rain and wind, I will not wear a raincoat or jacket.

Option No. 23

2. If a series converges, then its common term tends to zero.

3. If he is not a coward, then he will act in accordance with his own convictions. If he is honest, then he is not a coward. If he is not honest, then he will not admit his mistake. He admitted his mistake. So he's not a coward.

Option No. 24

2. Neither Ivan nor Fedor are excellent students.

3. If he is stubborn, then he can make mistakes. If he is honest, he is not stubborn. If he is not stubborn, then he cannot not make mistakes and be honest at the same time. So he's not stubborn.

Option No. 25

2. Either Ivan or Peter knows Fedor.

3. If salaries are paid on time, then either elections or a protest are expected. The salary was paid on time. No elections are expected. This means a protest is expected.

Option No. 26

2. If you create an algorithm and write a program, you can solve this problem.

3. If a person plays sports, then he is healthy. If a person is healthy, then he is happy. This person goes in for sports. So he's happy.

Option No. 27

2. In the evening we will go to hockey or watch it on TV.

3. Anton is overtired or sick. If he is overtired, he gets irritated. He doesn't get annoyed. Therefore he is sick.

Option No. 28

2. If I don't get enough sleep or am hungry, I can't exercise.

3. If a company is focused on strengthening marketing, then it intends to make large profits by releasing new products. If a company plans to expand its distribution network, then it intends to make large profits from increased sales. The company plans to strengthen marketing or is going to expand its distribution network. Therefore, it intends to make large profits.

Option No. 29

2. If taxes are not reduced, small producers will go bankrupt and leave production.

3. The contract will be fulfilled if and only if the house is completed in February. If the house is finished in February, then we can move in March. The contract will be fulfilled, therefore we can move in March.

Option No. 30

2. If our team does not take first place, we will stay at home and train.

3. The intended program will succeed if the enemy is taken by surprise or if his positions are poorly defended. You can take him by surprise if he is careless. He will not be careless if his positions are poorly defended. This means the program will fail.


Topic 2. Linear pairwise regression

This topic includes completing six laboratory work devoted to the construction and study of a linear regression equation of the form

Example 1.1.

To determine the relationship between shift coal production per worker (variable Y, measured in tons) and thickness of the coal seam (variable X, measured in meters) studies were carried out at 10 mines, the results of which are presented in a table.

i
x i
y i

Laboratory work No. 1

Calculation of coefficients of the LR equation

Goal of the work Calculation of linear regression equation coefficients from a spatial sample.

Calculation ratios. The coefficients determined based on the least squares method are the solution to the system of equations

Solving this system of equations, we get

,

Where m XY– sample value of the correlation moment, determined by the formula:

,

– sample value of the variance of the quantity X, determined by the formula:

Solution

Let's calculate these coefficients using the Excel spreadsheet processor. The picture shows a fragment Excel document, in which:

a) table data is posted;

b) the calculation of coefficients of the system is programmed;

c) the calculation is programmed b 0 , b 1 by formulas.

Note that the Excel function AVERAGE ( cell range).

As a result of performing the programmed calculations, we obtain

b 0 = –2.75; b 1 = 1.016,

and the regression equation itself will take the form

Exercise. Using the obtained regression equation, determine the miner’s labor productivity if the thickness of the coal layer is:

a) 8.5 meters (data interpolation);

b) 14 meters (data extrapolation).

Rice. 1.Calculation of linear regression coefficients


Laboratory work No. 2

Calculation of sample correlation coefficient

Goal of the work. Calculation of the sample correlation coefficient from a spatial sample.

Calculation ratios. The sample correlation coefficient is determined by the relation

Where , , .

Solution

Fragment of an Excel document calculating the values ​​of: correlation coefficient

Rice. 2. Calculation of the correlation coefficient


Laboratory work No. 3

Calculation of estimates of variances of paired LR

Goal of the work. Calculate estimates for coefficient variances b 0 , b 1 ,.

Calculation ratios. Estimates for coefficient variances are determined by the formulas:

,

Where - estimation of dispersion.

Solution. Figure 3 shows a fragment of an Excel document in which variance estimates were calculated. notice, that

· the values ​​of the coefficients are taken from laboratory work No. 1 and the cells (B1, B2) in which they are located have absolute addressing ($B$1, $B$2) in expressions that calculate regression values;

· value (cell B19) is taken from laboratory work No. 1. We obtain the following values:

.

Rice. 3. Calculation of estimates for variances of coefficients


Laboratory work No. 4

Excel functions for paired LR coefficients

Goal of the work. Calculate the coefficients of a linear regression equation from a spatial sample using Excel functions.

Here are some Excel statistical functions that are useful when building paired linear regression.

CUT function.

LINE SEGMENT( range_of_values ; range_of_values ).

TILT function. Calculates the coefficient and the inversion has the form

INCLINE( range_of_values ; range_of_values ).

FORECAST function Calculates the value of linear pairwise regression for a given value of the independent variable (denoted by ) and the inversion has the form

PREDICTION( ; range_of_values ;range_of_values_ ).

Function STOSYX. Calculates an estimate for the standard deviation of disturbances and the inversion has the form (YX - Latin letters):

STOSHYX( range_of_values ; range_of_values ).

Solution. A fragment of an Excel document calculating the required values ​​is given. Note the use of absolute addressing when calculating.

Rice. 4. Using Excel functions

Exercise. Compare the calculated values ​​with the values ​​obtained in labs #1 and #3.


Laboratory work No. 5

Construction of an interval estimate for the paired LR function

Goal of the work. Constructing an interval estimate for the regression function with a reliability of g = 0.95, using for this purpose the regression equation constructed in laboratory work No. 1.

Calculation ratios. Interval estimate (confidence interval) for (for a given value) with reliability (confidence probability) equal to g is determined by the expression

The estimate for the variance of the function has the form

,

Where - estimation of dispersion.

Thus, two quantities (depending on ) and , calculated using the Excel function:

STUDISCOVER().

Solution. We will calculate the values ​​of the lower and upper boundaries of the interval for .

A fragment of the document performing these calculations is shown in the figure


Fig.5. Constructing an interval estimate for

The values ​​, , (cells B16:B18) and coefficients (B1:B2) are taken from previous laboratory work. Magnitude = STUDASCOVER() = 2.31.


Laboratory work No. 6

Checking the significance of the LR equation using the Fisher criterion

Goal of the work. According to the table data, estimate the significance of the regression equation at the level a = 0.05

,

built in laboratory work No. 1.

Calculation ratios. A pairwise regression equation is significant at significance level a if the following inequality holds:

Where F g; 1; n-2 – quantile values ​​of level g F-distributions with numbers of degrees of freedom k 1 = 1 and k 2 = n – 2.

To calculate the quantile you can use the following expression

FDISC().

The amounts are determined by the expressions:

, .

The criterion is often called Fisher criterion or F-test.

Solution. Here is a fragment of an Excel document that calculates the values Q e, and criterion F. In column D values ​​are calculated using the formula. The coefficient values ​​are taken from laboratory work No. 1.

The following values ​​were obtained: , , . Calculating the quantile F 0.95; 1; 8 = 5.32. The inequality is satisfied since 24.04 > 5.32 and therefore the regression equation significant with significance level a = 0.05.

Rice. 6. Calculation of the F value - criterion


Topic 3 Nonlinear pairwise regression

This topic includes two labs that focus on constructing a nonlinear pairwise regression equation. The spatial sample for constructing the regression is taken from the following example.

Example The table shows the values ​​of the independent variable (family income in thousands of rubles) and the values ​​of the dependent variable (share of expenditures on durable goods as a percentage of total expenditures).

13.4 15.4 16.5 18.6 19.1

Laboratory work No. 7

Building a nonlinear regression using

Add Trendline Commands

Goal of the work Using spatial sampling, it is necessary to construct a nonlinear regression equation of the form using the “Add trend line” command and calculate the coefficient of determination.

“Add trend line” command. Used to highlight trends (slow changes) in time series analysis.

However, this command can also be used to construct a nonlinear regression equation, considering time as the independent variable.

This command allows you to build the following regression equations:

linear

polynomial ();

logarithmic

· power;

· exponential.

To build one of the listed regressions, you must perform the following steps:

Step 1. In the selected Excel sheet, enter the source data in columns .

Step 2. Using these data, construct a graph in the Cartesian coordinate system.

Step 3. Place the cursor on the plotted graph and click right click and in the appeared context menu execute command Add a trend line

Step 4. In the dialog box that appears, activate the “Type” tab and select the desired regression equation.

Rice. 2.1. Plotting a graph based on source data

Rice. 2.2. Selecting the type of regression equation

Step 5. Activate the “Options” tab and “enable” The options we need are:

· "Show equation on diagram" - the diagram will show the selected regression equation with the calculated coefficients;

Rice. 2.3. Setting information output options

· “Place the approximation reliability value (R^2) on the diagram” - the diagram will show the value of the coefficient of determination (for nonlinear regression - determination index), calculated by the formula

· If it is necessary to make a forecast based on the constructed regression equation, then you need to indicate the number of forecast periods.

The purpose of other options is clear from their names.

Step 6 After specifying all the listed options, click on the “OK” button and the formula of the constructed regression equation and the value of the determination index (highlighted in shading) will appear on the diagram.

Rice. 2.4. Graph and equation of the constructed regression

Solution. We construct the equation using the steps described above. We get the equation

,

for which the coefficient of determination is equal to . This value indicates a good correspondence of the constructed equation to the original data.


Laboratory work No. 8

Selecting the Best Nonlinear Regression

Goal of the work. Using spatial sampling and the “Add trend line” command, construct six nonlinear regression equations (the polynomial equation is constructed at and ), determine for each equation the coefficient of determination (the value is displayed), the reduced coefficient of determination (the value is calculated) and by maximum value find the best nonlinear regression equation.

Reduced coefficient of determination. The coefficient of determination characterizes the closeness of the constructed regression to the original data, which contain an “undesirable” random component. Obviously, by constructing a 5th order polynomial from the data, we obtain the “ideal” value, but such an equation contains not only an independent variable, but a component, and this reduces the accuracy of using the constructed equation for forecasting.

Therefore, when choosing a regression equation, it is necessary to take into account not only the value, but also the “complexity” of the regression equation, determined by the number of coefficients of the equation.

Such accounting is successfully implemented in the so-called given coefficient of determination:

,

where is the number of calculated regression coefficients. It can be seen that, at constant values, an increase decreases the value of . If the number of coefficients of the compared regression equations is the same (for example, ), then the selection of the best regression can be carried out by the value of . If the number of coefficients in the regression equations changes, then such selection is appropriate in terms of value.

Solution. To construct each equation, we perform steps 2 – 6 (for the first equation also step 1) and place six windows in one document in which the found regression equations and the value are displayed. Then we enter the formula of the equation into the table. Next, we calculate the reduced coefficient of determination and enter these values ​​into the table.

As the “best” regression equation, we choose the equation that has the largest reduced coefficient of determination. Such an equation is a power function (in the table, the row with this function is highlighted in gray).

, having value = 0.9901.

The equation
0.949 0.938
0.9916 0.9895
(polynomial, ) 0.9896 0.9827
(polynomial, ) 0.9917 0.9792
0.9921 0.9901
0.9029 0,8786

Exercise. Determine the “worst” regression equation based on its value.


Topic 4. Linear multiple regression

This topic includes laboratory work devoted to the construction and study of a linear multiple regression equation of the form

The spatial sample used to construct this equation is taken from the following example.

Example Data on shift coal production per worker (variable Y), reservoir thickness (variable X 1 and the level of mechanization of work in the mine (variable X 2) characterizing the process of coal mining in 10 mines are given in the table. Assuming that there is a linear relationship between the variables Y, X 1, X 2, it is necessary to find an analytical expression for this relationship, i.e. construct a linear regression equation.

Mine number i x i 1 x i 2, i.e. matrix

a) contact Function Master and choose the desired category functions, then specify the function name and specify the corresponding cell ranges,

b) enter the name of the function from the keyboard and set the corresponding ranges of cells.

Matrix Transpose carried out using the TRANSPORT function (category of functions – Links and Arrays

TRANSPA ( cell range),

where is the parameter cell range specifies all elements of the matrix (or vector) to be transposed.

Matrix multiplication carried out using the MULTIPLE function (category of functions – Mathematical).The call to the function has the form:

MUMNO( range_1;range_2),

where is the parameter range_1 specifies the elements of the first of the multiplied matrices, and the parameter range_2 – elements of the second matrix. In this case, the matrices being multiplied must have the appropriate sizes (if the first matrix is ​​, the second is , then the result will be the matrix ).

Matrix inversion (calculation inverse matrix) is carried out using the MOBR function (category of functions – Mathematical). The call to the function looks like:

MOBR ( cell range),

where is the parameter cell range specifies all elements of the invertible matrix, which must be square and non-degenerate.

When using these features The following procedure must be followed:

· select a cell fragment, into which the result of executing matrix functions will be entered (in this case, it is necessary to take into account the sizes of the original matrices);

· enter an arithmetic expression, containing an access to matrix Excel functions;

· press the keys simultaneously. . . If this is not done, then only one element will be calculated the resulting matrix or vector.

Module Regression Mode Data analysis. Excel spreadsheet contains a module Data analysis. This module allows you to perform statistical analysis of sample data (construction of histograms, calculation of numerical characteristics, etc.). Operating mode Regression This module calculates linear multiple regression coefficients with variables, constructs confidence intervals and tests the significance of the regression equation.

To call the mode Regression module Data analysis necessary:

· access menu item Service;

· in the menu that appears, execute the command Data analysis;

· in the list of module operating modes Data analysis select mode Regression and click on the button Ok .

After calling the mode Regression A dialog box appears on the screen in which the following parameters are set:

1. Input interval Y – a range of cell addresses containing values ​​is entered (the cells must form one column).

Rice. 3.2. Regression mode dialog box

2. Input interval X – a range of cell addresses containing the values ​​of independent variables is entered. The values ​​of each variable are represented in one column. The number of variables is no more than 16 (i.e. ).

3. Tags – enabled if the first row in the input range contains a title. In this case, standard names will be created automatically.

4. Reliability level – Enabling this option specifies the reliability when constructing confidence intervals.

5. Constant-zero– when this parameter is enabled, the coefficient is .

6. Output interval – when turned on, a field is activated in which you must enter the address of the upper left cell of the output range, which contains cells with the results of mode calculations Regression.

7. New worksheet – when this parameter is enabled, a new sheet opens into which, starting from cell A1, the results of the mode are inserted Regression.

8. New workbook- when this parameter is enabled, a new book opens on the first sheet of which, starting from cell A1, the results of the mode are inserted Regression.

9. Leftovers – when included, the column containing the residuals is calculated .

10. Standardized balances – when enabled, a column containing the standardized residuals is calculated.

After this mode Regression and in the dialog box we will set the necessary parameters. Note that due to the large “width” of the tables in which the results of the mode are displayed Regression, Some of the results are placed in other cells.

Let us give a brief interpretation of the indicators whose values ​​are calculated in the mode Regression. First, let's look at the indicators united by the name Regression statistics(see Fig. 3.3).

Multiple - square root of the coefficient of determination.

square– coefficient of determination.

Rice. 3.3. Results of the Regression mode

Normalized square– reduced coefficient of determination (see formula (2.1)).

Standard error– estimate for standard deviation.

Observations– number of observations.

The Introductory Overview section discusses two very simple examples(taken from Shumway, 1988) to illustrate the nature of spectral analysis and interpretation of results. If you are unfamiliar with this method, it is recommended that you look at this section of this chapter first.

Review and data file. The file Sunspot.sta contains part of the known sunspot numbers (Wolfer) from 1749 to 1924 (Anderson, 1971). Below is a list of the first few data from the example file.

The number of sunspots is believed to influence the weather on earth, as well as Agriculture, telecommunications, etc. Using this analysis, one can try to find out whether sunspot activity is truly cyclical in nature (in fact, it is, this data is widely discussed in the literature; see, for example, Bloomfield, 1976, or Shumway, 1988).

Definition of analysis. After running the analysis, open the Sunspot.sta data file. Click the Variables button and select the Spots variable (note that if the Sunspot.sta data file is the current open file data, and the Spots variable is the only variable in this file, then when the Time Series Analysis dialog box opens, Spots will be selected automatically). Now click on the Fourier (spectral) analysis button to open the Fourier (spectral) analysis dialog box.



Before applying spectral analysis, first plot the number of sunspots. Note that the Sunspot.sta file contains the corresponding years as observation names. To use these names in line graphs, click the Series View tab and select Case Names in the Label Points section. Also, select Set X-axis scale manually and Min. = 1, and Step = 10. Then click the Graph button next to the View Selection button. variable.



The number of sunspots appears to follow a cyclical pattern. The trend is not visible, so return to the Spectral Analysis window and deselect the Remove Linear Trend option in the Transform Source Series group.

It is obvious that the average of the series is greater than 0 (zero). Therefore, leave the Subtract mean option selected [otherwise the periodogram will be “clogged” with a very large peak at frequency 0 (zero)].

Now you are ready to start the analysis. Now click OK (One-Dimensional Fourier Analysis) to display the Fourier Spectral Analysis Results dialog box.



View results. The information section at the top of the dialog box shows some summary statistics for the series. It also shows the five largest peaks in the periodogram (by frequency). The three largest peaks are at frequencies 0.0852, 0.0909 and 0.0114. This information is often useful when analyzing very large series (for example, with more than 100,000 observations) that are not easily plotted on a single graph. In this case, however, it is easy to see the periodogram values; by clicking the Periodogram button in the Periodogram and graphs section spectral density.



The periodogram graph shows two clear peaks. The maximum is at a frequency of approximately 0.9. Return to the Spectral Analysis Results window and click the Summary button to see all periodogram values ​​(and other results) in the results table. Below is a portion of the results table with the largest peak identified from the periodogram.



As discussed in the Introductory Review section, Frequency is the number of cycles per unit of time (where each observation is one unit of time). Thus, Frequency 0.0909 corresponds to the value of 11 Periods (the number of time units required for a complete cycle). Since the sunspot data in Sunspot.sta represents annual observations, it can be concluded that there is a distinct 11-year (perhaps slightly longer than 11-year) cycle in sunspot activity.

Spectral density. Typically, to calculate spectral density estimates, the periodogram is smoothed to remove random fluctuations. The weighted moving average type and window width can be selected in the Spectral Windows section. The Introductory Overview section discusses these options in detail. For our example, let's leave the default window selected (Hamming width 5) and select the Spectral Density graph.



The two peaks are now even more distinct. Let's look at the periodogram values ​​by period. Select the Period field in the Schedule section. Now select the Spectral Density graph.



Again it can be seen that there is a pronounced 11-year cycle in sunspot activity; Moreover, there are signs of the existence of a longer, approximately 80-90 year cycle.

Any wave of complex shape can be represented as a sum of simple waves.

Joseph Fourier was very keen to describe in mathematical terms how heat passes through solid objects ( cm. Heat exchange). His interest in heat may have been sparked while he was in North Africa: Fourier accompanied Napoleon on the French expedition to Egypt and lived there for some time. To achieve his goal, Fourier had to develop new mathematical methods. The results of his research were published in 1822 in the work “Analytical Theory of Heat” ( Théorie analytique de la chaleur), where he explained how to analyze complex physical problems by decomposing them into a number of simpler ones.

The analysis method was based on the so-called Fourier series. In accordance with the principle of interference, the series begins with the decomposition of a complex form into simple ones - for example, a change in the earth's surface is explained by an earthquake, a change in the orbit of a comet is explained by the influence of the attraction of several planets, a change in the flow of heat is due to its passage through an irregularly shaped obstacle made of heat-insulating material. Fourier showed that a complex waveform can be represented as a sum of simple waves. As a rule, the equations describing classical systems can be easily solved for each of these simple waves. Further, Fourier showed how these simple solutions can be summed up to obtain a solution to the entire complex problem as a whole. (Mathematically speaking, a Fourier series is a method of representing a function as a sum of harmonics—sine and cosine waves, which is why Fourier analysis was also known as “harmonic analysis.”)

Before the advent of computers in the mid-twentieth century, Fourier methods and similar methods were best weapon in the scientific arsenal when attacking the complexities of nature. Since the advent of complex Fourier methods, scientists have been able to use them to solve not only simple problems that can be solved by direct application of Newton's laws of mechanics and other fundamental equations. Many of the great achievements of Newtonian science in the 19th century would in fact have been impossible without the use of the methods pioneered by Fourier. Subsequently, these methods were used to solve problems in various fields - from astronomy to mechanical engineering.

Jean-Baptiste Joseph FOURIE
Jean-Baptiste Joseph Fourier, 1768-1830

French mathematician. Born in Auxerre; at the age of nine he was left an orphan. Already at a young age he showed aptitude for mathematics. Fourier was educated at a church school and a military school, then worked as a mathematics teacher. Throughout his life he was actively involved in politics; was arrested in 1794 for defending victims of terror. After Robespierre's death he was released from prison; took part in the creation of the famous Polytechnic School (Ecole Polytechnique) in Paris; his position provided him with a springboard for advancement under Napoleon's regime. He accompanied Napoleon to Egypt and was appointed governor of Lower Egypt. Upon returning to France in 1801, he was appointed governor of one of the provinces. In 1822 he became permanent secretary of the French Academy of Sciences - an influential position in scientific world France.



2024 wisemotors.ru. How it works. Iron. Mining. Cryptocurrency.