This article is from the Nonlinear Science FAQ, by James D. Meiss jdm@boulder.colorado.edu with numerous contributions by others.

(Thanks to Justin Lipton for contributing to this answer)

How can I tell if my data is deterministic? This is a very tricky problem. It

is difficult because in practice no time series consists of pure 'signal.'

There will always be some form of corrupting noise, even if it is present as

round-off or truncation error or as a result of finite arithmetic or

quantization. Thus any real time series, even if mostly deterministic, will be

a stochastic processes

All methods for distinguishing deterministic and stochastic processes rely on

the fact that a deterministic system will always evolve in the same way from a

given starting point. Thus given a time series that we are testing for

determinism we

(1) pick a test state

(2) search the time series for a similar or 'nearby' state and

(3) compare their respective time evolution.

Define the error as the difference between the time evolution of the 'test'

state and the time evolution of the nearby state. A deterministic system will

have an error that either remains small (stable, regular solution) or increase

exponentially with time (chaotic solution). A stochastic system will have a

randomly distributed error.

Essentially all measures of determinism taken from time series rely upon

finding the closest states to a given 'test' state (i.e., correlation

dimension, Lyapunov exponents, etc.). To define the state of a system one

typically relies on phase space embedding methods, see [3.14].

Typically one chooses an embedding dimension, and investigates the propagation

of the error between two nearby states. If the error looks random, one

increases the dimension. If you can increase the dimension to obtain a

deterministic looking error, then you are done. Though it may sound simple it

is not really! One complication is that as the dimension increases the search

for a nearby state requires a lot more computation time and a lot of data (the

amount of data required increases exponentially with embedding dimension) to

find a suitably close candidate. If the embedding dimension (number of

measures per state) is chosen too small (less than the 'true' value)

deterministic data can appear to be random but in theory there is no problem

choosing the dimension too large--the method will work. Practically, anything

approaching about 10 dimensions is considered so large that a stochastic

description is probably more suitable and convenient anyway.

See e.g.,

Sugihara, G. and R. M. May (1990). "Nonlinear Forecasting as a Way of

Distinguishing Chaos from Measurement Error in Time Series." Nature

344: 734-740.

Continue to: