Documentation

LeanMachineLearning.SequentialLearning.Deterministic

Deterministic algorithms and environments #

A deterministic algorithm chooses its action in a deterministic way. That is, that action is given by a measurable function of the history instead of a general Markov kernel. Similarly, a deterministic environment gives feedback in a deterministic way.

Main definitions #

We introduce two typeclasses IsDeterministicAlg and IsDeterministicEnv to express that an algorithm or an environment is deterministic. We also give definitions for the initial action and the next action of a deterministic algorithm, and for the feedback functions of a deterministic environment. Finally, we give a construction of a deterministic algorithm and environment from measurable functions.

IsDeterministicAlg alg: a typeclass expressing that the algorithm alg is deterministic.
IsDeterministicEnv env: a typeclass expressing that the environment env is deterministic.
actionZero alg: the initial action of a deterministic algorithm alg.
nextAction alg n: the function that gives the next action of a deterministic algorithm alg at step n, as a function of the history.
feedbackFunZero env: the function that gives the initial feedback of a deterministic environment env.
feedbackFun env n: the function that gives the feedback of a deterministic environment env at step n, as a function of the history and the current action.
detAlgorithm nextA h_next action0: a deterministic algorithm that chooses its action according to the measurable function nextA (with proof of measurability h_next), with initial action action0.

class Learning.IsDeterministicAlg {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (alg : Algorithm 𝓐 𝓨) :

An algorithm is deterministic if its initial action and subsequent actions are determined by measurable functions (and not possibly random kernels).

exists_action0 : ∃ (action0 : 𝓐), alg.p0 = MeasureTheory.Measure.dirac action0
exists_nextAction (n : ℕ) : ∃ (nextAction : (↥(Finset.Iic n) → 𝓐 × 𝓨) → 𝓐) (h_meas : Measurable nextAction), alg.policy n = ProbabilityTheory.Kernel.deterministic nextAction h_meas

Instances

noncomputable def Learning.actionZero {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (alg : Algorithm 𝓐 𝓨) [h_det : IsDeterministicAlg alg] :

𝓐

The initial action of a deterministic algorithm.

Equations

Learning.actionZero alg = ⋯.choose

Instances For

noncomputable def Learning.nextAction {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (alg : Algorithm 𝓐 𝓨) [h_det : IsDeterministicAlg alg] (n : ℕ) :

(↥(Finset.Iic n) → 𝓐 × 𝓨) → 𝓐

The next action of a deterministic algorithm after step n.

Equations

Learning.nextAction alg n = ⋯.choose

Instances For

theorem Learning.measurable_nextAction {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (alg : Algorithm 𝓐 𝓨) [IsDeterministicAlg alg] (n : ℕ) :

Measurable (nextAction alg n)

theorem Learning.p0_eq_dirac {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (alg : Algorithm 𝓐 𝓨) [h_det : IsDeterministicAlg alg] :

alg.p0 = MeasureTheory.Measure.dirac (actionZero alg)

theorem Learning.policy_eq_deterministic {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (alg : Algorithm 𝓐 𝓨) [h_det : IsDeterministicAlg alg] (n : ℕ) :

alg.policy n = ProbabilityTheory.Kernel.deterministic (nextAction alg n) ⋯

theorem Learning.IsDeterministicAlg.hasLaw_action_zero_of_IsAlgEnvSeqUntil {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {Ω : Type u_3} {mΩ : MeasurableSpace Ω} [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} {N : ℕ} [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeqUntil A Y alg env P N) :

ProbabilityTheory.HasLaw (A 0) (MeasureTheory.Measure.dirac (actionZero alg)) P

theorem Learning.IsDeterministicAlg.action_zero_of_IsAlgEnvSeqUntil {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {Ω : Type u_3} {mΩ : MeasurableSpace Ω} [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} {N : ℕ} [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeqUntil A Y alg env P N) :

A 0 =ᵐ[P] fun (x : Ω) => actionZero alg

theorem Learning.IsDeterministicAlg.action_ae_eq_of_IsAlgEnvSeqUntil {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {Ω : Type u_3} {mΩ : MeasurableSpace Ω} [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} {n N : ℕ} [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeqUntil A Y alg env P N) (hn : n < N) :

A (n + 1) =ᵐ[P] fun (ω : Ω) => nextAction alg n (history A Y n ω)

theorem Learning.IsDeterministicAlg.hasLaw_action_zero {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {Ω : Type u_3} {mΩ : MeasurableSpace Ω} [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeq A Y alg env P) :

ProbabilityTheory.HasLaw (A 0) (MeasureTheory.Measure.dirac (actionZero alg)) P

theorem Learning.IsDeterministicAlg.action_zero_ae_eq {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {Ω : Type u_3} {mΩ : MeasurableSpace Ω} [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeq A Y alg env P) :

A 0 =ᵐ[P] fun (x : Ω) => actionZero alg

theorem Learning.IsDeterministicAlg.action_ae_eq {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {Ω : Type u_3} {mΩ : MeasurableSpace Ω} [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeq A Y alg env P) (n : ℕ) :

A (n + 1) =ᵐ[P] fun (ω : Ω) => nextAction alg n (history A Y n ω)

theorem Learning.IsDeterministicAlg.action_ae_all_eq {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {Ω : Type u_3} {mΩ : MeasurableSpace Ω} [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeq A Y alg env P) :

∀ᵐ (ω : Ω) ∂P, A 0 ω = actionZero alg ∧ ∀ (n : ℕ), A (n + 1) ω = nextAction alg n (history A Y n ω)

class Learning.IsDeterministicEnv {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (env : Environment 𝓐 𝓨) :

An environment is deterministic if its initial feedbacks are determined by measurable functions (and not possibly random kernels).

exists_f0 : ∃ (f0 : 𝓐 → 𝓨) (hf0 : Measurable f0), env.ν0 = ProbabilityTheory.Kernel.deterministic f0 hf0
exists_f (n : ℕ) : ∃ (f : (↥(Finset.Iic n) → 𝓐 × 𝓨) × 𝓐 → 𝓨) (hf : Measurable f), env.feedback n = ProbabilityTheory.Kernel.deterministic f hf

Instances

noncomputable def Learning.feedbackFunZero {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (env : Environment 𝓐 𝓨) [h_det : IsDeterministicEnv env] :

𝓐 → 𝓨

The initial feedback function of a deterministic environment.

Equations

Learning.feedbackFunZero env = ⋯.choose

Instances For

theorem Learning.measurable_feedbackFunZero {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (env : Environment 𝓐 𝓨) [IsDeterministicEnv env] :

Measurable (feedbackFunZero env)

theorem Learning.ν0_eq_deterministic {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (env : Environment 𝓐 𝓨) [IsDeterministicEnv env] :

env.ν0 = ProbabilityTheory.Kernel.deterministic (feedbackFunZero env) ⋯

noncomputable def Learning.feedbackFun {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (env : Environment 𝓐 𝓨) [h_det : IsDeterministicEnv env] (n : ℕ) :

(↥(Finset.Iic n) → 𝓐 × 𝓨) × 𝓐 → 𝓨

The feedback function of a deterministic environment at step n.

Equations

Learning.feedbackFun env n = ⋯.choose

Instances For

theorem Learning.measurable_feedbackFun {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (env : Environment 𝓐 𝓨) [IsDeterministicEnv env] (n : ℕ) :

Measurable (feedbackFun env n)

theorem Learning.feedback_eq_deterministic {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (env : Environment 𝓐 𝓨) [IsDeterministicEnv env] (n : ℕ) :

env.feedback n = ProbabilityTheory.Kernel.deterministic (feedbackFun env n) ⋯

theorem Learning.IsDeterministicEnv.hasCondDistrib_feedback_zero {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {Ω : Type u_3} {mΩ : MeasurableSpace Ω} [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} [h_det : IsDeterministicEnv env] (h : IsAlgEnvSeq A Y alg env P) :

ProbabilityTheory.HasCondDistrib (Y 0) (A 0) (ProbabilityTheory.Kernel.deterministic (feedbackFunZero env) ⋯) P

theorem Learning.IsDeterministicEnv.hasCondDistrib_feedback {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {Ω : Type u_3} {mΩ : MeasurableSpace Ω} [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} [h_det : IsDeterministicEnv env] (h : IsAlgEnvSeq A Y alg env P) (n : ℕ) :

ProbabilityTheory.HasCondDistrib (Y (n + 1)) (fun (ω : Ω) => (history A Y n ω, A (n + 1) ω)) (ProbabilityTheory.Kernel.deterministic (feedbackFun env n) ⋯) P

noncomputable def Learning.detAlgorithm {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (nextA : (n : ℕ) → (↥(Finset.Iic n) → 𝓐 × 𝓨) → 𝓐) (h_next : ∀ (n : ℕ), Measurable (nextA n)) (action0 : 𝓐) :

Algorithm 𝓐 𝓨

A deterministic algorithm, which chooses the action given by the function nextAction.

Equations

Learning.detAlgorithm nextA h_next action0 = { policy := fun (n : ℕ) => ProbabilityTheory.Kernel.deterministic (nextA n) ⋯, h_policy := ⋯, p0 := MeasureTheory.Measure.dirac action0, hp0 := ⋯ }

Instances For

@[simp]

theorem Learning.detAlgorithm_p0 {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (nextA : (n : ℕ) → (↥(Finset.Iic n) → 𝓐 × 𝓨) → 𝓐) (h_next : ∀ (n : ℕ), Measurable (nextA n)) (action0 : 𝓐) :

(detAlgorithm nextA h_next action0).p0 = MeasureTheory.Measure.dirac action0

@[simp]

theorem Learning.detAlgorithm_policy {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (nextA : (n : ℕ) → (↥(Finset.Iic n) → 𝓐 × 𝓨) → 𝓐) (h_next : ∀ (n : ℕ), Measurable (nextA n)) (action0 : 𝓐) (n : ℕ) :

(detAlgorithm nextA h_next action0).policy n = ProbabilityTheory.Kernel.deterministic (nextA n) ⋯

instance Learning.instIsDeterministicAlgDetAlgorithm {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {nextA : (n : ℕ) → (↥(Finset.Iic n) → 𝓐 × 𝓨) → 𝓐} {h_next : ∀ (n : ℕ), Measurable (nextA n)} {action0 : 𝓐} :

IsDeterministicAlg (detAlgorithm nextA h_next action0)

@[simp]

theorem Learning.actionZero_detAlgorithm {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {nextA : (n : ℕ) → (↥(Finset.Iic n) → 𝓐 × 𝓨) → 𝓐} {h_next : ∀ (n : ℕ), Measurable (nextA n)} {action0 : 𝓐} [MeasurableSpace.SeparatesPoints 𝓐] :

actionZero (detAlgorithm nextA h_next action0) = action0

@[simp]

theorem Learning.nextAction_detAlgorithm {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {nextA : (n : ℕ) → (↥(Finset.Iic n) → 𝓐 × 𝓨) → 𝓐} {h_next : ∀ (n : ℕ), Measurable (nextA n)} {action0 : 𝓐} [MeasurableSpace.SeparatesPoints 𝓐] (n : ℕ) :

nextAction (detAlgorithm nextA h_next action0) n = nextA n

noncomputable def Learning.detEnvironment {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (f0 : 𝓐 → 𝓨) (hf0 : Measurable f0) (f : (n : ℕ) → (↥(Finset.Iic n) → 𝓐 × 𝓨) × 𝓐 → 𝓨) (hf : ∀ (n : ℕ), Measurable (f n)) :

Environment 𝓐 𝓨

A deterministic environment, where the feedback is given by evaluating fixed measurable functions.

Equations

One or more equations did not get rendered due to their size.

Instances For

instance Learning.instIsDeterministicEnvDetEnvironment {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {f0 : 𝓐 → 𝓨} {hf0 : Measurable f0} {f : (n : ℕ) → (↥(Finset.Iic n) → 𝓐 × 𝓨) × 𝓐 → 𝓨} {hf : ∀ (n : ℕ), Measurable (f n)} :

IsDeterministicEnv (detEnvironment f0 hf0 f hf)

@[simp]

theorem Learning.feedbackFunZero_detEnvironment {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {f0 : 𝓐 → 𝓨} {hf0 : Measurable f0} {f : (n : ℕ) → (↥(Finset.Iic n) → 𝓐 × 𝓨) × 𝓐 → 𝓨} {hf : ∀ (n : ℕ), Measurable (f n)} [MeasurableSpace.SeparatesPoints 𝓨] :

feedbackFunZero (detEnvironment f0 hf0 f hf) = f0

@[simp]

theorem Learning.feedbackFun_detEnvironment {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {f0 : 𝓐 → 𝓨} {hf0 : Measurable f0} {f : (n : ℕ) → (↥(Finset.Iic n) → 𝓐 × 𝓨) × 𝓐 → 𝓨} {hf : ∀ (n : ℕ), Measurable (f n)} [MeasurableSpace.SeparatesPoints 𝓨] (n : ℕ) :

feedbackFun (detEnvironment f0 hf0 f hf) n = f n

theorem Learning.IsAlgEnvSeq.hasLaw_action_zero_detAlgorithm {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {nextA : (n : ℕ) → (↥(Finset.Iic n) → 𝓐 × 𝓨) → 𝓐} {h_next : ∀ (n : ℕ), Measurable (nextA n)} {action0 : 𝓐} {env : Environment 𝓐 𝓨} {Ω : Type u_3} {mΩ : MeasurableSpace Ω} [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} (h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) :

ProbabilityTheory.HasLaw (A 0) (MeasureTheory.Measure.dirac action0) P

theorem Learning.IsAlgEnvSeq.action_zero_detAlgorithm {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {nextA : (n : ℕ) → (↥(Finset.Iic n) → 𝓐 × 𝓨) → 𝓐} {h_next : ∀ (n : ℕ), Measurable (nextA n)} {action0 : 𝓐} {env : Environment 𝓐 𝓨} {Ω : Type u_3} {mΩ : MeasurableSpace Ω} [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} (h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) :

A 0 =ᵐ[P] fun (x : Ω) => action0

theorem Learning.IsAlgEnvSeq.action_detAlgorithm_ae_eq {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {nextA : (n : ℕ) → (↥(Finset.Iic n) → 𝓐 × 𝓨) → 𝓐} {h_next : ∀ (n : ℕ), Measurable (nextA n)} {action0 : 𝓐} {env : Environment 𝓐 𝓨} {Ω : Type u_3} {mΩ : MeasurableSpace Ω} [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} (h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) (n : ℕ) :

A (n + 1) =ᵐ[P] fun (ω : Ω) => nextA n (history A Y n ω)

theorem Learning.IsAlgEnvSeq.action_detAlgorithm_ae_all_eq {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {nextA : (n : ℕ) → (↥(Finset.Iic n) → 𝓐 × 𝓨) → 𝓐} {h_next : ∀ (n : ℕ), Measurable (nextA n)} {action0 : 𝓐} {env : Environment 𝓐 𝓨} {Ω : Type u_3} {mΩ : MeasurableSpace Ω} [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} (h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) :

∀ᵐ (ω : Ω) ∂P, A 0 ω = action0 ∧ ∀ (n : ℕ), A (n + 1) ω = nextA n (history A Y n ω)

theorem Learning.IsAlgEnvSeqUntil.hasLaw_action_zero_detAlgorithm {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {nextA : (n : ℕ) → (↥(Finset.Iic n) → 𝓐 × 𝓨) → 𝓐} {h_next : ∀ (n : ℕ), Measurable (nextA n)} {action0 : 𝓐} {env : Environment 𝓐 𝓨} {Ω : Type u_3} {mΩ : MeasurableSpace Ω} [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} {N : ℕ} (h : IsAlgEnvSeqUntil A Y (detAlgorithm nextA h_next action0) env P N) :

ProbabilityTheory.HasLaw (A 0) (MeasureTheory.Measure.dirac action0) P

theorem Learning.IsAlgEnvSeqUntil.action_zero_detAlgorithm {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {nextA : (n : ℕ) → (↥(Finset.Iic n) → 𝓐 × 𝓨) → 𝓐} {h_next : ∀ (n : ℕ), Measurable (nextA n)} {action0 : 𝓐} {env : Environment 𝓐 𝓨} {Ω : Type u_3} {mΩ : MeasurableSpace Ω} [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} {N : ℕ} (h : IsAlgEnvSeqUntil A Y (detAlgorithm nextA h_next action0) env P N) :

A 0 =ᵐ[P] fun (x : Ω) => action0

theorem Learning.IsAlgEnvSeqUntil.action_detAlgorithm_ae_eq {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {nextA : (n : ℕ) → (↥(Finset.Iic n) → 𝓐 × 𝓨) → 𝓐} {h_next : ∀ (n : ℕ), Measurable (nextA n)} {action0 : 𝓐} {env : Environment 𝓐 𝓨} {Ω : Type u_3} {mΩ : MeasurableSpace Ω} [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} {N n : ℕ} (h : IsAlgEnvSeqUntil A Y (detAlgorithm nextA h_next action0) env P N) (hn : n < N) :

A (n + 1) =ᵐ[P] fun (ω : Ω) => nextA n (history A Y n ω)