Algorithms and environments #

We define structures for stochastic, sequential algorithms and environments, and the notion of an algorithm-environment sequence, which is a sequence of actions and feedbacks generated by an algorithm interacting with an environment.

Main definitions #

Algorithm 𝓐 𝓨: a stochastic, sequential algorithm.
Environment 𝓐 𝓨: a stochastic environment.
IsAlgEnvSeq A 𝓨 alg env P: an algorithm-environment sequence. That is, a sequence of actions A and feedback Y that have the correct conditional distributions to be generated by an algorithm alg interacting with an environment env, defined on a probability space (Ω, P).
IsAlgEnvSeqUntil A Y alg env P N: A and Y form an algorithm-environment sequence until time N.
prod_left alg: an Algorithm 𝓐 (𝓧 × 𝓨) obtained from an algorithm alg : Algorithm 𝓐 𝓨 by ignoring the 𝓧 component of each observation.

source

structure Learning.Algorithm (𝓐 : Type u_4) (𝓨 : Type u_5) [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] :

Type (max u_4 u_5)

A stochastic, sequential algorithm.

policy (n : ℕ) : ProbabilityTheory.Kernel (↥(Finset.Iic n) → 𝓐 × 𝓨) 𝓐
Policy or sampling rule: distribution of the next action.
h_policy (n : ℕ) : ProbabilityTheory.IsMarkovKernel (self.policy n)
The policy is a Markov kernel.
p0 : MeasureTheory.Measure 𝓐
Distribution of the first action.
hp0 : MeasureTheory.IsProbabilityMeasure self.p0
The first action distribution is a probability measure.

Instances For

source

instance Learning.instIsMarkovKernelForallSubtypeNatMemFinsetIicProdPolicy {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (alg : Algorithm 𝓐 𝓨) (n : ℕ) :

ProbabilityTheory.IsMarkovKernel (alg.policy n)

source

instance Learning.instIsProbabilityMeasureP0 {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (alg : Algorithm 𝓐 𝓨) :

MeasureTheory.IsProbabilityMeasure alg.p0

source

def Learning.Algorithm.prodLeft {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (𝓧 : Type u_4) [MeasurableSpace 𝓧] (alg : Algorithm 𝓐 𝓨) :

Algorithm 𝓐 (𝓧 × 𝓨)

An algorithm with observations in 𝓧 × 𝓨 obtained from an algorithm with observations in 𝓨 by ignoring the 𝓧 component of each observation.

Equations

One or more equations did not get rendered due to their size.

Instances For

source

@[simp]

theorem Learning.Algorithm.prodLeft_p0 {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (𝓧 : Type u_4) [MeasurableSpace 𝓧] (alg : Algorithm 𝓐 𝓨) :

(prodLeft 𝓧 alg).p0 = alg.p0

source

@[simp]

theorem Learning.Algorithm.prodLeft_policy {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (𝓧 : Type u_4) [MeasurableSpace 𝓧] (alg : Algorithm 𝓐 𝓨) (n : ℕ) :

(prodLeft 𝓧 alg).policy n = (alg.policy n).comap (fun (h : ↥(Finset.Iic n) → 𝓐 × 𝓧 × 𝓨) (i : ↥(Finset.Iic n)) => ((h i).1, (h i).2.2)) ⋯

source

structure Learning.Environment (𝓐 : Type u_4) (𝓨 : Type u_5) [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] :

Type (max u_4 u_5)

A stochastic environment.

feedback (n : ℕ) : ProbabilityTheory.Kernel ((↥(Finset.Iic n) → 𝓐 × 𝓨) × 𝓐) 𝓨
Distribution of the next observation as function of the past history.
h_feedback (n : ℕ) : ProbabilityTheory.IsMarkovKernel (self.feedback n)
The feedback kernels are Markov kernels.
ν0 : ProbabilityTheory.Kernel 𝓐 𝓨
Distribution of the first observation given the first action.
hp0 : ProbabilityTheory.IsMarkovKernel self.ν0
The initial observation kernel is a Markov kernel.

Instances For

source

instance Learning.instIsMarkovKernelProdForallSubtypeNatMemFinsetIicFeedback {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (env : Environment 𝓐 𝓨) (n : ℕ) :

ProbabilityTheory.IsMarkovKernel (env.feedback n)

source

instance Learning.instIsMarkovKernelν0 {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (env : Environment 𝓐 𝓨) :

ProbabilityTheory.IsMarkovKernel env.ν0

source

noncomputable def Learning.stepKernel {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (alg : Algorithm 𝓐 𝓨) (env : Environment 𝓐 𝓨) (n : ℕ) :

ProbabilityTheory.Kernel (↥(Finset.Iic n) → 𝓐 × 𝓨) (𝓐 × 𝓨)

Kernel describing the distribution of the next action-feedback pair given the history up to n.

Equations

Learning.stepKernel alg env n = (alg.policy n).compProd (env.feedback n)

Instances For

source

instance Learning.instIsMarkovKernelForallSubtypeNatMemFinsetIicProdStepKernel {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (alg : Algorithm 𝓐 𝓨) (env : Environment 𝓐 𝓨) (n : ℕ) :

ProbabilityTheory.IsMarkovKernel (stepKernel alg env n)

source

theorem Learning.stepKernel_def {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (alg : Algorithm 𝓐 𝓨) (env : Environment 𝓐 𝓨) (n : ℕ) :

stepKernel alg env n = (alg.policy n).compProd (env.feedback n)

source

@[simp]

theorem Learning.fst_stepKernel {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (alg : Algorithm 𝓐 𝓨) (env : Environment 𝓐 𝓨) (n : ℕ) :

(stepKernel alg env n).fst = alg.policy n

source

def Learning.step {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} (A : ℕ → Ω → 𝓐) (Y : ℕ → Ω → 𝓨) (n : ℕ) (ω : Ω) :

𝓐 × 𝓨

Step of the algorithm-environment sequence: the action-feedback pair at time n.

Equations

Learning.step A Y n ω = (A n ω, Y n ω)

Instances For

source

theorem Learning.measurable_step {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} (n : ℕ) (hA : Measurable (A n)) (hY : Measurable (Y n)) :

Measurable (step A Y n)

source

def Learning.trajectory {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} (A : ℕ → Ω → 𝓐) (Y : ℕ → Ω → 𝓨) (ω : Ω) :

ℕ → 𝓐 × 𝓨

A random variable that gives the sequence of action-feedback pairs.

Equations

Learning.trajectory A Y ω n = (A n ω, Y n ω)

Instances For

source

theorem Learning.measurable_trajectory {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} (hA : ∀ (n : ℕ), Measurable (A n)) (hR : ∀ (n : ℕ), Measurable (Y n)) :

Measurable (trajectory A Y)

source

def Learning.history {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} (A : ℕ → Ω → 𝓐) (Y : ℕ → Ω → 𝓨) (n : ℕ) (ω : Ω) :

↥(Finset.Iic n) → 𝓐 × 𝓨

History of the algorithm-environment sequence up to time n.

Equations

Learning.history A Y n ω i = (A (↑i) ω, Y (↑i) ω)

Instances For

source

theorem Learning.measurable_history {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} (hA : ∀ (n : ℕ), Measurable (A n)) (hY : ∀ (n : ℕ), Measurable (Y n)) (n : ℕ) :

Measurable (history A Y n)

source

theorem Learning.eval_comp_history {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} (n : ℕ) :

(fun (x : ↥(Finset.Iic n) → 𝓐 × 𝓨) => x ⟨n, ⋯⟩) ∘ history A Y n = step A Y n

source

theorem Learning.fst_eval_comp_history {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} (n : ℕ) :

(fun (x : ↥(Finset.Iic n) → 𝓐 × 𝓨) => (x ⟨n, ⋯⟩).1) ∘ history A Y n = A n

source

theorem Learning.snd_eval_comp_history {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} (n : ℕ) :

(fun (x : ↥(Finset.Iic n) → 𝓐 × 𝓨) => (x ⟨n, ⋯⟩).2) ∘ history A Y n = Y n

source

structure Learning.IsAlgEnvSeq {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {mΩ : MeasurableSpace Ω} [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] (A : ℕ → Ω → 𝓐) (Y : ℕ → Ω → 𝓨) (alg : Algorithm 𝓐 𝓨) (env : Environment 𝓐 𝓨) (P : MeasureTheory.Measure Ω) [MeasureTheory.IsFiniteMeasure P] :

Prop

An algorithm-environment sequence: a sequence of actions and feedbacks generated by an algorithm interacting with an environment.

measurable_action (n : ℕ) : Measurable (A n)
The action sequence is measurable.
measurable_feedback (n : ℕ) : Measurable (Y n)
The feedback sequence is measurable.
hasLaw_action_zero : ProbabilityTheory.HasLaw (fun (ω : Ω) => A 0 ω) alg.p0 P
The first action has the correct law.
hasCondDistrib_feedback_zero : ProbabilityTheory.HasCondDistrib (Y 0) (A 0) env.ν0 P
The first feedback has the correct conditional distribution.
hasCondDistrib_action (n : ℕ) : ProbabilityTheory.HasCondDistrib (A (n + 1)) (history A Y n) (alg.policy n) P
The next action has the correct conditional distribution given the history.
hasCondDistrib_feedback (n : ℕ) : ProbabilityTheory.HasCondDistrib (Y (n + 1)) (fun (ω : Ω) => (history A Y n ω, A (n + 1) ω)) (env.feedback n) P
The next feedback has the correct conditional distribution given the history and next action.

Instances For

source

structure Learning.IsAlgEnvSeqUntil {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {mΩ : MeasurableSpace Ω} [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] (A : ℕ → Ω → 𝓐) (Y : ℕ → Ω → 𝓨) (alg : Algorithm 𝓐 𝓨) (env : Environment 𝓐 𝓨) (P : MeasureTheory.Measure Ω) [MeasureTheory.IsFiniteMeasure P] (N : ℕ) :

Prop

An algorithm-environment sequence: a sequence of actions and feedbacks generated by an algorithm interacting with an environment.

measurable_action (n : ℕ) : Measurable (A n)
The action sequence is measurable.
measurable_feedback (n : ℕ) : Measurable (Y n)
The feedback sequence is measurable.
hasLaw_action_zero : ProbabilityTheory.HasLaw (fun (ω : Ω) => A 0 ω) alg.p0 P
The first action has the correct law.
hasCondDistrib_feedback_zero : ProbabilityTheory.HasCondDistrib (Y 0) (A 0) env.ν0 P
The first feedback has the correct conditional distribution.
hasCondDistrib_action (n : ℕ) (hn : n < N) : ProbabilityTheory.HasCondDistrib (A (n + 1)) (history A Y n) (alg.policy n) P
The next action has the correct conditional distribution given the history.
hasCondDistrib_feedback (n : ℕ) (hn : n < N) : ProbabilityTheory.HasCondDistrib (Y (n + 1)) (fun (ω : Ω) => (history A Y n ω, A (n + 1) ω)) (env.feedback n) P
The next feedback has the correct conditional distribution given the history and next action.

Instances For

source

theorem Learning.IsAlgEnvSeqUntil.mono {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] {N : ℕ} [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] (h : IsAlgEnvSeqUntil A Y alg env P N) {N' : ℕ} (hN : N' ≤ N) :

IsAlgEnvSeqUntil A Y alg env P N'

source

theorem Learning.IsAlgEnvSeq.isAlgEnvSeqUntil {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] (h : IsAlgEnvSeq A Y alg env P) (N : ℕ) :

IsAlgEnvSeqUntil A Y alg env P N

source

theorem Learning.IsAlgEnvSeq.measurable_step {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] (h : IsAlgEnvSeq A Y alg env P) (n : ℕ) :

Measurable (step A Y n)

source

theorem Learning.IsAlgEnvSeq.measurable_history {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] (h : IsAlgEnvSeq A Y alg env P) (n : ℕ) :

Measurable (history A Y n)

source

theorem Learning.IsAlgEnvSeq.hasLaw_step_zero {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] (h : IsAlgEnvSeq A Y alg env P) :

ProbabilityTheory.HasLaw (step A Y 0) (alg.p0.compProd env.ν0) P

source

theorem Learning.IsAlgEnvSeqUntil.hasLaw_step_zero {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] {N : ℕ} [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] (h : IsAlgEnvSeqUntil A Y alg env P N) :

ProbabilityTheory.HasLaw (step A Y 0) (alg.p0.compProd env.ν0) P

source

theorem Learning.IsAlgEnvSeq.hasCondDistrib_step {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] (h : IsAlgEnvSeq A Y alg env P) (n : ℕ) :

ProbabilityTheory.HasCondDistrib (step A Y (n + 1)) (history A Y n) (stepKernel alg env n) P

source

theorem Learning.IsAlgEnvSeqUntil.hasCondDistrib_step {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] {N : ℕ} [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] (h : IsAlgEnvSeqUntil A Y alg env P N) (n : ℕ) (hn : n < N) :

ProbabilityTheory.HasCondDistrib (step A Y (n + 1)) (history A Y n) (stepKernel alg env n) P

source

theorem Learning.IsAlgEnvSeq.hasLaw_history_zero {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] (h : IsAlgEnvSeq A Y alg env P) :

ProbabilityTheory.HasLaw (history A Y 0) (MeasureTheory.Measure.map (⇑(MeasurableEquiv.piUnique fun (x : ↥(Finset.Iic 0)) => 𝓐 × 𝓨).symm) (MeasureTheory.Measure.map (step A Y 0) P)) P

source

theorem Learning.IsAlgEnvSeq.hasLaw_history_succ {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] (h : IsAlgEnvSeq A Y alg env P) (n : ℕ) :

ProbabilityTheory.HasLaw (history A Y (n + 1)) (MeasureTheory.Measure.map (⇑(MeasurableEquiv.IicSuccProd (fun (x : ℕ) => 𝓐 × 𝓨) n).symm) ((MeasureTheory.Measure.map (history A Y n) P).compProd 𝓛[step A Y (n + 1) | history A Y n; P])) P

source

def Learning.IsAlgEnvSeq.filtration {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} (hA : ∀ (n : ℕ), Measurable (A n)) (hY : ∀ (n : ℕ), Measurable (Y n)) :

MeasureTheory.Filtration ℕ mΩ

Filtration generated by the history up to time n.

Equations

Learning.IsAlgEnvSeq.filtration hA hY = { seq := fun (i : ℕ) => MeasurableSpace.comap (Learning.history A Y i) inferInstance, mono' := ⋯, le' := ⋯ }

Instances For

source

theorem Learning.IsAlgEnvSeq.adapted_history {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} (hA : ∀ (n : ℕ), Measurable (A n)) (hY : ∀ (n : ℕ), Measurable (Y n)) :

MeasureTheory.Adapted (filtration hA hY) (history A Y)

source

theorem Learning.IsAlgEnvSeq.adapted_step {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} (hA : ∀ (n : ℕ), Measurable (A n)) (hY : ∀ (n : ℕ), Measurable (Y n)) :

MeasureTheory.Adapted (filtration hA hY) (step A Y)

source

theorem Learning.IsAlgEnvSeq.adapted_action {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} (hA : ∀ (n : ℕ), Measurable (A n)) (hY : ∀ (n : ℕ), Measurable (Y n)) :

MeasureTheory.Adapted (filtration hA hY) A

source

theorem Learning.IsAlgEnvSeq.adapted_feedback {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} (hA : ∀ (n : ℕ), Measurable (A n)) (hY : ∀ (n : ℕ), Measurable (Y n)) :

MeasureTheory.Adapted (filtration hA hY) Y

source

def Learning.IsAlgEnvSeq.filtrationAction {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} (hA : ∀ (n : ℕ), Measurable (A n)) (hY : ∀ (n : ℕ), Measurable (Y n)) :

MeasureTheory.Filtration ℕ mΩ

Filtration generated by the history at time n-1 together with the action at time n.

Equations

One or more equations did not get rendered due to their size.

Instances For

source

theorem Learning.IsAlgEnvSeq.filtrationAction_zero_eq_comap {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} {hA : ∀ (n : ℕ), Measurable (A n)} {hY : ∀ (n : ℕ), Measurable (Y n)} :

↑(filtrationAction hA hY) 0 = MeasurableSpace.comap (A 0) inferInstance

source

theorem Learning.IsAlgEnvSeq.filtrationAction_eq_comap {𝓐 : Type u_1} {𝓨 : Type u_2} {Ω : Type u_3} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → 𝓐} {Y : ℕ → Ω → 𝓨} {hA : ∀ (n : ℕ), Measurable (A n)} {hY : ∀ (n : ℕ), Measurable (Y n)} (n : ℕ) (hn : n ≠ 0) :

↑(filtrationAction hA hY) n = MeasurableSpace.comap (fun (ω : Ω) => (history A Y (n - 1) ω, A n ω)) inferInstance

Documentation

LeanMachineLearning.SequentialLearning.Algorithm

Algorithms and environments #

Main definitions #