3.2.Β SequentialLearning.Algorithm
Algorithms and environments
We define structures for stochastic, sequential algorithms and environments, and the notion of an algorithm-environment sequence, which is a sequence of actions and feedbacks generated by an algorithm interacting with an environment.
Main definitions
-
Algorithm π π¨: a stochastic, sequential algorithm. -
Environment π π¨: a stochastic environment. -
IsAlgEnvSeq A π¨ alg env P: an algorithm-environment sequence. That is, a sequence of actionsAand feedbackYthat have the correct conditional distributions to be generated by an algorithmalginteracting with an environmentenv, defined on a probability space(Ξ©, P). -
IsAlgEnvSeqUntil A Y alg env P N:AandYform an algorithm-environment sequence until timeN. -
prod_left alg: anAlgorithm π (π§ Γ π¨)obtained from an algorithmalg : Algorithm π π¨by ignoring theπ§component of each observation.
Module LeanMachineLearning.SequentialLearning.Algorithm contains 42 exposed declarations.
Algorithmπ
Learning.AlgorithmA stochastic, sequential algorithm.
Learning.Algorithm.{u_4, u_5} (π : Type u_4) (π¨ : Type u_5) [MeasurableSpace π] [MeasurableSpace π¨] : Type (max u_4 u_5)Learning.Algorithm.{u_4, u_5} (π : Type u_4) (π¨ : Type u_5) [MeasurableSpace π] [MeasurableSpace π¨] : Type (max u_4 u_5)
Code
structure Algorithm (π π¨ : Type*) [MeasurableSpace π] [MeasurableSpace π¨] where /-- Policy or sampling rule: distribution of the next action. -/ policy : (n : β) β Kernel (Iic n β π Γ π¨) π /-- The policy is a Markov kernel. -/ [h_policy : β n, IsMarkovKernel (policy n)] /-- Distribution of the first action. -/ p0 : Measure π /-- The first action distribution is a probability measure. -/ [hp0 : IsProbabilityMeasure p0]
Used by (216)
Actions: Source Β· Open Issue
instIsMarkovKernelForallSubtypeNatMemFinsetIicProdPolicyπ
Learning.instIsMarkovKernelForallSubtypeNatMemFinsetIicProdPolicyNo docstring.
Learning.instIsMarkovKernelForallSubtypeNatMemFinsetIicProdPolicy.{u_1, u_2} {π : Type u_1} {π¨ : Type u_2} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} (alg : Algorithm π π¨) (n : β) : ProbabilityTheory.IsMarkovKernel (Algorithm.policy alg n)Learning.instIsMarkovKernelForallSubtypeNatMemFinsetIicProdPolicy.{u_1, u_2} {π : Type u_1} {π¨ : Type u_2} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} (alg : Algorithm π π¨) (n : β) : ProbabilityTheory.IsMarkovKernel (Algorithm.policy alg n)
Code
instance (alg : Algorithm π π¨) (n : β) : IsMarkovKernel (alg.policy n)
Type uses (1)
Actions: Source Β· Open Issue
Proof
alg.h_policy n
instIsProbabilityMeasureP0π
Learning.instIsProbabilityMeasureP0No docstring.
Learning.instIsProbabilityMeasureP0.{u_1, u_2} {π : Type u_1} {π¨ : Type u_2} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} (alg : Algorithm π π¨) : MeasureTheory.IsProbabilityMeasure (Algorithm.p0 alg)Learning.instIsProbabilityMeasureP0.{u_1, u_2} {π : Type u_1} {π¨ : Type u_2} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} (alg : Algorithm π π¨) : MeasureTheory.IsProbabilityMeasure (Algorithm.p0 alg)
Code
instance (alg : Algorithm π π¨) : IsProbabilityMeasure alg.p0
Type uses (1)
Used by (13)
Actions: Source Β· Open Issue
Proof
alg.hp0
prodLeftπ
Learning.Algorithm.prodLeft
An algorithm with observations in π§ Γ π¨ obtained from an algorithm with observations in π¨
by ignoring the π§ component of each observation.
Learning.Algorithm.prodLeft.{u_1, u_2, u_4} {π : Type u_1} {π¨ : Type u_2} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} (π§ : Type u_4) [MeasurableSpace π§] (alg : Algorithm π π¨) : Algorithm π (π§ Γ π¨)Learning.Algorithm.prodLeft.{u_1, u_2, u_4} {π : Type u_1} {π¨ : Type u_2} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} (π§ : Type u_4) [MeasurableSpace π§] (alg : Algorithm π π¨) : Algorithm π (π§ Γ π¨)
Code
def Algorithm.prodLeft (π§ : Type*) [MeasurableSpace π§] (alg : Algorithm π π¨) :
Algorithm π (π§ Γ π¨) where
policy n := (alg.policy n).comap (fun h i β¦ ((h i).1, (h i).2.2)) (by fun_prop)
p0 := alg.p0Type uses (1)
Used by (6)
Actions: Source Β· Open Issue
prodLeft_p0π
Learning.Algorithm.prodLeft_p0No docstring.
Learning.Algorithm.prodLeft_p0.{u_1, u_2, u_4} {π : Type u_1} {π¨ : Type u_2} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} (π§ : Type u_4) [MeasurableSpace π§] (alg : Algorithm π π¨) : p0 (prodLeft π§ alg) = p0 algLearning.Algorithm.prodLeft_p0.{u_1, u_2, u_4} {π : Type u_1} {π¨ : Type u_2} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} (π§ : Type u_4) [MeasurableSpace π§] (alg : Algorithm π π¨) : p0 (prodLeft π§ alg) = p0 alg
Code
theorem prodLeft_p0 : β {π : Type u_1} {π¨ : Type u_2} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} (π§ : Type u_4)
[inst : MeasurableSpace π§] (alg : Learning.Algorithm π π¨), (Learning.Algorithm.prodLeft π§ alg).p0 = alg.p0Actions: Source Β· Open Issue
Proof
@[simps]
prodLeft_policyπ
Learning.Algorithm.prodLeft_policyNo docstring.
Learning.Algorithm.prodLeft_policy.{u_1, u_2, u_4} {π : Type u_1} {π¨ : Type u_2} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} (π§ : Type u_4) [MeasurableSpace π§] (alg : Algorithm π π¨) (n : β) : policy (prodLeft π§ alg) n = ProbabilityTheory.Kernel.comap (policy alg n) (fun h i => (Prod.fst (h i), Prod.snd (Prod.snd (h i)))) β―Learning.Algorithm.prodLeft_policy.{u_1, u_2, u_4} {π : Type u_1} {π¨ : Type u_2} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} (π§ : Type u_4) [MeasurableSpace π§] (alg : Algorithm π π¨) (n : β) : policy (prodLeft π§ alg) n = ProbabilityTheory.Kernel.comap (policy alg n) (fun h i => (Prod.fst (h i), Prod.snd (Prod.snd (h i)))) β―
Code
theorem prodLeft_policy : β {π : Type u_1} {π¨ : Type u_2} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} (π§ : Type u_4)
[inst : MeasurableSpace π§] (alg : Learning.Algorithm π π¨) (n : β),
(Learning.Algorithm.prodLeft π§ alg).policy n = (alg.policy n).comap (fun h i => ((h i).1, (h i).2.2)) β―Actions: Source Β· Open Issue
Proof
@[simps]
Environmentπ
Learning.EnvironmentA stochastic environment.
Learning.Environment.{u_4, u_5} (π : Type u_4) (π¨ : Type u_5) [MeasurableSpace π] [MeasurableSpace π¨] : Type (max u_4 u_5)Learning.Environment.{u_4, u_5} (π : Type u_4) (π¨ : Type u_5) [MeasurableSpace π] [MeasurableSpace π¨] : Type (max u_4 u_5)
Code
structure Environment (π π¨ : Type*) [MeasurableSpace π] [MeasurableSpace π¨] where /-- Distribution of the next observation as function of the past history. -/ feedback : (n : β) β Kernel ((Iic n β π Γ π¨) Γ π) π¨ /-- The feedback kernels are Markov kernels. -/ [h_feedback : β n, IsMarkovKernel (feedback n)] /-- Distribution of the first observation given the first action. -/ Ξ½0 : Kernel π π¨ /-- The initial observation kernel is a Markov kernel. -/ [hp0 : IsMarkovKernel Ξ½0]
Actions: Source Β· Open Issue
instIsMarkovKernelProdForallSubtypeNatMemFinsetIicFeedbackπ
Learning.instIsMarkovKernelProdForallSubtypeNatMemFinsetIicFeedbackNo docstring.
Learning.instIsMarkovKernelProdForallSubtypeNatMemFinsetIicFeedback.{u_1, u_2} {π : Type u_1} {π¨ : Type u_2} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} (env : Environment π π¨) (n : β) : ProbabilityTheory.IsMarkovKernel (Environment.feedback env n)Learning.instIsMarkovKernelProdForallSubtypeNatMemFinsetIicFeedback.{u_1, u_2} {π : Type u_1} {π¨ : Type u_2} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} (env : Environment π π¨) (n : β) : ProbabilityTheory.IsMarkovKernel (Environment.feedback env n)
Code
instance (env : Environment π π¨) (n : β) : IsMarkovKernel (env.feedback n)
Type uses (1)
Used by (5)
Actions: Source Β· Open Issue
Proof
env.h_feedback n
instIsMarkovKernelΞ½0π
Learning.instIsMarkovKernelΞ½0No docstring.
Learning.instIsMarkovKernelΞ½0.{u_1, u_2} {π : Type u_1} {π¨ : Type u_2} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} (env : Environment π π¨) : ProbabilityTheory.IsMarkovKernel (Environment.Ξ½0 env)Learning.instIsMarkovKernelΞ½0.{u_1, u_2} {π : Type u_1} {π¨ : Type u_2} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} (env : Environment π π¨) : ProbabilityTheory.IsMarkovKernel (Environment.Ξ½0 env)
Code
instance (env : Environment π π¨) : IsMarkovKernel env.Ξ½0
Type uses (1)
Used by (8)
Actions: Source Β· Open Issue
Proof
env.hp0
stepKernelπ
Learning.stepKernel
Kernel describing the distribution of the next action-feedback pair given the history
up to n.
Learning.stepKernel.{u_1, u_2} {π : Type u_1} {π¨ : Type u_2} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} (alg : Algorithm π π¨) (env : Environment π π¨) (n : β) : ProbabilityTheory.Kernel (β₯(Finset.Iic n) β π Γ π¨) (π Γ π¨)Learning.stepKernel.{u_1, u_2} {π : Type u_1} {π¨ : Type u_2} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} (alg : Algorithm π π¨) (env : Environment π π¨) (n : β) : ProbabilityTheory.Kernel (β₯(Finset.Iic n) β π Γ π¨) (π Γ π¨)
Code
noncomputable
def stepKernel (alg : Algorithm π π¨) (env : Environment π π¨) (n : β) :
Kernel (Iic n β π Γ π¨) (π Γ π¨) :=
alg.policy n ββ env.feedback n
deriving IsMarkovKernelType uses (2)
Actions: Source Β· Open Issue
instIsMarkovKernelForallSubtypeNatMemFinsetIicProdStepKernelπ
Learning.instIsMarkovKernelForallSubtypeNatMemFinsetIicProdStepKernelNo docstring.
Learning.instIsMarkovKernelForallSubtypeNatMemFinsetIicProdStepKernel.{u_1, u_2} {π : Type u_1} {π¨ : Type u_2} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} (alg : Algorithm π π¨) (env : Environment π π¨) (n : β) : ProbabilityTheory.IsMarkovKernel (stepKernel alg env n)Learning.instIsMarkovKernelForallSubtypeNatMemFinsetIicProdStepKernel.{u_1, u_2} {π : Type u_1} {π¨ : Type u_2} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} (alg : Algorithm π π¨) (env : Environment π π¨) (n : β) : ProbabilityTheory.IsMarkovKernel (stepKernel alg env n)
Code
deriving IsMarkovKernel
Type uses (3)
Body uses (2)
Used by (10)
Actions: Source Β· Open Issue
Proof
deriving IsMarkovKernel
stepKernel_defπ
Learning.stepKernel_defNo docstring.
Learning.stepKernel_def.{u_1, u_2} {π : Type u_1} {π¨ : Type u_2} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} (alg : Algorithm π π¨) (env : Environment π π¨) (n : β) : stepKernel alg env n = ProbabilityTheory.Kernel.compProd (Algorithm.policy alg n) (Environment.feedback env n)Learning.stepKernel_def.{u_1, u_2} {π : Type u_1} {π¨ : Type u_2} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} (alg : Algorithm π π¨) (env : Environment π π¨) (n : β) : stepKernel alg env n = ProbabilityTheory.Kernel.compProd (Algorithm.policy alg n) (Environment.feedback env n)
Code
lemma stepKernel_def (alg : Algorithm π π¨) (env : Environment π π¨) (n : β) :
stepKernel alg env n = alg.policy n ββ env.feedback nType uses (3)
Actions: Source Β· Open Issue
Proof
rfl
fst_stepKernelπ
Learning.fst_stepKernelNo docstring.
Learning.fst_stepKernel.{u_1, u_2} {π : Type u_1} {π¨ : Type u_2} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} (alg : Algorithm π π¨) (env : Environment π π¨) (n : β) : ProbabilityTheory.Kernel.fst (stepKernel alg env n) = Algorithm.policy alg nLearning.fst_stepKernel.{u_1, u_2} {π : Type u_1} {π¨ : Type u_2} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} (alg : Algorithm π π¨) (env : Environment π π¨) (n : β) : ProbabilityTheory.Kernel.fst (stepKernel alg env n) = Algorithm.policy alg n
Code
lemma fst_stepKernel (alg : Algorithm π π¨) (env : Environment π π¨) (n : β) :
(stepKernel alg env n).fst = alg.policy nType uses (3)
Body uses (2)
Used by (1)
Actions: Source Β· Open Issue
Proof
by rw [stepKernel, Kernel.fst_compProd]
stepπ
Learning.step
Step of the algorithm-environment sequence: the action-feedback pair at time n.
Learning.step.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} (A : β β Ξ© β π) (Y : β β Ξ© β π¨) (n : β) (Ο : Ξ©) : π Γ π¨Learning.step.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} (A : β β Ξ© β π) (Y : β β Ξ© β π¨) (n : β) (Ο : Ξ©) : π Γ π¨
Code
def step (A : β β Ξ© β π) (Y : β β Ξ© β π¨) (n : β) (Ο : Ξ©) : π Γ π¨ := (A n Ο, Y n Ο)
Used by (12)
Actions: Source Β· Open Issue
measurable_stepπ
Learning.measurable_stepNo docstring.
Learning.measurable_step.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (n : β) (hA : Measurable (A n)) (hY : Measurable (Y n)) : Measurable (step A Y n)Learning.measurable_step.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (n : β) (hA : Measurable (A n)) (hY : Measurable (Y n)) : Measurable (step A Y n)
Code
lemma measurable_step (n : β) (hA : Measurable (A n)) (hY : Measurable (Y n)) :
Measurable (step A Y n)Type uses (1)
Used by (2)
Actions: Source Β· Open Issue
Proof
by unfold step fun_prop
trajectoryπ
Learning.trajectoryA random variable that gives the sequence of action-feedback pairs.
Learning.trajectory.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} (A : β β Ξ© β π) (Y : β β Ξ© β π¨) (Ο : Ξ©) : β β π Γ π¨Learning.trajectory.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} (A : β β Ξ© β π) (Y : β β Ξ© β π¨) (Ο : Ξ©) : β β π Γ π¨
Code
def trajectory (A : β β Ξ© β π) (Y : β β Ξ© β π¨) (Ο : Ξ©) : β β π Γ π¨ := fun n β¦ (A n Ο, Y n Ο)
Used by (18)
Actions: Source Β· Open Issue
measurable_trajectoryπ
Learning.measurable_trajectoryNo docstring.
Learning.measurable_trajectory.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (hA : β (n : β), Measurable (A n)) (hR : β (n : β), Measurable (Y n)) : Measurable (trajectory A Y)Learning.measurable_trajectory.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (hA : β (n : β), Measurable (A n)) (hR : β (n : β), Measurable (Y n)) : Measurable (trajectory A Y)
Code
lemma measurable_trajectory {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (hA : β n, Measurable (A n))
(hR : β n, Measurable (Y n)) : Measurable (trajectory A Y)Type uses (1)
Used by (8)
Actions: Source Β· Open Issue
Proof
by unfold trajectory fun_prop
historyπ
Learning.history
History of the algorithm-environment sequence up to time n.
Learning.history.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} (A : β β Ξ© β π) (Y : β β Ξ© β π¨) (n : β) (Ο : Ξ©) : β₯(Finset.Iic n) β π Γ π¨Learning.history.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} (A : β β Ξ© β π) (Y : β β Ξ© β π¨) (n : β) (Ο : Ξ©) : β₯(Finset.Iic n) β π Γ π¨
Code
def history (A : β β Ξ© β π) (Y : β β Ξ© β π¨) (n : β) (Ο : Ξ©) : Iic n β π Γ π¨ := fun i β¦ (A i Ο, Y i Ο)
Actions: Source Β· Open Issue
measurable_historyπ
Learning.measurable_historyNo docstring.
Learning.measurable_history.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (hA : β (n : β), Measurable (A n)) (hY : β (n : β), Measurable (Y n)) (n : β) : Measurable (history A Y n)Learning.measurable_history.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (hA : β (n : β), Measurable (A n)) (hY : β (n : β), Measurable (Y n)) (n : β) : Measurable (history A Y n)
Code
lemma measurable_history (hA : β n, Measurable (A n))
(hY : β n, Measurable (Y n)) (n : β) :
Measurable (history A Y n)Type uses (1)
Used by (10)
Actions: Source Β· Open Issue
Proof
by unfold history fun_prop
eval_comp_historyπ
Learning.eval_comp_historyNo docstring.
Learning.eval_comp_history.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (n : β) : (fun x => x β¨n, β―β©) β history A Y n = step A Y nLearning.eval_comp_history.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (n : β) : (fun x => x β¨n, β―β©) β history A Y n = step A Y n
Code
lemma eval_comp_history (n : β) :
(fun x β¦ x β¨n, by simpβ©) β (history A Y n) = step A Y nActions: Source Β· Open Issue
Proof
rfl
fst_eval_comp_historyπ
Learning.fst_eval_comp_historyNo docstring.
Learning.fst_eval_comp_history.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (n : β) : (fun x => Prod.fst (x β¨n, β―β©)) β history A Y n = A nLearning.fst_eval_comp_history.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (n : β) : (fun x => Prod.fst (x β¨n, β―β©)) β history A Y n = A n
Code
lemma fst_eval_comp_history (n : β) :
(fun x β¦ (x β¨n, by simpβ©).1) β (history A Y n) = A nType uses (1)
Actions: Source Β· Open Issue
Proof
rfl
snd_eval_comp_historyπ
Learning.snd_eval_comp_historyNo docstring.
Learning.snd_eval_comp_history.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (n : β) : (fun x => Prod.snd (x β¨n, β―β©)) β history A Y n = Y nLearning.snd_eval_comp_history.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (n : β) : (fun x => Prod.snd (x β¨n, β―β©)) β history A Y n = Y n
Code
lemma snd_eval_comp_history (n : β) :
(fun x β¦ (x β¨n, by simpβ©).2) β (history A Y n) = Y nType uses (1)
Actions: Source Β· Open Issue
Proof
rfl
history_succπ
Learning.history_succNo docstring.
Learning.history_succ.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (n : β) : history A Y (n + 1) = β(MeasurableEquiv.symm (MeasurableEquiv.IicSuccProd (fun x => match x with | Nat => π Γ π¨) n)) β fun Ο => (history A Y n Ο, step A Y (n + 1) Ο)Learning.history_succ.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (n : β) : history A Y (n + 1) = β(MeasurableEquiv.symm (MeasurableEquiv.IicSuccProd (fun x => match x with | Nat => π Γ π¨) n)) β fun Ο => (history A Y n Ο, step A Y (n + 1) Ο)
Code
lemma history_succ (n : β) :
history A Y (n + 1) =
(MeasurableEquiv.IicSuccProd (fun β β¦ π Γ π¨) n).symm β
(fun Ο β¦ (history A Y n Ο, step A Y (n + 1) Ο))Type uses (3)
Actions: Source Β· Open Issue
Proof
by funext Ο symm exact (MeasurableEquiv.IicSuccProd (fun _ β¦ π Γ π¨) n).symm_apply_apply (history A Y (n + 1) Ο)
IsAlgEnvSeqπ
Learning.IsAlgEnvSeqAn algorithm-environment sequence: a sequence of actions and feedbacks generated by an algorithm interacting with an environment.
Learning.IsAlgEnvSeq.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} (A : β β Ξ© β π) (Y : β β Ξ© β π¨) (alg : Algorithm π π¨) (env : Environment π π¨) (P : MeasureTheory.Measure Ξ©) [MeasureTheory.IsFiniteMeasure P] : PropLearning.IsAlgEnvSeq.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} (A : β β Ξ© β π) (Y : β β Ξ© β π¨) (alg : Algorithm π π¨) (env : Environment π π¨) (P : MeasureTheory.Measure Ξ©) [MeasureTheory.IsFiniteMeasure P] : Prop
Code
structure IsAlgEnvSeq
(A : β β Ξ© β π) (Y : β β Ξ© β π¨) (alg : Algorithm π π¨) (env : Environment π π¨)
(P : Measure Ξ©) [IsFiniteMeasure P] : Prop where
/-- The action sequence is measurable. -/
measurable_action n : Measurable (A n) := by fun_prop
/-- The feedback sequence is measurable. -/
measurable_feedback n : Measurable (Y n) := by fun_prop
/-- The first action has the correct law. -/
hasLaw_action_zero : HasLaw (fun Ο β¦ (A 0 Ο)) alg.p0 P
/-- The first feedback has the correct conditional distribution. -/
hasCondDistrib_feedback_zero : HasCondDistrib (Y 0) (A 0) env.Ξ½0 P
/-- The next action has the correct conditional distribution given the history. -/
hasCondDistrib_action n :
HasCondDistrib (A (n + 1)) (history A Y n) (alg.policy n) P
/-- The next feedback has the correct conditional distribution given the history and
next action. -/
hasCondDistrib_feedback n :
HasCondDistrib (Y (n + 1)) (fun Ο β¦ (history A Y n Ο, A (n + 1) Ο))
(env.feedback n) PType uses (3)
Used by (111)
Actions: Source Β· Open Issue
IsAlgEnvSeqUntilπ
Learning.IsAlgEnvSeqUntilAn algorithm-environment sequence: a sequence of actions and feedbacks generated by an algorithm interacting with an environment.
Learning.IsAlgEnvSeqUntil.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} (A : β β Ξ© β π) (Y : β β Ξ© β π¨) (alg : Algorithm π π¨) (env : Environment π π¨) (P : MeasureTheory.Measure Ξ©) [MeasureTheory.IsFiniteMeasure P] (N : β) : PropLearning.IsAlgEnvSeqUntil.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} (A : β β Ξ© β π) (Y : β β Ξ© β π¨) (alg : Algorithm π π¨) (env : Environment π π¨) (P : MeasureTheory.Measure Ξ©) [MeasureTheory.IsFiniteMeasure P] (N : β) : Prop
Code
structure IsAlgEnvSeqUntil
(A : β β Ξ© β π) (Y : β β Ξ© β π¨) (alg : Algorithm π π¨) (env : Environment π π¨)
(P : Measure Ξ©) [IsFiniteMeasure P] (N : β) : Prop where
/-- The action sequence is measurable. -/
measurable_action n : Measurable (A n) := by fun_prop
/-- The feedback sequence is measurable. -/
measurable_feedback n : Measurable (Y n) := by fun_prop
/-- The first action has the correct law. -/
hasLaw_action_zero : HasLaw (fun Ο β¦ (A 0 Ο)) alg.p0 P
/-- The first feedback has the correct conditional distribution. -/
hasCondDistrib_feedback_zero : HasCondDistrib (Y 0) (A 0) env.Ξ½0 P
/-- The next action has the correct conditional distribution given the history. -/
hasCondDistrib_action n (hn : n < N) :
HasCondDistrib (A (n + 1)) (history A Y n) (alg.policy n) P
/-- The next feedback has the correct conditional distribution given the history and
next action. -/
hasCondDistrib_feedback n (hn : n < N) :
HasCondDistrib (Y (n + 1)) (fun Ο β¦ (history A Y n Ο, A (n + 1) Ο))
(env.feedback n) PType uses (3)
Used by (22)
Actions: Source Β· Open Issue
monoπ
Learning.IsAlgEnvSeqUntil.monoNo docstring.
Learning.IsAlgEnvSeqUntil.mono.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} {alg : Algorithm π π¨} {env : Environment π π¨} {P : MeasureTheory.Measure Ξ©} [MeasureTheory.IsFiniteMeasure P] {N : β} (h : IsAlgEnvSeqUntil A Y alg env P N) {N' : β} (hN : N' β€ N) : IsAlgEnvSeqUntil A Y alg env P N'Learning.IsAlgEnvSeqUntil.mono.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} {alg : Algorithm π π¨} {env : Environment π π¨} {P : MeasureTheory.Measure Ξ©} [MeasureTheory.IsFiniteMeasure P] {N : β} (h : IsAlgEnvSeqUntil A Y alg env P N) {N' : β} (hN : N' β€ N) : IsAlgEnvSeqUntil A Y alg env P N'
Code
lemma IsAlgEnvSeqUntil.mono (h : IsAlgEnvSeqUntil A Y alg env P N) {N' : β} (hN : N' β€ N) :
IsAlgEnvSeqUntil A Y alg env P N' where
measurable_actionType uses (3)
Body uses (1)
Used by (4)
Actions: Source Β· Open Issue
Proof
h.measurable_action measurable_feedback := h.measurable_feedback hasLaw_action_zero := h.hasLaw_action_zero hasCondDistrib_feedback_zero := h.hasCondDistrib_feedback_zero hasCondDistrib_action n hn := h.hasCondDistrib_action n (hn.trans_le hN) hasCondDistrib_feedback n hn := h.hasCondDistrib_feedback n (hn.trans_le hN)
isAlgEnvSeqUntilπ
Learning.IsAlgEnvSeq.isAlgEnvSeqUntilNo docstring.
Learning.IsAlgEnvSeq.isAlgEnvSeqUntil.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} {alg : Algorithm π π¨} {env : Environment π π¨} {P : MeasureTheory.Measure Ξ©} [MeasureTheory.IsFiniteMeasure P] (h : IsAlgEnvSeq A Y alg env P) (N : β) : IsAlgEnvSeqUntil A Y alg env P NLearning.IsAlgEnvSeq.isAlgEnvSeqUntil.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} {alg : Algorithm π π¨} {env : Environment π π¨} {P : MeasureTheory.Measure Ξ©} [MeasureTheory.IsFiniteMeasure P] (h : IsAlgEnvSeq A Y alg env P) (N : β) : IsAlgEnvSeqUntil A Y alg env P N
Code
lemma IsAlgEnvSeq.isAlgEnvSeqUntil (h : IsAlgEnvSeq A Y alg env P) (N : β) :
IsAlgEnvSeqUntil A Y alg env P N where
measurable_actionType uses (4)
Body uses (1)
Used by (2)
Actions: Source Β· Open Issue
Proof
h.measurable_action measurable_feedback := h.measurable_feedback hasLaw_action_zero := h.hasLaw_action_zero hasCondDistrib_feedback_zero := h.hasCondDistrib_feedback_zero hasCondDistrib_action n _ := h.hasCondDistrib_action n hasCondDistrib_feedback n _ := h.hasCondDistrib_feedback n
measurable_stepπ
Learning.IsAlgEnvSeq.measurable_stepNo docstring.
Learning.IsAlgEnvSeq.measurable_step.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} {alg : Algorithm π π¨} {env : Environment π π¨} {P : MeasureTheory.Measure Ξ©} [MeasureTheory.IsFiniteMeasure P] (h : IsAlgEnvSeq A Y alg env P) (n : β) : Measurable (step A Y n)Learning.IsAlgEnvSeq.measurable_step.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} {alg : Algorithm π π¨} {env : Environment π π¨} {P : MeasureTheory.Measure Ξ©} [MeasureTheory.IsFiniteMeasure P] (h : IsAlgEnvSeq A Y alg env P) (n : β) : Measurable (step A Y n)
Code
lemma IsAlgEnvSeq.measurable_step (h : IsAlgEnvSeq A Y alg env P) (n : β) :
Measurable (step A Y n)Type uses (4)
Body uses (1)
Actions: Source Β· Open Issue
Proof
by have hA := h.measurable_action have hY := h.measurable_feedback fun_prop
measurable_historyπ
Learning.IsAlgEnvSeq.measurable_historyNo docstring.
Learning.IsAlgEnvSeq.measurable_history.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} {alg : Algorithm π π¨} {env : Environment π π¨} {P : MeasureTheory.Measure Ξ©} [MeasureTheory.IsFiniteMeasure P] (h : IsAlgEnvSeq A Y alg env P) (n : β) : Measurable (history A Y n)Learning.IsAlgEnvSeq.measurable_history.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} {alg : Algorithm π π¨} {env : Environment π π¨} {P : MeasureTheory.Measure Ξ©} [MeasureTheory.IsFiniteMeasure P] (h : IsAlgEnvSeq A Y alg env P) (n : β) : Measurable (history A Y n)
Code
lemma IsAlgEnvSeq.measurable_history (h : IsAlgEnvSeq A Y alg env P) (n : β) :
Measurable (history A Y n)Type uses (4)
Body uses (1)
Used by (4)
Actions: Source Β· Open Issue
Proof
by have hA := h.measurable_action have hY := h.measurable_feedback fun_prop
hasLaw_step_zeroπ
Learning.IsAlgEnvSeq.hasLaw_step_zeroNo docstring.
Learning.IsAlgEnvSeq.hasLaw_step_zero.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} {alg : Algorithm π π¨} {env : Environment π π¨} {P : MeasureTheory.Measure Ξ©} [MeasureTheory.IsFiniteMeasure P] (h : IsAlgEnvSeq A Y alg env P) : ProbabilityTheory.HasLaw (step A Y 0) (MeasureTheory.Measure.compProd (Algorithm.p0 alg) (Environment.Ξ½0 env)) PLearning.IsAlgEnvSeq.hasLaw_step_zero.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} {alg : Algorithm π π¨} {env : Environment π π¨} {P : MeasureTheory.Measure Ξ©} [MeasureTheory.IsFiniteMeasure P] (h : IsAlgEnvSeq A Y alg env P) : ProbabilityTheory.HasLaw (step A Y 0) (MeasureTheory.Measure.compProd (Algorithm.p0 alg) (Environment.Ξ½0 env)) P
Code
lemma IsAlgEnvSeq.hasLaw_step_zero (h : IsAlgEnvSeq A Y alg env P) :
HasLaw (step A Y 0) (alg.p0 ββ env.Ξ½0) PType uses (4)
Body uses (1)
Actions: Source Β· Open Issue
Proof
HasLaw.prod_of_hasCondDistrib h.hasLaw_action_zero h.hasCondDistrib_feedback_zero
hasLaw_step_zeroπ
Learning.IsAlgEnvSeqUntil.hasLaw_step_zeroNo docstring.
Learning.IsAlgEnvSeqUntil.hasLaw_step_zero.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} {alg : Algorithm π π¨} {env : Environment π π¨} {P : MeasureTheory.Measure Ξ©} [MeasureTheory.IsFiniteMeasure P] {N : β} (h : IsAlgEnvSeqUntil A Y alg env P N) : ProbabilityTheory.HasLaw (step A Y 0) (MeasureTheory.Measure.compProd (Algorithm.p0 alg) (Environment.Ξ½0 env)) PLearning.IsAlgEnvSeqUntil.hasLaw_step_zero.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} {alg : Algorithm π π¨} {env : Environment π π¨} {P : MeasureTheory.Measure Ξ©} [MeasureTheory.IsFiniteMeasure P] {N : β} (h : IsAlgEnvSeqUntil A Y alg env P N) : ProbabilityTheory.HasLaw (step A Y 0) (MeasureTheory.Measure.compProd (Algorithm.p0 alg) (Environment.Ξ½0 env)) P
Code
lemma IsAlgEnvSeqUntil.hasLaw_step_zero (h : IsAlgEnvSeqUntil A Y alg env P N) :
HasLaw (step A Y 0) (alg.p0 ββ env.Ξ½0) PType uses (4)
Body uses (1)
Actions: Source Β· Open Issue
Proof
HasLaw.prod_of_hasCondDistrib h.hasLaw_action_zero h.hasCondDistrib_feedback_zero
hasCondDistrib_stepπ
Learning.IsAlgEnvSeq.hasCondDistrib_stepNo docstring.
Learning.IsAlgEnvSeq.hasCondDistrib_step.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} {alg : Algorithm π π¨} {env : Environment π π¨} {P : MeasureTheory.Measure Ξ©} [MeasureTheory.IsFiniteMeasure P] (h : IsAlgEnvSeq A Y alg env P) (n : β) : ProbabilityTheory.HasCondDistrib (step A Y (n + 1)) (history A Y n) (stepKernel alg env n) PLearning.IsAlgEnvSeq.hasCondDistrib_step.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} {alg : Algorithm π π¨} {env : Environment π π¨} {P : MeasureTheory.Measure Ξ©} [MeasureTheory.IsFiniteMeasure P] (h : IsAlgEnvSeq A Y alg env P) (n : β) : ProbabilityTheory.HasCondDistrib (step A Y (n + 1)) (history A Y n) (stepKernel alg env n) P
Code
lemma IsAlgEnvSeq.hasCondDistrib_step (h : IsAlgEnvSeq A Y alg env P) (n : β) :
HasCondDistrib (step A Y (n + 1)) (history A Y n) (stepKernel alg env n) PType uses (6)
Body uses (1)
Actions: Source Β· Open Issue
Proof
HasCondDistrib.prod (h.hasCondDistrib_action n) (h.hasCondDistrib_feedback n)
hasCondDistrib_stepπ
Learning.IsAlgEnvSeqUntil.hasCondDistrib_stepNo docstring.
Learning.IsAlgEnvSeqUntil.hasCondDistrib_step.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} {alg : Algorithm π π¨} {env : Environment π π¨} {P : MeasureTheory.Measure Ξ©} [MeasureTheory.IsFiniteMeasure P] {N : β} (h : IsAlgEnvSeqUntil A Y alg env P N) (n : β) (hn : n < N) : ProbabilityTheory.HasCondDistrib (step A Y (n + 1)) (history A Y n) (stepKernel alg env n) PLearning.IsAlgEnvSeqUntil.hasCondDistrib_step.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} {alg : Algorithm π π¨} {env : Environment π π¨} {P : MeasureTheory.Measure Ξ©} [MeasureTheory.IsFiniteMeasure P] {N : β} (h : IsAlgEnvSeqUntil A Y alg env P N) (n : β) (hn : n < N) : ProbabilityTheory.HasCondDistrib (step A Y (n + 1)) (history A Y n) (stepKernel alg env n) P
Code
lemma IsAlgEnvSeqUntil.hasCondDistrib_step (h : IsAlgEnvSeqUntil A Y alg env P N)
(n : β) (hn : n < N) :
HasCondDistrib (step A Y (n + 1)) (history A Y n)
(stepKernel alg env n) PType uses (6)
Body uses (1)
Actions: Source Β· Open Issue
Proof
HasCondDistrib.prod (h.hasCondDistrib_action n hn) (h.hasCondDistrib_feedback n hn)
hasLaw_history_zeroπ
Learning.IsAlgEnvSeq.hasLaw_history_zeroNo docstring.
Learning.IsAlgEnvSeq.hasLaw_history_zero.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} {alg : Algorithm π π¨} {env : Environment π π¨} {P : MeasureTheory.Measure Ξ©} [MeasureTheory.IsFiniteMeasure P] (h : IsAlgEnvSeq A Y alg env P) : ProbabilityTheory.HasLaw (history A Y 0) (MeasureTheory.Measure.map (β(MeasurableEquiv.symm (MeasurableEquiv.piUnique fun x => π Γ π¨))) (MeasureTheory.Measure.map (step A Y 0) P)) PLearning.IsAlgEnvSeq.hasLaw_history_zero.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} {alg : Algorithm π π¨} {env : Environment π π¨} {P : MeasureTheory.Measure Ξ©} [MeasureTheory.IsFiniteMeasure P] (h : IsAlgEnvSeq A Y alg env P) : ProbabilityTheory.HasLaw (history A Y 0) (MeasureTheory.Measure.map (β(MeasurableEquiv.symm (MeasurableEquiv.piUnique fun x => π Γ π¨))) (MeasureTheory.Measure.map (step A Y 0) P)) P
Code
lemma IsAlgEnvSeq.hasLaw_history_zero (h : IsAlgEnvSeq A Y alg env P) : HasLaw (history A Y 0)
((P.map (step A Y 0)).map (MeasurableEquiv.piUnique (fun _ : Iic 0 β¦ π Γ π¨)).symm) P where
aemeasurableType uses (5)
Body uses (2)
Actions: Source Β· Open Issue
Proof
(h.measurable_history 0).aemeasurable
map_eq := by
have he : (MeasurableEquiv.piUnique (fun _ : Iic 0 β¦ π Γ π¨)).symm β step A Y 0 =
history A Y 0 := by
funext _ β¨0, _β©
rfl
rw [β he]
have hA := h.measurable_action
have hY := h.measurable_feedback
exact (Measure.map_map (by fun_prop) (by fun_prop)).symm
filtrationπ
Learning.IsAlgEnvSeq.filtration
Filtration generated by the history up to time n.
Learning.IsAlgEnvSeq.filtration.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (hA : β (n : β), Measurable (A n)) (hY : β (n : β), Measurable (Y n)) : MeasureTheory.Filtration β mΞ©Learning.IsAlgEnvSeq.filtration.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (hA : β (n : β), Measurable (A n)) (hY : β (n : β), Measurable (Y n)) : MeasureTheory.Filtration β mΞ©
Code
def IsAlgEnvSeq.filtration (hA : β n, Measurable (A n)) (hY : β n, Measurable (Y n)) :
Filtration β mΞ© where
seq i := MeasurableSpace.comap (history A Y i) inferInstance
mono' i j hij := by
simp only
rw [β measurable_iff_comap_le]
have : history A Y i = (fun h k β¦ h β¨k.1, by grindβ©) β history A Y j := rfl
rw [this]
exact measurable_comp_comap _ (by fun_prop)
le' i := by
rw [β measurable_iff_comap_le]
exact Learning.measurable_history hA hY iBody uses (3)
Used by (18)
Actions: Source Β· Open Issue
adapted_historyπ
Learning.IsAlgEnvSeq.adapted_historyNo docstring.
Learning.IsAlgEnvSeq.adapted_history.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (hA : β (n : β), Measurable (A n)) (hY : β (n : β), Measurable (Y n)) : MeasureTheory.Adapted (filtration hA hY) (history A Y)Learning.IsAlgEnvSeq.adapted_history.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (hA : β (n : β), Measurable (A n)) (hY : β (n : β), Measurable (Y n)) : MeasureTheory.Adapted (filtration hA hY) (history A Y)
Code
lemma IsAlgEnvSeq.adapted_history
(hA : β n, Measurable (A n)) (hY : β n, Measurable (Y n)) :
Adapted (filtration hA hY) (history A Y)Type uses (2)
Actions: Source Β· Open Issue
Proof
fun _ β¦ measurable_iff_comap_le.mpr le_rfl
adapted_stepπ
Learning.IsAlgEnvSeq.adapted_stepNo docstring.
Learning.IsAlgEnvSeq.adapted_step.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (hA : β (n : β), Measurable (A n)) (hY : β (n : β), Measurable (Y n)) : MeasureTheory.Adapted (filtration hA hY) (step A Y)Learning.IsAlgEnvSeq.adapted_step.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (hA : β (n : β), Measurable (A n)) (hY : β (n : β), Measurable (Y n)) : MeasureTheory.Adapted (filtration hA hY) (step A Y)
Code
lemma IsAlgEnvSeq.adapted_step
(hA : β n, Measurable (A n)) (hY : β n, Measurable (Y n)) :
Adapted (filtration hA hY) (step A Y)Type uses (2)
Body uses (2)
Actions: Source Β· Open Issue
Proof
by
intro n
have : step A Y n = (fun h β¦ (h β¨n, by simpβ©)) β (history A Y n) := by
ext Ο : 1
simp [history, step]
rw [this]
exact measurable_comp_comap _ (by fun_prop)
adapted_actionπ
Learning.IsAlgEnvSeq.adapted_actionNo docstring.
Learning.IsAlgEnvSeq.adapted_action.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (hA : β (n : β), Measurable (A n)) (hY : β (n : β), Measurable (Y n)) : MeasureTheory.Adapted (filtration hA hY) ALearning.IsAlgEnvSeq.adapted_action.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (hA : β (n : β), Measurable (A n)) (hY : β (n : β), Measurable (Y n)) : MeasureTheory.Adapted (filtration hA hY) A
Code
lemma IsAlgEnvSeq.adapted_action
(hA : β n, Measurable (A n)) (hY : β n, Measurable (Y n)) :
Adapted (filtration hA hY) AType uses (1)
Body uses (2)
Actions: Source Β· Open Issue
Proof
by
intro n
have : A n = (fun h β¦ (h β¨n, by simpβ©).1) β (history A Y n) := by
ext Ο : 1
simp [history]
rw [this]
exact measurable_comp_comap _ (by fun_prop)
adapted_feedbackπ
Learning.IsAlgEnvSeq.adapted_feedbackNo docstring.
Learning.IsAlgEnvSeq.adapted_feedback.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (hA : β (n : β), Measurable (A n)) (hY : β (n : β), Measurable (Y n)) : MeasureTheory.Adapted (filtration hA hY) YLearning.IsAlgEnvSeq.adapted_feedback.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (hA : β (n : β), Measurable (A n)) (hY : β (n : β), Measurable (Y n)) : MeasureTheory.Adapted (filtration hA hY) Y
Code
lemma IsAlgEnvSeq.adapted_feedback
(hA : β n, Measurable (A n)) (hY : β n, Measurable (Y n)) :
Adapted (filtration hA hY) YType uses (1)
Body uses (2)
Used by (1)
Actions: Source Β· Open Issue
Proof
by
intro n
have : Y n = (fun h β¦ (h β¨n, by simpβ©).2) β (history A Y n) := by
ext Ο : 1
simp [history]
rw [this]
exact measurable_comp_comap _ (by fun_prop)
filtrationActionπ
Learning.IsAlgEnvSeq.filtrationAction
Filtration generated by the history at time n-1 together with the action at time n.
Learning.IsAlgEnvSeq.filtrationAction.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (hA : β (n : β), Measurable (A n)) (hY : β (n : β), Measurable (Y n)) : MeasureTheory.Filtration β mΞ©Learning.IsAlgEnvSeq.filtrationAction.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} (hA : β (n : β), Measurable (A n)) (hY : β (n : β), Measurable (Y n)) : MeasureTheory.Filtration β mΞ©
Code
def IsAlgEnvSeq.filtrationAction
(hA : β n, Measurable (A n)) (hY : β n, Measurable (Y n)) :
Filtration β mΞ© where
seq n := if n = 0 then MeasurableSpace.comap (A 0) inferInstance
else IsAlgEnvSeq.filtration hA hY (n - 1) β MeasurableSpace.comap (A n) inferInstance
mono' n m hnm := by
simp only
by_cases hn : n = 0
Β· by_cases hm : m = 0
Β· simp [hn, hm]
Β· simp only [hn, βreduceIte, hm]
refine le_sup_of_le_left ?_
rw [β measurable_iff_comap_le]
suffices Measurable[IsAlgEnvSeq.filtration hA hY 0] (A 0) from
this.mono ((IsAlgEnvSeq.filtration hA hY).mono zero_le) le_rfl
exact adapted_action hA hY 0
have hm : m β 0 := by grind
simp only [hn, hm, βreduceIte]
have hnm' : n - 1 β€ m - 1 := by grind
simp only [sup_le_iff]
constructor
Β· refine le_sup_of_le_left ?_
exact (IsAlgEnvSeq.filtration hA hY).mono hnm'
Β· rcases eq_or_lt_of_le hnm with rfl | hlt
Β· exact le_sup_of_le_right le_rfl
refine le_sup_of_le_left ?_
rw [β measurable_iff_comap_le]
have h_le : n β€ m - 1 := by grind
suffices Measurable[IsAlgEnvSeq.filtration hA hY n] (A n) from
this.mono ((IsAlgEnvSeq.filtration hA hY).mono h_le) le_rfl
exact adapted_action hA hY n
le' n := by
by_cases hn : n = 0
Β· simp only [hn, βreduceIte]
rw [β measurable_iff_comap_le]
fun_prop
simp only [hn, βreduceIte, sup_le_iff]
constructor
Β· exact (IsAlgEnvSeq.filtration hA hY).le _
Β· rw [β measurable_iff_comap_le]
fun_propBody uses (2)
Used by (3)
Actions: Source Β· Open Issue
filtrationAction_zero_eq_comapπ
Learning.IsAlgEnvSeq.filtrationAction_zero_eq_comapNo docstring.
Learning.IsAlgEnvSeq.filtrationAction_zero_eq_comap.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} {hA : β (n : β), Measurable (A n)} {hY : β (n : β), Measurable (Y n)} : β(filtrationAction hA hY) 0 = MeasurableSpace.comap (A 0) inferInstanceLearning.IsAlgEnvSeq.filtrationAction_zero_eq_comap.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} {hA : β (n : β), Measurable (A n)} {hY : β (n : β), Measurable (Y n)} : β(filtrationAction hA hY) 0 = MeasurableSpace.comap (A 0) inferInstance
Code
lemma IsAlgEnvSeq.filtrationAction_zero_eq_comap
{hA : β n, Measurable (A n)} {hY : β n, Measurable (Y n)} :
filtrationAction hA hY 0 = MeasurableSpace.comap (A 0) inferInstanceType uses (1)
Body uses (1)
Actions: Source Β· Open Issue
Proof
by simp [filtrationAction]
filtrationAction_eq_comapπ
Learning.IsAlgEnvSeq.filtrationAction_eq_comapNo docstring.
Learning.IsAlgEnvSeq.filtrationAction_eq_comap.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} {hA : β (n : β), Measurable (A n)} {hY : β (n : β), Measurable (Y n)} (n : β) (hn : n β 0) : β(filtrationAction hA hY) n = MeasurableSpace.comap (fun Ο => (history A Y (n - 1) Ο, A n Ο)) inferInstanceLearning.IsAlgEnvSeq.filtrationAction_eq_comap.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} {mπ : MeasurableSpace π} {mπ¨ : MeasurableSpace π¨} {mΞ© : MeasurableSpace Ξ©} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} {hA : β (n : β), Measurable (A n)} {hY : β (n : β), Measurable (Y n)} (n : β) (hn : n β 0) : β(filtrationAction hA hY) n = MeasurableSpace.comap (fun Ο => (history A Y (n - 1) Ο, A n Ο)) inferInstance
Code
lemma IsAlgEnvSeq.filtrationAction_eq_comap
{hA : β n, Measurable (A n)} {hY : β n, Measurable (Y n)} (n : β) (hn : n β 0) :
filtrationAction hA hY n =
MeasurableSpace.comap (fun Ο β¦ (history A Y (n - 1) Ο, A n Ο)) inferInstanceType uses (2)
Body uses (2)
Used by (1)
Actions: Source Β· Open Issue
Proof
by simp only [filtrationAction, filtration, β MeasurableSpace.comap_prodMk, hn, βreduceIte] rfl
-
Learning.Algorithm -
Learning.instIsMarkovKernelForallSubtypeNatMemFinsetIicProdPolicy -
Learning.instIsProbabilityMeasureP0 -
Learning.Algorithm.prodLeft -
Learning.Algorithm.prodLeft_p0 -
Learning.Algorithm.prodLeft_policy -
Learning.Environment -
Learning.instIsMarkovKernelProdForallSubtypeNatMemFinsetIicFeedback -
Learning.instIsMarkovKernelΞ½0 -
Learning.stepKernel -
Learning.instIsMarkovKernelForallSubtypeNatMemFinsetIicProdStepKernel -
Learning.stepKernel_def -
Learning.fst_stepKernel -
Learning.step -
Learning.measurable_step -
Learning.trajectory -
Learning.measurable_trajectory -
Learning.history -
Learning.measurable_history -
Learning.eval_comp_history -
Learning.fst_eval_comp_history -
Learning.snd_eval_comp_history -
Learning.history_succ -
Learning.IsAlgEnvSeq -
Learning.IsAlgEnvSeqUntil -
Learning.IsAlgEnvSeqUntil.mono -
Learning.IsAlgEnvSeq.isAlgEnvSeqUntil -
Learning.IsAlgEnvSeq.measurable_step -
Learning.IsAlgEnvSeq.measurable_history -
Learning.IsAlgEnvSeq.hasLaw_step_zero -
Learning.IsAlgEnvSeqUntil.hasLaw_step_zero -
Learning.IsAlgEnvSeq.hasCondDistrib_step -
Learning.IsAlgEnvSeqUntil.hasCondDistrib_step -
Learning.IsAlgEnvSeq.hasLaw_history_zero -
Learning.IsAlgEnvSeq.filtration -
Learning.IsAlgEnvSeq.adapted_history -
Learning.IsAlgEnvSeq.adapted_step -
Learning.IsAlgEnvSeq.adapted_action -
Learning.IsAlgEnvSeq.adapted_feedback -
Learning.IsAlgEnvSeq.filtrationAction -
Learning.IsAlgEnvSeq.filtrationAction_zero_eq_comap -
Learning.IsAlgEnvSeq.filtrationAction_eq_comap