LeanMachineLearning exposition

Learning.IsAlgEnvSeq.isBayesAlgEnvSeq๐Ÿ”—

This page has the declaration's own card below, then its dependency graph, then a card for each dependency (type dependencies first, then the rest of the transitive closure). For a theorem, the graph and the dependency cards only follow its statement's dependencies (its proof is replaced by sorry, so what it proves doesn't depend on how); for everything else, both the type and the body/value are followed, since their content is part of what later declarations build on.

Minimal Lean file

isBayesAlgEnvSeq๐Ÿ”—

LemmaLearning.IsAlgEnvSeq.isBayesAlgEnvSeq

No docstring.

๐Ÿ”—theorem
Learning.IsAlgEnvSeq.isBayesAlgEnvSeq.{u_1, u_2, u_3, u_4} {๐“” : Type u_1} {๐“ : Type u_2} {๐“จ : Type u_3} {ฮฉ : Type u_4} [MeasurableSpace ๐“”] [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] [MeasurableSpace ฮฉ] {Q : MeasureTheory.Measure ๐“”} [MeasureTheory.IsProbabilityMeasure Q] {ฮบ : ProbabilityTheory.Kernel (๐“” ร— ๐“) ๐“จ} [ProbabilityTheory.IsMarkovKernel ฮบ] {alg : Algorithm ๐“ ๐“จ} {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“” ร— ๐“จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] (h : IsAlgEnvSeq A Y (Algorithm.prodLeft ๐“” alg) (bayesStationaryEnv Q ฮบ) P) : IsBayesAlgEnvSeq Q ฮบ alg (fun ฯ‰ => Prod.fst (Y 0 ฯ‰)) A (fun n ฯ‰ => Prod.snd (Y n ฯ‰)) P
Learning.IsAlgEnvSeq.isBayesAlgEnvSeq.{u_1, u_2, u_3, u_4} {๐“” : Type u_1} {๐“ : Type u_2} {๐“จ : Type u_3} {ฮฉ : Type u_4} [MeasurableSpace ๐“”] [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] [MeasurableSpace ฮฉ] {Q : MeasureTheory.Measure ๐“”} [MeasureTheory.IsProbabilityMeasure Q] {ฮบ : ProbabilityTheory.Kernel (๐“” ร— ๐“) ๐“จ} [ProbabilityTheory.IsMarkovKernel ฮบ] {alg : Algorithm ๐“ ๐“จ} {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“” ร— ๐“จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] (h : IsAlgEnvSeq A Y (Algorithm.prodLeft ๐“” alg) (bayesStationaryEnv Q ฮบ) P) : IsBayesAlgEnvSeq Q ฮบ alg (fun ฯ‰ => Prod.fst (Y 0 ฯ‰)) A (fun n ฯ‰ => Prod.snd (Y n ฯ‰)) P

Code

lemma IsAlgEnvSeq.isBayesAlgEnvSeq
    (h : IsAlgEnvSeq A Y (alg.prodLeft ๐“”) (bayesStationaryEnv Q ฮบ) P) :
    IsBayesAlgEnvSeq Q ฮบ alg (fun ฯ‰ โ†ฆ (Y 0 ฯ‰).1) A (fun n ฯ‰ โ†ฆ (Y n ฯ‰).2) P where
  measurable_param
Type uses (5)
Body uses (6)
Used by (1)

Actions: Source ยท Open Issue

Proof
(h.measurable_feedback 0).fst
  measurable_action := h.measurable_action
  measurable_feedback n := (h.measurable_feedback n).snd
  hasLaw_env := by
    apply HasCondDistrib.hasLaw_of_const
    simpa [bayesStationaryEnv] using h.hasCondDistrib_feedback_zero.fst
  hasCondDistrib_action_zero := by
    have hc : HasCondDistrib (fun ฯ‰ โ†ฆ (Y 0 ฯ‰).1) (A 0) (Kernel.const _ Q) P := by
      simpa [bayesStationaryEnv] using h.hasCondDistrib_feedback_zero.fst
    simpa [h.hasLaw_action_zero.map_eq, Algorithm.prodLeft] using hc.const_map_of_const
  hasCondDistrib_feedback_zero :=
    h.hasCondDistrib_feedback_zero.of_compProd.measurableEquiv_comp_right MeasurableEquiv.prodComm
  hasCondDistrib_action n := by
    let f : (Iic n โ†’ ๐“ ร— ๐“” ร— ๐“จ) โ†’ ๐“” ร— (Iic n โ†’ ๐“ ร— ๐“จ) :=
      fun h โ†ฆ ((h โŸจ0, by simpโŸฉ).2.1, fun i โ†ฆ ((h i).1, (h i).2.2))
    have hc : HasCondDistrib (A (n + 1)) (history A Y n)
        (((alg.policy n).comap Prod.snd (by fun_prop)).comap f (by fun_prop)) P :=
      h.hasCondDistrib_action n
    exact hc.comp_right (f := f)
  hasCondDistrib_feedback n := by
    let f : (Iic n โ†’ ๐“ ร— ๐“” ร— ๐“จ) ร— ๐“ โ†’ (Iic n โ†’ ๐“ ร— ๐“จ) ร— ๐“” ร— ๐“ :=
      fun p โ†ฆ ((fun i โ†ฆ ((p.1 i).1, (p.1 i).2.2)), (p.1 โŸจ0, by simpโŸฉ).2.1, p.2)
    have hc : HasCondDistrib (fun ฯ‰ โ†ฆ (Y (n + 1) ฯ‰).2)
        (fun ฯ‰ โ†ฆ (history A Y n ฯ‰, A (n + 1) ฯ‰))
        ((Kernel.prodMkLeft ((Iic n) โ†’ ๐“ ร— ๐“จ) ฮบ).comap f (by fun_prop)) P := by
      simpa [bayesStationaryEnv, Kernel.prodMkLeft, โ† Kernel.comap_comp_right, Function.comp_def]
        using (h.hasCondDistrib_feedback n).snd
    exact hc.comp_right

Dependency graph

Type dependencies (5)

Algorithm๐Ÿ”—

StructureLearning.Algorithm

A stochastic, sequential algorithm.

๐Ÿ”—structure
Learning.Algorithm.{u_4, u_5} (๐“ : Type u_4) (๐“จ : Type u_5) [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] : Type (max u_4 u_5)
Learning.Algorithm.{u_4, u_5} (๐“ : Type u_4) (๐“จ : Type u_5) [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] : Type (max u_4 u_5)

Code

structure Algorithm (๐“ ๐“จ : Type*) [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] where
  /-- Policy or sampling rule: distribution of the next action. -/
  policy : (n : โ„•) โ†’ Kernel (Iic n โ†’ ๐“ ร— ๐“จ) ๐“
  /-- The policy is a Markov kernel. -/
  [h_policy : โˆ€ n, IsMarkovKernel (policy n)]
  /-- Distribution of the first action. -/
  p0 : Measure ๐“
  /-- The first action distribution is a probability measure. -/
  [hp0 : IsProbabilityMeasure p0]
Used by (216)

Actions: Source ยท Open Issue

IsAlgEnvSeq๐Ÿ”—

StructureLearning.IsAlgEnvSeq

An algorithm-environment sequence: a sequence of actions and feedbacks generated by an algorithm interacting with an environment.

๐Ÿ”—structure
Learning.IsAlgEnvSeq.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {mฮฉ : MeasurableSpace ฮฉ} (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ) (alg : Algorithm ๐“ ๐“จ) (env : Environment ๐“ ๐“จ) (P : MeasureTheory.Measure ฮฉ) [MeasureTheory.IsFiniteMeasure P] : Prop
Learning.IsAlgEnvSeq.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {mฮฉ : MeasurableSpace ฮฉ} (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ) (alg : Algorithm ๐“ ๐“จ) (env : Environment ๐“ ๐“จ) (P : MeasureTheory.Measure ฮฉ) [MeasureTheory.IsFiniteMeasure P] : Prop

Code

structure IsAlgEnvSeq
    (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ) (alg : Algorithm ๐“ ๐“จ) (env : Environment ๐“ ๐“จ)
    (P : Measure ฮฉ) [IsFiniteMeasure P] : Prop where
  /-- The action sequence is measurable. -/
  measurable_action n : Measurable (A n) := by fun_prop
  /-- The feedback sequence is measurable. -/
  measurable_feedback n : Measurable (Y n) := by fun_prop
  /-- The first action has the correct law. -/
  hasLaw_action_zero : HasLaw (fun ฯ‰ โ†ฆ (A 0 ฯ‰)) alg.p0 P
  /-- The first feedback has the correct conditional distribution. -/
  hasCondDistrib_feedback_zero : HasCondDistrib (Y 0) (A 0) env.ฮฝ0 P
  /-- The next action has the correct conditional distribution given the history. -/
  hasCondDistrib_action n :
    HasCondDistrib (A (n + 1)) (history A Y n) (alg.policy n) P
  /-- The next feedback has the correct conditional distribution given the history and
  next action. -/
  hasCondDistrib_feedback n :
    HasCondDistrib (Y (n + 1)) (fun ฯ‰ โ†ฆ (history A Y n ฯ‰, A (n + 1) ฯ‰))
      (env.feedback n) P
Type uses (3)
Used by (111)

Actions: Source ยท Open Issue

prodLeft๐Ÿ”—

DefinitionLearning.Algorithm.prodLeft

An algorithm with observations in ๐“ง ร— ๐“จ obtained from an algorithm with observations in ๐“จ by ignoring the ๐“ง component of each observation.

๐Ÿ”—def
Learning.Algorithm.prodLeft.{u_1, u_2, u_4} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (๐“ง : Type u_4) [MeasurableSpace ๐“ง] (alg : Algorithm ๐“ ๐“จ) : Algorithm ๐“ (๐“ง ร— ๐“จ)
Learning.Algorithm.prodLeft.{u_1, u_2, u_4} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (๐“ง : Type u_4) [MeasurableSpace ๐“ง] (alg : Algorithm ๐“ ๐“จ) : Algorithm ๐“ (๐“ง ร— ๐“จ)

Code

def Algorithm.prodLeft (๐“ง : Type*) [MeasurableSpace ๐“ง] (alg : Algorithm ๐“ ๐“จ) :
    Algorithm ๐“ (๐“ง ร— ๐“จ) where
  policy n := (alg.policy n).comap (fun h i โ†ฆ ((h i).1, (h i).2.2)) (by fun_prop)
  p0 := alg.p0
Type uses (1)
Body uses (2)
Used by (6)

Actions: Source ยท Open Issue

bayesStationaryEnv๐Ÿ”—

DefinitionLearning.bayesStationaryEnv

An environment with observations in ๐“” ร— ๐“จ. The first element e of an observation is sampled from Q once and remains constant. The second element of an observation is sampled from ฮบ (e, a), where a is the corresponding action.

๐Ÿ”—def
Learning.bayesStationaryEnv.{u_1, u_2, u_3} {๐“” : Type u_1} {๐“ : Type u_2} {๐“จ : Type u_3} [MeasurableSpace ๐“”] [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] (Q : MeasureTheory.Measure ๐“”) [MeasureTheory.IsProbabilityMeasure Q] (ฮบ : ProbabilityTheory.Kernel (๐“” ร— ๐“) ๐“จ) [ProbabilityTheory.IsMarkovKernel ฮบ] : Environment ๐“ (๐“” ร— ๐“จ)
Learning.bayesStationaryEnv.{u_1, u_2, u_3} {๐“” : Type u_1} {๐“ : Type u_2} {๐“จ : Type u_3} [MeasurableSpace ๐“”] [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] (Q : MeasureTheory.Measure ๐“”) [MeasureTheory.IsProbabilityMeasure Q] (ฮบ : ProbabilityTheory.Kernel (๐“” ร— ๐“) ๐“จ) [ProbabilityTheory.IsMarkovKernel ฮบ] : Environment ๐“ (๐“” ร— ๐“จ)

Code

noncomputable
def bayesStationaryEnv (Q : Measure ๐“”) [IsProbabilityMeasure Q] (ฮบ : Kernel (๐“” ร— ๐“) ๐“จ)
    [IsMarkovKernel ฮบ] : Environment ๐“ (๐“” ร— ๐“จ) where
  feedback n :=
    let g : (Iic n โ†’ ๐“ ร— ๐“” ร— ๐“จ) ร— ๐“ โ†’ ๐“” ร— ๐“ := fun (h, a) => ((h โŸจ0, by simpโŸฉ).2.1, a)
    (Kernel.deterministic (Prod.fst โˆ˜ g) (by fun_prop)) ร—โ‚– (ฮบ.comap g (by fun_prop))
  ฮฝ0 := (Kernel.const _ Q) โŠ—โ‚– ฮบ.swapLeft
Type uses (1)
Used by (4)

Actions: Source ยท Open Issue

IsBayesAlgEnvSeq๐Ÿ”—

StructureLearning.IsBayesAlgEnvSeq

IsBayesAlgEnvSeq Q ฮบ alg E A Y P states that there is a measure P : Measure ฮฉ such that the parameter E : ฮฉ โ†’ ๐“” has law Q and that the sequences of actions A : โ„• โ†’ ฮฉ โ†’ ๐“ and feedbacks Y : โ„• โ†’ ฮฉ โ†’ ๐“จ are generated by the algorithm alg : Algorithm ๐“ ๐“จ interacting with an underlying environment that depends on E and ฮบ (stationaryEnv (ฮบ.sectR (E ฯ‰))).

๐Ÿ”—structure
Learning.IsBayesAlgEnvSeq.{u_1, u_2, u_3, u_4} {๐“” : Type u_1} {๐“ : Type u_2} {๐“จ : Type u_3} {ฮฉ : Type u_4} [MeasurableSpace ๐“”] [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] [MeasurableSpace ฮฉ] (Q : MeasureTheory.Measure ๐“”) (ฮบ : ProbabilityTheory.Kernel (๐“” ร— ๐“) ๐“จ) (alg : Algorithm ๐“ ๐“จ) (E : ฮฉ โ†’ ๐“”) (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ) (P : MeasureTheory.Measure ฮฉ) [MeasureTheory.IsFiniteMeasure P] : Prop
Learning.IsBayesAlgEnvSeq.{u_1, u_2, u_3, u_4} {๐“” : Type u_1} {๐“ : Type u_2} {๐“จ : Type u_3} {ฮฉ : Type u_4} [MeasurableSpace ๐“”] [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] [MeasurableSpace ฮฉ] (Q : MeasureTheory.Measure ๐“”) (ฮบ : ProbabilityTheory.Kernel (๐“” ร— ๐“) ๐“จ) (alg : Algorithm ๐“ ๐“จ) (E : ฮฉ โ†’ ๐“”) (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ) (P : MeasureTheory.Measure ฮฉ) [MeasureTheory.IsFiniteMeasure P] : Prop

Code

structure IsBayesAlgEnvSeq
    (Q : Measure ๐“”) (ฮบ : Kernel (๐“” ร— ๐“) ๐“จ) (alg : Algorithm ๐“ ๐“จ)
    (E : ฮฉ โ†’ ๐“”) (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ)
    (P : Measure ฮฉ) [IsFiniteMeasure P] : Prop where
  measurable_param : Measurable E := by fun_prop
  measurable_action n : Measurable (A n) := by fun_prop
  measurable_feedback n : Measurable (Y n) := by fun_prop
  hasLaw_env : HasLaw E Q P
  hasCondDistrib_action_zero : HasCondDistrib (A 0) E (Kernel.const _ alg.p0) P
  hasCondDistrib_feedback_zero : HasCondDistrib (Y 0) (fun ฯ‰ โ†ฆ (E ฯ‰, A 0 ฯ‰)) ฮบ P
  hasCondDistrib_action n :
    HasCondDistrib (A (n + 1)) (fun ฯ‰ โ†ฆ (E ฯ‰, history A Y n ฯ‰))
      ((alg.policy n).prodMkLeft _) P
  hasCondDistrib_feedback n :
    HasCondDistrib (Y (n + 1)) (fun ฯ‰ โ†ฆ (history A Y n ฯ‰, E ฯ‰, A (n + 1) ฯ‰))
      (ฮบ.prodMkLeft _) P
Type uses (2)
Used by (22)

Actions: Source ยท Open Issue

All dependencies, transitively (4)

Environment๐Ÿ”—

StructureLearning.Environment

A stochastic environment.

๐Ÿ”—structure
Learning.Environment.{u_4, u_5} (๐“ : Type u_4) (๐“จ : Type u_5) [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] : Type (max u_4 u_5)
Learning.Environment.{u_4, u_5} (๐“ : Type u_4) (๐“จ : Type u_5) [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] : Type (max u_4 u_5)

Code

structure Environment (๐“ ๐“จ : Type*) [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] where
  /-- Distribution of the next observation as function of the past history. -/
  feedback : (n : โ„•) โ†’ Kernel ((Iic n โ†’ ๐“ ร— ๐“จ) ร— ๐“) ๐“จ
  /-- The feedback kernels are Markov kernels. -/
  [h_feedback : โˆ€ n, IsMarkovKernel (feedback n)]
  /-- Distribution of the first observation given the first action. -/
  ฮฝ0 : Kernel ๐“ ๐“จ
  /-- The initial observation kernel is a Markov kernel. -/
  [hp0 : IsMarkovKernel ฮฝ0]
Used by (128)

Actions: Source ยท Open Issue

history๐Ÿ”—

DefinitionLearning.history

History of the algorithm-environment sequence up to time n.

๐Ÿ”—def
Learning.history.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {ฮฉ : Type u_3} (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ) (n : โ„•) (ฯ‰ : ฮฉ) : โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ
Learning.history.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {ฮฉ : Type u_3} (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ) (n : โ„•) (ฯ‰ : ฮฉ) : โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ

Code

def history (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ) (n : โ„•) (ฯ‰ : ฮฉ) : Iic n โ†’ ๐“ ร— ๐“จ :=
  fun i โ†ฆ (A i ฯ‰, Y i ฯ‰)
Used by (72)

Actions: Source ยท Open Issue

instIsProbabilityMeasureP0๐Ÿ”—

InstanceLearning.instIsProbabilityMeasureP0

No docstring.

๐Ÿ”—theorem
Learning.instIsProbabilityMeasureP0.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (alg : Algorithm ๐“ ๐“จ) : MeasureTheory.IsProbabilityMeasure (Algorithm.p0 alg)
Learning.instIsProbabilityMeasureP0.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (alg : Algorithm ๐“ ๐“จ) : MeasureTheory.IsProbabilityMeasure (Algorithm.p0 alg)

Code

instance (alg : Algorithm ๐“ ๐“จ) : IsProbabilityMeasure alg.p0
Type uses (1)
Used by (13)

Actions: Source ยท Open Issue

Proof
alg.hp0

instIsMarkovKernelForallSubtypeNatMemFinsetIicProdPolicy๐Ÿ”—

InstanceLearning.instIsMarkovKernelForallSubtypeNatMemFinsetIicProdPolicy

No docstring.

๐Ÿ”—theorem
Learning.instIsMarkovKernelForallSubtypeNatMemFinsetIicProdPolicy.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (alg : Algorithm ๐“ ๐“จ) (n : โ„•) : ProbabilityTheory.IsMarkovKernel (Algorithm.policy alg n)
Learning.instIsMarkovKernelForallSubtypeNatMemFinsetIicProdPolicy.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (alg : Algorithm ๐“ ๐“จ) (n : โ„•) : ProbabilityTheory.IsMarkovKernel (Algorithm.policy alg n)

Code

instance (alg : Algorithm ๐“ ๐“จ) (n : โ„•) : IsMarkovKernel (alg.policy n)
Type uses (1)
Used by (14)

Actions: Source ยท Open Issue

Proof
alg.h_policy n