LeanMachineLearning exposition

Learning.IsBayesAlgEnvSeq.hasCondDistrib_IT_feedback_zero๐Ÿ”—

This page has the declaration's own card below, then its dependency graph, then a card for each dependency (type dependencies first, then the rest of the transitive closure). For a theorem, the graph and the dependency cards only follow its statement's dependencies (its proof is replaced by sorry, so what it proves doesn't depend on how); for everything else, both the type and the body/value are followed, since their content is part of what later declarations build on.

Minimal Lean file

hasCondDistrib_IT_feedback_zero๐Ÿ”—

LemmaLearning.IsBayesAlgEnvSeq.hasCondDistrib_IT_feedback_zero

No docstring.

๐Ÿ”—theorem
Learning.IsBayesAlgEnvSeq.hasCondDistrib_IT_feedback_zero.{u_1, u_2, u_3, u_4} {๐“” : Type u_1} {๐“ : Type u_2} {๐“จ : Type u_3} {ฮฉ : Type u_4} [MeasurableSpace ๐“”] [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] [MeasurableSpace ฮฉ] {Q : MeasureTheory.Measure ๐“”} {ฮบ : ProbabilityTheory.Kernel (๐“” ร— ๐“) ๐“จ} {alg : Algorithm ๐“ ๐“จ} {E : ฮฉ โ†’ ๐“”} {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] [StandardBorelSpace ๐“] [Nonempty ๐“] [StandardBorelSpace ๐“จ] [Nonempty ๐“จ] [ProbabilityTheory.IsFiniteKernel ฮบ] (h : IsBayesAlgEnvSeq Q ฮบ alg E A Y P) : โˆ€แต (e : ๐“”) โˆ‚Q, ProbabilityTheory.HasCondDistrib (IT.feedback 0) (IT.action 0) (ProbabilityTheory.Kernel.sectR ฮบ e) (๐“›[trajectory A Y | E; P] e)
Learning.IsBayesAlgEnvSeq.hasCondDistrib_IT_feedback_zero.{u_1, u_2, u_3, u_4} {๐“” : Type u_1} {๐“ : Type u_2} {๐“จ : Type u_3} {ฮฉ : Type u_4} [MeasurableSpace ๐“”] [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] [MeasurableSpace ฮฉ] {Q : MeasureTheory.Measure ๐“”} {ฮบ : ProbabilityTheory.Kernel (๐“” ร— ๐“) ๐“จ} {alg : Algorithm ๐“ ๐“จ} {E : ฮฉ โ†’ ๐“”} {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] [StandardBorelSpace ๐“] [Nonempty ๐“] [StandardBorelSpace ๐“จ] [Nonempty ๐“จ] [ProbabilityTheory.IsFiniteKernel ฮบ] (h : IsBayesAlgEnvSeq Q ฮบ alg E A Y P) : โˆ€แต (e : ๐“”) โˆ‚Q, ProbabilityTheory.HasCondDistrib (IT.feedback 0) (IT.action 0) (ProbabilityTheory.Kernel.sectR ฮบ e) (๐“›[trajectory A Y | E; P] e)

Code

lemma hasCondDistrib_IT_feedback_zero [IsFiniteKernel ฮบ] (h : IsBayesAlgEnvSeq Q ฮบ alg E A Y P) :
    โˆ€แต e โˆ‚Q, HasCondDistrib (IT.feedback 0) (IT.action 0) (ฮบ.sectR e)
      (condDistrib (trajectory A Y) E P e)
Type uses (5)
Body uses (4)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  rw [โ† h.hasLaw_env.map_eq]
  exact h.hasCondDistrib_feedback_zero.hasCondDistrib_sectR
    (IT.measurable_action 0) (IT.measurable_feedback 0)
    (measurable_trajectory h.measurable_action h.measurable_feedback).aemeasurable

Dependency graph

Type dependencies (5)

Algorithm๐Ÿ”—

StructureLearning.Algorithm

A stochastic, sequential algorithm.

๐Ÿ”—structure
Learning.Algorithm.{u_4, u_5} (๐“ : Type u_4) (๐“จ : Type u_5) [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] : Type (max u_4 u_5)
Learning.Algorithm.{u_4, u_5} (๐“ : Type u_4) (๐“จ : Type u_5) [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] : Type (max u_4 u_5)

Code

structure Algorithm (๐“ ๐“จ : Type*) [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] where
  /-- Policy or sampling rule: distribution of the next action. -/
  policy : (n : โ„•) โ†’ Kernel (Iic n โ†’ ๐“ ร— ๐“จ) ๐“
  /-- The policy is a Markov kernel. -/
  [h_policy : โˆ€ n, IsMarkovKernel (policy n)]
  /-- Distribution of the first action. -/
  p0 : Measure ๐“
  /-- The first action distribution is a probability measure. -/
  [hp0 : IsProbabilityMeasure p0]
Used by (216)

Actions: Source ยท Open Issue

IsBayesAlgEnvSeq๐Ÿ”—

StructureLearning.IsBayesAlgEnvSeq

IsBayesAlgEnvSeq Q ฮบ alg E A Y P states that there is a measure P : Measure ฮฉ such that the parameter E : ฮฉ โ†’ ๐“” has law Q and that the sequences of actions A : โ„• โ†’ ฮฉ โ†’ ๐“ and feedbacks Y : โ„• โ†’ ฮฉ โ†’ ๐“จ are generated by the algorithm alg : Algorithm ๐“ ๐“จ interacting with an underlying environment that depends on E and ฮบ (stationaryEnv (ฮบ.sectR (E ฯ‰))).

๐Ÿ”—structure
Learning.IsBayesAlgEnvSeq.{u_1, u_2, u_3, u_4} {๐“” : Type u_1} {๐“ : Type u_2} {๐“จ : Type u_3} {ฮฉ : Type u_4} [MeasurableSpace ๐“”] [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] [MeasurableSpace ฮฉ] (Q : MeasureTheory.Measure ๐“”) (ฮบ : ProbabilityTheory.Kernel (๐“” ร— ๐“) ๐“จ) (alg : Algorithm ๐“ ๐“จ) (E : ฮฉ โ†’ ๐“”) (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ) (P : MeasureTheory.Measure ฮฉ) [MeasureTheory.IsFiniteMeasure P] : Prop
Learning.IsBayesAlgEnvSeq.{u_1, u_2, u_3, u_4} {๐“” : Type u_1} {๐“ : Type u_2} {๐“จ : Type u_3} {ฮฉ : Type u_4} [MeasurableSpace ๐“”] [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] [MeasurableSpace ฮฉ] (Q : MeasureTheory.Measure ๐“”) (ฮบ : ProbabilityTheory.Kernel (๐“” ร— ๐“) ๐“จ) (alg : Algorithm ๐“ ๐“จ) (E : ฮฉ โ†’ ๐“”) (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ) (P : MeasureTheory.Measure ฮฉ) [MeasureTheory.IsFiniteMeasure P] : Prop

Code

structure IsBayesAlgEnvSeq
    (Q : Measure ๐“”) (ฮบ : Kernel (๐“” ร— ๐“) ๐“จ) (alg : Algorithm ๐“ ๐“จ)
    (E : ฮฉ โ†’ ๐“”) (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ)
    (P : Measure ฮฉ) [IsFiniteMeasure P] : Prop where
  measurable_param : Measurable E := by fun_prop
  measurable_action n : Measurable (A n) := by fun_prop
  measurable_feedback n : Measurable (Y n) := by fun_prop
  hasLaw_env : HasLaw E Q P
  hasCondDistrib_action_zero : HasCondDistrib (A 0) E (Kernel.const _ alg.p0) P
  hasCondDistrib_feedback_zero : HasCondDistrib (Y 0) (fun ฯ‰ โ†ฆ (E ฯ‰, A 0 ฯ‰)) ฮบ P
  hasCondDistrib_action n :
    HasCondDistrib (A (n + 1)) (fun ฯ‰ โ†ฆ (E ฯ‰, history A Y n ฯ‰))
      ((alg.policy n).prodMkLeft _) P
  hasCondDistrib_feedback n :
    HasCondDistrib (Y (n + 1)) (fun ฯ‰ โ†ฆ (history A Y n ฯ‰, E ฯ‰, A (n + 1) ฯ‰))
      (ฮบ.prodMkLeft _) P
Type uses (2)
Used by (22)

Actions: Source ยท Open Issue

feedback๐Ÿ”—

DefinitionLearning.IT.feedback

feedback n is the feedback at time n. This is a random variable on the measurable space โ„• โ†’ ๐“ ร— ๐“จ.

๐Ÿ”—def
Learning.IT.feedback.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} (n : โ„•) (h : โ„• โ†’ ๐“ ร— ๐“จ) : ๐“จ
Learning.IT.feedback.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} (n : โ„•) (h : โ„• โ†’ ๐“ ร— ๐“จ) : ๐“จ

Code

def feedback (n : โ„•) (h : โ„• โ†’ ๐“ ร— ๐“จ) : ๐“จ := (h n).2
Used by (16)

Actions: Source ยท Open Issue

action๐Ÿ”—

DefinitionLearning.IT.action

action n is the action pulled at time n. This is a random variable on the measurable space โ„• โ†’ ๐“ ร— ๐“จ.

๐Ÿ”—def
Learning.IT.action.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} (n : โ„•) (h : โ„• โ†’ ๐“ ร— ๐“จ) : ๐“
Learning.IT.action.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} (n : โ„•) (h : โ„• โ†’ ๐“ ร— ๐“จ) : ๐“

Code

def action (n : โ„•) (h : โ„• โ†’ ๐“ ร— ๐“จ) : ๐“ := (h n).1
Used by (31)

Actions: Source ยท Open Issue

trajectory๐Ÿ”—

DefinitionLearning.trajectory

A random variable that gives the sequence of action-feedback pairs.

๐Ÿ”—def
Learning.trajectory.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {ฮฉ : Type u_3} (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ) (ฯ‰ : ฮฉ) : โ„• โ†’ ๐“ ร— ๐“จ
Learning.trajectory.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {ฮฉ : Type u_3} (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ) (ฯ‰ : ฮฉ) : โ„• โ†’ ๐“ ร— ๐“จ

Code

def trajectory (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ) (ฯ‰ : ฮฉ) : โ„• โ†’ ๐“ ร— ๐“จ := fun n โ†ฆ (A n ฯ‰, Y n ฯ‰)
Used by (18)

Actions: Source ยท Open Issue

All dependencies, transitively (1)

history๐Ÿ”—

DefinitionLearning.history

History of the algorithm-environment sequence up to time n.

๐Ÿ”—def
Learning.history.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {ฮฉ : Type u_3} (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ) (n : โ„•) (ฯ‰ : ฮฉ) : โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ
Learning.history.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {ฮฉ : Type u_3} (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ) (n : โ„•) (ฯ‰ : ฮฉ) : โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ

Code

def history (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ) (n : โ„•) (ฯ‰ : ฮฉ) : Iic n โ†’ ๐“ ร— ๐“จ :=
  fun i โ†ฆ (A i ฯ‰, Y i ฯ‰)
Used by (72)

Actions: Source ยท Open Issue