LeanMachineLearning exposition

Learning.IsBayesAlgEnvSeq.hasLaw_IT_hist๐Ÿ”—

This page has the declaration's own card below, then its dependency graph, then a card for each dependency (type dependencies first, then the rest of the transitive closure). For a theorem, the graph and the dependency cards only follow its statement's dependencies (its proof is replaced by sorry, so what it proves doesn't depend on how); for everything else, both the type and the body/value are followed, since their content is part of what later declarations build on.

Minimal Lean file

hasLaw_IT_hist๐Ÿ”—

LemmaLearning.IsBayesAlgEnvSeq.hasLaw_IT_hist

No docstring.

๐Ÿ”—theorem
Learning.IsBayesAlgEnvSeq.hasLaw_IT_hist.{u_1, u_2, u_3, u_4} {๐“” : Type u_1} {๐“ : Type u_2} {๐“จ : Type u_3} {ฮฉ : Type u_4} [MeasurableSpace ๐“”] [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] [MeasurableSpace ฮฉ] {Q : MeasureTheory.Measure ๐“”} {ฮบ : ProbabilityTheory.Kernel (๐“” ร— ๐“) ๐“จ} {alg : Algorithm ๐“ ๐“จ} {E : ฮฉ โ†’ ๐“”} {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] [StandardBorelSpace ๐“] [Nonempty ๐“] [StandardBorelSpace ๐“จ] [Nonempty ๐“จ] (h : IsBayesAlgEnvSeq Q ฮบ alg E A Y P) (n : โ„•) : โˆ€แต (e : ๐“”) โˆ‚Q, ProbabilityTheory.HasLaw (IT.hist n) (๐“›[history A Y n | E; P] e) (๐“›[trajectory A Y | E; P] e)
Learning.IsBayesAlgEnvSeq.hasLaw_IT_hist.{u_1, u_2, u_3, u_4} {๐“” : Type u_1} {๐“ : Type u_2} {๐“จ : Type u_3} {ฮฉ : Type u_4} [MeasurableSpace ๐“”] [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] [MeasurableSpace ฮฉ] {Q : MeasureTheory.Measure ๐“”} {ฮบ : ProbabilityTheory.Kernel (๐“” ร— ๐“) ๐“จ} {alg : Algorithm ๐“ ๐“จ} {E : ฮฉ โ†’ ๐“”} {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] [StandardBorelSpace ๐“] [Nonempty ๐“] [StandardBorelSpace ๐“จ] [Nonempty ๐“จ] (h : IsBayesAlgEnvSeq Q ฮบ alg E A Y P) (n : โ„•) : โˆ€แต (e : ๐“”) โˆ‚Q, ProbabilityTheory.HasLaw (IT.hist n) (๐“›[history A Y n | E; P] e) (๐“›[trajectory A Y | E; P] e)

Code

lemma hasLaw_IT_hist (h : IsBayesAlgEnvSeq Q ฮบ alg E A Y P) (n : โ„•) :
    โˆ€แต e โˆ‚Q, HasLaw (IT.hist n) (condDistrib (history A Y n) E P e)
      (condDistrib (trajectory A Y) E P e)
Type uses (5)
Body uses (2)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  rw [โ† h.hasLaw_env.map_eq, show history A Y n = IT.hist n โˆ˜ trajectory A Y from rfl]
  filter_upwards [condDistrib_comp E
    (measurable_trajectory h.measurable_action h.measurable_feedback).aemeasurable
    (IT.measurable_hist n)] with _ he
  exact โŸจ(IT.measurable_hist n).aemeasurable, by
    rw [โ† Kernel.map_apply _ (IT.measurable_hist n), he]โŸฉ

Dependency graph

Type dependencies (5)

Algorithm๐Ÿ”—

StructureLearning.Algorithm

A stochastic, sequential algorithm.

๐Ÿ”—structure
Learning.Algorithm.{u_4, u_5} (๐“ : Type u_4) (๐“จ : Type u_5) [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] : Type (max u_4 u_5)
Learning.Algorithm.{u_4, u_5} (๐“ : Type u_4) (๐“จ : Type u_5) [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] : Type (max u_4 u_5)

Code

structure Algorithm (๐“ ๐“จ : Type*) [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] where
  /-- Policy or sampling rule: distribution of the next action. -/
  policy : (n : โ„•) โ†’ Kernel (Iic n โ†’ ๐“ ร— ๐“จ) ๐“
  /-- The policy is a Markov kernel. -/
  [h_policy : โˆ€ n, IsMarkovKernel (policy n)]
  /-- Distribution of the first action. -/
  p0 : Measure ๐“
  /-- The first action distribution is a probability measure. -/
  [hp0 : IsProbabilityMeasure p0]
Used by (216)

Actions: Source ยท Open Issue

IsBayesAlgEnvSeq๐Ÿ”—

StructureLearning.IsBayesAlgEnvSeq

IsBayesAlgEnvSeq Q ฮบ alg E A Y P states that there is a measure P : Measure ฮฉ such that the parameter E : ฮฉ โ†’ ๐“” has law Q and that the sequences of actions A : โ„• โ†’ ฮฉ โ†’ ๐“ and feedbacks Y : โ„• โ†’ ฮฉ โ†’ ๐“จ are generated by the algorithm alg : Algorithm ๐“ ๐“จ interacting with an underlying environment that depends on E and ฮบ (stationaryEnv (ฮบ.sectR (E ฯ‰))).

๐Ÿ”—structure
Learning.IsBayesAlgEnvSeq.{u_1, u_2, u_3, u_4} {๐“” : Type u_1} {๐“ : Type u_2} {๐“จ : Type u_3} {ฮฉ : Type u_4} [MeasurableSpace ๐“”] [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] [MeasurableSpace ฮฉ] (Q : MeasureTheory.Measure ๐“”) (ฮบ : ProbabilityTheory.Kernel (๐“” ร— ๐“) ๐“จ) (alg : Algorithm ๐“ ๐“จ) (E : ฮฉ โ†’ ๐“”) (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ) (P : MeasureTheory.Measure ฮฉ) [MeasureTheory.IsFiniteMeasure P] : Prop
Learning.IsBayesAlgEnvSeq.{u_1, u_2, u_3, u_4} {๐“” : Type u_1} {๐“ : Type u_2} {๐“จ : Type u_3} {ฮฉ : Type u_4} [MeasurableSpace ๐“”] [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] [MeasurableSpace ฮฉ] (Q : MeasureTheory.Measure ๐“”) (ฮบ : ProbabilityTheory.Kernel (๐“” ร— ๐“) ๐“จ) (alg : Algorithm ๐“ ๐“จ) (E : ฮฉ โ†’ ๐“”) (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ) (P : MeasureTheory.Measure ฮฉ) [MeasureTheory.IsFiniteMeasure P] : Prop

Code

structure IsBayesAlgEnvSeq
    (Q : Measure ๐“”) (ฮบ : Kernel (๐“” ร— ๐“) ๐“จ) (alg : Algorithm ๐“ ๐“จ)
    (E : ฮฉ โ†’ ๐“”) (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ)
    (P : Measure ฮฉ) [IsFiniteMeasure P] : Prop where
  measurable_param : Measurable E := by fun_prop
  measurable_action n : Measurable (A n) := by fun_prop
  measurable_feedback n : Measurable (Y n) := by fun_prop
  hasLaw_env : HasLaw E Q P
  hasCondDistrib_action_zero : HasCondDistrib (A 0) E (Kernel.const _ alg.p0) P
  hasCondDistrib_feedback_zero : HasCondDistrib (Y 0) (fun ฯ‰ โ†ฆ (E ฯ‰, A 0 ฯ‰)) ฮบ P
  hasCondDistrib_action n :
    HasCondDistrib (A (n + 1)) (fun ฯ‰ โ†ฆ (E ฯ‰, history A Y n ฯ‰))
      ((alg.policy n).prodMkLeft _) P
  hasCondDistrib_feedback n :
    HasCondDistrib (Y (n + 1)) (fun ฯ‰ โ†ฆ (history A Y n ฯ‰, E ฯ‰, A (n + 1) ฯ‰))
      (ฮบ.prodMkLeft _) P
Type uses (2)
Used by (22)

Actions: Source ยท Open Issue

hist๐Ÿ”—

DefinitionLearning.IT.hist

hist n is the history up to time n. This is a random variable on the measurable space โ„• โ†’ ๐“ ร— ๐“จ.

๐Ÿ”—def
Learning.IT.hist.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} (n : โ„•) (h : โ„• โ†’ ๐“ ร— ๐“จ) : โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ
Learning.IT.hist.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} (n : โ„•) (h : โ„• โ†’ ๐“ ร— ๐“จ) : โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ

Code

def hist (n : โ„•) (h : โ„• โ†’ ๐“ ร— ๐“จ) : Iic n โ†’ ๐“ ร— ๐“จ := fun i โ†ฆ h i
Used by (23)

Actions: Source ยท Open Issue

history๐Ÿ”—

DefinitionLearning.history

History of the algorithm-environment sequence up to time n.

๐Ÿ”—def
Learning.history.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {ฮฉ : Type u_3} (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ) (n : โ„•) (ฯ‰ : ฮฉ) : โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ
Learning.history.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {ฮฉ : Type u_3} (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ) (n : โ„•) (ฯ‰ : ฮฉ) : โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ

Code

def history (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ) (n : โ„•) (ฯ‰ : ฮฉ) : Iic n โ†’ ๐“ ร— ๐“จ :=
  fun i โ†ฆ (A i ฯ‰, Y i ฯ‰)
Used by (72)

Actions: Source ยท Open Issue

trajectory๐Ÿ”—

DefinitionLearning.trajectory

A random variable that gives the sequence of action-feedback pairs.

๐Ÿ”—def
Learning.trajectory.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {ฮฉ : Type u_3} (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ) (ฯ‰ : ฮฉ) : โ„• โ†’ ๐“ ร— ๐“จ
Learning.trajectory.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {ฮฉ : Type u_3} (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ) (ฯ‰ : ฮฉ) : โ„• โ†’ ๐“ ร— ๐“จ

Code

def trajectory (A : โ„• โ†’ ฮฉ โ†’ ๐“) (Y : โ„• โ†’ ฮฉ โ†’ ๐“จ) (ฯ‰ : ฮฉ) : โ„• โ†’ ๐“ ร— ๐“จ := fun n โ†ฆ (A n ฯ‰, Y n ฯ‰)
Used by (18)

Actions: Source ยท Open Issue