LeanMachineLearning exposition

Learning.IsAlgEnvSeq.hasLaw_history_withDensityπŸ”—

This page has the declaration's own card below, then its dependency graph, then a card for each dependency (type dependencies first, then the rest of the transitive closure). For a theorem, the graph and the dependency cards only follow its statement's dependencies (its proof is replaced by sorry, so what it proves doesn't depend on how); for everything else, both the type and the body/value are followed, since their content is part of what later declarations build on.

Minimal Lean file

hasLaw_history_withDensityπŸ”—

LemmaLearning.IsAlgEnvSeq.hasLaw_history_withDensity

No docstring.

πŸ”—theorem
Learning.IsAlgEnvSeq.hasLaw_history_withDensity.{u_1, u_2, u_3, u_4} {𝓐 : Type u_1} {𝓨 : Type u_2} [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] {Ξ© : Type u_3} [MeasurableSpace Ξ©] {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {A : β„• β†’ Ξ© β†’ 𝓐} {Y : β„• β†’ Ξ© β†’ 𝓨} {P : MeasureTheory.Measure Ξ©} [MeasureTheory.IsFiniteMeasure P] {Ξ©β‚€ : Type u_4} [MeasurableSpace Ξ©β‚€] {algβ‚€ : Algorithm 𝓐 𝓨} {Aβ‚€ : β„• β†’ Ξ©β‚€ β†’ 𝓐} {Yβ‚€ : β„• β†’ Ξ©β‚€ β†’ 𝓨} {Pβ‚€ : MeasureTheory.Measure Ξ©β‚€} [MeasureTheory.IsProbabilityMeasure Pβ‚€] [MeasurableSpace.CountablyGenerated 𝓐] (h : IsAlgEnvSeq A Y alg env P) (hβ‚€ : IsAlgEnvSeq Aβ‚€ Yβ‚€ algβ‚€ env Pβ‚€) (hc : Algorithm.AbsolutelyContinuous alg algβ‚€) (n : β„•) : ProbabilityTheory.HasLaw (history A Y n) (MeasureTheory.Measure.withDensity (MeasureTheory.Measure.map (history Aβ‚€ Yβ‚€ n) Pβ‚€) (Algorithm.density alg algβ‚€ n)) P
Learning.IsAlgEnvSeq.hasLaw_history_withDensity.{u_1, u_2, u_3, u_4} {𝓐 : Type u_1} {𝓨 : Type u_2} [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] {Ξ© : Type u_3} [MeasurableSpace Ξ©] {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {A : β„• β†’ Ξ© β†’ 𝓐} {Y : β„• β†’ Ξ© β†’ 𝓨} {P : MeasureTheory.Measure Ξ©} [MeasureTheory.IsFiniteMeasure P] {Ξ©β‚€ : Type u_4} [MeasurableSpace Ξ©β‚€] {algβ‚€ : Algorithm 𝓐 𝓨} {Aβ‚€ : β„• β†’ Ξ©β‚€ β†’ 𝓐} {Yβ‚€ : β„• β†’ Ξ©β‚€ β†’ 𝓨} {Pβ‚€ : MeasureTheory.Measure Ξ©β‚€} [MeasureTheory.IsProbabilityMeasure Pβ‚€] [MeasurableSpace.CountablyGenerated 𝓐] (h : IsAlgEnvSeq A Y alg env P) (hβ‚€ : IsAlgEnvSeq Aβ‚€ Yβ‚€ algβ‚€ env Pβ‚€) (hc : Algorithm.AbsolutelyContinuous alg algβ‚€) (n : β„•) : ProbabilityTheory.HasLaw (history A Y n) (MeasureTheory.Measure.withDensity (MeasureTheory.Measure.map (history Aβ‚€ Yβ‚€ n) Pβ‚€) (Algorithm.density alg algβ‚€ n)) P

Code

lemma hasLaw_history_withDensity (h : IsAlgEnvSeq A Y alg env P)
    (hβ‚€ : IsAlgEnvSeq Aβ‚€ Yβ‚€ algβ‚€ env Pβ‚€) (hc : alg β‰ͺₐ algβ‚€) (n : β„•) : HasLaw (history A Y n)
      ((Pβ‚€.map (history Aβ‚€ Yβ‚€ n)).withDensity (alg.density algβ‚€ n)) P where
  aemeasurable
Type uses (6)
Body uses (20)
Used by (1)

Actions: Source Β· Open Issue

Proof
(h.measurable_history n).aemeasurable
  map_eq := by
    induction n with
    | zero =>
      rw [h.hasLaw_history_zero.map_eq, hβ‚€.hasLaw_history_zero.map_eq, h.hasLaw_step_zero.map_eq,
        hβ‚€.hasLaw_step_zero.map_eq]
      rw [← Measure.withDensity_rnDeriv_eq _ _ hc.p0,
        Measure.compProd_withDensity_left (by fun_prop)]
      exact map_equiv_withDensity (by fun_prop)
    | succ n ih =>
      let ρ h' (ar : 𝓐 Γ— 𝓨) := Kernel.rnDeriv (alg.policy n) (algβ‚€.policy n) h' ar.1
      have hs : stepKernel alg env n = (stepKernel algβ‚€ env n).withDensity ρ := by
        rw [stepKernel, ← Kernel.withDensity_rnDeriv_eq' (hc.policy n)]
        exact Kernel.compProd_withDensity_left (Kernel.measurable_rnDeriv _ _)
      have : IsMarkovKernel ((stepKernel algβ‚€ env n).withDensity ρ) := by
        rw [← hs]
        infer_instance
      simp_rw [history_succ]
      rw [← Measure.map_map (by fun_prop), ← Measure.map_map (by fun_prop)]
      rotate_left
      Β· exact (hβ‚€.measurable_history n).prodMk (hβ‚€.measurable_step (n + 1))
      Β· exact (h.measurable_history n).prodMk (h.measurable_step (n + 1))
      rw [(h.hasCondDistrib_step n).map_eq, (hβ‚€.hasCondDistrib_step n).map_eq, ih, hs,
        Measure.compProd_withDensity_withDensity (by fun_prop) (by fun_prop)]
      exact map_equiv_withDensity (by fun_prop)

Dependency graph

Type dependencies (6)

AlgorithmπŸ”—

StructureLearning.Algorithm

A stochastic, sequential algorithm.

πŸ”—structure
Learning.Algorithm.{u_4, u_5} (𝓐 : Type u_4) (𝓨 : Type u_5) [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] : Type (max u_4 u_5)
Learning.Algorithm.{u_4, u_5} (𝓐 : Type u_4) (𝓨 : Type u_5) [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] : Type (max u_4 u_5)

Code

structure Algorithm (𝓐 𝓨 : Type*) [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] where
  /-- Policy or sampling rule: distribution of the next action. -/
  policy : (n : β„•) β†’ Kernel (Iic n β†’ 𝓐 Γ— 𝓨) 𝓐
  /-- The policy is a Markov kernel. -/
  [h_policy : βˆ€ n, IsMarkovKernel (policy n)]
  /-- Distribution of the first action. -/
  p0 : Measure 𝓐
  /-- The first action distribution is a probability measure. -/
  [hp0 : IsProbabilityMeasure p0]
Used by (216)

Actions: Source Β· Open Issue

EnvironmentπŸ”—

StructureLearning.Environment

A stochastic environment.

πŸ”—structure
Learning.Environment.{u_4, u_5} (𝓐 : Type u_4) (𝓨 : Type u_5) [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] : Type (max u_4 u_5)
Learning.Environment.{u_4, u_5} (𝓐 : Type u_4) (𝓨 : Type u_5) [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] : Type (max u_4 u_5)

Code

structure Environment (𝓐 𝓨 : Type*) [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] where
  /-- Distribution of the next observation as function of the past history. -/
  feedback : (n : β„•) β†’ Kernel ((Iic n β†’ 𝓐 Γ— 𝓨) Γ— 𝓐) 𝓨
  /-- The feedback kernels are Markov kernels. -/
  [h_feedback : βˆ€ n, IsMarkovKernel (feedback n)]
  /-- Distribution of the first observation given the first action. -/
  Ξ½0 : Kernel 𝓐 𝓨
  /-- The initial observation kernel is a Markov kernel. -/
  [hp0 : IsMarkovKernel Ξ½0]
Used by (128)

Actions: Source Β· Open Issue

IsAlgEnvSeqπŸ”—

StructureLearning.IsAlgEnvSeq

An algorithm-environment sequence: a sequence of actions and feedbacks generated by an algorithm interacting with an environment.

πŸ”—structure
Learning.IsAlgEnvSeq.{u_1, u_2, u_3} {𝓐 : Type u_1} {𝓨 : Type u_2} {Ξ© : Type u_3} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {mΞ© : MeasurableSpace Ξ©} (A : β„• β†’ Ξ© β†’ 𝓐) (Y : β„• β†’ Ξ© β†’ 𝓨) (alg : Algorithm 𝓐 𝓨) (env : Environment 𝓐 𝓨) (P : MeasureTheory.Measure Ξ©) [MeasureTheory.IsFiniteMeasure P] : Prop
Learning.IsAlgEnvSeq.{u_1, u_2, u_3} {𝓐 : Type u_1} {𝓨 : Type u_2} {Ξ© : Type u_3} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {mΞ© : MeasurableSpace Ξ©} (A : β„• β†’ Ξ© β†’ 𝓐) (Y : β„• β†’ Ξ© β†’ 𝓨) (alg : Algorithm 𝓐 𝓨) (env : Environment 𝓐 𝓨) (P : MeasureTheory.Measure Ξ©) [MeasureTheory.IsFiniteMeasure P] : Prop

Code

structure IsAlgEnvSeq
    (A : β„• β†’ Ξ© β†’ 𝓐) (Y : β„• β†’ Ξ© β†’ 𝓨) (alg : Algorithm 𝓐 𝓨) (env : Environment 𝓐 𝓨)
    (P : Measure Ξ©) [IsFiniteMeasure P] : Prop where
  /-- The action sequence is measurable. -/
  measurable_action n : Measurable (A n) := by fun_prop
  /-- The feedback sequence is measurable. -/
  measurable_feedback n : Measurable (Y n) := by fun_prop
  /-- The first action has the correct law. -/
  hasLaw_action_zero : HasLaw (fun Ο‰ ↦ (A 0 Ο‰)) alg.p0 P
  /-- The first feedback has the correct conditional distribution. -/
  hasCondDistrib_feedback_zero : HasCondDistrib (Y 0) (A 0) env.Ξ½0 P
  /-- The next action has the correct conditional distribution given the history. -/
  hasCondDistrib_action n :
    HasCondDistrib (A (n + 1)) (history A Y n) (alg.policy n) P
  /-- The next feedback has the correct conditional distribution given the history and
  next action. -/
  hasCondDistrib_feedback n :
    HasCondDistrib (Y (n + 1)) (fun Ο‰ ↦ (history A Y n Ο‰, A (n + 1) Ο‰))
      (env.feedback n) P
Type uses (3)
Used by (111)

Actions: Source Β· Open Issue

AbsolutelyContinuousπŸ”—

StructureLearning.Algorithm.AbsolutelyContinuous

For every time and history, the distribution over actions according to alg is absolutely continuous with respect to the distribution over actions according to algβ‚€.

πŸ”—structure
Learning.Algorithm.AbsolutelyContinuous.{u_1, u_2} {𝓐 : Type u_1} {𝓨 : Type u_2} [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] (alg algβ‚€ : Algorithm 𝓐 𝓨) : Prop
Learning.Algorithm.AbsolutelyContinuous.{u_1, u_2} {𝓐 : Type u_1} {𝓨 : Type u_2} [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] (alg algβ‚€ : Algorithm 𝓐 𝓨) : Prop

Code

structure AbsolutelyContinuous (alg algβ‚€ : Algorithm 𝓐 𝓨) : Prop where
  p0 : alg.p0 β‰ͺ algβ‚€.p0
  policy n h : alg.policy n h β‰ͺ algβ‚€.policy n h
Type uses (1)
Used by (7)

Actions: Source Β· Open Issue

historyπŸ”—

DefinitionLearning.history

History of the algorithm-environment sequence up to time n.

πŸ”—def
Learning.history.{u_1, u_2, u_3} {𝓐 : Type u_1} {𝓨 : Type u_2} {Ξ© : Type u_3} (A : β„• β†’ Ξ© β†’ 𝓐) (Y : β„• β†’ Ξ© β†’ 𝓨) (n : β„•) (Ο‰ : Ξ©) : β†₯(Finset.Iic n) β†’ 𝓐 Γ— 𝓨
Learning.history.{u_1, u_2, u_3} {𝓐 : Type u_1} {𝓨 : Type u_2} {Ξ© : Type u_3} (A : β„• β†’ Ξ© β†’ 𝓐) (Y : β„• β†’ Ξ© β†’ 𝓨) (n : β„•) (Ο‰ : Ξ©) : β†₯(Finset.Iic n) β†’ 𝓐 Γ— 𝓨

Code

def history (A : β„• β†’ Ξ© β†’ 𝓐) (Y : β„• β†’ Ξ© β†’ 𝓨) (n : β„•) (Ο‰ : Ξ©) : Iic n β†’ 𝓐 Γ— 𝓨 :=
  fun i ↦ (A i Ο‰, Y i Ο‰)
Used by (72)

Actions: Source Β· Open Issue

densityπŸ”—

DefinitionLearning.Algorithm.density

If the algorithm alg is absolutely continuous with respect to the algorithm algβ‚€ and they are both interacting with the same environment, then the law of the history at time n under alg is the law of the history at time n under algβ‚€ with density alg.density algβ‚€ n.

πŸ”—def
Learning.Algorithm.density.{u_1, u_2} {𝓐 : Type u_1} {𝓨 : Type u_2} [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] [MeasurableSpace.CountablyGenerated 𝓐] (alg algβ‚€ : Algorithm 𝓐 𝓨) (n : β„•) : (β†₯(Finset.Iic n) β†’ 𝓐 Γ— 𝓨) β†’ ENNReal
Learning.Algorithm.density.{u_1, u_2} {𝓐 : Type u_1} {𝓨 : Type u_2} [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] [MeasurableSpace.CountablyGenerated 𝓐] (alg algβ‚€ : Algorithm 𝓐 𝓨) (n : β„•) : (β†₯(Finset.Iic n) β†’ 𝓐 Γ— 𝓨) β†’ ENNReal

Code

noncomputable
def density [MeasurableSpace.CountablyGenerated 𝓐] (alg algβ‚€ : Algorithm 𝓐 𝓨) :
    (n : β„•) β†’ (Iic n β†’ 𝓐 Γ— 𝓨) β†’ ℝβ‰₯0∞
  | 0, h => (alg.p0.rnDeriv algβ‚€.p0 (h ⟨0, by simp⟩).1)
  | n + 1, h =>
    let p := MeasurableEquiv.IicSuccProd (fun _ ↦ 𝓐 Γ— 𝓨) n h
    alg.density algβ‚€ n p.1 * (alg.policy n).rnDeriv (algβ‚€.policy n) p.1 p.2.1
Type uses (1)
Body uses (1)
Used by (5)

Actions: Source Β· Open Issue

All dependencies, transitively (1)

IicSuccProdπŸ”—

DefinitionMeasurableEquiv.IicSuccProd

Measurable equivalence between a product up to n + 1 and the pair of the product up to n and the space at n + 1.

πŸ”—def
MeasurableEquiv.IicSuccProd.{u_3} (X : β„• β†’ Type u_3) [(n : β„•) β†’ MeasurableSpace (X n)] (n : β„•) : ((i : β†₯(Finset.Iic (n + 1))) β†’ X ↑i) ≃ᡐ ((i : β†₯(Finset.Iic n)) β†’ X ↑i) Γ— X (n + 1)
MeasurableEquiv.IicSuccProd.{u_3} (X : β„• β†’ Type u_3) [(n : β„•) β†’ MeasurableSpace (X n)] (n : β„•) : ((i : β†₯(Finset.Iic (n + 1))) β†’ X ↑i) ≃ᡐ ((i : β†₯(Finset.Iic n)) β†’ X ↑i) Γ— X (n + 1)

Code

def _root_.MeasurableEquiv.IicSuccProd (X : β„• β†’ Type*) [βˆ€ n, MeasurableSpace (X n)] (n : β„•) :
    MeasurableEquiv (Ξ  i : Iic (n + 1), X i) ((Ξ  i : Iic n, X i) Γ— X (n + 1)) :=
  (MeasurableEquiv.IicProdIoc (Nat.le_succ n)).symm.trans
    (MeasurableEquiv.prodCongr (MeasurableEquiv.refl _) (MeasurableEquiv.piSingleton n).symm)
Used by (11)

Actions: Source Β· Open Issue