LeanMachineLearning exposition

3.9.Β SequentialLearning.AlgorithmDensityπŸ”—

Algorithm density

We define a density function that allows obtaining the law of the history under one algorithm from the law of the history under another algorithm when they are interacting with the same environment. This also requires one algorithm to be absolutely continuous with respect to another, a concept that we also introduce here.

Main definitions

  • AbsolutelyContinuous alg algβ‚€: alg is absolutely continuous with respect to algβ‚€ (also denoted alg β‰ͺₐ algβ‚€) when, in every situation, a set of actions with probability zero under algβ‚€ also has probability zero under alg. Intuitively, alg never acts in a way that algβ‚€ would never act.

  • density alg algβ‚€ n: a density function that allows obtaining the law of the history at time n under alg from the law of the history at time n under algβ‚€ when they are interacting with the same environment and alg β‰ͺₐ algβ‚€.

Main results

  • absolutelyContinuous_map_hist: the law of the history at time n under alg is absolutely continuous with respect to the law of the history at time n under algβ‚€ when they are interacting with the same environment and alg β‰ͺₐ algβ‚€.

  • hasLaw_history_withDensity: the law of the history at time n under alg is the law of the history at time n under algβ‚€ with density alg.density algβ‚€ n when they are interacting with the same environment and alg β‰ͺₐ algβ‚€.

Module LeanMachineLearning.SequentialLearning.AlgorithmDensity contains 6 exposed declarations.

AbsolutelyContinuousπŸ”—

StructureLearning.Algorithm.AbsolutelyContinuous

For every time and history, the distribution over actions according to alg is absolutely continuous with respect to the distribution over actions according to algβ‚€.

πŸ”—structure
Learning.Algorithm.AbsolutelyContinuous.{u_1, u_2} {𝓐 : Type u_1} {𝓨 : Type u_2} [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] (alg algβ‚€ : Algorithm 𝓐 𝓨) : Prop
Learning.Algorithm.AbsolutelyContinuous.{u_1, u_2} {𝓐 : Type u_1} {𝓨 : Type u_2} [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] (alg algβ‚€ : Algorithm 𝓐 𝓨) : Prop

Code

structure AbsolutelyContinuous (alg algβ‚€ : Algorithm 𝓐 𝓨) : Prop where
  p0 : alg.p0 β‰ͺ algβ‚€.p0
  policy n h : alg.policy n h β‰ͺ algβ‚€.policy n h
Type uses (1)
Used by (7)

Actions: Source Β· Open Issue

term_β‰ͺₐ_πŸ”—

DefinitionLearning.Algorithm.Β«term_β‰ͺₐ_Β»

For every time and history, the distribution over actions according to alg is absolutely continuous with respect to the distribution over actions according to algβ‚€.

πŸ”—def
Learning.Algorithm.Β«term_β‰ͺₐ_Β» : Lean.TrailingParserDescr
Learning.Algorithm.Β«term_β‰ͺₐ_Β» : Lean.TrailingParserDescr

Code

scoped notation:50 alg " β‰ͺₐ " algβ‚€ => AbsolutelyContinuous alg algβ‚€
Body uses (1)

Actions: Source Β· Open Issue

densityπŸ”—

DefinitionLearning.Algorithm.density

If the algorithm alg is absolutely continuous with respect to the algorithm algβ‚€ and they are both interacting with the same environment, then the law of the history at time n under alg is the law of the history at time n under algβ‚€ with density alg.density algβ‚€ n.

πŸ”—def
Learning.Algorithm.density.{u_1, u_2} {𝓐 : Type u_1} {𝓨 : Type u_2} [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] [MeasurableSpace.CountablyGenerated 𝓐] (alg algβ‚€ : Algorithm 𝓐 𝓨) (n : β„•) : (β†₯(Finset.Iic n) β†’ 𝓐 Γ— 𝓨) β†’ ENNReal
Learning.Algorithm.density.{u_1, u_2} {𝓐 : Type u_1} {𝓨 : Type u_2} [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] [MeasurableSpace.CountablyGenerated 𝓐] (alg algβ‚€ : Algorithm 𝓐 𝓨) (n : β„•) : (β†₯(Finset.Iic n) β†’ 𝓐 Γ— 𝓨) β†’ ENNReal

Code

noncomputable
def density [MeasurableSpace.CountablyGenerated 𝓐] (alg algβ‚€ : Algorithm 𝓐 𝓨) :
    (n : β„•) β†’ (Iic n β†’ 𝓐 Γ— 𝓨) β†’ ℝβ‰₯0∞
  | 0, h => (alg.p0.rnDeriv algβ‚€.p0 (h ⟨0, by simp⟩).1)
  | n + 1, h =>
    let p := MeasurableEquiv.IicSuccProd (fun _ ↦ 𝓐 Γ— 𝓨) n h
    alg.density algβ‚€ n p.1 * (alg.policy n).rnDeriv (algβ‚€.policy n) p.1 p.2.1
Type uses (1)
Body uses (1)
Used by (5)

Actions: Source Β· Open Issue

measurable_densityπŸ”—

LemmaLearning.Algorithm.measurable_density

No docstring.

πŸ”—theorem
Learning.Algorithm.measurable_density.{u_1, u_2} {𝓐 : Type u_1} {𝓨 : Type u_2} [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] [MeasurableSpace.CountablyGenerated 𝓐] (alg algβ‚€ : Algorithm 𝓐 𝓨) (n : β„•) : Measurable (density alg algβ‚€ n)
Learning.Algorithm.measurable_density.{u_1, u_2} {𝓐 : Type u_1} {𝓨 : Type u_2} [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] [MeasurableSpace.CountablyGenerated 𝓐] (alg algβ‚€ : Algorithm 𝓐 𝓨) (n : β„•) : Measurable (density alg algβ‚€ n)

Code

lemma measurable_density [MeasurableSpace.CountablyGenerated 𝓐] (alg algβ‚€ : Algorithm 𝓐 𝓨) (n : β„•) :
    Measurable (alg.density algβ‚€ n)
Type uses (2)
Body uses (1)
Used by (4)

Actions: Source Β· Open Issue

Proof
by
  induction n with
  | zero => simp_rw [density]; fun_prop
  | succ n ih => simp_rw [density]; fun_prop

absolutelyContinuous_map_historyπŸ”—

LemmaLearning.IsAlgEnvSeq.absolutelyContinuous_map_history

No docstring.

πŸ”—theorem
Learning.IsAlgEnvSeq.absolutelyContinuous_map_history.{u_1, u_2, u_3, u_4} {𝓐 : Type u_1} {𝓨 : Type u_2} [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] {Ξ© : Type u_3} [MeasurableSpace Ξ©] {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {A : β„• β†’ Ξ© β†’ 𝓐} {Y : β„• β†’ Ξ© β†’ 𝓨} {P : MeasureTheory.Measure Ξ©} [MeasureTheory.IsFiniteMeasure P] {Ξ©β‚€ : Type u_4} [MeasurableSpace Ξ©β‚€] {algβ‚€ : Algorithm 𝓐 𝓨} {Aβ‚€ : β„• β†’ Ξ©β‚€ β†’ 𝓐} {Yβ‚€ : β„• β†’ Ξ©β‚€ β†’ 𝓨} {Pβ‚€ : MeasureTheory.Measure Ξ©β‚€} [MeasureTheory.IsProbabilityMeasure Pβ‚€] (h : IsAlgEnvSeq A Y alg env P) (hβ‚€ : IsAlgEnvSeq Aβ‚€ Yβ‚€ algβ‚€ env Pβ‚€) (hc : Algorithm.AbsolutelyContinuous alg algβ‚€) (n : β„•) : MeasureTheory.Measure.AbsolutelyContinuous (MeasureTheory.Measure.map (history A Y n) P) (MeasureTheory.Measure.map (history Aβ‚€ Yβ‚€ n) Pβ‚€)
Learning.IsAlgEnvSeq.absolutelyContinuous_map_history.{u_1, u_2, u_3, u_4} {𝓐 : Type u_1} {𝓨 : Type u_2} [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] {Ξ© : Type u_3} [MeasurableSpace Ξ©] {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {A : β„• β†’ Ξ© β†’ 𝓐} {Y : β„• β†’ Ξ© β†’ 𝓨} {P : MeasureTheory.Measure Ξ©} [MeasureTheory.IsFiniteMeasure P] {Ξ©β‚€ : Type u_4} [MeasurableSpace Ξ©β‚€] {algβ‚€ : Algorithm 𝓐 𝓨} {Aβ‚€ : β„• β†’ Ξ©β‚€ β†’ 𝓐} {Yβ‚€ : β„• β†’ Ξ©β‚€ β†’ 𝓨} {Pβ‚€ : MeasureTheory.Measure Ξ©β‚€} [MeasureTheory.IsProbabilityMeasure Pβ‚€] (h : IsAlgEnvSeq A Y alg env P) (hβ‚€ : IsAlgEnvSeq Aβ‚€ Yβ‚€ algβ‚€ env Pβ‚€) (hc : Algorithm.AbsolutelyContinuous alg algβ‚€) (n : β„•) : MeasureTheory.Measure.AbsolutelyContinuous (MeasureTheory.Measure.map (history A Y n) P) (MeasureTheory.Measure.map (history Aβ‚€ Yβ‚€ n) Pβ‚€)

Code

lemma absolutelyContinuous_map_history (h : IsAlgEnvSeq A Y alg env P)
    (hβ‚€ : IsAlgEnvSeq Aβ‚€ Yβ‚€ algβ‚€ env Pβ‚€) (hc : alg β‰ͺₐ algβ‚€) (n : β„•) :
    P.map (history A Y n) β‰ͺ Pβ‚€.map (history Aβ‚€ Yβ‚€ n)
Type uses (5)
Body uses (13)

Actions: Source Β· Open Issue

Proof
by
  induction n with
  | zero =>
    rw [h.hasLaw_history_zero.map_eq, hβ‚€.hasLaw_history_zero.map_eq]
    apply Measure.AbsolutelyContinuous.map _ (by fun_prop)
    rw [h.hasLaw_step_zero.map_eq, hβ‚€.hasLaw_step_zero.map_eq]
    exact Measure.AbsolutelyContinuous.compProd_left hc.p0 _
  | succ n ih =>
    simp_rw [history_succ]
    rw [← Measure.map_map (by fun_prop), ← Measure.map_map (by fun_prop)]
    rotate_left
    Β· exact (hβ‚€.measurable_history n).prodMk (hβ‚€.measurable_step (n + 1))
    Β· exact (h.measurable_history n).prodMk (h.measurable_step (n + 1))
    apply Measure.AbsolutelyContinuous.map _ (by fun_prop)
    rw [(h.hasCondDistrib_step n).map_eq, (hβ‚€.hasCondDistrib_step n).map_eq]
    apply Measure.AbsolutelyContinuous.compProd ih
    filter_upwards with h' using Measure.AbsolutelyContinuous.compProd_left_apply (hc.policy n h') _

hasLaw_history_withDensityπŸ”—

LemmaLearning.IsAlgEnvSeq.hasLaw_history_withDensity

No docstring.

πŸ”—theorem
Learning.IsAlgEnvSeq.hasLaw_history_withDensity.{u_1, u_2, u_3, u_4} {𝓐 : Type u_1} {𝓨 : Type u_2} [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] {Ξ© : Type u_3} [MeasurableSpace Ξ©] {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {A : β„• β†’ Ξ© β†’ 𝓐} {Y : β„• β†’ Ξ© β†’ 𝓨} {P : MeasureTheory.Measure Ξ©} [MeasureTheory.IsFiniteMeasure P] {Ξ©β‚€ : Type u_4} [MeasurableSpace Ξ©β‚€] {algβ‚€ : Algorithm 𝓐 𝓨} {Aβ‚€ : β„• β†’ Ξ©β‚€ β†’ 𝓐} {Yβ‚€ : β„• β†’ Ξ©β‚€ β†’ 𝓨} {Pβ‚€ : MeasureTheory.Measure Ξ©β‚€} [MeasureTheory.IsProbabilityMeasure Pβ‚€] [MeasurableSpace.CountablyGenerated 𝓐] (h : IsAlgEnvSeq A Y alg env P) (hβ‚€ : IsAlgEnvSeq Aβ‚€ Yβ‚€ algβ‚€ env Pβ‚€) (hc : Algorithm.AbsolutelyContinuous alg algβ‚€) (n : β„•) : ProbabilityTheory.HasLaw (history A Y n) (MeasureTheory.Measure.withDensity (MeasureTheory.Measure.map (history Aβ‚€ Yβ‚€ n) Pβ‚€) (Algorithm.density alg algβ‚€ n)) P
Learning.IsAlgEnvSeq.hasLaw_history_withDensity.{u_1, u_2, u_3, u_4} {𝓐 : Type u_1} {𝓨 : Type u_2} [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] {Ξ© : Type u_3} [MeasurableSpace Ξ©] {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {A : β„• β†’ Ξ© β†’ 𝓐} {Y : β„• β†’ Ξ© β†’ 𝓨} {P : MeasureTheory.Measure Ξ©} [MeasureTheory.IsFiniteMeasure P] {Ξ©β‚€ : Type u_4} [MeasurableSpace Ξ©β‚€] {algβ‚€ : Algorithm 𝓐 𝓨} {Aβ‚€ : β„• β†’ Ξ©β‚€ β†’ 𝓐} {Yβ‚€ : β„• β†’ Ξ©β‚€ β†’ 𝓨} {Pβ‚€ : MeasureTheory.Measure Ξ©β‚€} [MeasureTheory.IsProbabilityMeasure Pβ‚€] [MeasurableSpace.CountablyGenerated 𝓐] (h : IsAlgEnvSeq A Y alg env P) (hβ‚€ : IsAlgEnvSeq Aβ‚€ Yβ‚€ algβ‚€ env Pβ‚€) (hc : Algorithm.AbsolutelyContinuous alg algβ‚€) (n : β„•) : ProbabilityTheory.HasLaw (history A Y n) (MeasureTheory.Measure.withDensity (MeasureTheory.Measure.map (history Aβ‚€ Yβ‚€ n) Pβ‚€) (Algorithm.density alg algβ‚€ n)) P

Code

lemma hasLaw_history_withDensity (h : IsAlgEnvSeq A Y alg env P)
    (hβ‚€ : IsAlgEnvSeq Aβ‚€ Yβ‚€ algβ‚€ env Pβ‚€) (hc : alg β‰ͺₐ algβ‚€) (n : β„•) : HasLaw (history A Y n)
      ((Pβ‚€.map (history Aβ‚€ Yβ‚€ n)).withDensity (alg.density algβ‚€ n)) P where
  aemeasurable
Type uses (6)
Body uses (20)
Used by (1)

Actions: Source Β· Open Issue

Proof
(h.measurable_history n).aemeasurable
  map_eq := by
    induction n with
    | zero =>
      rw [h.hasLaw_history_zero.map_eq, hβ‚€.hasLaw_history_zero.map_eq, h.hasLaw_step_zero.map_eq,
        hβ‚€.hasLaw_step_zero.map_eq]
      rw [← Measure.withDensity_rnDeriv_eq _ _ hc.p0,
        Measure.compProd_withDensity_left (by fun_prop)]
      exact map_equiv_withDensity (by fun_prop)
    | succ n ih =>
      let ρ h' (ar : 𝓐 Γ— 𝓨) := Kernel.rnDeriv (alg.policy n) (algβ‚€.policy n) h' ar.1
      have hs : stepKernel alg env n = (stepKernel algβ‚€ env n).withDensity ρ := by
        rw [stepKernel, ← Kernel.withDensity_rnDeriv_eq' (hc.policy n)]
        exact Kernel.compProd_withDensity_left (Kernel.measurable_rnDeriv _ _)
      have : IsMarkovKernel ((stepKernel algβ‚€ env n).withDensity ρ) := by
        rw [← hs]
        infer_instance
      simp_rw [history_succ]
      rw [← Measure.map_map (by fun_prop), ← Measure.map_map (by fun_prop)]
      rotate_left
      Β· exact (hβ‚€.measurable_history n).prodMk (hβ‚€.measurable_step (n + 1))
      Β· exact (h.measurable_history n).prodMk (h.measurable_step (n + 1))
      rw [(h.hasCondDistrib_step n).map_eq, (hβ‚€.hasCondDistrib_step n).map_eq, ih, hs,
        Measure.compProd_withDensity_withDensity (by fun_prop) (by fun_prop)]
      exact map_equiv_withDensity (by fun_prop)
  1. Learning.Algorithm.AbsolutelyContinuous
  2. Learning.Algorithm.Β«term_β‰ͺₐ_Β»
  3. Learning.Algorithm.density
  4. Learning.Algorithm.measurable_density
  5. Learning.IsAlgEnvSeq.absolutelyContinuous_map_history
  6. Learning.IsAlgEnvSeq.hasLaw_history_withDensity