LeanMachineLearning exposition

Learning.bayesStationaryEnvπŸ”—

This page has the declaration's own card below, then its dependency graph, then a card for each dependency (type dependencies first, then the rest of the transitive closure). For a theorem, the graph and the dependency cards only follow its statement's dependencies (its proof is replaced by sorry, so what it proves doesn't depend on how); for everything else, both the type and the body/value are followed, since their content is part of what later declarations build on.

Minimal Lean file

bayesStationaryEnvπŸ”—

DefinitionLearning.bayesStationaryEnv

An environment with observations in 𝓔 Γ— 𝓨. The first element e of an observation is sampled from Q once and remains constant. The second element of an observation is sampled from ΞΊ (e, a), where a is the corresponding action.

πŸ”—def
Learning.bayesStationaryEnv.{u_1, u_2, u_3} {𝓔 : Type u_1} {𝓐 : Type u_2} {𝓨 : Type u_3} [MeasurableSpace 𝓔] [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] (Q : MeasureTheory.Measure 𝓔) [MeasureTheory.IsProbabilityMeasure Q] (ΞΊ : ProbabilityTheory.Kernel (𝓔 Γ— 𝓐) 𝓨) [ProbabilityTheory.IsMarkovKernel ΞΊ] : Environment 𝓐 (𝓔 Γ— 𝓨)
Learning.bayesStationaryEnv.{u_1, u_2, u_3} {𝓔 : Type u_1} {𝓐 : Type u_2} {𝓨 : Type u_3} [MeasurableSpace 𝓔] [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] (Q : MeasureTheory.Measure 𝓔) [MeasureTheory.IsProbabilityMeasure Q] (ΞΊ : ProbabilityTheory.Kernel (𝓔 Γ— 𝓐) 𝓨) [ProbabilityTheory.IsMarkovKernel ΞΊ] : Environment 𝓐 (𝓔 Γ— 𝓨)

Code

noncomputable
def bayesStationaryEnv (Q : Measure 𝓔) [IsProbabilityMeasure Q] (ΞΊ : Kernel (𝓔 Γ— 𝓐) 𝓨)
    [IsMarkovKernel ΞΊ] : Environment 𝓐 (𝓔 Γ— 𝓨) where
  feedback n :=
    let g : (Iic n β†’ 𝓐 Γ— 𝓔 Γ— 𝓨) Γ— 𝓐 β†’ 𝓔 Γ— 𝓐 := fun (h, a) => ((h ⟨0, by simp⟩).2.1, a)
    (Kernel.deterministic (Prod.fst ∘ g) (by fun_prop)) Γ—β‚– (ΞΊ.comap g (by fun_prop))
  Ξ½0 := (Kernel.const _ Q) βŠ—β‚– ΞΊ.swapLeft
Type uses (1)
Used by (4)

Actions: Source Β· Open Issue

Dependency graph

Type dependencies (1)

EnvironmentπŸ”—

StructureLearning.Environment

A stochastic environment.

πŸ”—structure
Learning.Environment.{u_4, u_5} (𝓐 : Type u_4) (𝓨 : Type u_5) [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] : Type (max u_4 u_5)
Learning.Environment.{u_4, u_5} (𝓐 : Type u_4) (𝓨 : Type u_5) [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] : Type (max u_4 u_5)

Code

structure Environment (𝓐 𝓨 : Type*) [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] where
  /-- Distribution of the next observation as function of the past history. -/
  feedback : (n : β„•) β†’ Kernel ((Iic n β†’ 𝓐 Γ— 𝓨) Γ— 𝓐) 𝓨
  /-- The feedback kernels are Markov kernels. -/
  [h_feedback : βˆ€ n, IsMarkovKernel (feedback n)]
  /-- Distribution of the first observation given the first action. -/
  Ξ½0 : Kernel 𝓐 𝓨
  /-- The initial observation kernel is a Markov kernel. -/
  [hp0 : IsMarkovKernel Ξ½0]
Used by (128)

Actions: Source Β· Open Issue