LeanMachineLearning exposition

Learning.feedbackCondAction_obliviousEnvπŸ”—

This page has the declaration's own card below, then its dependency graph, then a card for each dependency (type dependencies first, then the rest of the transitive closure). For a theorem, the graph and the dependency cards only follow its statement's dependencies (its proof is replaced by sorry, so what it proves doesn't depend on how); for everything else, both the type and the body/value are followed, since their content is part of what later declarations build on.

Minimal Lean file

feedbackCondAction_obliviousEnvπŸ”—

LemmaLearning.feedbackCondAction_obliviousEnv

No docstring.

πŸ”—theorem
Learning.feedbackCondAction_obliviousEnv.{u_1, u_2} {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (Ξ½ : β„• β†’ ProbabilityTheory.Kernel 𝓐 𝓨) [hΞ½ : βˆ€ (n : β„•), ProbabilityTheory.IsMarkovKernel (Ξ½ n)] (n : β„•) : feedbackCondAction (obliviousEnv Ξ½) n = Ξ½ n
Learning.feedbackCondAction_obliviousEnv.{u_1, u_2} {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (Ξ½ : β„• β†’ ProbabilityTheory.Kernel 𝓐 𝓨) [hΞ½ : βˆ€ (n : β„•), ProbabilityTheory.IsMarkovKernel (Ξ½ n)] (n : β„•) : feedbackCondAction (obliviousEnv Ξ½) n = Ξ½ n

Code

lemma feedbackCondAction_obliviousEnv (Ξ½ : β„• β†’ Kernel 𝓐 𝓨) [hΞ½ : βˆ€ n, IsMarkovKernel (Ξ½ n)]
    (n : β„•) :
    feedbackCondAction (obliviousEnv Ξ½) n = Ξ½ n
Type uses (3)
Body uses (6)
Used by (2)

Actions: Source Β· Open Issue

Proof
by
  rcases isEmpty_or_nonempty 𝓐 with h𝓐 | h𝓐
  Β· ext a : 1
    exact h𝓐.elim a
  rcases isEmpty_or_nonempty 𝓨 with hR | hR
  Β· refine absurd (hΞ½ 0) ?_
    simp only [Subsingleton.eq_zero Ξ½, Pi.zero_apply]
    exact Kernel.not_isMarkovKernel_zero
  have : Nonempty (Iic n β†’ 𝓐 Γ— 𝓨) := ⟨fun _ ↦ (h𝓐.some, hR.some)⟩
  have h_eq_zero := Ξ½0_eq_feedbackCondAction (obliviousEnv Ξ½)
  have h_eq := feedback_eq_feedbackCondAction (obliviousEnv Ξ½) (n - 1)
  cases n with
  | zero => exact h_eq_zero.symm
  | succ n =>
    simp only [Nat.add_one_sub_one, obliviousEnv_feedback, add_tsub_cancel_right] at h_eq
    rw [← Kernel.prodMkLeft_inj (Ξ³ := Iic n β†’ 𝓐 Γ— 𝓨)]
    exact h_eq.symm

Dependency graph

Type dependencies (3)

feedbackCondActionπŸ”—

DefinitionLearning.feedbackCondAction

The kernel representing the conditional distribution of the feedback given the action at time n in an oblivious environment.

πŸ”—def
Learning.feedbackCondAction.{u_1, u_2} {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (env : Environment 𝓐 𝓨) [h_obl : IsObliviousEnv env] (n : β„•) : ProbabilityTheory.Kernel 𝓐 𝓨
Learning.feedbackCondAction.{u_1, u_2} {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (env : Environment 𝓐 𝓨) [h_obl : IsObliviousEnv env] (n : β„•) : ProbabilityTheory.Kernel 𝓐 𝓨

Code

noncomputable
def feedbackCondAction (env : Environment 𝓐 𝓨) [h_obl : IsObliviousEnv env] (n : β„•) : Kernel 𝓐 𝓨 :=
  h_obl.exists_eq_prodMkLeft.choose n
Type uses (2)
Used by (12)

Actions: Source Β· Open Issue

obliviousEnvπŸ”—

DefinitionLearning.obliviousEnv

An oblivious environment, in which the distribution of the next feedback depends only on the last action, but in a possibly time-dependent manner.

πŸ”—def
Learning.obliviousEnv.{u_1, u_2} {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (Ξ½ : β„• β†’ ProbabilityTheory.Kernel 𝓐 𝓨) [βˆ€ (n : β„•), ProbabilityTheory.IsMarkovKernel (Ξ½ n)] : Environment 𝓐 𝓨
Learning.obliviousEnv.{u_1, u_2} {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (Ξ½ : β„• β†’ ProbabilityTheory.Kernel 𝓐 𝓨) [βˆ€ (n : β„•), ProbabilityTheory.IsMarkovKernel (Ξ½ n)] : Environment 𝓐 𝓨

Code

def obliviousEnv (Ξ½ : β„• β†’ Kernel 𝓐 𝓨) [βˆ€ n, IsMarkovKernel (Ξ½ n)] : Environment 𝓐 𝓨 where
  feedback n := (Ξ½ (n + 1)).prodMkLeft _
  Ξ½0 := Ξ½ 0
Type uses (1)
Used by (10)

Actions: Source Β· Open Issue

instIsObliviousEnvObliviousEnvπŸ”—

InstanceLearning.instIsObliviousEnvObliviousEnv

No docstring.

πŸ”—theorem
Learning.instIsObliviousEnvObliviousEnv.{u_1, u_2} {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (Ξ½ : β„• β†’ ProbabilityTheory.Kernel 𝓐 𝓨) [βˆ€ (n : β„•), ProbabilityTheory.IsMarkovKernel (Ξ½ n)] : IsObliviousEnv (obliviousEnv Ξ½)
Learning.instIsObliviousEnvObliviousEnv.{u_1, u_2} {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (Ξ½ : β„• β†’ ProbabilityTheory.Kernel 𝓐 𝓨) [βˆ€ (n : β„•), ProbabilityTheory.IsMarkovKernel (Ξ½ n)] : IsObliviousEnv (obliviousEnv Ξ½)

Code

instance (Ξ½ : β„• β†’ Kernel 𝓐 𝓨) [βˆ€ n, IsMarkovKernel (Ξ½ n)] :
    IsObliviousEnv (obliviousEnv Ξ½) where
  exists_eq_prodMkLeft
Type uses (2)
Body uses (1)
Used by (1)

Actions: Source Β· Open Issue

Proof
⟨fun n ↦ Ξ½ n, inferInstance,rfl, fun _ ↦ rfl⟩

All dependencies, transitively (2)

EnvironmentπŸ”—

StructureLearning.Environment

A stochastic environment.

πŸ”—structure
Learning.Environment.{u_4, u_5} (𝓐 : Type u_4) (𝓨 : Type u_5) [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] : Type (max u_4 u_5)
Learning.Environment.{u_4, u_5} (𝓐 : Type u_4) (𝓨 : Type u_5) [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] : Type (max u_4 u_5)

Code

structure Environment (𝓐 𝓨 : Type*) [MeasurableSpace 𝓐] [MeasurableSpace 𝓨] where
  /-- Distribution of the next observation as function of the past history. -/
  feedback : (n : β„•) β†’ Kernel ((Iic n β†’ 𝓐 Γ— 𝓨) Γ— 𝓐) 𝓨
  /-- The feedback kernels are Markov kernels. -/
  [h_feedback : βˆ€ n, IsMarkovKernel (feedback n)]
  /-- Distribution of the first observation given the first action. -/
  Ξ½0 : Kernel 𝓐 𝓨
  /-- The initial observation kernel is a Markov kernel. -/
  [hp0 : IsMarkovKernel Ξ½0]
Used by (128)

Actions: Source Β· Open Issue

IsObliviousEnvπŸ”—

Type ClassLearning.IsObliviousEnv

An environment is oblivious if the distribution of the next feedback depends only on the last action and not on the past history.

πŸ”—type class
Learning.IsObliviousEnv.{u_1, u_2} {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (env : Environment 𝓐 𝓨) : Prop
Learning.IsObliviousEnv.{u_1, u_2} {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (env : Environment 𝓐 𝓨) : Prop

Code

class IsObliviousEnv (env : Environment 𝓐 𝓨) : Prop where
  exists_eq_prodMkLeft : βˆƒ Ξ½ : β„• β†’ Kernel 𝓐 𝓨, (βˆ€ n, IsMarkovKernel (Ξ½ n)) ∧
    (env.Ξ½0 = Ξ½ 0) ∧ (βˆ€ n, env.feedback n = (Ξ½ (n + 1)).prodMkLeft _)
Type uses (1)
Used by (13)

Actions: Source Β· Open Issue