LeanMachineLearning exposition

Learning.instIsDeterministicEnvEvalEnv๐Ÿ”—

This page has the declaration's own card below, then its dependency graph, then a card for each dependency (type dependencies first, then the rest of the transitive closure). For a theorem, the graph and the dependency cards only follow its statement's dependencies (its proof is replaced by sorry, so what it proves doesn't depend on how); for everything else, both the type and the body/value are followed, since their content is part of what later declarations build on.

Minimal Lean file

instIsDeterministicEnvEvalEnv๐Ÿ”—

InstanceLearning.instIsDeterministicEnvEvalEnv

No docstring.

๐Ÿ”—theorem
Learning.instIsDeterministicEnvEvalEnv.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {f : ๐“ โ†’ ๐“จ} {hf : Measurable f} : IsDeterministicEnv (evalEnv f hf)
Learning.instIsDeterministicEnvEvalEnv.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {f : ๐“ โ†’ ๐“จ} {hf : Measurable f} : IsDeterministicEnv (evalEnv f hf)

Code

instance : IsDeterministicEnv (evalEnv f hf)
Type uses (2)
Body uses (2)
Used by (2)

Actions: Source ยท Open Issue

Proof
by unfold evalEnv; infer_instance

Dependency graph

Type dependencies (2)

IsDeterministicEnv๐Ÿ”—

Type ClassLearning.IsDeterministicEnv

An environment is deterministic if its initial feedbacks are determined by measurable functions (and not possibly random kernels).

๐Ÿ”—type class
Learning.IsDeterministicEnv.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (env : Environment ๐“ ๐“จ) : Prop
Learning.IsDeterministicEnv.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (env : Environment ๐“ ๐“จ) : Prop

Code

class IsDeterministicEnv (env : Environment ๐“ ๐“จ) : Prop where
  exists_f0 : โˆƒ (f0 : ๐“ โ†’ ๐“จ) (hf0 : Measurable f0), env.ฮฝ0 = Kernel.deterministic f0 hf0
  exists_f : โˆ€ n, โˆƒ (f : ((Iic n โ†’ ๐“ ร— ๐“จ) ร— ๐“) โ†’ ๐“จ) (hf : Measurable f),
    env.feedback n = Kernel.deterministic f hf
Type uses (1)
Used by (11)

Actions: Source ยท Open Issue

evalEnv๐Ÿ”—

DefinitionLearning.evalEnv

The evaluation environment where the feedback is given by evaluating a fixed measurable function f at the chosen action.

๐Ÿ”—def
Learning.evalEnv.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (f : ๐“ โ†’ ๐“จ) (hf : Measurable f) : Environment ๐“ ๐“จ
Learning.evalEnv.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (f : ๐“ โ†’ ๐“จ) (hf : Measurable f) : Environment ๐“ ๐“จ

Code

noncomputable def evalEnv (f : ๐“ โ†’ ๐“จ) (hf : Measurable f) := onlineEvalEnv (fun _ โ†ฆ f) (fun _ โ†ฆ hf)
Type uses (1)
Body uses (1)
Used by (9)

Actions: Source ยท Open Issue

All dependencies, transitively (4)

Environment๐Ÿ”—

StructureLearning.Environment

A stochastic environment.

๐Ÿ”—structure
Learning.Environment.{u_4, u_5} (๐“ : Type u_4) (๐“จ : Type u_5) [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] : Type (max u_4 u_5)
Learning.Environment.{u_4, u_5} (๐“ : Type u_4) (๐“จ : Type u_5) [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] : Type (max u_4 u_5)

Code

structure Environment (๐“ ๐“จ : Type*) [MeasurableSpace ๐“] [MeasurableSpace ๐“จ] where
  /-- Distribution of the next observation as function of the past history. -/
  feedback : (n : โ„•) โ†’ Kernel ((Iic n โ†’ ๐“ ร— ๐“จ) ร— ๐“) ๐“จ
  /-- The feedback kernels are Markov kernels. -/
  [h_feedback : โˆ€ n, IsMarkovKernel (feedback n)]
  /-- Distribution of the first observation given the first action. -/
  ฮฝ0 : Kernel ๐“ ๐“จ
  /-- The initial observation kernel is a Markov kernel. -/
  [hp0 : IsMarkovKernel ฮฝ0]
Used by (128)

Actions: Source ยท Open Issue

obliviousEnv๐Ÿ”—

DefinitionLearning.obliviousEnv

An oblivious environment, in which the distribution of the next feedback depends only on the last action, but in a possibly time-dependent manner.

๐Ÿ”—def
Learning.obliviousEnv.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (ฮฝ : โ„• โ†’ ProbabilityTheory.Kernel ๐“ ๐“จ) [โˆ€ (n : โ„•), ProbabilityTheory.IsMarkovKernel (ฮฝ n)] : Environment ๐“ ๐“จ
Learning.obliviousEnv.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (ฮฝ : โ„• โ†’ ProbabilityTheory.Kernel ๐“ ๐“จ) [โˆ€ (n : โ„•), ProbabilityTheory.IsMarkovKernel (ฮฝ n)] : Environment ๐“ ๐“จ

Code

def obliviousEnv (ฮฝ : โ„• โ†’ Kernel ๐“ ๐“จ) [โˆ€ n, IsMarkovKernel (ฮฝ n)] : Environment ๐“ ๐“จ where
  feedback n := (ฮฝ (n + 1)).prodMkLeft _
  ฮฝ0 := ฮฝ 0
Type uses (1)
Used by (10)

Actions: Source ยท Open Issue

onlineEvalEnv๐Ÿ”—

DefinitionLearning.onlineEvalEnv

The evaluation environment where the feedback is given by evaluating a fixed measurable function f at the chosen action.

๐Ÿ”—def
Learning.onlineEvalEnv.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (g : โ„• โ†’ ๐“ โ†’ ๐“จ) (hg : โˆ€ (n : โ„•), Measurable (g n)) : Environment ๐“ ๐“จ
Learning.onlineEvalEnv.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (g : โ„• โ†’ ๐“ โ†’ ๐“จ) (hg : โˆ€ (n : โ„•), Measurable (g n)) : Environment ๐“ ๐“จ

Code

noncomputable def onlineEvalEnv (g : โ„• โ†’ ๐“ โ†’ ๐“จ) (hg : โˆ€ n, Measurable (g n)) :=
  obliviousEnv (fun n โ†ฆ Kernel.deterministic (g n) (hg n))
Type uses (1)
Body uses (1)
Used by (11)

Actions: Source ยท Open Issue

instIsDeterministicEnvOnlineEvalEnv๐Ÿ”—

InstanceLearning.instIsDeterministicEnvOnlineEvalEnv

No docstring.

๐Ÿ”—theorem
Learning.instIsDeterministicEnvOnlineEvalEnv.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {g : โ„• โ†’ ๐“ โ†’ ๐“จ} {hg : โˆ€ (n : โ„•), Measurable (g n)} : IsDeterministicEnv (onlineEvalEnv g hg)
Learning.instIsDeterministicEnvOnlineEvalEnv.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {g : โ„• โ†’ ๐“ โ†’ ๐“จ} {hg : โˆ€ (n : โ„•), Measurable (g n)} : IsDeterministicEnv (onlineEvalEnv g hg)

Code

instance : IsDeterministicEnv (onlineEvalEnv g hg) where
  exists_f0
Type uses (2)
Body uses (1)
Used by (3)

Actions: Source ยท Open Issue

Proof
โŸจg 0, hg 0, rflโŸฉ
  exists_f n := โŸจfun p โ†ฆ g (n + 1) p.2, by fun_prop, rflโŸฉ