Learning.instIsDeterministicEnvEvalEnv
This page has the declaration's own card below, then its dependency graph, then a card for each dependency (type dependencies first, then the rest of the transitive closure). For a theorem, the graph and the dependency cards only follow its statement's dependencies (its proof is replaced by sorry, so what it proves doesn't depend on how); for everything else, both the type and the body/value are followed, since their content is part of what later declarations build on.
instIsDeterministicEnvEvalEnv๐
Learning.instIsDeterministicEnvEvalEnvNo docstring.
Learning.instIsDeterministicEnvEvalEnv.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {f : ๐ โ ๐จ} {hf : Measurable f} : IsDeterministicEnv (evalEnv f hf)Learning.instIsDeterministicEnvEvalEnv.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {f : ๐ โ ๐จ} {hf : Measurable f} : IsDeterministicEnv (evalEnv f hf)
Code
instance : IsDeterministicEnv (evalEnv f hf)
Type uses (2)
Body uses (2)
Used by (2)
Actions: Source ยท Open Issue
Proof
by unfold evalEnv; infer_instance
Dependency graph
Type dependencies (2)
IsDeterministicEnv๐
Learning.IsDeterministicEnvAn environment is deterministic if its initial feedbacks are determined by measurable functions (and not possibly random kernels).
Learning.IsDeterministicEnv.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (env : Environment ๐ ๐จ) : PropLearning.IsDeterministicEnv.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (env : Environment ๐ ๐จ) : Prop
Code
class IsDeterministicEnv (env : Environment ๐ ๐จ) : Prop where
exists_f0 : โ (f0 : ๐ โ ๐จ) (hf0 : Measurable f0), env.ฮฝ0 = Kernel.deterministic f0 hf0
exists_f : โ n, โ (f : ((Iic n โ ๐ ร ๐จ) ร ๐) โ ๐จ) (hf : Measurable f),
env.feedback n = Kernel.deterministic f hfType uses (1)
Used by (11)
Actions: Source ยท Open Issue
evalEnv๐
Learning.evalEnv
The evaluation environment where the feedback is given by evaluating a fixed measurable function
f at the chosen action.
Learning.evalEnv.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (f : ๐ โ ๐จ) (hf : Measurable f) : Environment ๐ ๐จLearning.evalEnv.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (f : ๐ โ ๐จ) (hf : Measurable f) : Environment ๐ ๐จ
Code
noncomputable def evalEnv (f : ๐ โ ๐จ) (hf : Measurable f) := onlineEvalEnv (fun _ โฆ f) (fun _ โฆ hf)
Type uses (1)
Body uses (1)
Used by (9)
Actions: Source ยท Open Issue
All dependencies, transitively (4)
Environment๐
Learning.EnvironmentA stochastic environment.
Learning.Environment.{u_4, u_5} (๐ : Type u_4) (๐จ : Type u_5) [MeasurableSpace ๐] [MeasurableSpace ๐จ] : Type (max u_4 u_5)Learning.Environment.{u_4, u_5} (๐ : Type u_4) (๐จ : Type u_5) [MeasurableSpace ๐] [MeasurableSpace ๐จ] : Type (max u_4 u_5)
Code
structure Environment (๐ ๐จ : Type*) [MeasurableSpace ๐] [MeasurableSpace ๐จ] where /-- Distribution of the next observation as function of the past history. -/ feedback : (n : โ) โ Kernel ((Iic n โ ๐ ร ๐จ) ร ๐) ๐จ /-- The feedback kernels are Markov kernels. -/ [h_feedback : โ n, IsMarkovKernel (feedback n)] /-- Distribution of the first observation given the first action. -/ ฮฝ0 : Kernel ๐ ๐จ /-- The initial observation kernel is a Markov kernel. -/ [hp0 : IsMarkovKernel ฮฝ0]
Actions: Source ยท Open Issue
obliviousEnv๐
Learning.obliviousEnvAn oblivious environment, in which the distribution of the next feedback depends only on the last action, but in a possibly time-dependent manner.
Learning.obliviousEnv.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (ฮฝ : โ โ ProbabilityTheory.Kernel ๐ ๐จ) [โ (n : โ), ProbabilityTheory.IsMarkovKernel (ฮฝ n)] : Environment ๐ ๐จLearning.obliviousEnv.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (ฮฝ : โ โ ProbabilityTheory.Kernel ๐ ๐จ) [โ (n : โ), ProbabilityTheory.IsMarkovKernel (ฮฝ n)] : Environment ๐ ๐จ
Code
def obliviousEnv (ฮฝ : โ โ Kernel ๐ ๐จ) [โ n, IsMarkovKernel (ฮฝ n)] : Environment ๐ ๐จ where feedback n := (ฮฝ (n + 1)).prodMkLeft _ ฮฝ0 := ฮฝ 0
Type uses (1)
Used by (10)
Actions: Source ยท Open Issue
onlineEvalEnv๐
Learning.onlineEvalEnv
The evaluation environment where the feedback is given by evaluating a fixed measurable function
f at the chosen action.
Learning.onlineEvalEnv.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (g : โ โ ๐ โ ๐จ) (hg : โ (n : โ), Measurable (g n)) : Environment ๐ ๐จLearning.onlineEvalEnv.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (g : โ โ ๐ โ ๐จ) (hg : โ (n : โ), Measurable (g n)) : Environment ๐ ๐จ
Code
noncomputable def onlineEvalEnv (g : โ โ ๐ โ ๐จ) (hg : โ n, Measurable (g n)) := obliviousEnv (fun n โฆ Kernel.deterministic (g n) (hg n))
Type uses (1)
Body uses (1)
Used by (11)
Actions: Source ยท Open Issue
instIsDeterministicEnvOnlineEvalEnv๐
Learning.instIsDeterministicEnvOnlineEvalEnvNo docstring.
Learning.instIsDeterministicEnvOnlineEvalEnv.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {g : โ โ ๐ โ ๐จ} {hg : โ (n : โ), Measurable (g n)} : IsDeterministicEnv (onlineEvalEnv g hg)Learning.instIsDeterministicEnvOnlineEvalEnv.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {g : โ โ ๐ โ ๐จ} {hg : โ (n : โ), Measurable (g n)} : IsDeterministicEnv (onlineEvalEnv g hg)
Code
instance : IsDeterministicEnv (onlineEvalEnv g hg) where exists_f0
Type uses (2)
Body uses (1)
Actions: Source ยท Open Issue
Proof
โจg 0, hg 0, rflโฉ exists_f n := โจfun p โฆ g (n + 1) p.2, by fun_prop, rflโฉ