Bandits.TS.initialPolicy
This page has the declaration's own card below, then its dependency graph, then a card for each dependency (type dependencies first, then the rest of the transitive closure). For a theorem, the graph and the dependency cards only follow its statement's dependencies (its proof is replaced by sorry, so what it proves doesn't depend on how); for everything else, both the type and the body/value are followed, since their content is part of what later declarations build on.
initialPolicy🔗
Bandits.TS.initialPolicyThe initial action is sampled according to its probability of being optimal under the prior over environments.
Bandits.TS.initialPolicy.{u_1} {K : ℕ} {𝓔 : Type u_1} [MeasurableSpace 𝓔] (hK : 0 < K) (Q : MeasureTheory.Measure 𝓔) (κ : ProbabilityTheory.Kernel (𝓔 × Fin K) ℝ) : MeasureTheory.Measure (Fin K)Bandits.TS.initialPolicy.{u_1} {K : ℕ} {𝓔 : Type u_1} [MeasurableSpace 𝓔] (hK : 0 < K) (Q : MeasureTheory.Measure 𝓔) (κ : ProbabilityTheory.Kernel (𝓔 × Fin K) ℝ) : MeasureTheory.Measure (Fin K)
Code
noncomputable def TS.initialPolicy (hK : 0 < K) (Q : Measure 𝓔) (κ : Kernel (𝓔 × Fin K) ℝ) : Measure (Fin K) := have : Nonempty (Fin K) := Fin.pos_iff_nonempty.mp hK Q.map (bestAction κ id)
Body uses (1)
Used by (2)
Actions: Source · Open Issue
Dependency graph
All dependencies, transitively (5)
max🔗
Function.maxThe maximum value of a tuple.
Function.max.{u_1, u_2} {ι : Type u_1} {α : Type u_2} [LinearOrder α] [Fintype ι] [Nonempty ι] (f : ι → α) : αFunction.max.{u_1, u_2} {ι : Type u_1} {α : Type u_2} [LinearOrder α] [Fintype ι] [Nonempty ι] (f : ι → α) : α
Code
abbrev max : α := univ.sup' univ_nonempty f
Used by (8)
Actions: Source · Open Issue
exists_argmax🔗
exists_argmaxNo docstring.
exists_argmax.{u_1, u_2} {ι : Type u_1} {α : Type u_2} [LinearOrder α] [Fintype ι] [Nonempty ι] (f : ι → α) : ∃ i, f i = Function.max fexists_argmax.{u_1, u_2} {ι : Type u_1} {α : Type u_2} [LinearOrder α] [Fintype ι] [Nonempty ι] (f : ι → α) : ∃ i, f i = Function.max f
Code
lemma exists_argmax : ∃ i, f i = f.max
Type uses (1)
Used by (3)
Actions: Source · Open Issue
Proof
by obtain ⟨i, -, hi⟩ := Finset.exists_mem_eq_sup' (by simp : Finset.univ.Nonempty) f exact ⟨i, hi.symm⟩
argmax🔗
argmaxThe index of the maximum value of a tuple.
argmax.{u_1, u_2} {ι : Type u_1} {α : Type u_2} [LinearOrder α] [Fintype ι] [Nonempty ι] (f : ι → α) : ιargmax.{u_1, u_2} {ι : Type u_1} {α : Type u_2} [LinearOrder α] [Fintype ι] [Nonempty ι] (f : ι → α) : ι
Code
noncomputable def argmax := (exists_argmax f).choose
Body uses (2)
Used by (17)
Actions: Source · Open Issue
actionMean🔗
Learning.IsBayesAlgEnvSeq.actionMean
A random variable that gives the mean feedback of action a.
Learning.IsBayesAlgEnvSeq.actionMean.{u_1, u_2, u_4} {𝓔 : Type u_1} {𝓐 : Type u_2} {Ω : Type u_4} [MeasurableSpace 𝓔] [MeasurableSpace 𝓐] (κ : ProbabilityTheory.Kernel (𝓔 × 𝓐) ℝ) (E : Ω → 𝓔) (a : 𝓐) (ω : Ω) : ℝLearning.IsBayesAlgEnvSeq.actionMean.{u_1, u_2, u_4} {𝓔 : Type u_1} {𝓐 : Type u_2} {Ω : Type u_4} [MeasurableSpace 𝓔] [MeasurableSpace 𝓐] (κ : ProbabilityTheory.Kernel (𝓔 × 𝓐) ℝ) (E : Ω → 𝓔) (a : 𝓐) (ω : Ω) : ℝ
Code
noncomputable def actionMean (κ : Kernel (𝓔 × 𝓐) ℝ) (E : Ω → 𝓔) (a : 𝓐) (ω : Ω) : ℝ := (κ (E ω, a))[id]
Actions: Source · Open Issue
bestAction🔗
Learning.IsBayesAlgEnvSeq.bestActionA random variable that gives the action with the highest mean feedback.
Learning.IsBayesAlgEnvSeq.bestAction.{u_1, u_2, u_4} {𝓔 : Type u_1} {𝓐 : Type u_2} {Ω : Type u_4} [MeasurableSpace 𝓔] [MeasurableSpace 𝓐] [Nonempty 𝓐] [Fintype 𝓐] (κ : ProbabilityTheory.Kernel (𝓔 × 𝓐) ℝ) (E : Ω → 𝓔) (ω : Ω) : 𝓐Learning.IsBayesAlgEnvSeq.bestAction.{u_1, u_2, u_4} {𝓔 : Type u_1} {𝓐 : Type u_2} {Ω : Type u_4} [MeasurableSpace 𝓔] [MeasurableSpace 𝓐] [Nonempty 𝓐] [Fintype 𝓐] (κ : ProbabilityTheory.Kernel (𝓔 × 𝓐) ℝ) (E : Ω → 𝓔) (ω : Ω) : 𝓐
Code
noncomputable def bestAction [Nonempty 𝓐] [Fintype 𝓐] (κ : Kernel (𝓔 × 𝓐) ℝ) (E : Ω → 𝓔) (ω : Ω) : 𝓐 := argmax (fun a ↦ actionMean κ E a ω)
Body uses (2)
Used by (12)
Actions: Source · Open Issue