Learning.IsBayesAlgEnvSeq.gap_eq_sub
This page has the declaration's own card below, then its dependency graph, then a card for each dependency (type dependencies first, then the rest of the transitive closure). For a theorem, the graph and the dependency cards only follow its statement's dependencies (its proof is replaced by sorry, so what it proves doesn't depend on how); for everything else, both the type and the body/value are followed, since their content is part of what later declarations build on.
gap_eq_sub๐
Learning.IsBayesAlgEnvSeq.gap_eq_subNo docstring.
Learning.IsBayesAlgEnvSeq.gap_eq_sub.{u_1, u_2, u_4} {๐ : Type u_1} {๐ : Type u_2} {ฮฉ : Type u_4} [MeasurableSpace ๐] [MeasurableSpace ๐] [Nonempty ๐] [Fintype ๐] {ฮบ : ProbabilityTheory.Kernel (๐ ร ๐) โ} {E : ฮฉ โ ๐} {A : โ โ ฮฉ โ ๐} {n : โ} {ฯ : ฮฉ} : gap ฮบ E A n ฯ = actionMean ฮบ E (bestAction ฮบ E ฯ) ฯ - actionMean ฮบ E (A n ฯ) ฯLearning.IsBayesAlgEnvSeq.gap_eq_sub.{u_1, u_2, u_4} {๐ : Type u_1} {๐ : Type u_2} {ฮฉ : Type u_4} [MeasurableSpace ๐] [MeasurableSpace ๐] [Nonempty ๐] [Fintype ๐] {ฮบ : ProbabilityTheory.Kernel (๐ ร ๐) โ} {E : ฮฉ โ ๐} {A : โ โ ฮฉ โ ๐} {n : โ} {ฯ : ฮฉ} : gap ฮบ E A n ฯ = actionMean ฮบ E (bestAction ฮบ E ฯ) ฯ - actionMean ฮบ E (A n ฯ) ฯ
Code
lemma gap_eq_sub [Nonempty ๐] [Fintype ๐] {ฮบ : Kernel (๐ ร ๐) โ} {E : ฮฉ โ ๐} {A : โ โ ฮฉ โ ๐}
{n : โ} {ฯ : ฮฉ} : gap ฮบ E A n ฯ =
actionMean ฮบ E (bestAction ฮบ E ฯ) ฯ - actionMean ฮบ E (A n ฯ) ฯType uses (3)
Body uses (2)
Used by (1)
Actions: Source ยท Open Issue
Proof
by rw [gap, Bandits.gap] congr apply le_antisymm ยท exact ciSup_le <| isMaxOn_argmax (fun a โฆ actionMean ฮบ E a ฯ) ยท exact Finite.le_ciSup (fun a โฆ actionMean ฮบ E a ฯ) _
Dependency graph
Type dependencies (3)
gap๐
Learning.IsBayesAlgEnvSeq.gap
A random variable that gives the gap at time n.
Learning.IsBayesAlgEnvSeq.gap.{u_1, u_2, u_4} {๐ : Type u_1} {๐ : Type u_2} {ฮฉ : Type u_4} [MeasurableSpace ๐] [MeasurableSpace ๐] (ฮบ : ProbabilityTheory.Kernel (๐ ร ๐) โ) (E : ฮฉ โ ๐) (A : โ โ ฮฉ โ ๐) (n : โ) (ฯ : ฮฉ) : โLearning.IsBayesAlgEnvSeq.gap.{u_1, u_2, u_4} {๐ : Type u_1} {๐ : Type u_2} {ฮฉ : Type u_4} [MeasurableSpace ๐] [MeasurableSpace ๐] (ฮบ : ProbabilityTheory.Kernel (๐ ร ๐) โ) (E : ฮฉ โ ๐) (A : โ โ ฮฉ โ ๐) (n : โ) (ฯ : ฮฉ) : โ
Code
noncomputable def gap (ฮบ : Kernel (๐ ร ๐) โ) (E : ฮฉ โ ๐) (A : โ โ ฮฉ โ ๐) (n : โ) (ฯ : ฮฉ) : โ := Bandits.gap (ฮบ.sectR (E ฯ)) (A n ฯ)
Body uses (1)
Used by (10)
Actions: Source ยท Open Issue
actionMean๐
Learning.IsBayesAlgEnvSeq.actionMean
A random variable that gives the mean feedback of action a.
Learning.IsBayesAlgEnvSeq.actionMean.{u_1, u_2, u_4} {๐ : Type u_1} {๐ : Type u_2} {ฮฉ : Type u_4} [MeasurableSpace ๐] [MeasurableSpace ๐] (ฮบ : ProbabilityTheory.Kernel (๐ ร ๐) โ) (E : ฮฉ โ ๐) (a : ๐) (ฯ : ฮฉ) : โLearning.IsBayesAlgEnvSeq.actionMean.{u_1, u_2, u_4} {๐ : Type u_1} {๐ : Type u_2} {ฮฉ : Type u_4} [MeasurableSpace ๐] [MeasurableSpace ๐] (ฮบ : ProbabilityTheory.Kernel (๐ ร ๐) โ) (E : ฮฉ โ ๐) (a : ๐) (ฯ : ฮฉ) : โ
Code
noncomputable def actionMean (ฮบ : Kernel (๐ ร ๐) โ) (E : ฮฉ โ ๐) (a : ๐) (ฯ : ฮฉ) : โ := (ฮบ (E ฯ, a))[id]
Actions: Source ยท Open Issue
bestAction๐
Learning.IsBayesAlgEnvSeq.bestActionA random variable that gives the action with the highest mean feedback.
Learning.IsBayesAlgEnvSeq.bestAction.{u_1, u_2, u_4} {๐ : Type u_1} {๐ : Type u_2} {ฮฉ : Type u_4} [MeasurableSpace ๐] [MeasurableSpace ๐] [Nonempty ๐] [Fintype ๐] (ฮบ : ProbabilityTheory.Kernel (๐ ร ๐) โ) (E : ฮฉ โ ๐) (ฯ : ฮฉ) : ๐Learning.IsBayesAlgEnvSeq.bestAction.{u_1, u_2, u_4} {๐ : Type u_1} {๐ : Type u_2} {ฮฉ : Type u_4} [MeasurableSpace ๐] [MeasurableSpace ๐] [Nonempty ๐] [Fintype ๐] (ฮบ : ProbabilityTheory.Kernel (๐ ร ๐) โ) (E : ฮฉ โ ๐) (ฯ : ฮฉ) : ๐
Code
noncomputable def bestAction [Nonempty ๐] [Fintype ๐] (ฮบ : Kernel (๐ ร ๐) โ) (E : ฮฉ โ ๐) (ฯ : ฮฉ) : ๐ := argmax (fun a โฆ actionMean ฮบ E a ฯ)
Body uses (2)
Used by (12)
Actions: Source ยท Open Issue
All dependencies, transitively (4)
gap๐
Bandits.gap
Gap of an action a: difference between the highest mean of the actions and the mean of a.
Bandits.gap.{u_1} {๐ : Type u_1} {m๐ : MeasurableSpace ๐} (ฮฝ : ProbabilityTheory.Kernel ๐ โ) (a : ๐) : โBandits.gap.{u_1} {๐ : Type u_1} {m๐ : MeasurableSpace ๐} (ฮฝ : ProbabilityTheory.Kernel ๐ โ) (a : ๐) : โ
Code
noncomputable def gap (ฮฝ : Kernel ๐ โ) (a : ๐) : โ := (โจ i, (ฮฝ i)[id]) - (ฮฝ a)[id]
Used by (27)
Actions: Source ยท Open Issue
max๐
Function.maxThe maximum value of a tuple.
Function.max.{u_1, u_2} {ฮน : Type u_1} {ฮฑ : Type u_2} [LinearOrder ฮฑ] [Fintype ฮน] [Nonempty ฮน] (f : ฮน โ ฮฑ) : ฮฑFunction.max.{u_1, u_2} {ฮน : Type u_1} {ฮฑ : Type u_2} [LinearOrder ฮฑ] [Fintype ฮน] [Nonempty ฮน] (f : ฮน โ ฮฑ) : ฮฑ
Code
abbrev max : ฮฑ := univ.sup' univ_nonempty f
Used by (8)
Actions: Source ยท Open Issue
exists_argmax๐
exists_argmaxNo docstring.
exists_argmax.{u_1, u_2} {ฮน : Type u_1} {ฮฑ : Type u_2} [LinearOrder ฮฑ] [Fintype ฮน] [Nonempty ฮน] (f : ฮน โ ฮฑ) : โ i, f i = Function.max fexists_argmax.{u_1, u_2} {ฮน : Type u_1} {ฮฑ : Type u_2} [LinearOrder ฮฑ] [Fintype ฮน] [Nonempty ฮน] (f : ฮน โ ฮฑ) : โ i, f i = Function.max f
Code
lemma exists_argmax : โ i, f i = f.max
Type uses (1)
Used by (3)
Actions: Source ยท Open Issue
Proof
by obtain โจi, -, hiโฉ := Finset.exists_mem_eq_sup' (by simp : Finset.univ.Nonempty) f exact โจi, hi.symmโฉ
argmax๐
argmaxThe index of the maximum value of a tuple.
argmax.{u_1, u_2} {ฮน : Type u_1} {ฮฑ : Type u_2} [LinearOrder ฮฑ] [Fintype ฮน] [Nonempty ฮน] (f : ฮน โ ฮฑ) : ฮนargmax.{u_1, u_2} {ฮน : Type u_1} {ฮฑ : Type u_2} [LinearOrder ฮฑ] [Fintype ฮน] [Nonempty ฮน] (f : ฮน โ ฮฑ) : ฮน
Code
noncomputable def argmax := (exists_argmax f).choose
Body uses (2)
Used by (17)
Actions: Source ยท Open Issue