Bandits.probReal_sum_le_sum_streamMeasure
This page has the declaration's own card below, then its dependency graph, then a card for each dependency (type dependencies first, then the rest of the transitive closure). For a theorem, the graph and the dependency cards only follow its statement's dependencies (its proof is replaced by sorry, so what it proves doesn't depend on how); for everything else, both the type and the body/value are followed, since their content is part of what later declarations build on.
probReal_sum_le_sum_streamMeasure๐
Bandits.probReal_sum_le_sum_streamMeasureNo docstring.
Bandits.probReal_sum_le_sum_streamMeasure.{u_1} {๐ : Type u_1} {m๐ : MeasurableSpace ๐} {ฮฝ : ProbabilityTheory.Kernel ๐ โ} [ProbabilityTheory.IsMarkovKernel ฮฝ] [Nonempty ๐] [Fintype ๐] {c : NNReal} (hฮฝ : โ (a : ๐), ProbabilityTheory.HasSubgaussianMGF (fun x => x - โซ (x : โ), id x โฮฝ a) c (ฮฝ a)) (a : ๐) (m : โ) : MeasureTheory.Measure.real (streamMeasure ฮฝ) {ฯ | โ s โ Finset.range m, ฯ s (bestArm ฮฝ) โค โ s โ Finset.range m, ฯ s a} โค Real.exp (-โm * gap ฮฝ a ^ 2 / (4 * โc))Bandits.probReal_sum_le_sum_streamMeasure.{u_1} {๐ : Type u_1} {m๐ : MeasurableSpace ๐} {ฮฝ : ProbabilityTheory.Kernel ๐ โ} [ProbabilityTheory.IsMarkovKernel ฮฝ] [Nonempty ๐] [Fintype ๐] {c : NNReal} (hฮฝ : โ (a : ๐), ProbabilityTheory.HasSubgaussianMGF (fun x => x - โซ (x : โ), id x โฮฝ a) c (ฮฝ a)) (a : ๐) (m : โ) : MeasureTheory.Measure.real (streamMeasure ฮฝ) {ฯ | โ s โ Finset.range m, ฯ s (bestArm ฮฝ) โค โ s โ Finset.range m, ฯ s a} โค Real.exp (-โm * gap ฮฝ a ^ 2 / (4 * โc))
Code
lemma probReal_sum_le_sum_streamMeasure [Fintype ๐] {c : โโฅ0}
(hฮฝ : โ a, HasSubgaussianMGF (fun x โฆ x - (ฮฝ a)[id]) c (ฮฝ a)) (a : ๐) (m : โ) :
(streamMeasure ฮฝ).real
{ฯ | โ s โ range m, ฯ s (bestArm ฮฝ) โค โ s โ range m, ฯ s a} โค
Real.exp (-โm * gap ฮฝ a ^ 2 / (4 * c))Type uses (3)
Body uses (9)
Used by (1)
Actions: Source ยท Open Issue
Proof
by
by_cases ha : a = bestArm ฮฝ
ยท simp [ha]
refine (HasSubgaussianMGF.measure_sum_le_sum_le' (cX := fun _ โฆ c) (cY := fun _ โฆ c)
?_ ?_ ?_ ?_ ?_ ?_).trans_eq ?_
ยท exact iIndepFun_eval_streamMeasure'' ฮฝ (bestArm ฮฝ)
ยท exact iIndepFun_eval_streamMeasure'' ฮฝ a
ยท intro i him
simp_rw [integral_eval_streamMeasure]
refine (hฮฝ (bestArm ฮฝ)).congr_identDistrib ?_
exact (identDistrib_eval_eval_id_streamMeasure _ _ _).symm.sub_const _
ยท intro i him
simp_rw [integral_eval_streamMeasure]
refine (hฮฝ a).congr_identDistrib ?_
exact (identDistrib_eval_eval_id_streamMeasure _ _ _).symm.sub_const _
ยท exact indepFun_eval_streamMeasure' ฮฝ (Ne.symm ha)
ยท gcongr 1 with i him
simp_rw [integral_eval_streamMeasure]
exact le_bestArm a
ยท congr 1
simp_rw [integral_eval_streamMeasure]
simp only [id_eq, sum_const, card_range, nsmul_eq_mul, NNReal.coe_mul, NNReal.coe_natCast,
gap_eq_bestArm_sub, neg_mul]
field_simp
ringDependency graph
Type dependencies (3)
streamMeasure๐
Bandits.streamMeasureMeasure of an infinite stream of rewards from each action.
Bandits.streamMeasure.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) : MeasureTheory.Measure (โ โ ๐ โ R)Bandits.streamMeasure.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) : MeasureTheory.Measure (โ โ ๐ โ R)
Code
noncomputable def streamMeasure (ฮฝ : Kernel ๐ R) : Measure (โ โ ๐ โ R) := Measure.infinitePi fun _ โฆ Measure.infinitePi ฮฝ
Used by (56)
Actions: Source ยท Open Issue
bestArm๐
Bandits.bestArmaction with the highest mean.
Bandits.bestArm.{u_1} {๐ : Type u_1} {m๐ : MeasurableSpace ๐} [Fintype ๐] [Nonempty ๐] (ฮฝ : ProbabilityTheory.Kernel ๐ โ) : ๐Bandits.bestArm.{u_1} {๐ : Type u_1} {m๐ : MeasurableSpace ๐} [Fintype ๐] [Nonempty ๐] (ฮฝ : ProbabilityTheory.Kernel ๐ โ) : ๐
Code
noncomputable def bestArm (ฮฝ : Kernel ๐ โ) : ๐ := (exists_max_image univ (fun a โฆ (ฮฝ a)[id]) (univ_nonempty_iff.mpr inferInstance)).choose
Used by (18)
Actions: Source ยท Open Issue
gap๐
Bandits.gap
Gap of an action a: difference between the highest mean of the actions and the mean of a.
Bandits.gap.{u_1} {๐ : Type u_1} {m๐ : MeasurableSpace ๐} (ฮฝ : ProbabilityTheory.Kernel ๐ โ) (a : ๐) : โBandits.gap.{u_1} {๐ : Type u_1} {m๐ : MeasurableSpace ๐} (ฮฝ : ProbabilityTheory.Kernel ๐ โ) (a : ๐) : โ
Code
noncomputable def gap (ฮฝ : Kernel ๐ โ) (a : ๐) : โ := (โจ i, (ฮฝ i)[id]) - (ฮฝ a)[id]
Used by (27)
Actions: Source ยท Open Issue