LeanMachineLearning exposition

Bandits.probReal_sum_le_sum_streamMeasure๐Ÿ”—

This page has the declaration's own card below, then its dependency graph, then a card for each dependency (type dependencies first, then the rest of the transitive closure). For a theorem, the graph and the dependency cards only follow its statement's dependencies (its proof is replaced by sorry, so what it proves doesn't depend on how); for everything else, both the type and the body/value are followed, since their content is part of what later declarations build on.

Minimal Lean file

probReal_sum_le_sum_streamMeasure๐Ÿ”—

LemmaBandits.probReal_sum_le_sum_streamMeasure

No docstring.

๐Ÿ”—theorem
Bandits.probReal_sum_le_sum_streamMeasure.{u_1} {๐“ : Type u_1} {m๐“ : MeasurableSpace ๐“} {ฮฝ : ProbabilityTheory.Kernel ๐“ โ„} [ProbabilityTheory.IsMarkovKernel ฮฝ] [Nonempty ๐“] [Fintype ๐“] {c : NNReal} (hฮฝ : โˆ€ (a : ๐“), ProbabilityTheory.HasSubgaussianMGF (fun x => x - โˆซ (x : โ„), id x โˆ‚ฮฝ a) c (ฮฝ a)) (a : ๐“) (m : โ„•) : MeasureTheory.Measure.real (streamMeasure ฮฝ) {ฯ‰ | โˆ‘ s โˆˆ Finset.range m, ฯ‰ s (bestArm ฮฝ) โ‰ค โˆ‘ s โˆˆ Finset.range m, ฯ‰ s a} โ‰ค Real.exp (-โ†‘m * gap ฮฝ a ^ 2 / (4 * โ†‘c))
Bandits.probReal_sum_le_sum_streamMeasure.{u_1} {๐“ : Type u_1} {m๐“ : MeasurableSpace ๐“} {ฮฝ : ProbabilityTheory.Kernel ๐“ โ„} [ProbabilityTheory.IsMarkovKernel ฮฝ] [Nonempty ๐“] [Fintype ๐“] {c : NNReal} (hฮฝ : โˆ€ (a : ๐“), ProbabilityTheory.HasSubgaussianMGF (fun x => x - โˆซ (x : โ„), id x โˆ‚ฮฝ a) c (ฮฝ a)) (a : ๐“) (m : โ„•) : MeasureTheory.Measure.real (streamMeasure ฮฝ) {ฯ‰ | โˆ‘ s โˆˆ Finset.range m, ฯ‰ s (bestArm ฮฝ) โ‰ค โˆ‘ s โˆˆ Finset.range m, ฯ‰ s a} โ‰ค Real.exp (-โ†‘m * gap ฮฝ a ^ 2 / (4 * โ†‘c))

Code

lemma probReal_sum_le_sum_streamMeasure [Fintype ๐“] {c : โ„โ‰ฅ0}
    (hฮฝ : โˆ€ a, HasSubgaussianMGF (fun x โ†ฆ x - (ฮฝ a)[id]) c (ฮฝ a)) (a : ๐“) (m : โ„•) :
    (streamMeasure ฮฝ).real
        {ฯ‰ | โˆ‘ s โˆˆ range m, ฯ‰ s (bestArm ฮฝ) โ‰ค โˆ‘ s โˆˆ range m, ฯ‰ s a} โ‰ค
      Real.exp (-โ†‘m * gap ฮฝ a ^ 2 / (4 * c))
Type uses (3)
Body uses (9)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  by_cases ha : a = bestArm ฮฝ
  ยท simp [ha]
  refine (HasSubgaussianMGF.measure_sum_le_sum_le' (cX := fun _ โ†ฆ c) (cY := fun _ โ†ฆ c)
    ?_ ?_ ?_ ?_ ?_ ?_).trans_eq ?_
  ยท exact iIndepFun_eval_streamMeasure'' ฮฝ (bestArm ฮฝ)
  ยท exact iIndepFun_eval_streamMeasure'' ฮฝ a
  ยท intro i him
    simp_rw [integral_eval_streamMeasure]
    refine (hฮฝ (bestArm ฮฝ)).congr_identDistrib ?_
    exact (identDistrib_eval_eval_id_streamMeasure _ _ _).symm.sub_const _
  ยท intro i him
    simp_rw [integral_eval_streamMeasure]
    refine (hฮฝ a).congr_identDistrib ?_
    exact (identDistrib_eval_eval_id_streamMeasure _ _ _).symm.sub_const _
  ยท exact indepFun_eval_streamMeasure' ฮฝ (Ne.symm ha)
  ยท gcongr 1 with i him
    simp_rw [integral_eval_streamMeasure]
    exact le_bestArm a
  ยท congr 1
    simp_rw [integral_eval_streamMeasure]
    simp only [id_eq, sum_const, card_range, nsmul_eq_mul, NNReal.coe_mul, NNReal.coe_natCast,
      gap_eq_bestArm_sub, neg_mul]
    field_simp
    ring

Dependency graph

Type dependencies (3)

streamMeasure๐Ÿ”—

DefinitionBandits.streamMeasure

Measure of an infinite stream of rewards from each action.

๐Ÿ”—def
Bandits.streamMeasure.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) : MeasureTheory.Measure (โ„• โ†’ ๐“ โ†’ R)
Bandits.streamMeasure.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) : MeasureTheory.Measure (โ„• โ†’ ๐“ โ†’ R)

Code

noncomputable
def streamMeasure (ฮฝ : Kernel ๐“ R) : Measure (โ„• โ†’ ๐“ โ†’ R) :=
  Measure.infinitePi fun _ โ†ฆ Measure.infinitePi ฮฝ
Used by (56)

Actions: Source ยท Open Issue

bestArm๐Ÿ”—

DefinitionBandits.bestArm

action with the highest mean.

๐Ÿ”—def
Bandits.bestArm.{u_1} {๐“ : Type u_1} {m๐“ : MeasurableSpace ๐“} [Fintype ๐“] [Nonempty ๐“] (ฮฝ : ProbabilityTheory.Kernel ๐“ โ„) : ๐“
Bandits.bestArm.{u_1} {๐“ : Type u_1} {m๐“ : MeasurableSpace ๐“} [Fintype ๐“] [Nonempty ๐“] (ฮฝ : ProbabilityTheory.Kernel ๐“ โ„) : ๐“

Code

noncomputable def bestArm (ฮฝ : Kernel ๐“ โ„) : ๐“ :=
  (exists_max_image univ (fun a โ†ฆ (ฮฝ a)[id]) (univ_nonempty_iff.mpr inferInstance)).choose
Used by (18)

Actions: Source ยท Open Issue

gap๐Ÿ”—

DefinitionBandits.gap

Gap of an action a: difference between the highest mean of the actions and the mean of a.

๐Ÿ”—def
Bandits.gap.{u_1} {๐“ : Type u_1} {m๐“ : MeasurableSpace ๐“} (ฮฝ : ProbabilityTheory.Kernel ๐“ โ„) (a : ๐“) : โ„
Bandits.gap.{u_1} {๐“ : Type u_1} {m๐“ : MeasurableSpace ๐“} (ฮฝ : ProbabilityTheory.Kernel ๐“ โ„) (a : ๐“) : โ„

Code

noncomputable
def gap (ฮฝ : Kernel ๐“ โ„) (a : ๐“) : โ„ := (โจ† i, (ฮฝ i)[id]) - (ฮฝ a)[id]
Used by (27)

Actions: Source ยท Open Issue