`Bandits.prob_sum_le_sqrt_log`🔗

This page has the declaration's own card below, then its dependency graph, then a card for each dependency (type dependencies first, then the rest of the transitive closure). For a theorem, the graph and the dependency cards only follow its statement's dependencies (its proof is replaced by sorry, so what it proves doesn't depend on how); for everything else, both the type and the body/value are followed, since their content is part of what later declarations build on.

Minimal Lean file

`prob_sum_le_sqrt_log`🔗

LemmaBandits.prob_sum_le_sqrt_log

Details

No docstring.

theorem

Bandits.prob_sum_le_sqrt_log.{u_1} {𝓐 : Type u_1}
  {m𝓐 : MeasurableSpace 𝓐} {ν : ProbabilityTheory.Kernel 𝓐 ℝ}
  [ProbabilityTheory.IsMarkovKernel ν] {n : ℕ} {σ2 : NNReal}
  (hν :
    ∀ (a : 𝓐),
      ProbabilityTheory.HasSubgaussianMGF
        (fun x => x - ∫ (x : ℝ), id x ∂ν a) σ2 (ν a))
  (hσ2 : σ2 ≠ 0) {c : ℝ} (hc : 0 ≤ c) (a : 𝓐) (k : ℕ) (hk : k ≠ 0) :
  (streamMeasure ν)
      {ω |
        ∑ s ∈ Finset.range k, (ω s a - ∫ (x : ℝ), id x ∂ν a) ≤
          -√(2 * c * ↑k * ↑σ2 * Real.log (↑n + 1))} ≤
    1 / (↑n + 1) ^ c
Bandits.prob_sum_le_sqrt_log.{u_1}
  {𝓐 : Type u_1} {m𝓐 : MeasurableSpace 𝓐}
  {ν : ProbabilityTheory.Kernel 𝓐 ℝ}
  [ProbabilityTheory.IsMarkovKernel ν]
  {n : ℕ} {σ2 : NNReal}
  (hν :
    ∀ (a : 𝓐),
      ProbabilityTheory.HasSubgaussianMGF
        (fun x =>
          x - ∫ (x : ℝ), id x ∂ν a)
        σ2 (ν a))
  (hσ2 : σ2 ≠ 0) {c : ℝ} (hc : 0 ≤ c)
  (a : 𝓐) (k : ℕ) (hk : k ≠ 0) :
  (streamMeasure ν)
      {ω |
        ∑ s ∈ Finset.range k,
            (ω s a -
              ∫ (x : ℝ), id x ∂ν a) ≤
          -√(2 * c * ↑k * ↑σ2 *
                Real.log (↑n + 1))} ≤
    1 / (↑n + 1) ^ c

Code

lemma prob_sum_le_sqrt_log {σ2 : ℝ≥0}
    (hν : ∀ a, HasSubgaussianMGF (fun x ↦ x - (ν a)[id]) σ2 (ν a))
    (hσ2 : σ2 ≠ 0) {c : ℝ} (hc : 0 ≤ c) (a : 𝓐) (k : ℕ) (hk : k ≠ 0) :
    streamMeasure ν
        {ω | (∑ s ∈ range k, (ω s a - (ν a)[id])) ≤ - √(2 * c * k * σ2 * Real.log (n + 1))} ≤
      1 / (n + 1) ^ c

Type uses (1)

streamMeasure

Body uses (4)

Used by (1)

prob_avg_add_sqrt_log_le

Actions: Source · Open Issue

Proof

by
  calc
    streamMeasure ν
      {ω | (∑ s ∈ range k, (ω s a - (ν a)[id])) ≤ - √(2 * c * k * σ2 * Real.log (n + 1))}
  _ ≤ ENNReal.ofReal (Real.exp (-(√(2 * c * k * σ2 * Real.log (n + 1))) ^ 2 / (2 * k * σ2))) := by
    rw [← ofReal_measureReal]
    gcongr
    refine (HasSubgaussianMGF.measure_sum_range_le_le_of_iIndepFun (c := σ2) ?_ ?_ (by positivity))
    · exact (iIndepFun_eval_streamMeasure'' ν a).comp (fun i ω ↦ ω - (ν a)[id])
        (fun _ ↦ by fun_prop)
    · intro i him
      refine (hν a).congr_identDistrib ?_
      exact (identDistrib_eval_eval_id_streamMeasure _ _ _).symm.sub_const _
  _ = 1 / (n + 1) ^ c := by
    rw [Real.sq_sqrt]
    swap; · exact mul_nonneg (by positivity) (Real.log_nonneg (by simp))
    field_simp
    rw [← Real.log_rpow (by positivity), ← Real.log_inv,
      Real.exp_log (by positivity), one_div, ENNReal.ofReal_inv_of_pos (by positivity),
      ← ENNReal.ofReal_rpow_of_nonneg (by positivity) (by positivity)]
    norm_cast

Dependency graph

Type dependencies (1)

`streamMeasure`🔗

DefinitionBandits.streamMeasure

Details

Measure of an infinite stream of rewards from each action.

def

Bandits.streamMeasure.{u_1, u_2} {𝓐 : Type u_1} {R : Type u_2}
  {m𝓐 : MeasurableSpace 𝓐} {mR : MeasurableSpace R}
  (ν : ProbabilityTheory.Kernel 𝓐 R) : MeasureTheory.Measure (ℕ → 𝓐 → R)
Bandits.streamMeasure.{u_1, u_2}
  {𝓐 : Type u_1} {R : Type u_2}
  {m𝓐 : MeasurableSpace 𝓐}
  {mR : MeasurableSpace R}
  (ν : ProbabilityTheory.Kernel 𝓐 R) :
  MeasureTheory.Measure (ℕ → 𝓐 → R)

Code

noncomputable
def streamMeasure (ν : Kernel 𝓐 R) : Measure (ℕ → 𝓐 → R) :=
  Measure.infinitePi fun _ ↦ Measure.infinitePi ν

Used by (56)

Actions: Source · Open Issue

Bandits.prob_sum_le_sqrt_log🔗

prob_sum_le_sqrt_log🔗

streamMeasure🔗

`Bandits.prob_sum_le_sqrt_log`🔗

`prob_sum_le_sqrt_log`🔗

`streamMeasure`🔗