`Bandits.prob_avg_sub_sqrt_log_ge`🔗

This page has the declaration's own card below, then its dependency graph, then a card for each dependency (type dependencies first, then the rest of the transitive closure). For a theorem, the graph and the dependency cards only follow its statement's dependencies (its proof is replaced by sorry, so what it proves doesn't depend on how); for everything else, both the type and the body/value are followed, since their content is part of what later declarations build on.

Minimal Lean file

`prob_avg_sub_sqrt_log_ge`🔗

LemmaBandits.prob_avg_sub_sqrt_log_ge

Details

No docstring.

theorem

Bandits.prob_avg_sub_sqrt_log_ge.{u_1} {𝓐 : Type u_1}
  {m𝓐 : MeasurableSpace 𝓐} {ν : ProbabilityTheory.Kernel 𝓐 ℝ}
  [ProbabilityTheory.IsMarkovKernel ν] {σ2 : NNReal} {c : ℝ}
  (hν :
    ∀ (a : 𝓐),
      ProbabilityTheory.HasSubgaussianMGF
        (fun x => x - ∫ (x : ℝ), id x ∂ν a) σ2 (ν a))
  (hσ2 : σ2 ≠ 0) (hc : 0 ≤ c) (a : 𝓐) (n k : ℕ) (hk : k ≠ 0) :
  (streamMeasure ν)
      {ω |
        ∫ (x : ℝ), id x ∂ν a ≤
          (∑ m ∈ Finset.range k, ω m a) / ↑k -
            √(2 * c * ↑σ2 * Real.log (↑n + 1) / ↑k)} ≤
    1 / (↑n + 1) ^ c
Bandits.prob_avg_sub_sqrt_log_ge.{u_1}
  {𝓐 : Type u_1} {m𝓐 : MeasurableSpace 𝓐}
  {ν : ProbabilityTheory.Kernel 𝓐 ℝ}
  [ProbabilityTheory.IsMarkovKernel ν]
  {σ2 : NNReal} {c : ℝ}
  (hν :
    ∀ (a : 𝓐),
      ProbabilityTheory.HasSubgaussianMGF
        (fun x =>
          x - ∫ (x : ℝ), id x ∂ν a)
        σ2 (ν a))
  (hσ2 : σ2 ≠ 0) (hc : 0 ≤ c) (a : 𝓐)
  (n k : ℕ) (hk : k ≠ 0) :
  (streamMeasure ν)
      {ω |
        ∫ (x : ℝ), id x ∂ν a ≤
          (∑ m ∈ Finset.range k, ω m a) /
              ↑k -
            √(2 * c * ↑σ2 *
                  Real.log (↑n + 1) /
                ↑k)} ≤
    1 / (↑n + 1) ^ c

Code

lemma prob_avg_sub_sqrt_log_ge {σ2 : ℝ≥0} {c : ℝ}
    (hν : ∀ a, HasSubgaussianMGF (fun x ↦ x - (ν a)[id]) σ2 (ν a)) (hσ2 : σ2 ≠ 0)
    (hc : 0 ≤ c) (a : 𝓐) (n k : ℕ) (hk : k ≠ 0) :
    streamMeasure ν
        {ω | (ν a)[id] ≤ (∑ m ∈ range k, ω m a) / k - √(2 * c * σ2 *log (n + 1) / k)} ≤
      1 / (n + 1) ^ c

Type uses (1)

streamMeasure

Body uses (1)

prob_sum_ge_sqrt_log

Used by (1)

prob_ucbIndex_ge

Actions: Source · Open Issue

Proof

by
  have h_log_nonneg : 0 ≤ log (n + 1) := log_nonneg (by simp)
  calc
    streamMeasure ν {ω | (ν a)[id] ≤ (∑ m ∈ range k, ω m a) / k - √(2 * c * σ2 * log (n + 1) / k)}
  _ = streamMeasure ν
      {ω | √(2 * c * σ2 * log (n + 1) / k) ≤ (∑ s ∈ range k, (ω s a - (ν a)[id])) / k} := by
    congr with ω
    field_simp
    rw [Finset.sum_sub_distrib]
    simp
    grind
  _ = streamMeasure ν
      {ω | √(2 * c * k * σ2 * log (n + 1)) ≤ (∑ s ∈ range k, (ω s a - (ν a)[id]))} := by
    congr with ω
    field_simp
    congr! 1
    rw [sqrt_div (by positivity), ← mul_div_assoc, mul_comm, mul_div_assoc, div_sqrt,
      mul_comm _ (k : ℝ), sqrt_mul (x := (k : ℝ)) (by positivity), mul_comm]
  _ ≤ 1 / (n + 1) ^ c := prob_sum_ge_sqrt_log hν hσ2 hc a k hk

Dependency graph

Type dependencies (1)

`streamMeasure`🔗

DefinitionBandits.streamMeasure

Details

Measure of an infinite stream of rewards from each action.

def

Bandits.streamMeasure.{u_1, u_2} {𝓐 : Type u_1} {R : Type u_2}
  {m𝓐 : MeasurableSpace 𝓐} {mR : MeasurableSpace R}
  (ν : ProbabilityTheory.Kernel 𝓐 R) : MeasureTheory.Measure (ℕ → 𝓐 → R)
Bandits.streamMeasure.{u_1, u_2}
  {𝓐 : Type u_1} {R : Type u_2}
  {m𝓐 : MeasurableSpace 𝓐}
  {mR : MeasurableSpace R}
  (ν : ProbabilityTheory.Kernel 𝓐 R) :
  MeasureTheory.Measure (ℕ → 𝓐 → R)

Code

noncomputable
def streamMeasure (ν : Kernel 𝓐 R) : Measure (ℕ → 𝓐 → R) :=
  Measure.infinitePi fun _ ↦ Measure.infinitePi ν

Used by (56)

Actions: Source · Open Issue

Bandits.prob_avg_sub_sqrt_log_ge🔗

prob_avg_sub_sqrt_log_ge🔗

streamMeasure🔗

`Bandits.prob_avg_sub_sqrt_log_ge`🔗

`prob_avg_sub_sqrt_log_ge`🔗

`streamMeasure`🔗