LeanMachineLearning exposition

Bandits.UCB.measurable_ucbWidth๐Ÿ”—

This page has the declaration's own card below, then its dependency graph, then a card for each dependency (type dependencies first, then the rest of the transitive closure). For a theorem, the graph and the dependency cards only follow its statement's dependencies (its proof is replaced by sorry, so what it proves doesn't depend on how); for everything else, both the type and the body/value are followed, since their content is part of what later declarations build on.

Minimal Lean file

measurable_ucbWidth๐Ÿ”—

LemmaBandits.UCB.measurable_ucbWidth

No docstring.

๐Ÿ”—theorem
Bandits.UCB.measurable_ucbWidth.{u_1} {K : โ„•} {ฮฉ : Type u_1} {mฮฉ : MeasurableSpace ฮฉ} {A : โ„• โ†’ ฮฉ โ†’ Fin K} {n : โ„•} (hA : โˆ€ (n : โ„•), Measurable (A n)) (c : โ„) (a : Fin K) : Measurable (ucbWidth A c a n)
Bandits.UCB.measurable_ucbWidth.{u_1} {K : โ„•} {ฮฉ : Type u_1} {mฮฉ : MeasurableSpace ฮฉ} {A : โ„• โ†’ ฮฉ โ†’ Fin K} {n : โ„•} (hA : โˆ€ (n : โ„•), Measurable (A n)) (c : โ„) (a : Fin K) : Measurable (ucbWidth A c a n)

Code

lemma measurable_ucbWidth (hA : โˆ€ n, Measurable (A n)) (c : โ„) (a : Fin K) :
    Measurable (ucbWidth A c a n)
Type uses (1)
Body uses (2)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  unfold ucbWidth
  fun_prop

Dependency graph

Type dependencies (1)

ucbWidth๐Ÿ”—

DefinitionBandits.UCB.ucbWidth

The exploration bonus of the UCB algorithm, which corresponds to the width of a confidence interval.

๐Ÿ”—def
Bandits.UCB.ucbWidth.{u_1} {K : โ„•} {ฮฉ : Type u_1} (A : โ„• โ†’ ฮฉ โ†’ Fin K) (c : โ„) (a : Fin K) (n : โ„•) (ฯ‰ : ฮฉ) : โ„
Bandits.UCB.ucbWidth.{u_1} {K : โ„•} {ฮฉ : Type u_1} (A : โ„• โ†’ ฮฉ โ†’ Fin K) (c : โ„) (a : Fin K) (n : โ„•) (ฯ‰ : ฮฉ) : โ„

Code

noncomputable def ucbWidth (A : โ„• โ†’ ฮฉ โ†’ Fin K) (c : โ„) (a : Fin K) (n : โ„•) (ฯ‰ : ฮฉ) : โ„ :=
  โˆš(2 * c * log (n + 1) / pullCount A a n ฯ‰)
Body uses (1)
Used by (16)

Actions: Source ยท Open Issue

All dependencies, transitively (1)

pullCount๐Ÿ”—

DefinitionLearning.pullCount

Number of times action a was chosen up to time t (excluding t).

๐Ÿ”—def
Learning.pullCount.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] (A : โ„• โ†’ ฮฉ โ†’ ๐“) (a : ๐“) (t : โ„•) (ฯ‰ : ฮฉ) : โ„•
Learning.pullCount.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] (A : โ„• โ†’ ฮฉ โ†’ ๐“) (a : ๐“) (t : โ„•) (ฯ‰ : ฮฉ) : โ„•

Code

noncomputable
def pullCount (A : โ„• โ†’ ฮฉ โ†’ ๐“) (a : ๐“) (t : โ„•) (ฯ‰ : ฮฉ) : โ„• :=
  #(filter (fun s โ†ฆ A s ฯ‰ = a) (range t))
Used by (146)

Actions: Source ยท Open Issue