LeanMachineLearning exposition

Bandits.UCB.ucbWidth๐Ÿ”—

This page has the declaration's own card below, then its dependency graph, then a card for each dependency (type dependencies first, then the rest of the transitive closure). For a theorem, the graph and the dependency cards only follow its statement's dependencies (its proof is replaced by sorry, so what it proves doesn't depend on how); for everything else, both the type and the body/value are followed, since their content is part of what later declarations build on.

Minimal Lean file

ucbWidth๐Ÿ”—

DefinitionBandits.UCB.ucbWidth

The exploration bonus of the UCB algorithm, which corresponds to the width of a confidence interval.

๐Ÿ”—def
Bandits.UCB.ucbWidth.{u_1} {K : โ„•} {ฮฉ : Type u_1} (A : โ„• โ†’ ฮฉ โ†’ Fin K) (c : โ„) (a : Fin K) (n : โ„•) (ฯ‰ : ฮฉ) : โ„
Bandits.UCB.ucbWidth.{u_1} {K : โ„•} {ฮฉ : Type u_1} (A : โ„• โ†’ ฮฉ โ†’ Fin K) (c : โ„) (a : Fin K) (n : โ„•) (ฯ‰ : ฮฉ) : โ„

Code

noncomputable def ucbWidth (A : โ„• โ†’ ฮฉ โ†’ Fin K) (c : โ„) (a : Fin K) (n : โ„•) (ฯ‰ : ฮฉ) : โ„ :=
  โˆš(2 * c * log (n + 1) / pullCount A a n ฯ‰)
Body uses (1)
Used by (16)

Actions: Source ยท Open Issue

Dependency graph

All dependencies, transitively (1)

pullCount๐Ÿ”—

DefinitionLearning.pullCount

Number of times action a was chosen up to time t (excluding t).

๐Ÿ”—def
Learning.pullCount.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] (A : โ„• โ†’ ฮฉ โ†’ ๐“) (a : ๐“) (t : โ„•) (ฯ‰ : ฮฉ) : โ„•
Learning.pullCount.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] (A : โ„• โ†’ ฮฉ โ†’ ๐“) (a : ๐“) (t : โ„•) (ฯ‰ : ฮฉ) : โ„•

Code

noncomputable
def pullCount (A : โ„• โ†’ ฮฉ โ†’ ๐“) (a : ๐“) (t : โ„•) (ฯ‰ : ฮฉ) : โ„• :=
  #(filter (fun s โ†ฆ A s ฯ‰ = a) (range t))
Used by (146)

Actions: Source ยท Open Issue