Documentation

LeanMachineLearning.Online.Bandit.BayesRegret

Bayesian regret #

This file defines actionMean, bestAction, gap, and regret as random variables in a measurable space Ω. These definitions are useful when IsBayesAlgEnvSeq Q κ alg E A Y P.

Recall that IsBayesAlgEnvSeq Q κ alg E A Y P states that there is a measure P : Measure Ω such that the parameter E : Ω → 𝓔 has law Q and that the sequences of actions A : ℕ → Ω → 𝓐 and feedbacks Y : ℕ → Ω → 𝓨 are generated by the algorithm alg : Algorithm 𝓐 𝓨 interacting with an underlying environment that depends on E and κ (stationaryEnv (κ.sectR (E ω)))

Main definitions #

noncomputable def Learning.IsBayesAlgEnvSeq.actionMean {𝓔 : Type u_1} {𝓐 : Type u_2} {Ω : Type u_4} [MeasurableSpace 𝓔] [MeasurableSpace 𝓐] (κ : ProbabilityTheory.Kernel (𝓔 × 𝓐) ) (E : Ω𝓔) (a : 𝓐) (ω : Ω) :

A random variable that gives the mean feedback of action a.

Equations
Instances For
    theorem Learning.IsBayesAlgEnvSeq.measurable_actionMean {𝓔 : Type u_1} {𝓐 : Type u_2} {Ω : Type u_4} [MeasurableSpace 𝓔] [MeasurableSpace 𝓐] [MeasurableSpace Ω] {κ : ProbabilityTheory.Kernel (𝓔 × 𝓐) } {E : Ω𝓔} {a : 𝓐} (hE : Measurable E) :
    theorem Learning.IsBayesAlgEnvSeq.measurable_uncurry_actionMean_comp {𝓔 : Type u_1} {𝓐 : Type u_2} {Ω : Type u_4} [MeasurableSpace 𝓔] [MeasurableSpace 𝓐] [MeasurableSpace Ω] [Countable 𝓐] [MeasurableSingletonClass 𝓐] {κ : ProbabilityTheory.Kernel (𝓔 × 𝓐) } {E : Ω𝓔} (hE : Measurable E) {f : Ω𝓐} (hf : Measurable f) :
    Measurable fun (ω : Ω) => actionMean κ E (f ω) ω
    theorem Learning.IsBayesAlgEnvSeq.integrable_uncurry_actionMean_comp {𝓔 : Type u_1} {𝓐 : Type u_2} {Ω : Type u_4} [MeasurableSpace 𝓔] [MeasurableSpace 𝓐] [MeasurableSpace Ω] [Countable 𝓐] [MeasurableSingletonClass 𝓐] {κ : ProbabilityTheory.Kernel (𝓔 × 𝓐) } {E : Ω𝓔} (hE : Measurable E) {f : Ω𝓐} (hf : Measurable f) {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] {l u : } (hm : ∀ (e : 𝓔) (a : 𝓐), (x : ), id x κ (e, a) Set.Icc l u) :
    MeasureTheory.Integrable (fun (ω : Ω) => actionMean κ E (f ω) ω) P
    noncomputable def Learning.IsBayesAlgEnvSeq.bestAction {𝓔 : Type u_1} {𝓐 : Type u_2} {Ω : Type u_4} [MeasurableSpace 𝓔] [MeasurableSpace 𝓐] [Nonempty 𝓐] [Fintype 𝓐] [Encodable 𝓐] [MeasurableSingletonClass 𝓐] (κ : ProbabilityTheory.Kernel (𝓔 × 𝓐) ) (E : Ω𝓔) (ω : Ω) :
    𝓐

    A random variable that gives the action with the highest mean feedback.

    Equations
    Instances For
      theorem Learning.IsBayesAlgEnvSeq.measurable_bestAction {𝓔 : Type u_1} {𝓐 : Type u_2} {Ω : Type u_4} [MeasurableSpace 𝓔] [MeasurableSpace 𝓐] [MeasurableSpace Ω] [Nonempty 𝓐] [Fintype 𝓐] [Encodable 𝓐] [MeasurableSingletonClass 𝓐] {κ : ProbabilityTheory.Kernel (𝓔 × 𝓐) } {E : Ω𝓔} (hE : Measurable E) :
      noncomputable def Learning.IsBayesAlgEnvSeq.gap {𝓔 : Type u_1} {𝓐 : Type u_2} {Ω : Type u_4} [MeasurableSpace 𝓔] [MeasurableSpace 𝓐] (κ : ProbabilityTheory.Kernel (𝓔 × 𝓐) ) (E : Ω𝓔) (A : Ω𝓐) (n : ) (ω : Ω) :

      A random variable that gives the gap at time n.

      Equations
      Instances For
        theorem Learning.IsBayesAlgEnvSeq.gap_nonneg_of_le {𝓔 : Type u_1} {𝓐 : Type u_2} {Ω : Type u_4} [MeasurableSpace 𝓔] [MeasurableSpace 𝓐] {κ : ProbabilityTheory.Kernel (𝓔 × 𝓐) } {E : Ω𝓔} {A : Ω𝓐} {n : } {ω : Ω} {u : } (h : ∀ (e : 𝓔) (a : 𝓐), (x : ), id x κ (e, a) u) :
        0 gap κ E A n ω

        The gap is non-negative if the means are bounded by u : ℝ (even if 𝓐 is not Finite).

        theorem Learning.IsBayesAlgEnvSeq.gap_le_of_mem_Icc {𝓔 : Type u_1} {𝓐 : Type u_2} {Ω : Type u_4} [MeasurableSpace 𝓔] [MeasurableSpace 𝓐] [Nonempty 𝓐] {κ : ProbabilityTheory.Kernel (𝓔 × 𝓐) } {E : Ω𝓔} {A : Ω𝓐} {n : } {ω : Ω} {l u : } (h : ∀ (e : 𝓔) (a : 𝓐), (x : ), id x κ (e, a) Set.Icc l u) :
        gap κ E A n ω u - l
        theorem Learning.IsBayesAlgEnvSeq.gap_eq_sub {𝓔 : Type u_1} {𝓐 : Type u_2} {Ω : Type u_4} [MeasurableSpace 𝓔] [MeasurableSpace 𝓐] [Nonempty 𝓐] [Fintype 𝓐] [Encodable 𝓐] [MeasurableSingletonClass 𝓐] {κ : ProbabilityTheory.Kernel (𝓔 × 𝓐) } {E : Ω𝓔} {A : Ω𝓐} {n : } {ω : Ω} :
        gap κ E A n ω = actionMean κ E (bestAction κ E ω) ω - actionMean κ E (A n ω) ω
        theorem Learning.IsBayesAlgEnvSeq.measurable_gap {𝓔 : Type u_1} {𝓐 : Type u_2} {Ω : Type u_4} [MeasurableSpace 𝓔] [MeasurableSpace 𝓐] [MeasurableSpace Ω] [Countable 𝓐] {κ : ProbabilityTheory.Kernel (𝓔 × 𝓐) } {E : Ω𝓔} {A : Ω𝓐} {n : } (hE : Measurable E) (hA : ∀ (t : ), Measurable (A t)) :
        Measurable (gap κ E A n)
        theorem Learning.IsBayesAlgEnvSeq.integrable_gap {𝓔 : Type u_1} {𝓐 : Type u_2} {Ω : Type u_4} [MeasurableSpace 𝓔] [MeasurableSpace 𝓐] [MeasurableSpace Ω] [Countable 𝓐] [Nonempty 𝓐] {κ : ProbabilityTheory.Kernel (𝓔 × 𝓐) } {E : Ω𝓔} {A : Ω𝓐} {n : } {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] (hE : Measurable E) (hA : ∀ (t : ), Measurable (A t)) {l u : } (h : ∀ (e : 𝓔) (a : 𝓐), (x : ), id x κ (e, a) Set.Icc l u) :
        noncomputable def Learning.IsBayesAlgEnvSeq.regret {𝓔 : Type u_1} {𝓐 : Type u_2} {Ω : Type u_4} [MeasurableSpace 𝓔] [MeasurableSpace 𝓐] (κ : ProbabilityTheory.Kernel (𝓔 × 𝓐) ) (E : Ω𝓔) (A : Ω𝓐) (n : ) (ω : Ω) :

        A random variable that gives the regret at time n.

        Equations
        Instances For
          theorem Learning.IsBayesAlgEnvSeq.regret_eq_sum_gap {𝓔 : Type u_1} {𝓐 : Type u_2} {Ω : Type u_4} [MeasurableSpace 𝓔] [MeasurableSpace 𝓐] {κ : ProbabilityTheory.Kernel (𝓔 × 𝓐) } {E : Ω𝓔} {A : Ω𝓐} {n : } {ω : Ω} :
          regret κ E A n ω = sFinset.range n, gap κ E A s ω
          theorem Learning.IsBayesAlgEnvSeq.regret_eq_sum_gap' {𝓔 : Type u_1} {𝓐 : Type u_2} {Ω : Type u_4} [MeasurableSpace 𝓔] [MeasurableSpace 𝓐] {κ : ProbabilityTheory.Kernel (𝓔 × 𝓐) } {E : Ω𝓔} {A : Ω𝓐} {n : } :
          regret κ E A n = fun (ω : Ω) => sFinset.range n, gap κ E A s ω
          theorem Learning.IsBayesAlgEnvSeq.measurable_regret {𝓔 : Type u_1} {𝓐 : Type u_2} {Ω : Type u_4} [MeasurableSpace 𝓔] [MeasurableSpace 𝓐] [MeasurableSpace Ω] [Countable 𝓐] {κ : ProbabilityTheory.Kernel (𝓔 × 𝓐) } {E : Ω𝓔} {A : Ω𝓐} {n : } (hE : Measurable E) (hA : ∀ (t : ), Measurable (A t)) :
          Measurable (regret κ E A n)
          theorem Learning.IsBayesAlgEnvSeq.integrable_regret {𝓔 : Type u_1} {𝓐 : Type u_2} {Ω : Type u_4} [MeasurableSpace 𝓔] [MeasurableSpace 𝓐] [MeasurableSpace Ω] [Countable 𝓐] [Nonempty 𝓐] {κ : ProbabilityTheory.Kernel (𝓔 × 𝓐) } {E : Ω𝓔} {A : Ω𝓐} {n : } {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] (hE : Measurable E) (hA : ∀ (t : ), Measurable (A t)) {l u : } (h : ∀ (e : 𝓔) (a : 𝓐), (x : ), id x κ (e, a) Set.Icc l u) :