LeanMachineLearning exposition

2.3.ย Online.Bandit.ArrayProbSpace๐Ÿ”—

Array-of-rewards probability space for stochastic bandits

We build a particular probability space for stochastic bandits, called the "array model", in which an infinite array of i.i.d. rewards is first produced for all actions. When the algorithm chooses action a for the nth time, it receives the reward in the row a of the array and column n.

Some statements about bandit algorithms are easier to prove in this space, and can then be transfered to any other probability space using the fact that the conditinonal distributions of the arms and rewards specified in the bandit model determine their laws uniquely.

Main definitions

  • streamMeasure ฮฝ: probability measure on the space of infinite arrays of rewards, where the rewards in each row are i.i.d. according to ฮฝ.

  • probSpace ๐“ R: probability space for the array model of stochastic bandits with action space ๐“ and reward space R.

  • arrayMeasure ฮฝ: probability measure on probSpace ๐“ R for the array model of stochastic bandits with reward kernel ฮฝ.

Module LeanMachineLearning.Online.Bandit.ArrayProbSpace contains 77 exposed declarations.

streamMeasure๐Ÿ”—

DefinitionBandits.streamMeasure

Measure of an infinite stream of rewards from each action.

๐Ÿ”—def
Bandits.streamMeasure.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) : MeasureTheory.Measure (โ„• โ†’ ๐“ โ†’ R)
Bandits.streamMeasure.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) : MeasureTheory.Measure (โ„• โ†’ ๐“ โ†’ R)

Code

noncomputable
def streamMeasure (ฮฝ : Kernel ๐“ R) : Measure (โ„• โ†’ ๐“ โ†’ R) :=
  Measure.infinitePi fun _ โ†ฆ Measure.infinitePi ฮฝ
Used by (56)

Actions: Source ยท Open Issue

instIsProbabilityMeasureForallNatForallStreamMeasureOfIsMarkovKernel๐Ÿ”—

InstanceBandits.instIsProbabilityMeasureForallNatForallStreamMeasureOfIsMarkovKernel

No docstring.

๐Ÿ”—theorem
Bandits.instIsProbabilityMeasureForallNatForallStreamMeasureOfIsMarkovKernel.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : MeasureTheory.IsProbabilityMeasure (streamMeasure ฮฝ)
Bandits.instIsProbabilityMeasureForallNatForallStreamMeasureOfIsMarkovKernel.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : MeasureTheory.IsProbabilityMeasure (streamMeasure ฮฝ)

Code

instance (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ] : IsProbabilityMeasure (streamMeasure ฮฝ)
Type uses (1)
Used by (22)

Actions: Source ยท Open Issue

Proof
by
  unfold streamMeasure
  infer_instance

hasLaw_eval_infinitePi๐Ÿ”—

LemmahasLaw_eval_infinitePi

No docstring.

๐Ÿ”—theorem
hasLaw_eval_infinitePi.{u_3, u_4} {ฮน : Type u_3} {X : ฮน โ†’ Type u_4} {mX : (i : ฮน) โ†’ MeasurableSpace (X i)} (ฮผ : (i : ฮน) โ†’ MeasureTheory.Measure (X i)) [hฮผ : โˆ€ (i : ฮน), MeasureTheory.IsProbabilityMeasure (ฮผ i)] (i : ฮน) : ProbabilityTheory.HasLaw (Function.eval i) (ฮผ i) (MeasureTheory.Measure.infinitePi ฮผ)
hasLaw_eval_infinitePi.{u_3, u_4} {ฮน : Type u_3} {X : ฮน โ†’ Type u_4} {mX : (i : ฮน) โ†’ MeasurableSpace (X i)} (ฮผ : (i : ฮน) โ†’ MeasureTheory.Measure (X i)) [hฮผ : โˆ€ (i : ฮน), MeasureTheory.IsProbabilityMeasure (ฮผ i)] (i : ฮน) : ProbabilityTheory.HasLaw (Function.eval i) (ฮผ i) (MeasureTheory.Measure.infinitePi ฮผ)

Code

lemma _root_.hasLaw_eval_infinitePi {ฮน : Type*} {X : ฮน โ†’ Type*} {mX : โˆ€ i, MeasurableSpace (X i)}
  (ฮผ : (i : ฮน) โ†’ Measure (X i)) [hฮผ : โˆ€ i, IsProbabilityMeasure (ฮผ i)] (i : ฮน) :
    HasLaw (Function.eval i) (ฮผ i) (Measure.infinitePi ฮผ) where
  aemeasurable
Used by (2)

Actions: Source ยท Open Issue

Proof
Measurable.aemeasurable (by fun_prop)
  map_eq := by exact (measurePreserving_eval_infinitePi ฮผ i).map_eq

hasLaw_eval_streamMeasure๐Ÿ”—

LemmaBandits.hasLaw_eval_streamMeasure

No docstring.

๐Ÿ”—theorem
Bandits.hasLaw_eval_streamMeasure.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) : ProbabilityTheory.HasLaw (fun h => h n) (MeasureTheory.Measure.infinitePi โ‡‘ฮฝ) (streamMeasure ฮฝ)
Bandits.hasLaw_eval_streamMeasure.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) : ProbabilityTheory.HasLaw (fun h => h n) (MeasureTheory.Measure.infinitePi โ‡‘ฮฝ) (streamMeasure ฮฝ)

Code

lemma hasLaw_eval_streamMeasure (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ] (n : โ„•) :
    HasLaw (fun h : โ„• โ†’ ๐“ โ†’ R โ†ฆ h n) (Measure.infinitePi ฮฝ) (streamMeasure ฮฝ)
Type uses (1)
Body uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
hasLaw_eval_infinitePi (fun _ โ†ฆ Measure.infinitePi ฮฝ) n

hasLaw_eval_eval_streamMeasure๐Ÿ”—

LemmaBandits.hasLaw_eval_eval_streamMeasure

No docstring.

๐Ÿ”—theorem
Bandits.hasLaw_eval_eval_streamMeasure.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) (a : ๐“) : ProbabilityTheory.HasLaw (fun h => h n a) (ฮฝ a) (streamMeasure ฮฝ)
Bandits.hasLaw_eval_eval_streamMeasure.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) (a : ๐“) : ProbabilityTheory.HasLaw (fun h => h n a) (ฮฝ a) (streamMeasure ฮฝ)

Code

lemma hasLaw_eval_eval_streamMeasure (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ] (n : โ„•) (a : ๐“) :
    HasLaw (fun h : โ„• โ†’ ๐“ โ†’ R โ†ฆ h n a) (ฮฝ a) (streamMeasure ฮฝ)
Type uses (1)
Body uses (2)
Used by (2)

Actions: Source ยท Open Issue

Proof
(hasLaw_eval_infinitePi ฮฝ a).comp (hasLaw_eval_streamMeasure ฮฝ n)

identDistrib_eval_eval_id_streamMeasure๐Ÿ”—

LemmaBandits.identDistrib_eval_eval_id_streamMeasure

No docstring.

๐Ÿ”—theorem
Bandits.identDistrib_eval_eval_id_streamMeasure.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) (a : ๐“) : ProbabilityTheory.IdentDistrib (fun h => h n a) id (streamMeasure ฮฝ) (ฮฝ a)
Bandits.identDistrib_eval_eval_id_streamMeasure.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) (a : ๐“) : ProbabilityTheory.IdentDistrib (fun h => h n a) id (streamMeasure ฮฝ) (ฮฝ a)

Code

lemma identDistrib_eval_eval_id_streamMeasure (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ] (n : โ„•) (a : ๐“) :
    IdentDistrib (fun h : โ„• โ†’ ๐“ โ†’ R โ†ฆ h n a) id (streamMeasure ฮฝ) (ฮฝ a) where
  aemeasurable_fst
Type uses (1)
Body uses (1)
Used by (7)

Actions: Source ยท Open Issue

Proof
Measurable.aemeasurable (by fun_prop)
  aemeasurable_snd := Measurable.aemeasurable (by fun_prop)
  map_eq := by
    rw [โ† (hasLaw_eval_eval_streamMeasure ฮฝ n a).map_eq,
      Measure.map_map (by fun_prop) (by fun_prop)]
    simp

integrable_eval_streamMeasure๐Ÿ”—

LemmaBandits.integrable_eval_streamMeasure

No docstring.

๐Ÿ”—theorem
Bandits.integrable_eval_streamMeasure.{u_1} {๐“ : Type u_1} {m๐“ : MeasurableSpace ๐“} (ฮฝ : ProbabilityTheory.Kernel ๐“ โ„) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) (a : ๐“) (h_int : MeasureTheory.Integrable id (ฮฝ a)) : MeasureTheory.Integrable (fun h => h n a) (streamMeasure ฮฝ)
Bandits.integrable_eval_streamMeasure.{u_1} {๐“ : Type u_1} {m๐“ : MeasurableSpace ๐“} (ฮฝ : ProbabilityTheory.Kernel ๐“ โ„) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) (a : ๐“) (h_int : MeasureTheory.Integrable id (ฮฝ a)) : MeasureTheory.Integrable (fun h => h n a) (streamMeasure ฮฝ)

Code

lemma integrable_eval_streamMeasure (ฮฝ : Kernel ๐“ โ„) [IsMarkovKernel ฮฝ] (n : โ„•) (a : ๐“)
    (h_int : Integrable id (ฮฝ a)) :
    Integrable (fun h : โ„• โ†’ ๐“ โ†’ โ„ โ†ฆ h n a) (streamMeasure ฮฝ)
Type uses (1)
Body uses (2)

Actions: Source ยท Open Issue

Proof
Integrable.congr_identDistrib h_int (identDistrib_eval_eval_id_streamMeasure ฮฝ n a).symm

integral_eval_streamMeasure๐Ÿ”—

LemmaBandits.integral_eval_streamMeasure

No docstring.

๐Ÿ”—theorem
Bandits.integral_eval_streamMeasure.{u_1} {๐“ : Type u_1} {m๐“ : MeasurableSpace ๐“} (ฮฝ : ProbabilityTheory.Kernel ๐“ โ„) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) (a : ๐“) : โˆซ (h : โ„• โ†’ ๐“ โ†’ โ„), h n a โˆ‚streamMeasure ฮฝ = โˆซ (x : โ„), id x โˆ‚ฮฝ a
Bandits.integral_eval_streamMeasure.{u_1} {๐“ : Type u_1} {m๐“ : MeasurableSpace ๐“} (ฮฝ : ProbabilityTheory.Kernel ๐“ โ„) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) (a : ๐“) : โˆซ (h : โ„• โ†’ ๐“ โ†’ โ„), h n a โˆ‚streamMeasure ฮฝ = โˆซ (x : โ„), id x โˆ‚ฮฝ a

Code

lemma integral_eval_streamMeasure (ฮฝ : Kernel ๐“ โ„) [IsMarkovKernel ฮฝ] (n : โ„•) (a : ๐“) :
    โˆซ h, h n a โˆ‚(streamMeasure ฮฝ) = (ฮฝ a)[id]
Type uses (1)
Body uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  calc โˆซ h, h n a โˆ‚(streamMeasure ฮฝ)
  _ = โˆซ x, x โˆ‚((streamMeasure ฮฝ).map (fun h โ†ฆ h n a)) := by
    rw [integral_map (Measurable.aemeasurable (by fun_prop)) (by fun_prop)]
  _ = (ฮฝ a)[id] := by simp [(hasLaw_eval_eval_streamMeasure ฮฝ n a).map_eq]

iIndepFun_eval_streamMeasure'๐Ÿ”—

LemmaBandits.iIndepFun_eval_streamMeasure'

No docstring.

๐Ÿ”—theorem
Bandits.iIndepFun_eval_streamMeasure'.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : ProbabilityTheory.iIndepFun (fun n ฯ‰ => ฯ‰ n) (streamMeasure ฮฝ)
Bandits.iIndepFun_eval_streamMeasure'.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : ProbabilityTheory.iIndepFun (fun n ฯ‰ => ฯ‰ n) (streamMeasure ฮฝ)

Code

lemma iIndepFun_eval_streamMeasure' (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ] :
    iIndepFun (fun n ฯ‰ โ†ฆ ฯ‰ n) (streamMeasure ฮฝ)
Type uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
iIndepFun_infinitePi (P := fun (_ : โ„•) โ†ฆ Measure.infinitePi ฮฝ) (ฮฉ := fun _ โ†ฆ ๐“ โ†’ R)
    (X := fun i u โ†ฆ u) (fun i โ†ฆ by fun_prop)

iIndepFun_eval_streamMeasure''๐Ÿ”—

LemmaBandits.iIndepFun_eval_streamMeasure''

No docstring.

๐Ÿ”—theorem
Bandits.iIndepFun_eval_streamMeasure''.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (a : ๐“) : ProbabilityTheory.iIndepFun (fun n ฯ‰ => ฯ‰ n a) (streamMeasure ฮฝ)
Bandits.iIndepFun_eval_streamMeasure''.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (a : ๐“) : ProbabilityTheory.iIndepFun (fun n ฯ‰ => ฯ‰ n a) (streamMeasure ฮฝ)

Code

lemma iIndepFun_eval_streamMeasure'' (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ] (a : ๐“) :
    iIndepFun (fun n ฯ‰ โ†ฆ ฯ‰ n a) (streamMeasure ฮฝ)
Type uses (1)
Body uses (1)
Used by (5)

Actions: Source ยท Open Issue

Proof
(iIndepFun_eval_streamMeasure' ฮฝ).comp (g := fun i ฯ‰ โ†ฆ ฯ‰ a) (by fun_prop)

iIndepFun_eval_streamMeasure๐Ÿ”—

LemmaBandits.iIndepFun_eval_streamMeasure

No docstring.

๐Ÿ”—theorem
Bandits.iIndepFun_eval_streamMeasure.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : ProbabilityTheory.iIndepFun (fun p ฯ‰ => ฯ‰ (Prod.fst p) (Prod.snd p)) (streamMeasure ฮฝ)
Bandits.iIndepFun_eval_streamMeasure.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : ProbabilityTheory.iIndepFun (fun p ฯ‰ => ฯ‰ (Prod.fst p) (Prod.snd p)) (streamMeasure ฮฝ)

Code

lemma iIndepFun_eval_streamMeasure (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ] :
    iIndepFun (fun (p : โ„• ร— ๐“) ฯ‰ โ†ฆ ฯ‰ p.1 p.2) (streamMeasure ฮฝ)
Type uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
iIndepFun_uncurry_infinitePi' (X := fun _ _ โ†ฆ id) (fun _ โ†ฆ ฮฝ) (by fun_prop)

indepFun_eval_streamMeasure๐Ÿ”—

LemmaBandits.indepFun_eval_streamMeasure

No docstring.

๐Ÿ”—theorem
Bandits.indepFun_eval_streamMeasure.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] {n m : โ„•} {a b : ๐“} (h : n โ‰  m โˆจ a โ‰  b) : ProbabilityTheory.IndepFun (fun ฯ‰ => ฯ‰ n a) (fun ฯ‰ => ฯ‰ m b) (streamMeasure ฮฝ)
Bandits.indepFun_eval_streamMeasure.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] {n m : โ„•} {a b : ๐“} (h : n โ‰  m โˆจ a โ‰  b) : ProbabilityTheory.IndepFun (fun ฯ‰ => ฯ‰ n a) (fun ฯ‰ => ฯ‰ m b) (streamMeasure ฮฝ)

Code

lemma indepFun_eval_streamMeasure (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ] {n m : โ„•} {a b : ๐“}
    (h : n โ‰  m โˆจ a โ‰  b) :
    IndepFun (fun ฯ‰ โ†ฆ ฯ‰ n a) (fun ฯ‰ โ†ฆ ฯ‰ m b) (streamMeasure ฮฝ)
Type uses (1)
Body uses (1)

Actions: Source ยท Open Issue

Proof
by
  change IndepFun (fun ฯ‰ โ†ฆ ฯ‰ (n, a).1 (n, a).2) (fun ฯ‰ โ†ฆ ฯ‰ (m, b).1 (m, b).2)
    (streamMeasure ฮฝ)
  exact (iIndepFun_eval_streamMeasure ฮฝ).indepFun (by grind)

indepFun_eval_streamMeasure'๐Ÿ”—

LemmaBandits.indepFun_eval_streamMeasure'

No docstring.

๐Ÿ”—theorem
Bandits.indepFun_eval_streamMeasure'.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] {a b : ๐“} (h : a โ‰  b) : ProbabilityTheory.IndepFun (fun ฯ‰ n => ฯ‰ n a) (fun ฯ‰ n => ฯ‰ n b) (streamMeasure ฮฝ)
Bandits.indepFun_eval_streamMeasure'.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] {a b : ๐“} (h : a โ‰  b) : ProbabilityTheory.IndepFun (fun ฯ‰ n => ฯ‰ n a) (fun ฯ‰ n => ฯ‰ n b) (streamMeasure ฮฝ)

Code

lemma indepFun_eval_streamMeasure' (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ] {a b : ๐“} (h : a โ‰  b) :
    IndepFun (fun ฯ‰ n โ†ฆ ฯ‰ n a) (fun ฯ‰ n โ†ฆ ฯ‰ n b) (streamMeasure ฮฝ)
Type uses (1)
Body uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
indepFun_proj_infinitePi_infinitePi h

probSpace๐Ÿ”—

DefinitionBandits.ArrayModel.probSpace

Probability space for the array model of stochastic bandits.

๐Ÿ”—def
Bandits.ArrayModel.probSpace.{u_1, u_2} (๐“ : Type u_1) (R : Type u_2) : Type (max u_1 u_2)
Bandits.ArrayModel.probSpace.{u_1, u_2} (๐“ : Type u_1) (R : Type u_2) : Type (max u_1 u_2)

Code

def probSpace : Type _ := (โ„• โ†’ I) ร— (โ„• โ†’ ๐“ โ†’ R)
Used by (64)

Actions: Source ยท Open Issue

instMeasurableSpaceProbSpace๐Ÿ”—

InstanceBandits.ArrayModel.instMeasurableSpaceProbSpace

No docstring.

๐Ÿ”—def
Bandits.ArrayModel.instMeasurableSpaceProbSpace.{u_3, u_4} {๐“ : Type u_3} {R : Type u_4} [MeasurableSpace R] : MeasurableSpace (probSpace ๐“ R)
Bandits.ArrayModel.instMeasurableSpaceProbSpace.{u_3, u_4} {๐“ : Type u_3} {R : Type u_4} [MeasurableSpace R] : MeasurableSpace (probSpace ๐“ R)

Code

instance {๐“ R : Type*} [MeasurableSpace R] : MeasurableSpace (probSpace ๐“ R)
Type uses (1)
Used by (41)

Actions: Source ยท Open Issue

Proof
inferInstanceAs (MeasurableSpace ((โ„• โ†’ I) ร— (โ„• โ†’ ๐“ โ†’ R)))

instStandardBorelSpaceProbSpaceOfCountable๐Ÿ”—

InstanceBandits.ArrayModel.instStandardBorelSpaceProbSpaceOfCountable

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.instStandardBorelSpaceProbSpaceOfCountable.{u_3, u_4} {๐“ : Type u_3} {R : Type u_4} [Countable ๐“] [MeasurableSpace R] [StandardBorelSpace R] : StandardBorelSpace (probSpace ๐“ R)
Bandits.ArrayModel.instStandardBorelSpaceProbSpaceOfCountable.{u_3, u_4} {๐“ : Type u_3} {R : Type u_4} [Countable ๐“] [MeasurableSpace R] [StandardBorelSpace R] : StandardBorelSpace (probSpace ๐“ R)

Code

instance {๐“ R : Type*} [Countable ๐“] [MeasurableSpace R] [StandardBorelSpace R] :
    StandardBorelSpace (probSpace ๐“ R)
Type uses (2)
Used by (2)

Actions: Source ยท Open Issue

Proof
inferInstanceAs (StandardBorelSpace ((โ„• โ†’ I) ร— (โ„• โ†’ ๐“ โ†’ R)))

arrayMeasure๐Ÿ”—

DefinitionBandits.ArrayModel.arrayMeasure

Probability measure for the array model of stochastic bandits.

๐Ÿ”—def
Bandits.ArrayModel.arrayMeasure.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) : MeasureTheory.Measure (probSpace ๐“ R)
Bandits.ArrayModel.arrayMeasure.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) : MeasureTheory.Measure (probSpace ๐“ R)

Code

noncomputable
def arrayMeasure (ฮฝ : Kernel ๐“ R) : Measure (probSpace ๐“ R) :=
  (Measure.infinitePi fun _ โ†ฆ volume).prod (streamMeasure ฮฝ)
Type uses (2)
Body uses (1)
Used by (29)

Actions: Source ยท Open Issue

instIsProbabilityMeasureProbSpaceArrayMeasureOfIsMarkovKernel๐Ÿ”—

InstanceBandits.ArrayModel.instIsProbabilityMeasureProbSpaceArrayMeasureOfIsMarkovKernel

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.instIsProbabilityMeasureProbSpaceArrayMeasureOfIsMarkovKernel.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : MeasureTheory.IsProbabilityMeasure (arrayMeasure ฮฝ)
Bandits.ArrayModel.instIsProbabilityMeasureProbSpaceArrayMeasureOfIsMarkovKernel.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : MeasureTheory.IsProbabilityMeasure (arrayMeasure ฮฝ)

Code

instance (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ] : IsProbabilityMeasure (arrayMeasure ฮฝ)
Type uses (3)
Body uses (2)
Used by (10)

Actions: Source ยท Open Issue

Proof
Measure.prod.instIsProbabilityMeasure _ _

initAlgFunction๐Ÿ”—

DefinitionBandits.ArrayModel.initAlgFunction

The initial action is the image of a uniform random variable by this function.

๐Ÿ”—def
Bandits.ArrayModel.initAlgFunction.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] (alg : Learning.Algorithm ๐“ R) : โ†‘unitInterval โ†’ ๐“
Bandits.ArrayModel.initAlgFunction.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] (alg : Learning.Algorithm ๐“ R) : โ†‘unitInterval โ†’ ๐“

Code

noncomputable
def initAlgFunction (alg : Algorithm ๐“ R) : I โ†’ ๐“ :=
  (Measure.exists_measurable_map_eq alg.p0).choose
Type uses (1)
Body uses (1)
Used by (12)

Actions: Source ยท Open Issue

initAlgFunction_map๐Ÿ”—

LemmaBandits.ArrayModel.initAlgFunction_map

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.initAlgFunction_map.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] (alg : Learning.Algorithm ๐“ R) : MeasureTheory.Measure.map (initAlgFunction alg) MeasureTheory.volume = Learning.Algorithm.p0 alg
Bandits.ArrayModel.initAlgFunction_map.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] (alg : Learning.Algorithm ๐“ R) : MeasureTheory.Measure.map (initAlgFunction alg) MeasureTheory.volume = Learning.Algorithm.p0 alg

Code

lemma initAlgFunction_map (alg : Algorithm ๐“ R) : volume.map (initAlgFunction alg) = alg.p0
Type uses (2)
Body uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
(Measure.exists_measurable_map_eq alg.p0).choose_spec.2

measurable_initAlgFunction๐Ÿ”—

LemmaBandits.ArrayModel.measurable_initAlgFunction

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.measurable_initAlgFunction.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] (alg : Learning.Algorithm ๐“ R) : Measurable (initAlgFunction alg)
Bandits.ArrayModel.measurable_initAlgFunction.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] (alg : Learning.Algorithm ๐“ R) : Measurable (initAlgFunction alg)

Code

lemma measurable_initAlgFunction (alg : Algorithm ๐“ R) :
    Measurable (initAlgFunction alg)
Type uses (2)
Body uses (1)
Used by (3)

Actions: Source ยท Open Issue

Proof
(Measure.exists_measurable_map_eq alg.p0).choose_spec.1

algFunction๐Ÿ”—

DefinitionBandits.ArrayModel.algFunction

The next action is the image of the history and a uniform random variable by this function.

๐Ÿ”—def
Bandits.ArrayModel.algFunction.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) : (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— R) โ†’ โ†‘unitInterval โ†’ ๐“
Bandits.ArrayModel.algFunction.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) : (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— R) โ†’ โ†‘unitInterval โ†’ ๐“

Code

noncomputable
def algFunction (alg : Algorithm ๐“ R) (n : โ„•) :
    (Iic n โ†’ ๐“ ร— R) โ†’ I โ†’ ๐“ :=
  (Kernel.exists_measurable_map_eq_unitInterval (alg.policy n)).choose
Type uses (1)
Body uses (1)
Used by (17)

Actions: Source ยท Open Issue

algFunction_map๐Ÿ”—

LemmaBandits.ArrayModel.algFunction_map

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.algFunction_map.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) (h : โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— R) : MeasureTheory.Measure.map (algFunction alg n h) MeasureTheory.volume = (Learning.Algorithm.policy alg n) h
Bandits.ArrayModel.algFunction_map.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) (h : โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— R) : MeasureTheory.Measure.map (algFunction alg n h) MeasureTheory.volume = (Learning.Algorithm.policy alg n) h

Code

lemma algFunction_map (alg : Algorithm ๐“ R) (n : โ„•) (h : Iic n โ†’ ๐“ ร— R) :
      volume.map (algFunction alg n h) = alg.policy n h
Type uses (2)
Body uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
(Kernel.exists_measurable_map_eq_unitInterval (alg.policy n)).choose_spec.2 h

measurable_algFunction๐Ÿ”—

LemmaBandits.ArrayModel.measurable_algFunction

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.measurable_algFunction.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) : Measurable (Function.uncurry (algFunction alg n))
Bandits.ArrayModel.measurable_algFunction.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) : Measurable (Function.uncurry (algFunction alg n))

Code

lemma measurable_algFunction (alg : Algorithm ๐“ R) (n : โ„•) :
    Measurable (Function.uncurry (algFunction alg n))
Type uses (2)
Body uses (1)
Used by (4)

Actions: Source ยท Open Issue

Proof
(Kernel.exists_measurable_map_eq_unitInterval (alg.policy n)).choose_spec.1

hist๐Ÿ”—

DefinitionBandits.ArrayModel.hist

History of actions and rewards up to time n in the array model.

๐Ÿ”—def
Bandits.ArrayModel.hist.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (ฯ‰ : probSpace ๐“ R) (n : โ„•) : โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— R
Bandits.ArrayModel.hist.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (ฯ‰ : probSpace ๐“ R) (n : โ„•) : โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— R

Code

noncomputable
def hist [DecidableEq ๐“] (alg : Algorithm ๐“ R) (ฯ‰ : probSpace ๐“ R) : (n : โ„•) โ†’ Iic n โ†’ ๐“ ร— R
| 0 => fun _ โ†ฆ (initAlgFunction alg (ฯ‰.1 0), ฯ‰.2 0 (initAlgFunction alg (ฯ‰.1 0)))
| n + 1 =>
  let hn : Iic n โ†’ ๐“ ร— R := hist alg ฯ‰ n
  let a : ๐“ := algFunction alg n hn (ฯ‰.1 (n + 1))
  fun i โ†ฆ if hin : i โ‰ค n then hn โŸจi, by simp [hin]โŸฉ else (a, ฯ‰.2 (pullCount' n hn a) a)
Type uses (2)
Body uses (3)
Used by (30)

Actions: Source ยท Open Issue

hist_zero๐Ÿ”—

LemmaBandits.ArrayModel.hist_zero

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.hist_zero.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (ฯ‰ : probSpace ๐“ R) : hist alg ฯ‰ 0 = fun x => (initAlgFunction alg (Prod.fst ฯ‰ 0), Prod.snd ฯ‰ 0 (initAlgFunction alg (Prod.fst ฯ‰ 0)))
Bandits.ArrayModel.hist_zero.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (ฯ‰ : probSpace ๐“ R) : hist alg ฯ‰ 0 = fun x => (initAlgFunction alg (Prod.fst ฯ‰ 0), Prod.snd ฯ‰ 0 (initAlgFunction alg (Prod.fst ฯ‰ 0)))

Code

lemma hist_zero [DecidableEq ๐“] (alg : Algorithm ๐“ R) (ฯ‰ : probSpace ๐“ R) :
    hist alg ฯ‰ 0 = fun _ โ†ฆ (initAlgFunction alg (ฯ‰.1 0), ฯ‰.2 0 (initAlgFunction alg (ฯ‰.1 0)))
Type uses (4)

Actions: Source ยท Open Issue

Proof
rfl

hist_add_one๐Ÿ”—

LemmaBandits.ArrayModel.hist_add_one

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.hist_add_one.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (ฯ‰ : probSpace ๐“ R) (n : โ„•) : have a := algFunction alg n (hist alg ฯ‰ n) (Prod.fst ฯ‰ (n + 1)); hist alg ฯ‰ (n + 1) = fun i => if hin : โ†‘i โ‰ค n then hist alg ฯ‰ n โŸจโ†‘i, โ‹ฏโŸฉ else (a, Prod.snd ฯ‰ (Learning.pullCount' n (hist alg ฯ‰ n) a) a)
Bandits.ArrayModel.hist_add_one.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (ฯ‰ : probSpace ๐“ R) (n : โ„•) : have a := algFunction alg n (hist alg ฯ‰ n) (Prod.fst ฯ‰ (n + 1)); hist alg ฯ‰ (n + 1) = fun i => if hin : โ†‘i โ‰ค n then hist alg ฯ‰ n โŸจโ†‘i, โ‹ฏโŸฉ else (a, Prod.snd ฯ‰ (Learning.pullCount' n (hist alg ฯ‰ n) a) a)

Code

lemma hist_add_one [DecidableEq ๐“] (alg : Algorithm ๐“ R) (ฯ‰ : probSpace ๐“ R) (n : โ„•) :
    let a : ๐“
Type uses (5)
Used by (5)

Actions: Source ยท Open Issue

Proof
algFunction alg n (hist alg ฯ‰ n) (ฯ‰.1 (n + 1))
    hist alg ฯ‰ (n + 1) =
      fun (i : Iic (n + 1)) โ†ฆ if hin : i โ‰ค n then hist alg ฯ‰ n โŸจi, by simp [hin]โŸฉ
        else (a, ฯ‰.2 (pullCount' n (hist alg ฯ‰ n) a) a) := rfl

hist_eq๐Ÿ”—

LemmaBandits.ArrayModel.hist_eq

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.hist_eq.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (ฯ‰ : probSpace ๐“ R) (n : โ„•) : hist alg ฯ‰ n = fun i => hist alg ฯ‰ โ†‘i โŸจโ†‘i, โ‹ฏโŸฉ
Bandits.ArrayModel.hist_eq.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (ฯ‰ : probSpace ๐“ R) (n : โ„•) : hist alg ฯ‰ n = fun i => hist alg ฯ‰ โ†‘i โŸจโ†‘i, โ‹ฏโŸฉ

Code

lemma hist_eq [DecidableEq ๐“] (alg : Algorithm ๐“ R) (ฯ‰ : probSpace ๐“ R) (n : โ„•) :
    hist alg ฯ‰ n = fun i : Iic n โ†ฆ hist alg ฯ‰ i โŸจi.1, by simpโŸฉ
Type uses (3)
Body uses (4)
Used by (7)

Actions: Source ยท Open Issue

Proof
by
  induction n with
  | zero =>
    ext i : 1
    simp only [hist]
    rw [Unique.eq_default i]
    simp [coe_default_Iic_zero]
  | succ n hn =>
    ext i : 1
    by_cases hin : i โ‰ค n
    ยท rw [hist_add_one]
      simp only [hin, โ†“reduceDIte]
      rw [funext_iff] at hn
      simp_rw [hn]
    ยท grind

hist_add_one_eq_IicSuccProd'๐Ÿ”—

LemmaBandits.ArrayModel.hist_add_one_eq_IicSuccProd'

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.hist_add_one_eq_IicSuccProd'.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (ฯ‰ : probSpace ๐“ R) (n : โ„•) : have a := algFunction alg n (hist alg ฯ‰ n) (Prod.fst ฯ‰ (n + 1)); hist alg ฯ‰ (n + 1) = (MeasurableEquiv.symm (MeasurableEquiv.IicSuccProd (fun x => ๐“ ร— R) n)) (hist alg ฯ‰ n, a, Prod.snd ฯ‰ (Learning.pullCount' n (hist alg ฯ‰ n) a) a)
Bandits.ArrayModel.hist_add_one_eq_IicSuccProd'.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (ฯ‰ : probSpace ๐“ R) (n : โ„•) : have a := algFunction alg n (hist alg ฯ‰ n) (Prod.fst ฯ‰ (n + 1)); hist alg ฯ‰ (n + 1) = (MeasurableEquiv.symm (MeasurableEquiv.IicSuccProd (fun x => ๐“ ร— R) n)) (hist alg ฯ‰ n, a, Prod.snd ฯ‰ (Learning.pullCount' n (hist alg ฯ‰ n) a) a)

Code

lemma hist_add_one_eq_IicSuccProd' [DecidableEq ๐“] (alg : Algorithm ๐“ R) (ฯ‰ : probSpace ๐“ R)
    (n : โ„•) :
    let a : ๐“
Type uses (6)
Body uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
algFunction alg n (hist alg ฯ‰ n) (ฯ‰.1 (n + 1))
    hist alg ฯ‰ (n + 1) =
      (MeasurableEquiv.IicSuccProd (fun _ โ†ฆ ๐“ ร— R) n).symm
        (hist alg ฯ‰ n, (a, ฯ‰.2 (pullCount' n (hist alg ฯ‰ n) a) a)) := by
  intro a
  rw [hist_add_one]
  ext i : 1
  simp only [Kernel.symm_IicSuccProd, MeasurableEquiv.prodCongr, MeasurableEquiv.refl_toEquiv,
    MeasurableEquiv.piSingleton, eq_rec_constant, MeasurableEquiv.IicProdIoc,
    MeasurableEquiv.trans_apply, MeasurableEquiv.coe_mk, Equiv.prodCongr_apply, Equiv.coe_refl,
    Equiv.coe_fn_mk, Prod.map_apply, id_eq]
  rfl

action๐Ÿ”—

DefinitionBandits.ArrayModel.action

Action taken at time n in the array model.

๐Ÿ”—def
Bandits.ArrayModel.action.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) (ฯ‰ : probSpace ๐“ R) : ๐“
Bandits.ArrayModel.action.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) (ฯ‰ : probSpace ๐“ R) : ๐“

Code

noncomputable
def action [DecidableEq ๐“] (alg : Algorithm ๐“ R) (n : โ„•) (ฯ‰ : probSpace ๐“ R) : ๐“ :=
  (hist alg ฯ‰ n โŸจn, by simpโŸฉ).1
Type uses (2)
Body uses (1)
Used by (43)

Actions: Source ยท Open Issue

action_zero๐Ÿ”—

LemmaBandits.ArrayModel.action_zero

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.action_zero.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) : action alg 0 = fun ฯ‰ => initAlgFunction alg (Prod.fst ฯ‰ 0)
Bandits.ArrayModel.action_zero.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) : action alg 0 = fun ฯ‰ => initAlgFunction alg (Prod.fst ฯ‰ 0)

Code

lemma action_zero [DecidableEq ๐“] (alg : Algorithm ๐“ R) :
    action alg 0 = fun ฯ‰ โ†ฆ initAlgFunction alg (ฯ‰.1 0)
Type uses (4)
Used by (3)

Actions: Source ยท Open Issue

Proof
by
  ext
  simp [action, hist_zero]

action_add_one_eq๐Ÿ”—

LemmaBandits.ArrayModel.action_add_one_eq

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.action_add_one_eq.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) : action alg (n + 1) = fun ฯ‰ => algFunction alg n (hist alg ฯ‰ n) (Prod.fst ฯ‰ (n + 1))
Bandits.ArrayModel.action_add_one_eq.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) : action alg (n + 1) = fun ฯ‰ => algFunction alg n (hist alg ฯ‰ n) (Prod.fst ฯ‰ (n + 1))

Code

lemma action_add_one_eq [DecidableEq ๐“] (alg : Algorithm ๐“ R) (n : โ„•) :
    action alg (n + 1) = fun ฯ‰ โ†ฆ algFunction alg n (hist alg ฯ‰ n) (ฯ‰.1 (n + 1))
Type uses (5)
Body uses (2)
Used by (7)

Actions: Source ยท Open Issue

Proof
by
  ext ฯ‰
  rw [action, hist_add_one]
  simp only [add_le_iff_nonpos_right, nonpos_iff_eq_zero, one_ne_zero, โ†“reduceDIte]

reward๐Ÿ”—

DefinitionBandits.ArrayModel.reward

Reward received at time n in the array model.

๐Ÿ”—def
Bandits.ArrayModel.reward.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) (ฯ‰ : probSpace ๐“ R) : R
Bandits.ArrayModel.reward.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) (ฯ‰ : probSpace ๐“ R) : R

Code

noncomputable
def reward [DecidableEq ๐“] (alg : Algorithm ๐“ R) (n : โ„•) (ฯ‰ : probSpace ๐“ R) : R :=
  (hist alg ฯ‰ n โŸจn, by simpโŸฉ).2
Type uses (2)
Body uses (1)
Used by (24)

Actions: Source ยท Open Issue

reward_zero๐Ÿ”—

LemmaBandits.ArrayModel.reward_zero

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.reward_zero.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) : reward alg 0 = fun ฯ‰ => Prod.snd ฯ‰ 0 (action alg 0 ฯ‰)
Bandits.ArrayModel.reward_zero.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) : reward alg 0 = fun ฯ‰ => Prod.snd ฯ‰ 0 (action alg 0 ฯ‰)

Code

lemma reward_zero [DecidableEq ๐“] (alg : Algorithm ๐“ R) :
    reward alg 0 = fun ฯ‰ โ†ฆ ฯ‰.2 0 (action alg 0 ฯ‰)
Type uses (4)
Body uses (2)
Used by (2)

Actions: Source ยท Open Issue

Proof
by
  ext
  simp [reward, hist_zero, action_zero]

reward_add_one๐Ÿ”—

LemmaBandits.ArrayModel.reward_add_one

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.reward_add_one.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) : reward alg (n + 1) = fun ฯ‰ => Prod.snd ฯ‰ (Learning.pullCount' n (hist alg ฯ‰ n) (action alg (n + 1) ฯ‰)) (action alg (n + 1) ฯ‰)
Bandits.ArrayModel.reward_add_one.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) : reward alg (n + 1) = fun ฯ‰ => Prod.snd ฯ‰ (Learning.pullCount' n (hist alg ฯ‰ n) (action alg (n + 1) ฯ‰)) (action alg (n + 1) ฯ‰)

Code

lemma reward_add_one [DecidableEq ๐“] (alg : Algorithm ๐“ R) (n : โ„•) :
    reward alg (n + 1) =
      fun ฯ‰ โ†ฆ ฯ‰.2 (pullCount' n (hist alg ฯ‰ n) (action alg (n + 1) ฯ‰)) (action alg (n + 1) ฯ‰)
Type uses (6)
Body uses (3)
Used by (2)

Actions: Source ยท Open Issue

Proof
by
  ext ฯ‰
  simp [reward, hist_add_one, action_add_one_eq]

reward_eq๐Ÿ”—

LemmaBandits.ArrayModel.reward_eq

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.reward_eq.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) : reward alg n = fun ฯ‰ => Prod.snd ฯ‰ (Learning.pullCount (action alg) (action alg n ฯ‰) n ฯ‰) (action alg n ฯ‰)
Bandits.ArrayModel.reward_eq.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) : reward alg n = fun ฯ‰ => Prod.snd ฯ‰ (Learning.pullCount (action alg) (action alg n ฯ‰) n ฯ‰) (action alg n ฯ‰)

Code

lemma reward_eq [DecidableEq ๐“] (alg : Algorithm ๐“ R) (n : โ„•) :
    reward alg n = fun ฯ‰ โ†ฆ ฯ‰.2 (pullCount (action alg) (action alg n ฯ‰) n ฯ‰) (action alg n ฯ‰)
Type uses (5)
Body uses (11)
Used by (3)

Actions: Source ยท Open Issue

Proof
by
  cases n with
  | zero => ext; simp [reward_zero, action_zero]
  | succ n =>
    ext ฯ‰
    rw [reward, hist_add_one]
    simp only [add_le_iff_nonpos_right, nonpos_iff_eq_zero, one_ne_zero, โ†“reduceDIte]
    rw [action_add_one_eq, pullCount_eq_pullCount' (R' := reward alg) (by simp)]
    simp only [Nat.add_one_sub_one]
    rw [hist_eq]
    rfl

sumRewards_eq๐Ÿ”—

LemmaBandits.ArrayModel.sumRewards_eq

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.sumRewards_eq.{u_1} {๐“ : Type u_1} {m๐“ : MeasurableSpace ๐“} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ โ„) (a : ๐“) (n : โ„•) (ฯ‰ : probSpace ๐“ โ„) : Learning.sumRewards (action alg) (reward alg) a n ฯ‰ = โˆ‘ i โˆˆ Finset.range (Learning.pullCount (action alg) a n ฯ‰), Prod.snd ฯ‰ i a
Bandits.ArrayModel.sumRewards_eq.{u_1} {๐“ : Type u_1} {m๐“ : MeasurableSpace ๐“} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ โ„) (a : ๐“) (n : โ„•) (ฯ‰ : probSpace ๐“ โ„) : Learning.sumRewards (action alg) (reward alg) a n ฯ‰ = โˆ‘ i โˆˆ Finset.range (Learning.pullCount (action alg) a n ฯ‰), Prod.snd ฯ‰ i a

Code

lemma sumRewards_eq [DecidableEq ๐“] (alg : Algorithm ๐“ โ„) (a : ๐“) (n : โ„•) (ฯ‰ : probSpace ๐“ โ„) :
    sumRewards (action alg) (reward alg) a n ฯ‰ =
      โˆ‘ i โˆˆ range (pullCount (action alg) a n ฯ‰), ฯ‰.2 i a
Type uses (6)
Body uses (6)
Used by (3)

Actions: Source ยท Open Issue

Proof
by
  induction n with
  | zero => simp
  | succ n ih =>
    by_cases ha : action alg n ฯ‰ = a
    ยท simp [ha, sumRewards_add_one, pullCount_add_one, sum_range_succ, ih, reward_eq]
    ยท simp [ha, sumRewards_add_one, pullCount_eq_pullCount_of_action_ne, ih]

measurable_action_add_one'๐Ÿ”—

LemmaBandits.ArrayModel.measurable_action_add_one'

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.measurable_action_add_one'.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] {alg : Learning.Algorithm ๐“ R} (n : โ„•) (h : Measurable fun x => hist alg x n) : Measurable fun x => algFunction alg n (hist alg x n) (Prod.fst x (n + 1))
Bandits.ArrayModel.measurable_action_add_one'.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] {alg : Learning.Algorithm ๐“ R} (n : โ„•) (h : Measurable fun x => hist alg x n) : Measurable fun x => algFunction alg n (hist alg x n) (Prod.fst x (n + 1))

Code

lemma measurable_action_add_one' [DecidableEq ๐“] {alg : Algorithm ๐“ R}
    (n : โ„•) (h : Measurable (hist alg ยท n)) :
    Measurable (fun x โ†ฆ algFunction alg n (hist alg x n) (x.1 (n + 1)))
Type uses (5)
Body uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
by fun_prop

measurable_pullCount'_action_add_one๐Ÿ”—

LemmaBandits.ArrayModel.measurable_pullCount'_action_add_one

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.measurable_pullCount'_action_add_one.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] {alg : Learning.Algorithm ๐“ R} (n : โ„•) (h_hist : Measurable fun x => hist alg x n) : Measurable fun x => Learning.pullCount' n (hist alg x n) (algFunction alg n (hist alg x n) (Prod.fst x (n + 1)))
Bandits.ArrayModel.measurable_pullCount'_action_add_one.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] {alg : Learning.Algorithm ๐“ R} (n : โ„•) (h_hist : Measurable fun x => hist alg x n) : Measurable fun x => Learning.pullCount' n (hist alg x n) (algFunction alg n (hist alg x n) (Prod.fst x (n + 1)))

Code

lemma measurable_pullCount'_action_add_one [DecidableEq ๐“] {alg : Algorithm ๐“ R}
    (n : โ„•) (h_hist : Measurable (hist alg ยท n)) :
    Measurable (fun x โ†ฆ
      pullCount' n (hist alg x n) (algFunction alg n (hist alg x n) (x.1 (n + 1))))
Type uses (6)
Body uses (3)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  have h_alg_meas : Measurable (fun x โ†ฆ algFunction alg n (hist alg x n) (x.1 (n + 1))) :=
    measurable_action_add_one' n h_hist
  exact (measurable_uncurry_pullCount' (๐“ := ๐“) n).comp (h_hist.prodMk h_alg_meas)

measurable_hist๐Ÿ”—

LemmaBandits.ArrayModel.measurable_hist

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.measurable_hist.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) : Measurable fun ฯ‰ => hist alg ฯ‰ n
Bandits.ArrayModel.measurable_hist.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) : Measurable fun ฯ‰ => hist alg ฯ‰ n

Code

lemma measurable_hist [DecidableEq ๐“] [Countable ๐“] (alg : Algorithm ๐“ R) (n : โ„•) :
    Measurable (fun ฯ‰ โ†ฆ hist alg ฯ‰ n)
Type uses (4)
Body uses (7)
Used by (8)

Actions: Source ยท Open Issue

Proof
by
  induction n with
  | zero =>
    simp_rw [hist_zero, measurable_pi_iff]
    refine fun _ โ†ฆ Measurable.prodMk (by fun_prop) ?_
    change Measurable ((fun x : ๐“ ร— ((โ„• โ†’ I) ร— (โ„• โ†’ ๐“ โ†’ R)) โ†ฆ x.2.2 0 x.1) โˆ˜
        (fun x : (โ„• โ†’ I) ร— (โ„• โ†’ ๐“ โ†’ R) โ†ฆ (initAlgFunction alg (x.1 0), x)))
    have : Measurable (fun x : ๐“ ร— ((โ„• โ†’ I) ร— (โ„• โ†’ ๐“ โ†’ R)) โ†ฆ x.2.2 0 x.1) :=
      measurable_from_prod_countable_right fun p โ†ฆ by simp only; fun_prop
    exact Measurable.comp (by fun_prop) (Measurable.prodMk (by fun_prop) (by fun_prop))
  | succ n hn =>
    refine measurable_pi_iff.mpr fun i โ†ฆ ?_
    by_cases hin : i โ‰ค n
    ยท simp only [hist, hin, โ†“reduceDIte]
      rw [measurable_pi_iff] at hn
      exact hn โŸจi.1, by simp [hin]โŸฉ
    ยท simp only [hist, hin, โ†“reduceDIte]
      refine Measurable.prodMk (by fun_prop) ?_
      change Measurable ((fun (x : (โ„• โ†’ ๐“ โ†’ R) ร— โ„• ร— ๐“) โ†ฆ x.1 x.2.1 x.2.2) โˆ˜
        (fun x โ†ฆ (x.2, pullCount' n (hist alg x n) (algFunction alg n (hist alg x n) (x.1 (n + 1))),
          (algFunction alg n (hist alg x n) (x.1 (n + 1))))))
      have h1 : Measurable (fun (x : (โ„• โ†’ ๐“ โ†’ R) ร— โ„• ร— ๐“) โ†ฆ x.1 x.2.1 x.2.2) :=
        measurable_from_prod_countable_left fun p : โ„• ร— ๐“ โ†ฆ (by simp only; fun_prop)
      refine Measurable.comp (by fun_prop) (Measurable.prodMk (by fun_prop) ?_)
      refine Measurable.prodMk ?_ (by fun_prop)
      exact measurable_pullCount'_action_add_one n hn

measurable_action๐Ÿ”—

LemmaBandits.ArrayModel.measurable_action

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.measurable_action.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) : Measurable (action alg n)
Bandits.ArrayModel.measurable_action.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) : Measurable (action alg n)

Code

lemma measurable_action [DecidableEq ๐“] [Countable ๐“] (alg : Algorithm ๐“ R) (n : โ„•) :
    Measurable (action alg n)
Type uses (4)
Body uses (2)
Used by (13)

Actions: Source ยท Open Issue

Proof
by unfold action; fun_prop

measurable_reward๐Ÿ”—

LemmaBandits.ArrayModel.measurable_reward

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.measurable_reward.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) : Measurable (reward alg n)
Bandits.ArrayModel.measurable_reward.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) : Measurable (reward alg n)

Code

lemma measurable_reward [DecidableEq ๐“] [Countable ๐“] (alg : Algorithm ๐“ R) (n : โ„•) :
    Measurable (reward alg n)
Type uses (4)
Body uses (2)
Used by (8)

Actions: Source ยท Open Issue

Proof
by unfold reward; fun_prop

hist_add_one_eq_IicSuccProd๐Ÿ”—

LemmaBandits.ArrayModel.hist_add_one_eq_IicSuccProd

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.hist_add_one_eq_IicSuccProd.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (ฯ‰ : probSpace ๐“ R) (n : โ„•) : hist alg ฯ‰ (n + 1) = (MeasurableEquiv.symm (MeasurableEquiv.IicSuccProd (fun x => ๐“ ร— R) n)) (hist alg ฯ‰ n, action alg (n + 1) ฯ‰, reward alg (n + 1) ฯ‰)
Bandits.ArrayModel.hist_add_one_eq_IicSuccProd.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (ฯ‰ : probSpace ๐“ R) (n : โ„•) : hist alg ฯ‰ (n + 1) = (MeasurableEquiv.symm (MeasurableEquiv.IicSuccProd (fun x => ๐“ ร— R) n)) (hist alg ฯ‰ n, action alg (n + 1) ฯ‰, reward alg (n + 1) ฯ‰)

Code

lemma hist_add_one_eq_IicSuccProd [DecidableEq ๐“] (alg : Algorithm ๐“ R) (ฯ‰ : probSpace ๐“ R)
    (n : โ„•) :
    hist alg ฯ‰ (n + 1) =
      (MeasurableEquiv.IicSuccProd (fun _ โ†ฆ ๐“ ร— R) n).symm
        (hist alg ฯ‰ n, (action alg (n + 1) ฯ‰, reward alg (n + 1) ฯ‰))
Type uses (6)
Body uses (5)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  rw [hist_add_one_eq_IicSuccProd', reward_add_one, action_add_one_eq]

measurable_pullCount_action_add_one๐Ÿ”—

LemmaBandits.ArrayModel.measurable_pullCount_action_add_one

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.measurable_pullCount_action_add_one.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) : Measurable fun ฯ‰ => Learning.pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰
Bandits.ArrayModel.measurable_pullCount_action_add_one.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) : Measurable fun ฯ‰ => Learning.pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰

Code

lemma measurable_pullCount_action_add_one [DecidableEq ๐“] [Countable ๐“] (alg : Algorithm ๐“ R)
    (n : โ„•) :
    Measurable (fun ฯ‰ โ†ฆ pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰)
Type uses (5)
Body uses (3)
Used by (6)

Actions: Source ยท Open Issue

Proof
by
  change Measurable ((fun p : (probSpace ๐“ R) ร— ๐“ โ†ฆ pullCount (action alg) p.2 (n + 1) p.1) โˆ˜
    (fun ฯ‰ : probSpace ๐“ R โ†ฆ (ฯ‰, action alg (n + 1) ฯ‰)))
  exact (measurable_uncurry_pullCount (by fun_prop) _).comp (by fun_prop)

hist_congr๐Ÿ”—

LemmaBandits.ArrayModel.hist_congr

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.hist_congr.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) {ฯ‰ ฯ‰' : probSpace ๐“ R} (hฯ‰1 : โˆ€ i โ‰ค n, Prod.fst ฯ‰ i = Prod.fst ฯ‰' i) (hฯ‰2 : โˆ€ (i : โ„•) (a : ๐“), i < Learning.pullCount (action alg) a (n + 1) ฯ‰ โ†’ Prod.snd ฯ‰ i a = Prod.snd ฯ‰' i a) : hist alg ฯ‰ n = hist alg ฯ‰' n
Bandits.ArrayModel.hist_congr.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) {ฯ‰ ฯ‰' : probSpace ๐“ R} (hฯ‰1 : โˆ€ i โ‰ค n, Prod.fst ฯ‰ i = Prod.fst ฯ‰' i) (hฯ‰2 : โˆ€ (i : โ„•) (a : ๐“), i < Learning.pullCount (action alg) a (n + 1) ฯ‰ โ†’ Prod.snd ฯ‰ i a = Prod.snd ฯ‰' i a) : hist alg ฯ‰ n = hist alg ฯ‰' n

Code

lemma hist_congr (alg : Algorithm ๐“ R) (n : โ„•) {ฯ‰ ฯ‰' : probSpace ๐“ R}
    (hฯ‰1 : โˆ€ i โ‰ค n, ฯ‰.1 i = ฯ‰'.1 i)
    (hฯ‰2 : โˆ€ i a, i < pullCount (action alg) a (n + 1) ฯ‰ โ†’ ฯ‰.2 i a = ฯ‰'.2 i a) :
    hist alg ฯ‰ n = hist alg ฯ‰' n
Type uses (5)
Body uses (13)
Used by (3)

Actions: Source ยท Open Issue

Proof
by
  induction n with
  | zero =>
    simp only [zero_add, pullCount_one] at hฯ‰2
    simp_rw [hist_zero]
    ext i : 1
    simp only [le_refl, hฯ‰1, Prod.mk.injEq, true_and]
    refine hฯ‰2 0 _ ?_
    simp [action, hฯ‰1]
  | succ n hn =>
    simp_rw [hist_add_one_eq_IicSuccProd]
    specialize hn fun i hin โ†ฆ hฯ‰1 i (by grind)
    have h_hist : hist alg ฯ‰ n = hist alg ฯ‰' n := by
      refine hn fun i a hi โ†ฆ hฯ‰2 i a (hi.trans_le ?_)
      exact pullCount_mono _ (by lia) _
    have h_action : action alg (n + 1) ฯ‰ = action alg (n + 1) ฯ‰' := by
      simp_rw [action_add_one_eq]
      rw [h_hist, hฯ‰1 _ le_rfl]
    congr 3
    simp only [reward_add_one, h_hist, h_action]
    refine hฯ‰2 _ _ ?_
    rw [pullCount_add_one, h_action]
    simp only [โ†“reduceIte]
    rw [pullCount_eq_pullCount' (R' := reward alg) (by simp)]
    simp only [Nat.add_one_sub_one]
    rw [โ† h_hist, hist_eq]
    change pullCount' n  (fun i โ†ฆ (action alg i ฯ‰, reward alg i ฯ‰)) (action alg (n + 1) ฯ‰') <
      pullCount' n (fun i โ†ฆ (action alg i ฯ‰, reward alg i ฯ‰)) (action alg (n + 1) ฯ‰') + 1
    grind

stepsUntil_congr_aux๐Ÿ”—

LemmaBandits.ArrayModel.stepsUntil_congr_aux

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.stepsUntil_congr_aux.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (a : ๐“) (m n : โ„•) {ฯ‰ ฯ‰' : probSpace ๐“ R} (hฯ‰1 : โˆ€ (i : โ„•), Prod.fst ฯ‰ i = Prod.fst ฯ‰' i) (hฯ‰2_ne : โˆ€ (i : โ„•) (b : ๐“), b โ‰  a โ†’ Prod.snd ฯ‰ i b = Prod.snd ฯ‰' i b) (hฯ‰2_eq : โˆ€ (i : โ„•), i + 1 โ‰ค m โ†’ Prod.snd ฯ‰ i a = Prod.snd ฯ‰' i a) (h_eq : action alg (n + 1) ฯ‰ = a โˆง Learning.pullCount (action alg) a (n + 1) ฯ‰ = m) : action alg (n + 1) ฯ‰' = a โˆง Learning.pullCount (action alg) a (n + 1) ฯ‰' = m
Bandits.ArrayModel.stepsUntil_congr_aux.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (a : ๐“) (m n : โ„•) {ฯ‰ ฯ‰' : probSpace ๐“ R} (hฯ‰1 : โˆ€ (i : โ„•), Prod.fst ฯ‰ i = Prod.fst ฯ‰' i) (hฯ‰2_ne : โˆ€ (i : โ„•) (b : ๐“), b โ‰  a โ†’ Prod.snd ฯ‰ i b = Prod.snd ฯ‰' i b) (hฯ‰2_eq : โˆ€ (i : โ„•), i + 1 โ‰ค m โ†’ Prod.snd ฯ‰ i a = Prod.snd ฯ‰' i a) (h_eq : action alg (n + 1) ฯ‰ = a โˆง Learning.pullCount (action alg) a (n + 1) ฯ‰ = m) : action alg (n + 1) ฯ‰' = a โˆง Learning.pullCount (action alg) a (n + 1) ฯ‰' = m

Code

lemma stepsUntil_congr_aux (alg : Algorithm ๐“ R)
    (a : ๐“) (m n : โ„•) {ฯ‰ ฯ‰' : probSpace ๐“ R}
    (hฯ‰1 : โˆ€ i, ฯ‰.1 i = ฯ‰'.1 i) (hฯ‰2_ne : โˆ€ i b, b โ‰  a โ†’ ฯ‰.2 i b = ฯ‰'.2 i b)
    (hฯ‰2_eq : โˆ€ i, i + 1 โ‰ค m โ†’ ฯ‰.2 i a = ฯ‰'.2 i a)
    (h_eq : action alg (n + 1) ฯ‰ = a โˆง pullCount (action alg) a (n + 1) ฯ‰ = m) :
    action alg (n + 1) ฯ‰' = a โˆง pullCount (action alg) a (n + 1) ฯ‰' = m
Type uses (4)
Body uses (6)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  obtain โŸจh_action, h_pcโŸฉ := h_eq
  have h_hist := hist_congr alg n (ฯ‰ := ฯ‰) (ฯ‰' := ฯ‰') (by grind) fun i b hi โ†ฆ ?_
  swap
  ยท rcases eq_or_ne b a with (rfl | hba)
    ยท refine hฯ‰2_eq i ?_
      rw [h_pc] at hi
      grind
    ยท grind
  constructor
  ยท rw [โ† h_action, action_add_one_eq]
    simp [h_hist, hฯ‰1]
  ยท simp_rw [โ† h_pc, pullCount_eq_sum]
    refine Finset.sum_congr rfl fun i hi โ†ฆ ?_
    congr 2
    rw [hist_eq _ _ n, hist_eq _ _ n, funext_iff] at h_hist
    unfold action
    specialize h_hist โŸจi, by grindโŸฉ
    simp only at h_hist
    rw [h_hist]

stepsUntil_congr๐Ÿ”—

LemmaBandits.ArrayModel.stepsUntil_congr

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.stepsUntil_congr.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (a : ๐“) (m n : โ„•) {ฯ‰ ฯ‰' : probSpace ๐“ R} (hฯ‰1 : โˆ€ (i : โ„•), Prod.fst ฯ‰ i = Prod.fst ฯ‰' i) (hฯ‰2_ne : โˆ€ (i : โ„•) (b : ๐“), b โ‰  a โ†’ Prod.snd ฯ‰ i b = Prod.snd ฯ‰' i b) (hฯ‰2_eq : โˆ€ (i : โ„•), i + 1 โ‰ค m โ†’ Prod.snd ฯ‰ i a = Prod.snd ฯ‰' i a) : action alg (n + 1) ฯ‰ = a โˆง Learning.pullCount (action alg) a (n + 1) ฯ‰ = m โ†” action alg (n + 1) ฯ‰' = a โˆง Learning.pullCount (action alg) a (n + 1) ฯ‰' = m
Bandits.ArrayModel.stepsUntil_congr.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (a : ๐“) (m n : โ„•) {ฯ‰ ฯ‰' : probSpace ๐“ R} (hฯ‰1 : โˆ€ (i : โ„•), Prod.fst ฯ‰ i = Prod.fst ฯ‰' i) (hฯ‰2_ne : โˆ€ (i : โ„•) (b : ๐“), b โ‰  a โ†’ Prod.snd ฯ‰ i b = Prod.snd ฯ‰' i b) (hฯ‰2_eq : โˆ€ (i : โ„•), i + 1 โ‰ค m โ†’ Prod.snd ฯ‰ i a = Prod.snd ฯ‰' i a) : action alg (n + 1) ฯ‰ = a โˆง Learning.pullCount (action alg) a (n + 1) ฯ‰ = m โ†” action alg (n + 1) ฯ‰' = a โˆง Learning.pullCount (action alg) a (n + 1) ฯ‰' = m

Code

lemma stepsUntil_congr (alg : Algorithm ๐“ R) (a : ๐“) (m n : โ„•) {ฯ‰ ฯ‰' : probSpace ๐“ R}
    (hฯ‰1 : โˆ€ i, ฯ‰.1 i = ฯ‰'.1 i) (hฯ‰2_ne : โˆ€ i b, b โ‰  a โ†’ ฯ‰.2 i b = ฯ‰'.2 i b)
    (hฯ‰2_eq : โˆ€ i, i + 1 โ‰ค m โ†’ ฯ‰.2 i a = ฯ‰'.2 i a) :
    (action alg (n + 1) ฯ‰ = a โˆง pullCount (action alg) a (n + 1) ฯ‰ = m) โ†”
      (action alg (n + 1) ฯ‰' = a โˆง pullCount (action alg) a (n + 1) ฯ‰' = m)
Type uses (4)
Body uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
โŸจstepsUntil_congr_aux alg a m n hฯ‰1 hฯ‰2_ne hฯ‰2_eq,
    stepsUntil_congr_aux alg a m n (by grind) (by grind) (by grind)โŸฉ

stepsUntil_indicator_congr๐Ÿ”—

LemmaBandits.ArrayModel.stepsUntil_indicator_congr

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.stepsUntil_indicator_congr.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (a : ๐“) (m n : โ„•) {ฯ‰ ฯ‰' : probSpace ๐“ R} (hฯ‰1 : โˆ€ (i : โ„•), Prod.fst ฯ‰ i = Prod.fst ฯ‰' i) (hฯ‰2_ne : โˆ€ (i : โ„•) (b : ๐“), b โ‰  a โ†’ Prod.snd ฯ‰ i b = Prod.snd ฯ‰' i b) (hฯ‰2_eq : โˆ€ (i : โ„•), i + 1 โ‰ค m โ†’ Prod.snd ฯ‰ i a = Prod.snd ฯ‰' i a) : Set.indicator {ฯ‰ | action alg (n + 1) ฯ‰ = a โˆง Learning.pullCount (action alg) a (n + 1) ฯ‰ = m} (fun x => 1) ฯ‰ = Set.indicator {ฯ‰ | action alg (n + 1) ฯ‰ = a โˆง Learning.pullCount (action alg) a (n + 1) ฯ‰ = m} (fun x => 1) ฯ‰'
Bandits.ArrayModel.stepsUntil_indicator_congr.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (a : ๐“) (m n : โ„•) {ฯ‰ ฯ‰' : probSpace ๐“ R} (hฯ‰1 : โˆ€ (i : โ„•), Prod.fst ฯ‰ i = Prod.fst ฯ‰' i) (hฯ‰2_ne : โˆ€ (i : โ„•) (b : ๐“), b โ‰  a โ†’ Prod.snd ฯ‰ i b = Prod.snd ฯ‰' i b) (hฯ‰2_eq : โˆ€ (i : โ„•), i + 1 โ‰ค m โ†’ Prod.snd ฯ‰ i a = Prod.snd ฯ‰' i a) : Set.indicator {ฯ‰ | action alg (n + 1) ฯ‰ = a โˆง Learning.pullCount (action alg) a (n + 1) ฯ‰ = m} (fun x => 1) ฯ‰ = Set.indicator {ฯ‰ | action alg (n + 1) ฯ‰ = a โˆง Learning.pullCount (action alg) a (n + 1) ฯ‰ = m} (fun x => 1) ฯ‰'

Code

lemma stepsUntil_indicator_congr (alg : Algorithm ๐“ R) (a : ๐“) (m n : โ„•) {ฯ‰ ฯ‰' : probSpace ๐“ R}
    (hฯ‰1 : โˆ€ i, ฯ‰.1 i = ฯ‰'.1 i) (hฯ‰2_ne : โˆ€ i b, b โ‰  a โ†’ ฯ‰.2 i b = ฯ‰'.2 i b)
    (hฯ‰2_eq : โˆ€ i, i + 1 โ‰ค m โ†’ ฯ‰.2 i a = ฯ‰'.2 i a) :
    {ฯ‰ | action alg (n + 1) ฯ‰ = a โˆง pullCount (action alg) a (n + 1) ฯ‰ = m}.indicator (fun _ โ†ฆ 1)
        ฯ‰ =
      {ฯ‰ | action alg (n + 1) ฯ‰ = a โˆง pullCount (action alg) a (n + 1) ฯ‰ = m}.indicator
        (fun _ โ†ฆ 1) ฯ‰'
Type uses (4)
Body uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  simp only [Set.indicator_apply, Set.mem_setOf_eq]
  simp_rw [stepsUntil_congr alg a m n hฯ‰1 hฯ‰2_ne hฯ‰2_eq]

measurable_hist_comap๐Ÿ”—

LemmaBandits.ArrayModel.measurable_hist_comap

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.measurable_hist_comap.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) : Measurable fun x => hist alg x n
Bandits.ArrayModel.measurable_hist_comap.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) : Measurable fun x => hist alg x n

Code

lemma measurable_hist_comap [Countable ๐“] (alg : Algorithm ๐“ R) (n : โ„•) :
    Measurable[MeasurableSpace.comap (fun ฯ‰ โ†ฆ (fun (i : Iic n) โ†ฆ ฯ‰.1 i, ฯ‰.2)) inferInstance]
      (hist alg ยท n)
Type uses (3)
Body uses (6)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  have h_eq : (hist alg ยท n) =
      ((hist alg ยท n) โˆ˜ (fun p โ†ฆ (fun i : โ„• โ†ฆ p.1 โŸจmin i n, by grindโŸฉ, p.2))) โˆ˜
        (fun ฯ‰ โ†ฆ (fun (i : Iic n) โ†ฆ ฯ‰.1 i, ฯ‰.2)) := by
    ext ฯ‰ : 1
    exact hist_congr alg n (by grind) (by simp)
  rw [h_eq]
  refine measurable_comp_comap _ (Measurable.comp (by fun_prop) ?_)
  refine Measurable.prodMk ?_ (by fun_prop)
  rw [measurable_pi_iff]
  intro i
  change Measurable ((fun p โ†ฆ p โŸจmin i n, by simpโŸฉ) โˆ˜ (fun x : (Iic n โ†’ I) ร— (โ„• โ†’ ๐“ โ†’ R) โ†ฆ x.1))
  exact Measurable.comp (by fun_prop) measurable_fst

truePast๐Ÿ”—

DefinitionBandits.ArrayModel.truePast

All random variables in the space, except for the unseen rewards for action a after time n.

๐Ÿ”—def
Bandits.ArrayModel.truePast.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Nonempty R] (alg : Learning.Algorithm ๐“ R) (a : ๐“) (n : โ„•) (ฯ‰ : probSpace ๐“ R) : probSpace ๐“ R
Bandits.ArrayModel.truePast.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Nonempty R] (alg : Learning.Algorithm ๐“ R) (a : ๐“) (n : โ„•) (ฯ‰ : probSpace ๐“ R) : probSpace ๐“ R

Code

noncomputable
def truePast (alg : Algorithm ๐“ R) (a : ๐“) (n : โ„•) (ฯ‰ : probSpace ๐“ R) :
    probSpace ๐“ R :=
  (ฯ‰.1, fun i b โ†ฆ if b = a then if pullCount (action alg) a (n + 1) ฯ‰ โ‰  0 then
      ฯ‰.2 (min i ((pullCount (action alg) a (n + 1) ฯ‰) - 1)) a else Nonempty.some inferInstance
    else ฯ‰.2 i b)
Type uses (2)
Body uses (2)
Used by (6)

Actions: Source ยท Open Issue

truePast_eq_of_pullCount_eq๐Ÿ”—

LemmaBandits.ArrayModel.truePast_eq_of_pullCount_eq

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.truePast_eq_of_pullCount_eq.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Nonempty R] (alg : Learning.Algorithm ๐“ R) (a : ๐“) (n m : โ„•) (ฯ‰ : probSpace ๐“ R) (h_pc : Learning.pullCount (action alg) a (n + 1) ฯ‰ = m) : truePast alg a n ฯ‰ = (Prod.fst ฯ‰, fun i b => if b = a then if m โ‰  0 then Prod.snd ฯ‰ (min i (m - 1)) a else Nonempty.some โ‹ฏ else Prod.snd ฯ‰ i b)
Bandits.ArrayModel.truePast_eq_of_pullCount_eq.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Nonempty R] (alg : Learning.Algorithm ๐“ R) (a : ๐“) (n m : โ„•) (ฯ‰ : probSpace ๐“ R) (h_pc : Learning.pullCount (action alg) a (n + 1) ฯ‰ = m) : truePast alg a n ฯ‰ = (Prod.fst ฯ‰, fun i b => if b = a then if m โ‰  0 then Prod.snd ฯ‰ (min i (m - 1)) a else Nonempty.some โ‹ฏ else Prod.snd ฯ‰ i b)

Code

lemma truePast_eq_of_pullCount_eq (alg : Algorithm ๐“ R)
    (a : ๐“) (n m : โ„•) (ฯ‰ : probSpace ๐“ R)
    (h_pc : pullCount (action alg) a (n + 1) ฯ‰ = m) :
    truePast alg a n ฯ‰ = (ฯ‰.1, fun i b โ†ฆ if b = a then if m โ‰  0 then
      ฯ‰.2 (min i (m - 1)) a else Nonempty.some inferInstance else ฯ‰.2 i b)
Type uses (5)

Actions: Source ยท Open Issue

Proof
by
  simp [truePast, h_pc]

truePast_eq_of_pullCount_eq_of_ne_zero๐Ÿ”—

LemmaBandits.ArrayModel.truePast_eq_of_pullCount_eq_of_ne_zero

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.truePast_eq_of_pullCount_eq_of_ne_zero.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Nonempty R] (alg : Learning.Algorithm ๐“ R) (a : ๐“) (n m : โ„•) (ฯ‰ : probSpace ๐“ R) (h_pc : Learning.pullCount (action alg) a (n + 1) ฯ‰ = m) (hm : m โ‰  0) : truePast alg a n ฯ‰ = (Prod.fst ฯ‰, fun i b => if b = a then Prod.snd ฯ‰ (min i (m - 1)) a else Prod.snd ฯ‰ i b)
Bandits.ArrayModel.truePast_eq_of_pullCount_eq_of_ne_zero.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Nonempty R] (alg : Learning.Algorithm ๐“ R) (a : ๐“) (n m : โ„•) (ฯ‰ : probSpace ๐“ R) (h_pc : Learning.pullCount (action alg) a (n + 1) ฯ‰ = m) (hm : m โ‰  0) : truePast alg a n ฯ‰ = (Prod.fst ฯ‰, fun i b => if b = a then Prod.snd ฯ‰ (min i (m - 1)) a else Prod.snd ฯ‰ i b)

Code

lemma truePast_eq_of_pullCount_eq_of_ne_zero (alg : Algorithm ๐“ R)
    (a : ๐“) (n m : โ„•) (ฯ‰ : probSpace ๐“ R)
    (h_pc : pullCount (action alg) a (n + 1) ฯ‰ = m) (hm : m โ‰  0) :
    truePast alg a n ฯ‰ = (ฯ‰.1, fun i b โ†ฆ if b = a then
      ฯ‰.2 (min i (m - 1)) a else ฯ‰.2 i b)
Type uses (5)

Actions: Source ยท Open Issue

Proof
by
  simp [truePast, h_pc, hm]

measurable_hist_truePast๐Ÿ”—

LemmaBandits.ArrayModel.measurable_hist_truePast

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.measurable_hist_truePast.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Nonempty R] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (a : ๐“) (n : โ„•) : Measurable fun x => hist alg x n
Bandits.ArrayModel.measurable_hist_truePast.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Nonempty R] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (a : ๐“) (n : โ„•) : Measurable fun x => hist alg x n

Code

lemma measurable_hist_truePast [Countable ๐“] (alg : Algorithm ๐“ R)
    (a : ๐“) (n : โ„•) :
    Measurable[MeasurableSpace.comap (truePast alg a n) inferInstance] (hist alg ยท n)
Type uses (5)
Body uses (4)
Used by (3)

Actions: Source ยท Open Issue

Proof
by
  have h_eq : (hist alg ยท n) = (hist alg ยท n) โˆ˜ (truePast alg a n) := by
    ext ฯ‰ : 1
    refine hist_congr alg n (fun _ _ โ†ฆ rfl) fun i b hi โ†ฆ ?_
    by_cases hb : b = a
    ยท subst hb
      simp only [truePast, โ†“reduceIte]
      rw [min_eq_left, if_pos (by grind)]
      grind
    ยท simp [truePast, hb]
  rw [h_eq]
  refine Measurable.comp ?_ (Measurable.of_comap_le le_rfl)
  fun_prop

measurable_action_add_one_truePast๐Ÿ”—

LemmaBandits.ArrayModel.measurable_action_add_one_truePast

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.measurable_action_add_one_truePast.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Nonempty R] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (a : ๐“) (n : โ„•) : Measurable (action alg (n + 1))
Bandits.ArrayModel.measurable_action_add_one_truePast.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Nonempty R] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (a : ๐“) (n : โ„•) : Measurable (action alg (n + 1))

Code

lemma measurable_action_add_one_truePast [Countable ๐“] (alg : Algorithm ๐“ R)
    (a : ๐“) (n : โ„•) :
    Measurable[MeasurableSpace.comap (truePast alg a n) inferInstance]
      (action alg (n + 1))
Type uses (5)
Body uses (5)

Actions: Source ยท Open Issue

Proof
by
  rw [action_add_one_eq]
  change Measurable[MeasurableSpace.comap (truePast alg a n) inferInstance]
    ((fun p โ†ฆ algFunction alg n p.1 p.2) โˆ˜ (fun ฯ‰ โ†ฆ (hist alg ฯ‰ n, ฯ‰.1 (n + 1))))
  refine (measurable_algFunction alg n).comp (Measurable.prodMk ?_ ?_)
  ยท exact measurable_hist_truePast alg a n
  ยท have : (fun ฯ‰ โ†ฆ ฯ‰.1 (n + 1)) =
      (fun (p : probSpace ๐“ R) โ†ฆ p.1 (n + 1)) โˆ˜ (truePast alg a n) := rfl
    rw [this]
    exact Measurable.comp (by fun_prop) (Measurable.of_comap_le le_rfl)

measurable_pullCount_add_one_truePast๐Ÿ”—

LemmaBandits.ArrayModel.measurable_pullCount_add_one_truePast

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.measurable_pullCount_add_one_truePast.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Nonempty R] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (a : ๐“) (n : โ„•) : Measurable (Learning.pullCount (action alg) a (n + 1))
Bandits.ArrayModel.measurable_pullCount_add_one_truePast.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Nonempty R] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (a : ๐“) (n : โ„•) : Measurable (Learning.pullCount (action alg) a (n + 1))

Code

lemma measurable_pullCount_add_one_truePast [Countable ๐“] (alg : Algorithm ๐“ R) (a : ๐“) (n : โ„•) :
    Measurable[MeasurableSpace.comap (truePast alg a n) inferInstance]
      (pullCount (action alg) a (n + 1))
Type uses (6)
Body uses (5)

Actions: Source ยท Open Issue

Proof
by
  change Measurable[MeasurableSpace.comap (truePast alg a n) inferInstance]
    (fun ฯ‰ โ†ฆ pullCount (action alg) a (n + 1) ฯ‰)
  simp_rw [pullCount_eq_sum]
  refine measurable_sum _ fun i hi โ†ฆ Measurable.ite ?_ (by fun_prop) (by fun_prop)
  refine (measurableSet_singleton _).preimage ?_
  have h_meas := measurable_hist_truePast alg a n
  simp_rw [hist_eq _ _ n, @measurable_pi_iff] at h_meas
  exact (h_meas โŸจi, by grindโŸฉ).fst

measurable_stepsUntil๐Ÿ”—

LemmaBandits.ArrayModel.measurable_stepsUntil

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.measurable_stepsUntil.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Nonempty R] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (a : ๐“) (m n : โ„•) : Measurable (Set.indicator {ฯ‰ | action alg (n + 1) ฯ‰ = a โˆง Learning.pullCount (action alg) a (n + 1) ฯ‰ = m} fun x => 1)
Bandits.ArrayModel.measurable_stepsUntil.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Nonempty R] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (a : ๐“) (m n : โ„•) : Measurable (Set.indicator {ฯ‰ | action alg (n + 1) ฯ‰ = a โˆง Learning.pullCount (action alg) a (n + 1) ฯ‰ = m} fun x => 1)

Code

lemma measurable_stepsUntil [Countable ๐“] (alg : Algorithm ๐“ R) (a : ๐“) (m n : โ„•) :
    Measurable[MeasurableSpace.comap
        (fun ฯ‰ โ†ฆ (ฯ‰.1, fun k b โ†ฆ if b = a then if m โ‰  0 then ฯ‰.2 (min k (m - 1)) b
          else Nonempty.some inferInstance else ฯ‰.2 k b)) inferInstance]
      (({ฯ‰ | action alg (n + 1) ฯ‰ = a โˆง
        pullCount (action alg) a (n + 1) ฯ‰ = m}).indicator (fun _ โ†ฆ 1))
Type uses (4)
Body uses (4)
Used by (2)

Actions: Source ยท Open Issue

Proof
by
  let f := ({ฯ‰ | action alg (n + 1) ฯ‰ = a โˆง pullCount (action alg) a (n + 1) ฯ‰ = m}).indicator
    (fun _ โ†ฆ 1)
  have h_eq : f = f โˆ˜
      fun ฯ‰ โ†ฆ (ฯ‰.1, fun k b โ†ฆ if b = a then if m โ‰  0 then ฯ‰.2 (min k (m - 1)) b
        else Nonempty.some inferInstance else ฯ‰.2 k b) := by
    ext ฯ‰
    exact stepsUntil_indicator_congr alg a m n (by grind) (by grind) (by grind)
  change Measurable[MeasurableSpace.comap
    (fun ฯ‰ โ†ฆ (ฯ‰.1, fun k b โ†ฆ if b = a then if m โ‰  0 then ฯ‰.2 (min k (m - 1)) b
      else Nonempty.some inferInstance else ฯ‰.2 k b)) inferInstance] f
  rw [h_eq]
  refine Measurable.comp ?_ (Measurable.of_comap_le le_rfl)
  refine Measurable.indicator (by fun_prop) ?_
  exact MeasurableSet.inter ((measurableSet_singleton _).preimage (by fun_prop))
    ((measurableSet_singleton _).preimage (by fun_prop))

measurable_pullCount_action_add_one_hist๐Ÿ”—

LemmaBandits.ArrayModel.measurable_pullCount_action_add_one_hist

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.measurable_pullCount_action_add_one_hist.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) : Measurable fun ฯ‰ => Learning.pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰
Bandits.ArrayModel.measurable_pullCount_action_add_one_hist.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] (alg : Learning.Algorithm ๐“ R) (n : โ„•) : Measurable fun ฯ‰ => Learning.pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰

Code

lemma measurable_pullCount_action_add_one_hist (alg : Algorithm ๐“ R) (n : โ„•) :
    Measurable[MeasurableSpace.comap (fun ฯ‰ โ†ฆ (action alg (n + 1) ฯ‰, hist alg ฯ‰ n)) inferInstance]
      (fun ฯ‰ โ†ฆ pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰)
Type uses (5)
Body uses (4)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  simp_rw [pullCount_eq_sum]
  refine measurable_sum _ fun i hi โ†ฆ Measurable.ite ?_ (by fun_prop) (by fun_prop)
  refine measurableSet_eq_fun ?_ (measurable_comp_comap _ measurable_fst)
  rw [measurable_iff_comap_le]
  simp_rw [hist_eq _ _ n]
  rw [โ† measurable_iff_comap_le]
  unfold action
  refine Measurable.fst (mฮณ := inferInstance) ?_
  have : (hist alg ยท i โŸจi, by grindโŸฉ) =
      (fun ฯ‰ : ๐“ ร— (Iic n โ†’ ๐“ ร— R) โ†ฆ ฯ‰.2 โŸจi, by grindโŸฉ) โˆ˜
        (fun ฯ‰ โ†ฆ (action alg (n + 1) ฯ‰, fun i : Iic n โ†ฆ hist alg ฯ‰ i โŸจi, by grindโŸฉ)) := rfl
  rw [this]
  exact measurable_comp_comap _ (Measurable.prodMk (by fun_prop) (by fun_prop))

map_snd_apply_arrayMeasure๐Ÿ”—

LemmaBandits.ArrayModel.map_snd_apply_arrayMeasure

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.map_snd_apply_arrayMeasure.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} {ฮฝ : ProbabilityTheory.Kernel ๐“ R} [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) (a : ๐“) : MeasureTheory.Measure.map (fun ฯ‰ => Prod.snd ฯ‰ n a) (arrayMeasure ฮฝ) = ฮฝ a
Bandits.ArrayModel.map_snd_apply_arrayMeasure.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} {ฮฝ : ProbabilityTheory.Kernel ๐“ R} [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) (a : ๐“) : MeasureTheory.Measure.map (fun ฯ‰ => Prod.snd ฯ‰ n a) (arrayMeasure ฮฝ) = ฮฝ a

Code

lemma map_snd_apply_arrayMeasure {ฮฝ : Kernel ๐“ R} [IsMarkovKernel ฮฝ] (n : โ„•) (a : ๐“) :
    (arrayMeasure ฮฝ).map (fun ฯ‰ โ†ฆ ฯ‰.2 n a) = ฮฝ a
Type uses (3)
Body uses (2)
Used by (3)

Actions: Source ยท Open Issue

Proof
by
  calc (arrayMeasure ฮฝ).map (fun ฯ‰ โ†ฆ ฯ‰.2 n a)
  _ = (arrayMeasure ฮฝ).snd.map (fun ฯ‰ โ†ฆ ฯ‰ n a) := by
    rw [Measure.snd, Measure.map_map (by fun_prop) (by fun_prop)]
    rfl
  _ = ฮฝ a := by
    rw [arrayMeasure, Measure.snd_prod, streamMeasure]
    have : (fun ฯ‰ โ†ฆ ฯ‰ n a) = (fun h : ๐“ โ†’ R โ†ฆ h a) โˆ˜ (fun ฯ‰ : โ„• โ†’ ๐“ โ†’ R โ†ฆ ฯ‰ n) := rfl
    rw [this, โ† Measure.map_map (by fun_prop) (by fun_prop), Measure.infinitePi_map_eval,
      Measure.infinitePi_map_eval]

indepFun_fst_snd๐Ÿ”—

LemmaBandits.ArrayModel.indepFun_fst_snd

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.indepFun_fst_snd.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : ProbabilityTheory.IndepFun Prod.fst Prod.snd (arrayMeasure ฮฝ)
Bandits.ArrayModel.indepFun_fst_snd.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : ProbabilityTheory.IndepFun Prod.fst Prod.snd (arrayMeasure ฮฝ)

Code

lemma indepFun_fst_snd (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ] :
    IndepFun Prod.fst Prod.snd (arrayMeasure ฮฝ)
Type uses (2)
Body uses (2)

Actions: Source ยท Open Issue

Proof
indepFun_prod measurable_id measurable_id

indepFun_fst_zero_snd_zero_action๐Ÿ”—

LemmaBandits.ArrayModel.indepFun_fst_zero_snd_zero_action

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.indepFun_fst_zero_snd_zero_action.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (a : ๐“) : ProbabilityTheory.IndepFun (fun ฯ‰ => Prod.fst ฯ‰ 0) (fun ฯ‰ => Prod.snd ฯ‰ 0 a) (arrayMeasure ฮฝ)
Bandits.ArrayModel.indepFun_fst_zero_snd_zero_action.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (a : ๐“) : ProbabilityTheory.IndepFun (fun ฯ‰ => Prod.fst ฯ‰ 0) (fun ฯ‰ => Prod.snd ฯ‰ 0 a) (arrayMeasure ฮฝ)

Code

lemma indepFun_fst_zero_snd_zero_action (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ] (a : ๐“) :
    IndepFun (fun ฯ‰ โ†ฆ ฯ‰.1 0) (fun ฯ‰ โ†ฆ ฯ‰.2 0 a) (arrayMeasure ฮฝ)
Type uses (3)
Body uses (2)
Used by (1)

Actions: Source ยท Open Issue

Proof
indepFun_prod (X := fun ฯ‰ : โ„• โ†’ I โ†ฆ ฯ‰ 0) (Y := fun ฯ‰ : โ„• โ†’ ๐“ โ†’ R โ†ฆ ฯ‰ 0 a)
    (by fun_prop) (by fun_prop)

indepFun_fst_add_one_aux๐Ÿ”—

LemmaBandits.ArrayModel.indepFun_fst_add_one_aux

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.indepFun_fst_add_one_aux.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) : ProbabilityTheory.IndepFun (fun ฯ‰ => Prod.fst ฯ‰ (n + 1)) (fun ฯ‰ => (fun i => Prod.fst ฯ‰ โ†‘i, Prod.snd ฯ‰)) (arrayMeasure ฮฝ)
Bandits.ArrayModel.indepFun_fst_add_one_aux.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) : ProbabilityTheory.IndepFun (fun ฯ‰ => Prod.fst ฯ‰ (n + 1)) (fun ฯ‰ => (fun i => Prod.fst ฯ‰ โ†‘i, Prod.snd ฯ‰)) (arrayMeasure ฮฝ)

Code

lemma indepFun_fst_add_one_aux (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ] (n : โ„•) :
    (fun ฯ‰ โ†ฆ ฯ‰.1 (n + 1)) โŸ‚แตข[arrayMeasure ฮฝ] (fun ฯ‰ โ†ฆ (fun (i : Iic n) โ†ฆ ฯ‰.1 i, ฯ‰.2))
Type uses (3)
Body uses (2)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  let ฮผโ‚ : Measure (โ„• โ†’ I) := Measure.infinitePi fun _ โ†ฆ volume
  let ฮผโ‚‚ : Measure (โ„• โ†’ ๐“ โ†’ R) := streamMeasure ฮฝ
  -- Coordinates of ฮผโ‚ are independent
  have h_indep : iIndepFun (fun i (ฯ‰ : โ„• โ†’ I) โ†ฆ ฯ‰ i) ฮผโ‚ :=
    iIndepFun_infinitePi (fun _ โ†ฆ measurable_id)
  have h_indep_n : IndepFun (fun ฯ‰ โ†ฆ ฯ‰ (n + 1)) (fun ฯ‰ โ†ฆ fun i : Iic n โ†ฆ ฯ‰ i) ฮผโ‚ := by
    have h := h_indep.indepFun_finsetโ‚€ {n + 1} (Iic n) (by simp)
      (fun i โ†ฆ (measurable_pi_apply i).aemeasurable)
    convert h.comp (measurable_pi_apply โŸจn + 1, by simpโŸฉ) measurable_id using 1
    ยท rfl
    ยท rfl
  rw [indepFun_iff_measure_inter_preimage_eq_mul]
  intro s t hs ht
  let X : (โ„• โ†’ I) ร— (โ„• โ†’ ๐“ โ†’ R) โ†’ I := fun ฯ‰ โ†ฆ ฯ‰.1 (n + 1)
  let Y : (โ„• โ†’ I) ร— (โ„• โ†’ ๐“ โ†’ R) โ†’ (Iic n โ†’ I) ร— (โ„• โ†’ ๐“ โ†’ R) := fun ฯ‰ โ†ฆ (fun i โ†ฆ ฯ‰.1 i, ฯ‰.2)
  change (ฮผโ‚.prod ฮผโ‚‚) (X โปยน' s โˆฉ Y โปยน' t) = (ฮผโ‚.prod ฮผโ‚‚) (X โปยน' s) * (ฮผโ‚.prod ฮผโ‚‚) (Y โปยน' t)
  -- Rewrite using Fubini
  rw [Measure.prod_apply (hs.preimage (by fun_prop : Measurable X)),
    Measure.prod_apply (ht.preimage (by fun_prop : Measurable Y)),
    Measure.prod_apply ((hs.preimage (by fun_prop : Measurable X)).inter
      (ht.preimage (by fun_prop : Measurable Y)))]
  -- Compute fibers
  have hX_fst ฯ‰โ‚ : ฮผโ‚‚ (Prod.mk ฯ‰โ‚ โปยน' (X โปยน' s)) = s.indicator 1 (ฯ‰โ‚ (n + 1)) := by
    simp only [X, Set.preimage_preimage]
    by_cases h : ฯ‰โ‚ (n + 1) โˆˆ s <;> simp [h]
  have hY_fst ฯ‰โ‚ : ฮผโ‚‚ (Prod.mk ฯ‰โ‚ โปยน' (Y โปยน' t)) = ฮผโ‚‚ {y | ((fun i : Iic n โ†ฆ ฯ‰โ‚ i), y) โˆˆ t} := rfl
  have hXY ฯ‰โ‚ : ฮผโ‚‚ (Prod.mk ฯ‰โ‚ โปยน' (X โปยน' s โˆฉ Y โปยน' t)) =
      s.indicator 1 (ฯ‰โ‚ (n + 1)) * ฮผโ‚‚ {y | ((fun i : Iic n โ†ฆ ฯ‰โ‚ i), y) โˆˆ t} := by
    simp only [X, Y, Set.preimage_inter, Set.preimage_preimage]
    by_cases h : ฯ‰โ‚ (n + 1) โˆˆ s
    ยท simp [h]
      congr
    ยท simp [h]
  simp_rw [hY_fst, hX_fst, hXY]
  -- Factor the integral using independence
  let g : (Iic n โ†’ I) โ†’ ENNReal := fun x โ†ฆ ฮผโ‚‚ {y | (x, y) โˆˆ t}
  have hg_meas : Measurable g := measurable_measure_prodMk_left ht
  have hf_meas : Measurable (fun ฯ‰โ‚ : โ„• โ†’ I โ†ฆ s.indicator (1 : I โ†’ ENNReal) (ฯ‰โ‚ (n + 1))) :=
    (measurable_one.indicator hs).comp (measurable_pi_apply _)
  have hindep_fg : IndepFun (fun ฯ‰โ‚ โ†ฆ s.indicator (1 : I โ†’ ENNReal) (ฯ‰โ‚ (n + 1)))
      (fun ฯ‰โ‚ โ†ฆ g (fun i โ†ฆ ฯ‰โ‚ i)) ฮผโ‚ :=
    h_indep_n.comp (measurable_one.indicator hs) hg_meas
  have h_eq (ฯ‰โ‚ : โ„• โ†’ I) : ฮผโ‚‚ {y | ((fun i : Iic n โ†ฆ ฯ‰โ‚ i), y) โˆˆ t} = g (fun i โ†ฆ ฯ‰โ‚ i) := rfl
  simp_rw [h_eq]
  exact lintegral_mul_eq_lintegral_mul_lintegral_of_indepFun hf_meas (by fun_prop) hindep_fg

indepFun_fst_add_one_hist๐Ÿ”—

LemmaBandits.ArrayModel.indepFun_fst_add_one_hist

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.indepFun_fst_add_one_hist.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [StandardBorelSpace R] [Nonempty R] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) : ProbabilityTheory.IndepFun (fun ฯ‰ => Prod.fst ฯ‰ (n + 1)) (fun x => hist alg x n) (arrayMeasure ฮฝ)
Bandits.ArrayModel.indepFun_fst_add_one_hist.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [StandardBorelSpace R] [Nonempty R] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) : ProbabilityTheory.IndepFun (fun ฯ‰ => Prod.fst ฯ‰ (n + 1)) (fun x => hist alg x n) (arrayMeasure ฮฝ)

Code

lemma indepFun_fst_add_one_hist [Countable ๐“] (alg : Algorithm ๐“ R)
    (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ] (n : โ„•) :
    IndepFun (fun ฯ‰ โ†ฆ ฯ‰.1 (n + 1)) (hist alg ยท n) (arrayMeasure ฮฝ)
Type uses (5)
Body uses (3)
Used by (1)

Actions: Source ยท Open Issue

Proof
(indepFun_fst_add_one_aux ฮฝ n).of_measurable_right (measurable_hist_comap alg n)

indepFun_snd_apply_aux๐Ÿ”—

LemmaBandits.ArrayModel.indepFun_snd_apply_aux

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.indepFun_snd_apply_aux.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [DecidableEq ๐“] [Nonempty R] (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (a : ๐“) (m : โ„•) : ProbabilityTheory.IndepFun (fun ฯ‰ => Prod.snd ฯ‰ m a) (fun ฯ‰ => (Prod.fst ฯ‰, fun k b => if b = a then if m โ‰  0 then Prod.snd ฯ‰ (min k (m - 1)) b else Nonempty.some โ‹ฏ else Prod.snd ฯ‰ k b)) (arrayMeasure ฮฝ)
Bandits.ArrayModel.indepFun_snd_apply_aux.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [DecidableEq ๐“] [Nonempty R] (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (a : ๐“) (m : โ„•) : ProbabilityTheory.IndepFun (fun ฯ‰ => Prod.snd ฯ‰ m a) (fun ฯ‰ => (Prod.fst ฯ‰, fun k b => if b = a then if m โ‰  0 then Prod.snd ฯ‰ (min k (m - 1)) b else Nonempty.some โ‹ฏ else Prod.snd ฯ‰ k b)) (arrayMeasure ฮฝ)

Code

lemma indepFun_snd_apply_aux (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ] (a : ๐“) (m : โ„•) :
    (fun ฯ‰ โ†ฆ ฯ‰.2 m a) โŸ‚แตข[arrayMeasure ฮฝ]
      (fun ฯ‰ โ†ฆ (ฯ‰.1, fun k b โ†ฆ if b = a then if m โ‰  0 then ฯ‰.2 (min k (m - 1)) b
        else Nonempty.some inferInstance else ฯ‰.2 k b))
Type uses (3)
Body uses (2)
Used by (2)

Actions: Source ยท Open Issue

Proof
by
  unfold arrayMeasure
  let ฮผโ‚ : Measure (โ„• โ†’ I) := Measure.infinitePi fun _ โ†ฆ volume
  let ฮผโ‚‚ : Measure (โ„• โ†’ ๐“ โ†’ R) := Measure.infinitePi fun _ โ†ฆ Measure.infinitePi ฮฝ
  -- Independence within ฮผโ‚‚: coordinates ฯ‰ i are independent
  have h_indepโ‚‚ : iIndepFun (fun i (ฯ‰ : โ„• โ†’ ๐“ โ†’ R) โ†ฆ ฯ‰ i) ฮผโ‚‚ :=
    iIndepFun_infinitePi (fun _ โ†ฆ measurable_id)
  -- Independence within each infinitePi ฮฝ: coordinates f b are independent
  have h_indep_inner : iIndepFun (fun (b : ๐“) (f : ๐“ โ†’ R) โ†ฆ f b) (Measure.infinitePi ฮฝ) :=
    iIndepFun_infinitePi (fun _ โ†ฆ measurable_id)
  rw [indepFun_iff_measure_inter_preimage_eq_mul]
  intro s t hs ht
  let X : (โ„• โ†’ I) ร— (โ„• โ†’ ๐“ โ†’ R) โ†’ R := fun ฯ‰ โ†ฆ ฯ‰.2 m a
  let Y : (โ„• โ†’ I) ร— (โ„• โ†’ ๐“ โ†’ R) โ†’ (โ„• โ†’ I) ร— (โ„• โ†’ ๐“ โ†’ R) :=
    fun ฯ‰ โ†ฆ (ฯ‰.1, fun k b โ†ฆ if b = a then if m โ‰  0 then ฯ‰.2 (min k (m - 1)) b
      else Nonempty.some inferInstance else ฯ‰.2 k b)
  have hX_meas : Measurable X :=
    (measurable_pi_apply a).comp ((measurable_pi_apply m).comp measurable_snd)
  have hY_meas : Measurable Y := by
    change Measurable (fun ฯ‰ : (โ„• โ†’ I) ร— (โ„• โ†’ ๐“ โ†’ R) โ†ฆ
      (ฯ‰.1, fun k b โ†ฆ if b = a then if m โ‰  0 then ฯ‰.2 (min k (m - 1)) b
        else Nonempty.some inferInstance else ฯ‰.2 k b))
    refine Measurable.prod measurable_fst ?_
    refine measurable_pi_lambda _ (fun k โ†ฆ ?_)
    refine measurable_pi_lambda _ (fun b โ†ฆ ?_)
    by_cases hb : b = a
    ยท simp only [hb, โ†“reduceIte]
      by_cases hm : m โ‰  0
      ยท simp only [ne_eq, hm, not_false_eq_true, โ†“reduceIte]
        exact (measurable_pi_apply a).comp
          ((measurable_pi_apply (min k (m - 1))).comp measurable_snd)
      ยท simp only [hm, โ†“reduceIte]
        exact measurable_const
    ยท simp only [hb, โ†“reduceIte]
      exact (measurable_pi_apply b).comp ((measurable_pi_apply k).comp measurable_snd)
  change (ฮผโ‚.prod ฮผโ‚‚) (X โปยน' s โˆฉ Y โปยน' t) = (ฮผโ‚.prod ฮผโ‚‚) (X โปยน' s) * (ฮผโ‚.prod ฮผโ‚‚) (Y โปยน' t)
  -- Use Fubini on ฮผโ‚.prod ฮผโ‚‚
  rw [Measure.prod_apply (hs.preimage hX_meas),
    Measure.prod_apply (ht.preimage hY_meas),
    Measure.prod_apply ((hs.preimage hX_meas).inter (ht.preimage hY_meas))]
  -- X only depends on ฯ‰โ‚‚, so its fiber is constant in ฯ‰โ‚
  have hX_fst : โˆ€ ฯ‰โ‚, ฮผโ‚‚ (Prod.mk ฯ‰โ‚ โปยน' (X โปยน' s)) = ฮผโ‚‚ ((fun ฯ‰โ‚‚ โ†ฆ ฯ‰โ‚‚ m a) โปยน' s) := fun _ โ†ฆ rfl
  simp_rw [hX_fst]
  -- The LHS integral: fiber of X โˆฉ Y at ฯ‰โ‚
  -- Key: X depends only on ฯ‰โ‚‚ m a, while Y's dependence on ฯ‰โ‚‚ avoids (m, a)
  -- Define the "truncation" map on ฯ‰โ‚‚
  let trunc : (โ„• โ†’ ๐“ โ†’ R) โ†’ (โ„• โ†’ ๐“ โ†’ R) :=
    fun ฯ‰โ‚‚ k b โ†ฆ if b = a then if m โ‰  0 then ฯ‰โ‚‚ (min k (m - 1)) b
      else Nonempty.some inferInstance else ฯ‰โ‚‚ k b
  -- The fiber of Y at ฯ‰โ‚ only depends on trunc(ฯ‰โ‚‚)
  have hY_fiber : โˆ€ ฯ‰โ‚, Prod.mk ฯ‰โ‚ โปยน' (Y โปยน' t) = (fun ฯ‰โ‚‚ โ†ฆ (ฯ‰โ‚, trunc ฯ‰โ‚‚)) โปยน' t := fun _ โ†ฆ rfl
  -- The fiber of X โˆฉ Y factors
  have hXY_fiber : โˆ€ ฯ‰โ‚, Prod.mk ฯ‰โ‚ โปยน' (X โปยน' s โˆฉ Y โปยน' t) =
      ((fun ฯ‰โ‚‚ โ†ฆ ฯ‰โ‚‚ m a) โปยน' s) โˆฉ ((fun ฯ‰โ‚‚ โ†ฆ (ฯ‰โ‚, trunc ฯ‰โ‚‚)) โปยน' t) := fun _ โ†ฆ rfl
  simp_rw [hXY_fiber, hY_fiber]
  -- Now we use independence in ฮผโ‚‚: (ฯ‰โ‚‚ m a) is independent of (trunc ฯ‰โ‚‚)
  -- because trunc only uses indices (k, a) with k < m, and (k, b) with b โ‰  a
  have h_trunc_meas : Measurable trunc := by
    refine measurable_pi_lambda _ (fun k โ†ฆ ?_)
    refine measurable_pi_lambda _ (fun b โ†ฆ ?_)
    simp only [trunc]
    by_cases hb : b = a
    ยท simp only [hb, โ†“reduceIte]
      by_cases hm : m = 0
      ยท simp only [hm]
        exact measurable_const
      ยท simp only [ne_eq, hm, not_false_eq_true, โ†“reduceIte]
        exact (measurable_pi_apply a).comp (measurable_pi_apply (min k (m - 1)))
    ยท simp only [hb, โ†“reduceIte]
      exact (measurable_pi_apply b).comp (measurable_pi_apply k)
  -- Key independence: (ฯ‰โ‚‚ m a) โŸ‚ trunc because trunc only uses coordinates โ‰  (m, a)
  have h_indep_trunc : IndepFun (fun ฯ‰โ‚‚ โ†ฆ ฯ‰โ‚‚ m a) trunc ฮผโ‚‚ := by
    -- Factor trunc through proj which extracts the relevant coordinates
    let proj : (โ„• โ†’ ๐“ โ†’ R) โ†’ ((โ„• โ†’ R) ร— (โ„• โ†’ {b : ๐“ // b โ‰  a} โ†’ R)) := fun ฯ‰โ‚‚ โ†ฆ
      (fun k โ†ฆ if m โ‰  0 then ฯ‰โ‚‚ (min k (m - 1)) a else Nonempty.some inferInstance,
      fun k โŸจb, _โŸฉ โ†ฆ ฯ‰โ‚‚ k b)
    have h_trunc_proj : โˆ€ ฯ‰โ‚‚, trunc ฯ‰โ‚‚ = (fun p k b โ†ฆ
        if h : b = a then if m โ‰  0 then p.1 k
          else Nonempty.some inferInstance else p.2 k โŸจb, hโŸฉ) (proj ฯ‰โ‚‚) := by
      intro ฯ‰โ‚‚; ext k b; simp only [trunc, proj]; by_cases hb : b = a <;> simp [hb]; grind
    have h_proj_meas : Measurable proj := by
      refine Measurable.prod ?_ ?_
      ยท refine measurable_pi_lambda _ fun k โ†ฆ ?_
        by_cases hm : m โ‰  0
        ยท simp only [proj, ne_eq, hm, not_false_eq_true, โ†“reduceIte]
          exact (measurable_pi_apply a).comp (measurable_pi_apply (min k (m - 1)))
        ยท simp [proj, hm]
      ยท exact measurable_pi_lambda _ (fun k โ†ฆ measurable_pi_lambda _ (fun โŸจb, _โŸฉ โ†ฆ
          (measurable_pi_apply b).comp (measurable_pi_apply k)))
    have h_g_meas : Measurable (fun p : (โ„• โ†’ R) ร— (โ„• โ†’ {b : ๐“ // b โ‰  a} โ†’ R) โ†ฆ
        (fun k b โ†ฆ if h : b = a then if m โ‰  0 then p.1 k else Nonempty.some inferInstance
          else p.2 k โŸจb, hโŸฉ)) := by
      refine measurable_pi_lambda _ (fun k โ†ฆ measurable_pi_lambda _ (fun b โ†ฆ ?_))
      by_cases hb : b = a
      ยท simp only [hb, โ†“reduceDIte]
        by_cases hm : m โ‰  0
        ยท simp only [ne_eq, hm, not_false_eq_true]
          exact (measurable_pi_apply k).comp measurable_fst
        ยท simp [hm]
      ยท simp only [hb, โ†“reduceDIte]
        exact (measurable_pi_apply (โŸจb, hbโŸฉ : {b : ๐“ // b โ‰  a})).comp
          ((measurable_pi_apply k).comp measurable_snd)
    -- Show (ฯ‰โ‚‚ m a) โŸ‚ proj: proj uses coordinates disjoint from (m, a)
    have h_indep_proj : IndepFun (fun ฯ‰โ‚‚ โ†ฆ ฯ‰โ‚‚ m a) proj ฮผโ‚‚ := by
      have h_row_bound (hm : m โ‰  0) : โˆ€ k, min k (m - 1) < m := by
        intro k
        calc min k (m - 1) โ‰ค m - 1 := Nat.min_le_right k (m - 1)
          _ < m := Nat.sub_lt (by grind) Nat.one_pos
      rw [indepFun_iff_measure_inter_preimage_eq_mul]
      intro s t' hs ht'
      -- rows_lt_m extracts column a at rows < m, other_cols extracts columns โ‰  a
      let rows_lt_m : (โ„• โ†’ ๐“ โ†’ R) โ†’ (Iio m โ†’ R) := fun ฯ‰โ‚‚ โŸจj, _โŸฉ โ†ฆ ฯ‰โ‚‚ j a
      let other_cols : (โ„• โ†’ ๐“ โ†’ R) โ†’ (โ„• โ†’ {b : ๐“ // b โ‰  a} โ†’ R) := fun ฯ‰โ‚‚ k โŸจb, _โŸฉ โ†ฆ ฯ‰โ‚‚ k b
      have h_proj_factor : โˆ€ ฯ‰โ‚‚, proj ฯ‰โ‚‚ =
          ((fun r k โ†ฆ if hm : m โ‰  0 then r โŸจmin k (m - 1), Finset.mem_Iio.mpr (h_row_bound hm k)โŸฉ
            else Nonempty.some inferInstance) (rows_lt_m ฯ‰โ‚‚),
           other_cols ฯ‰โ‚‚) := by
        intro ฯ‰โ‚‚; ext1
        ยท ext k
          by_cases hm : m โ‰  0
          ยท simp [proj, rows_lt_m, hm]
          ยท simp [proj, hm]
        ยท rfl
      -- Use iIndepFun structure of the doubly-indexed infinite product
      have h_iindep : iIndepFun (fun (p : โ„• ร— ๐“) ฯ‰ โ†ฆ ฯ‰ p.1 p.2) ฮผโ‚‚ :=
        iIndepFun_uncurry_infinitePi' (X := fun _ _ โ†ฆ id) (fun _ โ†ฆ ฮฝ) (by fun_prop)
      have h_rows_meas : Measurable rows_lt_m :=
        measurable_pi_lambda _ (fun โŸจj, _โŸฉ โ†ฆ (measurable_pi_apply a).comp (measurable_pi_apply j))
      have h_other_meas : Measurable other_cols :=
        measurable_pi_lambda _ (fun k โ†ฆ measurable_pi_lambda _ (fun โŸจb, _โŸฉ โ†ฆ
          (measurable_pi_apply b).comp (measurable_pi_apply k)))
      -- Show (ฯ‰โ‚‚ m a) โŸ‚ (rows_lt_m, other_cols) via indep_iSup_of_disjoint
      have h_indep_combined : IndepFun (fun ฯ‰โ‚‚ โ†ฆ ฯ‰โ‚‚ m a)
          (fun ฯ‰โ‚‚ โ†ฆ (rows_lt_m ฯ‰โ‚‚, other_cols ฯ‰โ‚‚)) ฮผโ‚‚ := by
        rw [IndepFun_iff_Indep]
        have h_comap_le : (MeasurableSpace.pi.prod MeasurableSpace.pi).comap
            (fun ฯ‰โ‚‚ โ†ฆ (rows_lt_m ฯ‰โ‚‚, other_cols ฯ‰โ‚‚)) โ‰ค
            โจ† (p : {p : โ„• ร— ๐“ // p โ‰  (m, a)}), mR.comap (fun ฯ‰ โ†ฆ ฯ‰ p.val.1 p.val.2) := by
          rw [MeasurableSpace.comap_prodMk]
          refine sup_le ?_ ?_
          ยท rw [MeasurableSpace.comap_pi]
            refine iSup_le (fun โŸจj, hjโŸฉ โ†ฆ ?_)
            have h_ne : (j, a) โ‰  (m, a) := fun h โ†ฆ (Finset.mem_Iio.mp hj).ne (Prod.mk.inj h).1
            exact le_iSup_of_le โŸจ(j, a), h_neโŸฉ le_rfl
          ยท rw [MeasurableSpace.comap_pi]
            refine iSup_le (fun k โ†ฆ ?_)
            rw [MeasurableSpace.comap_pi]
            refine iSup_le (fun โŸจb, hbโŸฉ โ†ฆ ?_)
            have h_ne : (k, b) โ‰  (m, a) := fun h โ†ฆ hb (Prod.mk.inj h).2
            exact le_iSup_of_le โŸจ(k, b), h_neโŸฉ le_rfl
        refine indep_of_indep_of_le_right ?_ h_comap_le
        have h_disjoint : Disjoint ({(m, a)} : Set (โ„• ร— ๐“)) {p | p โ‰  (m, a)} := by simp
        have h_le : โˆ€ p : โ„• ร— ๐“, mR.comap (fun ฯ‰ : โ„• โ†’ ๐“ โ†’ R โ†ฆ ฯ‰ p.1 p.2) โ‰ค
            MeasurableSpace.pi (m := fun _ โ†ฆ MeasurableSpace.pi) := fun p โ†ฆ
          Measurable.comap_le ((measurable_pi_apply p.2).comp (measurable_pi_apply p.1))
        have h_iindep' : iIndep (fun p : โ„• ร— ๐“ โ†ฆ mR.comap (fun ฯ‰ : โ„• โ†’ ๐“ โ†’ R โ†ฆ ฯ‰ p.1 p.2)) ฮผโ‚‚ :=
          h_iindep.iIndep
        have h_indep := indep_iSup_of_disjoint h_le h_iindep' h_disjoint
        convert h_indep using 2
        ยท simp only [Set.mem_singleton_iff, iSup_iSup_eq_left]
        ยท simp only [ne_eq, Set.mem_setOf_eq, iSup_subtype']
      have h_proj_preimage : proj โปยน' t' = (fun ฯ‰โ‚‚ โ†ฆ (rows_lt_m ฯ‰โ‚‚, other_cols ฯ‰โ‚‚)) โปยน'
          {p | ((fun r k โ†ฆ if hm : m โ‰  0 then
            r โŸจmin k (m - 1), Finset.mem_Iio.mpr (h_row_bound hm k)โŸฉ
            else Nonempty.some inferInstance) p.1, p.2) โˆˆ t'}
          := by ext ฯ‰โ‚‚; simp only [Set.mem_preimage, Set.mem_setOf_eq, h_proj_factor]
      rw [indepFun_iff_measure_inter_preimage_eq_mul] at h_indep_combined
      rw [h_proj_preimage]
      let T : Set ((Iio m โ†’ R) ร— (โ„• โ†’ {b : ๐“ // b โ‰  a} โ†’ R)) :=
        {p | ((fun r k โ†ฆ if hm : m โ‰  0 then
          r โŸจmin k (m - 1), Finset.mem_Iio.mpr (h_row_bound hm k)โŸฉ
          else Nonempty.some inferInstance) p.1, p.2) โˆˆ t'}
      have hT_meas : MeasurableSet T := by
        refine ht'.preimage (Measurable.prod ?_ measurable_snd)
        refine measurable_pi_lambda _ (fun k โ†ฆ ?_)
        by_cases hm : m โ‰  0
        ยท simp only [ne_eq, hm, not_false_eq_true, โ†“reduceDIte]
          exact (measurable_pi_apply (โŸจmin k (m - 1), Finset.mem_Iio.mpr (h_row_bound hm k)โŸฉ :
            Iio m)).comp measurable_fst
        ยท simp [hm]
      change ฮผโ‚‚ ((fun ฯ‰โ‚‚ โ†ฆ ฯ‰โ‚‚ m a) โปยน' s โˆฉ (fun ฯ‰โ‚‚ โ†ฆ (rows_lt_m ฯ‰โ‚‚, other_cols ฯ‰โ‚‚)) โปยน' T) =
        ฮผโ‚‚ ((fun ฯ‰โ‚‚ โ†ฆ ฯ‰โ‚‚ m a) โปยน' s) *
        ฮผโ‚‚ ((fun ฯ‰โ‚‚ โ†ฆ (rows_lt_m ฯ‰โ‚‚, other_cols ฯ‰โ‚‚)) โปยน' T)
      exact h_indep_combined s T hs hT_meas
    have h_eq : trunc = (fun p k b โ†ฆ if h : b = a then if m โ‰  0 then p.1 k
        else Nonempty.some inferInstance else p.2 k โŸจb, hโŸฉ) โˆ˜ proj := by
      funext ฯ‰โ‚‚; exact h_trunc_proj ฯ‰โ‚‚
    rw [h_eq]
    exact h_indep_proj.comp measurable_id h_g_meas
  rw [indepFun_iff_measure_inter_preimage_eq_mul] at h_indep_trunc
  have h_const : โˆ€ ฯ‰โ‚, ฮผโ‚‚ (((fun ฯ‰โ‚‚ โ†ฆ ฯ‰โ‚‚ m a) โปยน' s) โˆฉ ((fun ฯ‰โ‚‚ โ†ฆ (ฯ‰โ‚, trunc ฯ‰โ‚‚)) โปยน' t)) =
      ฮผโ‚‚ ((fun ฯ‰โ‚‚ โ†ฆ ฯ‰โ‚‚ m a) โปยน' s) * ฮผโ‚‚ ((fun ฯ‰โ‚‚ โ†ฆ (ฯ‰โ‚, trunc ฯ‰โ‚‚)) โปยน' t) := fun ฯ‰โ‚ โ†ฆ
    h_indep_trunc s _ hs (ht.preimage (by fun_prop))
  simp_rw [h_const]
  let c := ฮผโ‚‚ ((fun ฯ‰โ‚‚ โ†ฆ ฯ‰โ‚‚ m a) โปยน' s)
  change โˆซโป x, c * ฮผโ‚‚ ((fun ฯ‰โ‚‚ โ†ฆ (x, trunc ฯ‰โ‚‚)) โปยน' t) โˆ‚ฮผโ‚ =
    (โˆซโป _, c โˆ‚ฮผโ‚) * โˆซโป x, ฮผโ‚‚ ((fun ฯ‰โ‚‚ โ†ฆ (x, trunc ฯ‰โ‚‚)) โปยน' t) โˆ‚ฮผโ‚
  have h_preimage : โˆ€ x, (fun ฯ‰โ‚‚ โ†ฆ (x, trunc ฯ‰โ‚‚)) โปยน' t = trunc โปยน' (Prod.mk x โปยน' t) := fun _ โ†ฆ rfl
  simp_rw [h_preimage]
  have h_map : โˆ€ x, ฮผโ‚‚ (trunc โปยน' (Prod.mk x โปยน' t)) = (ฮผโ‚‚.map trunc) (Prod.mk x โปยน' t) := by
    intro x; rw [Measure.map_apply h_trunc_meas (ht.preimage (by fun_prop))]
  simp_rw [h_map]
  rw [lintegral_const_mul _ (measurable_measure_prodMk_left_finite ht),
    lintegral_const, measure_univ, mul_one]

indepFun_snd_apply_pullCount_action๐Ÿ”—

LemmaBandits.ArrayModel.indepFun_snd_apply_pullCount_action

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.indepFun_snd_apply_pullCount_action.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Nonempty R] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (a : ๐“) (m n : โ„•) : ProbabilityTheory.IndepFun (fun ฯ‰ => Prod.snd ฯ‰ m a) (Set.indicator {ฯ‰ | action alg (n + 1) ฯ‰ = a โˆง Learning.pullCount (action alg) a (n + 1) ฯ‰ = m} fun x => 1) (arrayMeasure ฮฝ)
Bandits.ArrayModel.indepFun_snd_apply_pullCount_action.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Nonempty R] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (a : ๐“) (m n : โ„•) : ProbabilityTheory.IndepFun (fun ฯ‰ => Prod.snd ฯ‰ m a) (Set.indicator {ฯ‰ | action alg (n + 1) ฯ‰ = a โˆง Learning.pullCount (action alg) a (n + 1) ฯ‰ = m} fun x => 1) (arrayMeasure ฮฝ)

Code

lemma indepFun_snd_apply_pullCount_action [Countable ๐“] (alg : Algorithm ๐“ R)
    (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ] (a : ๐“) (m n : โ„•) :
    (fun ฯ‰ โ†ฆ ฯ‰.2 m a) โŸ‚แตข[arrayMeasure ฮฝ]
      ({ฯ‰ | action alg (n + 1) ฯ‰ = a โˆง
        pullCount (action alg) a (n + 1) ฯ‰ = m}).indicator (fun _ โ†ฆ 1)
Type uses (6)
Body uses (3)
Used by (2)

Actions: Source ยท Open Issue

Proof
(indepFun_snd_apply_aux ฮฝ a m).of_measurable_right (measurable_stepsUntil alg a m n)

indepFun_cond_comp๐Ÿ”—

LemmaBandits.ArrayModel.indepFun_cond_comp

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.indepFun_cond_comp.{u_3, u_4, u_5, u_6} {๐“ : Type u_3} {ฮฒ : Type u_4} {ฮณ : Type u_5} {ฮด : Type u_6} {m๐“ : MeasurableSpace ๐“} {mฮฒ : MeasurableSpace ฮฒ} {mฮณ : MeasurableSpace ฮณ} {mฮด : MeasurableSpace ฮด} [MeasurableSingletonClass ฮด] {ฮผ : MeasureTheory.Measure ๐“} {X : ๐“ โ†’ ฮฒ} {Y : ๐“ โ†’ ฮณ} (hXY : ProbabilityTheory.IndepFun X Y ฮผ) (hY : Measurable Y) {Z : ฮณ โ†’ ฮด} (hZ : Measurable Z) (z : ฮด) : ProbabilityTheory.IndepFun X Y ฮผ[|Z โˆ˜ Y โปยน' {z}]
Bandits.ArrayModel.indepFun_cond_comp.{u_3, u_4, u_5, u_6} {๐“ : Type u_3} {ฮฒ : Type u_4} {ฮณ : Type u_5} {ฮด : Type u_6} {m๐“ : MeasurableSpace ๐“} {mฮฒ : MeasurableSpace ฮฒ} {mฮณ : MeasurableSpace ฮณ} {mฮด : MeasurableSpace ฮด} [MeasurableSingletonClass ฮด] {ฮผ : MeasureTheory.Measure ๐“} {X : ๐“ โ†’ ฮฒ} {Y : ๐“ โ†’ ฮณ} (hXY : ProbabilityTheory.IndepFun X Y ฮผ) (hY : Measurable Y) {Z : ฮณ โ†’ ฮด} (hZ : Measurable Z) (z : ฮด) : ProbabilityTheory.IndepFun X Y ฮผ[|Z โˆ˜ Y โปยน' {z}]

Code

lemma indepFun_cond_comp {๐“ ฮฒ ฮณ ฮด : Type*} {m๐“ : MeasurableSpace ๐“} {mฮฒ : MeasurableSpace ฮฒ}
    {mฮณ : MeasurableSpace ฮณ} {mฮด : MeasurableSpace ฮด} [MeasurableSingletonClass ฮด] {ฮผ : Measure ๐“}
    {X : ๐“ โ†’ ฮฒ} {Y : ๐“ โ†’ ฮณ} (hXY : X โŸ‚แตข[ฮผ] Y) (hY : Measurable Y)
    {Z : ฮณ โ†’ ฮด} (hZ : Measurable Z) (z : ฮด) :
    X โŸ‚แตข[ฮผ[|(Z โˆ˜ Y) โปยน' {z}]] Y
Body uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  have h_preim : (Z โˆ˜ Y) โปยน' {z} = Y โปยน' (Z โปยน' {z}) := by grind
  simp_rw [h_preim]
  exact indepFun_cond_of_indepFun hXY hY (hZ (measurableSet_singleton z))

indepFun_snd_hist_cond๐Ÿ”—

LemmaBandits.ArrayModel.indepFun_snd_hist_cond

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.indepFun_snd_hist_cond.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [StandardBorelSpace R] [Nonempty R] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (a : ๐“) (n m : โ„•) : ProbabilityTheory.IndepFun (fun ฯ‰ => Prod.snd ฯ‰ m a) (fun x => hist alg x n) (arrayMeasure ฮฝ)[|(fun ฯ‰ => (action alg (n + 1) ฯ‰, Learning.pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰)) โปยน' {(a, m)}]
Bandits.ArrayModel.indepFun_snd_hist_cond.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [StandardBorelSpace R] [Nonempty R] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (a : ๐“) (n m : โ„•) : ProbabilityTheory.IndepFun (fun ฯ‰ => Prod.snd ฯ‰ m a) (fun x => hist alg x n) (arrayMeasure ฮฝ)[|(fun ฯ‰ => (action alg (n + 1) ฯ‰, Learning.pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰)) โปยน' {(a, m)}]

Code

lemma indepFun_snd_hist_cond [Countable ๐“] (alg : Algorithm ๐“ R)
    (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ] (a : ๐“) (n m : โ„•) :
    (fun ฯ‰ โ†ฆ ฯ‰.2 m a) โŸ‚แตข[(arrayMeasure ฮฝ)[|(fun ฯ‰ โ†ฆ (action alg (n + 1) ฯ‰,
      pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰)) โปยน' {(a, m)}]]
    (hist alg ยท n)
Type uses (7)
Body uses (10)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  have h_meas := measurable_hist_truePast alg a n
  refine IndepFun.of_measurable_right ?_ h_meas
  have h_ae_eq : truePast alg a n =แต[(arrayMeasure ฮฝ)[|(fun ฯ‰ โ†ฆ (action alg (n + 1) ฯ‰,
        pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰)) โปยน' {(a, m)}]]
      (fun ฯ‰ โ†ฆ (ฯ‰.1, fun k b โ†ฆ if b = a then if m โ‰  0 then ฯ‰.2 (min k (m - 1)) b
        else Nonempty.some inferInstance else ฯ‰.2 k b)) := by
    refine ae_cond_of_forall_mem ?_ fun x hx โ†ฆ ?_
    ยท refine (measurableSet_singleton _).preimage ?_
      have h_meas_pc : Measurable fun ฯ‰ โ†ฆ
          pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰ := by
        change Measurable ((fun p : (probSpace ๐“ R) ร— ๐“ โ†ฆ pullCount (action alg) p.2 (n + 1) p.1) โˆ˜
          (fun ฯ‰ : probSpace ๐“ R โ†ฆ (ฯ‰, action alg (n + 1) ฯ‰)))
        exact (measurable_uncurry_pullCount (by fun_prop) _).comp (by fun_prop)
      fun_prop
    simp only [Set.mem_preimage, Set.mem_singleton_iff, Prod.mk.injEq] at hx
    simp only [truePast]
    congr with i b
    by_cases hb : b = a
    ยท simp only [hb, โ†“reduceIte]
      simp only [hx.1, true_and] at hx
      congr!
    ยท simp [hb]
  refine IndepFun.congr ?_ EventuallyEq.rfl h_ae_eq.symm
  suffices (fun ฯ‰ โ†ฆ ฯ‰.2 m a) โŸ‚แตข[(arrayMeasure ฮฝ)[|(({ฯ‰ | action alg (n + 1) ฯ‰ = a โˆง
        pullCount (action alg) a (n + 1) ฯ‰ = m}).indicator (fun _ โ†ฆ 1)) โปยน' {1}]]
      fun ฯ‰ โ†ฆ (ฯ‰.1, fun k b โ†ฆ if b = a then if m โ‰  0 then ฯ‰.2 (min k (m - 1)) b
        else Nonempty.some inferInstance else ฯ‰.2 k b) by
    convert this using 1
    ยท rfl
    ยท rfl
    ยท rfl
    congr with ฯ‰
    simp only [Set.mem_preimage, Set.mem_singleton_iff, Prod.mk.injEq, Set.indicator_apply,
      Set.mem_setOf_eq, ite_eq_left_iff, not_and, zero_ne_one, imp_false,
      Classical.not_imp, Decidable.not_not, and_congr_right_iff]
    intro ha
    simp [ha]
  have h_meas := measurable_stepsUntil alg a m n
  obtain โŸจf, hf, hf_eqโŸฉ := h_meas.exists_eq_measurable_comp
  simp_rw [hf_eq]
  refine indepFun_cond_comp (Z := f) (z := 1) ?_ ?_ hf
  ยท exact indepFun_snd_apply_aux ฮฝ a m
  ยท refine Measurable.prodMk (by fun_prop) ?_
    simp_rw [measurable_pi_iff]
    intro i b
    refine Measurable.ite (MeasurableSet.const _) ?_ (by fun_prop)
    refine Measurable.ite (MeasurableSet.const _) (by fun_prop) (by fun_prop)

hasLaw_action_zero๐Ÿ”—

LemmaBandits.ArrayModel.hasLaw_action_zero

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.hasLaw_action_zero.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : ProbabilityTheory.HasLaw (action alg 0) (Learning.Algorithm.p0 alg) (arrayMeasure ฮฝ)
Bandits.ArrayModel.hasLaw_action_zero.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : ProbabilityTheory.HasLaw (action alg 0) (Learning.Algorithm.p0 alg) (arrayMeasure ฮฝ)

Code

lemma hasLaw_action_zero (alg : Algorithm ๐“ R) (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ] :
    HasLaw (action alg 0) alg.p0 (arrayMeasure ฮฝ) where
  map_eq
Type uses (5)
Body uses (6)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
    calc (arrayMeasure ฮฝ).map (fun ฯ‰ โ†ฆ initAlgFunction alg (ฯ‰.1 0))
    _ = ((arrayMeasure ฮฝ).fst.map (Function.eval 0)).map (initAlgFunction alg) := by
      rw [Measure.fst, Measure.map_map (by fun_prop) (by fun_prop),
        Measure.map_map (by fun_prop) (by fun_prop)]
      rfl
    _ = (volume : Measure I).map (initAlgFunction alg) := by
      simp only [arrayMeasure, Measure.fst_prod]
      rw [(measurePreserving_eval_infinitePi (fun _ โ†ฆ volume) 0).map_eq]
    _ = alg.p0 := initAlgFunction_map alg

hasCondDistrib_reward_zero๐Ÿ”—

LemmaBandits.ArrayModel.hasCondDistrib_reward_zero

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.hasCondDistrib_reward_zero.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : ProbabilityTheory.HasCondDistrib (reward alg 0) (action alg 0) ฮฝ (arrayMeasure ฮฝ)
Bandits.ArrayModel.hasCondDistrib_reward_zero.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : ProbabilityTheory.HasCondDistrib (reward alg 0) (action alg 0) ฮฝ (arrayMeasure ฮฝ)

Code

lemma hasCondDistrib_reward_zero (alg : Algorithm ๐“ R) (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ] :
    HasCondDistrib (reward alg 0) (action alg 0) ฮฝ (arrayMeasure ฮฝ)
Type uses (6)
Body uses (13)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  refine hasCondDistrib_of_condDistrib_eq (by fun_prop) (by fun_prop) ?_
  refine (condDistrib_ae_eq_cond (by fun_prop) (by fun_prop)).trans ?_
  rw [Filter.EventuallyEq, ae_iff_of_countable]
  intro a ha
  simp only [reward_zero]
  calc ((arrayMeasure ฮฝ)[|action alg 0 โปยน' {a}]).map (fun ฯ‰ โ†ฆ ฯ‰.2 0 (action alg 0 ฯ‰))
  _ = ((arrayMeasure ฮฝ)[|action alg 0 โปยน' {a}]).map (fun ฯ‰ โ†ฆ ฯ‰.2 0 a) := by
    refine Measure.map_congr
      (ae_cond_of_forall_mem ((measurableSet_singleton _).preimage (by fun_prop)) ?_)
    intro x hx
    simp only [Set.mem_preimage, Set.mem_singleton_iff] at hx
    simp [hx]
  _ = ฮฝ a := by
    rw [cond_of_indepFun]
    ยท exact map_snd_apply_arrayMeasure 0 a
    ยท have : (fun ฯ‰ โ†ฆ ฯ‰.1 0) โŸ‚แตข[arrayMeasure ฮฝ] fun ฯ‰ โ†ฆ ฯ‰.2 0 a :=
        indepFun_fst_zero_snd_zero_action ฮฝ a
      rw [action_zero]
      exact this.comp (ฯ† := initAlgFunction alg) (by fun_prop) measurable_id
    ยท fun_prop
    ยท fun_prop
    ยท simp
    ยท rwa [Measure.map_apply (by fun_prop) (by simp)] at ha

hasCondDistrib_action'๐Ÿ”—

LemmaBandits.ArrayModel.hasCondDistrib_action'

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.hasCondDistrib_action'.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) : ProbabilityTheory.HasCondDistrib (action alg (n + 1)) (fun x => hist alg x n) (Learning.Algorithm.policy alg n) (arrayMeasure ฮฝ)
Bandits.ArrayModel.hasCondDistrib_action'.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) : ProbabilityTheory.HasCondDistrib (action alg (n + 1)) (fun x => hist alg x n) (Learning.Algorithm.policy alg n) (arrayMeasure ฮฝ)

Code

lemma hasCondDistrib_action' (alg : Algorithm ๐“ R) (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ] (n : โ„•) :
    HasCondDistrib (action alg (n + 1)) (hist alg ยท n) (alg.policy n) (arrayMeasure ฮฝ)
Type uses (6)
Body uses (11)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  rw [action_add_one_eq]
  have h_fun ฯ‰ := algFunction_map alg n (hist alg ฯ‰ n)
  refine โŸจby fun_prop, ?_โŸฉ
  have h_indep : (arrayMeasure ฮฝ).map (fun ฯ‰ โ†ฆ (ฯ‰.1 (n + 1), hist alg ฯ‰ n)) =
      (โ„™).prod ((arrayMeasure ฮฝ).map (hist alg ยท n)) := by
    have h_indep' := indepFun_fst_add_one_hist alg ฮฝ n
    rw [indepFun_iff_map_prod_eq_prod_map_map (by fun_prop) (by fun_prop)] at h_indep'
    rw [h_indep']
    congr
    simp only [arrayMeasure]
    calc ((Measure.infinitePi fun x โ†ฆ โ„™).prod (streamMeasure ฮฝ)).map (fun ฯ‰ โ†ฆ ฯ‰.1 (n + 1))
    _ = (Measure.infinitePi fun x โ†ฆ โ„™).map (Function.eval (n + 1)) := by
      nth_rw 2 [โ† Measure.fst_prod (ฮผ := Measure.infinitePi fun x โ†ฆ โ„™)
        (ฮฝ := streamMeasure ฮฝ)]
      rw [Measure.fst, Measure.map_map (by fun_prop) (by fun_prop)]
      rfl
    _ = โ„™ := by rw [Measure.infinitePi_map_eval]
  have : (fun x โ†ฆ (hist alg x n, algFunction alg n (hist alg x n) (x.1 (n + 1)))) =
      (fun p โ†ฆ (p.2, algFunction alg n (p.2) (p.1))) โˆ˜ (fun x โ†ฆ (x.1 (n + 1), hist alg x n)) := rfl
  rw [this, โ† Measure.map_map (by fun_prop) (by fun_prop), h_indep]
  have : (โ„™ : Measure I).prod ((arrayMeasure ฮฝ).map (hist alg ยท n)) =
      ((Kernel.const _ โ„™) ร—โ‚– Kernel.id) โˆ˜โ‚˜ ((arrayMeasure ฮฝ).map (hist alg ยท n)) := by
    have h := Measure.compProd_const (ฮผ := (arrayMeasure ฮฝ).map (hist alg ยท n))
      (ฮฝ := (โ„™ : Measure I))
    rw [Measure.compProd_eq_comp_prod] at h
    rw [โ† Measure.prod_swap, โ† h, โ† Measure.deterministic_comp_eq_map (by fun_prop),
      Measure.comp_assoc, โ† Kernel.swap, Kernel.swap_prod]
  rw [this, โ† Measure.deterministic_comp_eq_map (by fun_prop),
    โ† Measure.deterministic_comp_eq_map (by fun_prop), Measure.compProd_eq_comp_prod,
    Measure.comp_assoc, Measure.comp_assoc, Measure.comp_assoc]
  congr 2
  ext ฯ‰ : 1
  simp only [Kernel.deterministic_comp_eq_map, Kernel.comp_deterministic_eq_comap, Kernel.coe_comap,
    Function.comp_apply]
  rw [Kernel.map_apply _ (by fun_prop), Kernel.prod_apply, Kernel.const_apply, Kernel.id_apply,
    Kernel.prod_apply, Kernel.id_apply, โ† h_fun]
  calc (((โ„™).prod (Measure.dirac (hist alg ฯ‰ n)))).map (fun p โ†ฆ (p.2, algFunction alg n p.2 p.1))
  _ = (((โ„™).prod (Measure.dirac (hist alg ฯ‰ n))).map Prod.swap).map
      (fun p โ†ฆ (p.1, algFunction alg n p.1 p.2)) := by
    rw [Measure.map_map (by fun_prop) (by fun_prop)]
    rfl
  _ = ((Measure.dirac (hist alg ฯ‰ n)).prod โ„™).map (fun p โ†ฆ (p.1, algFunction alg n p.1 p.2)) := by
    rw [Measure.prod_swap]
  _ = (Measure.dirac (hist alg ฯ‰ n)).prod ((โ„™).map (algFunction alg n (hist alg ฯ‰ n))) := by
    ext s hs
    rw [Measure.map_apply (by fun_prop) hs, Measure.prod_apply, lintegral_dirac, Measure.prod_apply,
      lintegral_dirac, Measure.map_apply (by fun_prop)]
    ยท congr
    ยท exact hs.preimage (by fun_prop)
    ยท exact hs
    ยท exact hs.preimage (by fun_prop)

hasCondDistrib_reward_pullCount_action๐Ÿ”—

LemmaBandits.ArrayModel.hasCondDistrib_reward_pullCount_action

The conditional distribution of the reward at time n + 1, given the action at time n + 1 and the number of times that action has been pulled before time n + 1, is equal to the kernel ฮฝ.

๐Ÿ”—theorem
Bandits.ArrayModel.hasCondDistrib_reward_pullCount_action.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) : ProbabilityTheory.HasCondDistrib (reward alg (n + 1)) (fun ฯ‰ => (action alg (n + 1) ฯ‰, Learning.pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰)) (ProbabilityTheory.Kernel.prodMkRight โ„• ฮฝ) (arrayMeasure ฮฝ)
Bandits.ArrayModel.hasCondDistrib_reward_pullCount_action.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) : ProbabilityTheory.HasCondDistrib (reward alg (n + 1)) (fun ฯ‰ => (action alg (n + 1) ฯ‰, Learning.pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰)) (ProbabilityTheory.Kernel.prodMkRight โ„• ฮฝ) (arrayMeasure ฮฝ)

Code

lemma hasCondDistrib_reward_pullCount_action
    (alg : Algorithm ๐“ R) (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ] (n : โ„•) :
    HasCondDistrib (reward alg (n + 1))
      (fun ฯ‰ โ†ฆ (action alg (n + 1) ฯ‰, pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰))
      (ฮฝ.prodMkRight _) (arrayMeasure ฮฝ)
Type uses (7)
Body uses (14)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  have h_meas : Measurable fun ฯ‰ โ†ฆ pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰ := by
    change Measurable ((fun p : (probSpace ๐“ R) ร— ๐“ โ†ฆ pullCount (action alg) p.2 (n + 1) p.1) โˆ˜
      (fun ฯ‰ : probSpace ๐“ R โ†ฆ (ฯ‰, action alg (n + 1) ฯ‰)))
    exact (measurable_uncurry_pullCount (by fun_prop) _).comp (by fun_prop)
  refine hasCondDistrib_of_condDistrib_eq (by fun_prop) (by fun_prop) ?_
  refine (condDistrib_ae_eq_cond
    (Measurable.prodMk (by fun_prop) (by fun_prop)) (by fun_prop)).trans ?_
  rw [Filter.EventuallyEq, ae_iff_of_countable]
  intro โŸจa, mโŸฉ ham
  simp only [Kernel.prodMkRight_apply]
  calc
    Measure.map (reward alg (n + 1))
      (arrayMeasure ฮฝ)[|(fun ฯ‰ โ†ฆ (action alg (n + 1) ฯ‰,
        pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰)) โปยน' {(a, m)}]
  _ = Measure.map (fun ฯ‰ โ†ฆ ฯ‰.2 m a)
      (arrayMeasure ฮฝ)[|(fun ฯ‰ โ†ฆ (action alg (n + 1) ฯ‰,
        pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰)) โปยน' {(a, m)}] := by
    rw [reward_eq]
    refine Measure.map_congr
      (ae_cond_of_forall_mem ((measurableSet_singleton _).preimage (by fun_prop)) (fun x hx โ†ฆ ?_))
    simp only [Set.mem_preimage, Set.mem_singleton_iff, Prod.mk.injEq] at hx
    simp only [hx.1] at hx โŠข
    simp [hx.2]
  _ = Measure.map (fun ฯ‰ โ†ฆ ฯ‰.2 m a)
      (arrayMeasure ฮฝ)[|({ฯ‰ | action alg (n + 1) ฯ‰ = a โˆง
        pullCount (action alg) a (n + 1) ฯ‰ = m}).indicator 1 โปยน' {1}] := by
    congr with ฯ‰
    simp only [Set.mem_preimage, Set.mem_singleton_iff, Prod.mk.injEq, Set.indicator_apply,
      Set.mem_setOf_eq, Pi.one_apply, ite_eq_left_iff, not_and, zero_ne_one, imp_false,
      Classical.not_imp, Decidable.not_not, and_congr_right_iff]
    intro ha
    simp [ha]
  _ = ฮฝ a := by
    rw [cond_of_indepFun, map_snd_apply_arrayMeasure m a]
    ยท exact (indepFun_snd_apply_pullCount_action alg ฮฝ a m n).symm
    ยท refine Measurable.indicator (by fun_prop) ?_
      exact MeasurableSet.inter ((measurableSet_singleton _).preimage (by fun_prop))
        ((measurableSet_singleton _).preimage (by fun_prop))
    ยท fun_prop
    ยท simp
    ยท rw [Measure.map_apply (by fun_prop) (by simp)] at ham
      convert ham
      ext ฯ‰
      simp only [Set.mem_preimage, Set.indicator_apply, Set.mem_setOf_eq, Pi.one_apply,
        Set.mem_singleton_iff, ite_eq_left_iff, not_and, zero_ne_one, imp_false, Classical.not_imp,
        Decidable.not_not, Prod.mk.injEq, and_congr_right_iff]
      intro ha
      simp [ha]

reward_ae_eq_cond๐Ÿ”—

LemmaBandits.ArrayModel.reward_ae_eq_cond

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.reward_ae_eq_cond.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) (a : ๐“) (n m : โ„•) : reward alg (n + 1) =แต[(arrayMeasure ฮฝ)[|(fun ฯ‰ => (action alg (n + 1) ฯ‰, Learning.pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰)) โปยน' {(a, m)}]] fun ฯ‰ => Prod.snd ฯ‰ m a
Bandits.ArrayModel.reward_ae_eq_cond.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) (a : ๐“) (n m : โ„•) : reward alg (n + 1) =แต[(arrayMeasure ฮฝ)[|(fun ฯ‰ => (action alg (n + 1) ฯ‰, Learning.pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰)) โปยน' {(a, m)}]] fun ฯ‰ => Prod.snd ฯ‰ m a

Code

lemma reward_ae_eq_cond (alg : Algorithm ๐“ R) (ฮฝ : Kernel ๐“ R) (a : ๐“) (n m : โ„•) :
    reward alg (n + 1) =แต[(arrayMeasure ฮฝ)[|(fun ฯ‰ โ†ฆ (action alg (n + 1) ฯ‰,
        pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰)) โปยน' {(a, m)}]]
      (fun ฯ‰ โ†ฆ ฯ‰.2 m a)
Type uses (7)
Body uses (5)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  rw [reward_eq]
  refine ae_cond_of_forall_mem ?_ ?_
  ยท have : Measurable fun ฯ‰ โ†ฆ pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰ := by
      change Measurable ((fun p : (probSpace ๐“ R) ร— ๐“ โ†ฆ pullCount (action alg) p.2 (n + 1) p.1) โˆ˜
        (fun ฯ‰ : probSpace ๐“ R โ†ฆ (ฯ‰, action alg (n + 1) ฯ‰)))
      exact (measurable_uncurry_pullCount (by fun_prop) _).comp (by fun_prop)
    exact (measurableSet_singleton _).preimage (by fun_prop)
  intro ฯ‰ hฯ‰
  simp only [Set.mem_preimage, Set.mem_singleton_iff, Prod.mk.injEq] at hฯ‰
  simp only [hฯ‰.2]
  simp [hฯ‰.1]

hasCondDistrib_reward_hist_action_pullCount๐Ÿ”—

LemmaBandits.ArrayModel.hasCondDistrib_reward_hist_action_pullCount

The conditional distribution of the reward at time n + 1, given the history up to time n, the action at time n + 1, and the number of times that action has been pulled before time n + 1, is equal to the kernel ฮฝ.

๐Ÿ”—theorem
Bandits.ArrayModel.hasCondDistrib_reward_hist_action_pullCount.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) : ProbabilityTheory.HasCondDistrib (reward alg (n + 1)) (fun ฯ‰ => (hist alg ฯ‰ n, action alg (n + 1) ฯ‰, Learning.pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰)) (ProbabilityTheory.Kernel.prodMkLeft (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— R) (ProbabilityTheory.Kernel.prodMkRight โ„• ฮฝ)) (arrayMeasure ฮฝ)
Bandits.ArrayModel.hasCondDistrib_reward_hist_action_pullCount.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) : ProbabilityTheory.HasCondDistrib (reward alg (n + 1)) (fun ฯ‰ => (hist alg ฯ‰ n, action alg (n + 1) ฯ‰, Learning.pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰)) (ProbabilityTheory.Kernel.prodMkLeft (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— R) (ProbabilityTheory.Kernel.prodMkRight โ„• ฮฝ)) (arrayMeasure ฮฝ)

Code

lemma hasCondDistrib_reward_hist_action_pullCount
    (alg : Algorithm ๐“ R) (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ] (n : โ„•) :
    HasCondDistrib (reward alg (n + 1))
      (fun ฯ‰ โ†ฆ (hist alg ฯ‰ n, action alg (n + 1) ฯ‰,
        pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰))
      ((ฮฝ.prodMkRight _).prodMkLeft _) (arrayMeasure ฮฝ)
Type uses (8)
Body uses (17)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  have h_meas : Measurable fun ฯ‰ โ†ฆ pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰ := by
    change Measurable ((fun p : (probSpace ๐“ R) ร— ๐“ โ†ฆ pullCount (action alg) p.2 (n + 1) p.1) โˆ˜
      (fun ฯ‰ : probSpace ๐“ R โ†ฆ (ฯ‰, action alg (n + 1) ฯ‰)))
    exact (measurable_uncurry_pullCount (by fun_prop) _).comp (by fun_prop)
  refine hasCondDistrib_of_condDistrib_eq (by fun_prop) (by fun_prop) ?_
  refine condDistrib_prod_of_forall_condDistrib_cond (by fun_prop) (by fun_prop) (by fun_prop) _ ?_
  intro (a, m) ham
  have h_eq : ((ฮฝ.prodMkRight _).prodMkLeft _).comap (fun ฯ‰ : (Iic n โ†’ ๐“ ร— R) โ†ฆ (ฯ‰, a, m))
        (by fun_prop) =
      Kernel.const _ (ฮฝ a) := by ext; simp
  rw [h_eq, condDistrib_congr_left (reward_ae_eq_cond alg ฮฝ a n m)]
  refine (condDistrib_of_indepFun ?_ (by fun_prop) (by fun_prop)).trans (ae_of_all _ fun ฯ‰ โ†ฆ ?_)
  ยท exact (indepFun_snd_hist_cond alg ฮฝ a n m).symm
  ยท simp only [Kernel.const_apply]
    have : (fun ฯ‰ โ†ฆ (action alg (n + 1) ฯ‰,
          pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰)) โปยน' {(a, m)} =
        ({ฯ‰ | action alg (n + 1) ฯ‰ = a โˆง
          pullCount (action alg) a (n + 1) ฯ‰ = m}).indicator 1 โปยน' {1} := by
      ext ฯ‰
      simp [Set.indicator_apply]
      grind
    rw [this, cond_of_indepFun, map_snd_apply_arrayMeasure m a]
    ยท exact (indepFun_snd_apply_pullCount_action alg ฮฝ a m n).symm
    ยท refine Measurable.indicator (by fun_prop) ?_
      exact MeasurableSet.inter ((measurableSet_singleton _).preimage (by fun_prop))
        ((measurableSet_singleton _).preimage (by fun_prop))
    ยท fun_prop
    ยท simp
    ยท convert ham
      ext ฯ‰
      simp only [Set.mem_preimage, Set.indicator_apply, Set.mem_setOf_eq, Pi.one_apply,
        Set.mem_singleton_iff, ite_eq_left_iff, not_and, zero_ne_one, imp_false, Classical.not_imp,
        Decidable.not_not, Prod.mk.injEq, and_congr_right_iff]
      intro ha
      simp [ha]

condIndepFun_reward_hist๐Ÿ”—

LemmaBandits.ArrayModel.condIndepFun_reward_hist

The reward at time n + 1 is conditionally independent of the history up to time n, given the action at time n + 1 and the number of times that action has been pulled before time n + 1.

๐Ÿ”—theorem
Bandits.ArrayModel.condIndepFun_reward_hist.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) : ProbabilityTheory.CondIndepFun (MeasurableSpace.comap (fun ฯ‰ => (action alg (n + 1) ฯ‰, Learning.pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰)) inferInstance) โ‹ฏ (reward alg (n + 1)) (fun x => hist alg x n) (arrayMeasure ฮฝ)
Bandits.ArrayModel.condIndepFun_reward_hist.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) : ProbabilityTheory.CondIndepFun (MeasurableSpace.comap (fun ฯ‰ => (action alg (n + 1) ฯ‰, Learning.pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰)) inferInstance) โ‹ฏ (reward alg (n + 1)) (fun x => hist alg x n) (arrayMeasure ฮฝ)

Code

lemma condIndepFun_reward_hist (alg : Algorithm ๐“ R) (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ] (n : โ„•) :
    (reward alg (n + 1)) โŸ‚แตข[(fun ฯ‰ โ†ฆ (action alg (n + 1) ฯ‰,
          pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰)),
        Measurable.prodMk (by fun_prop) (measurable_pullCount_action_add_one alg n);
        arrayMeasure ฮฝ]
      (hist alg ยท n)
Type uses (12)
Body uses (5)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  have h_cond := hasCondDistrib_reward_hist_action_pullCount alg ฮฝ n
  refine condIndepFun_of_exists_condDistrib_prod_ae_eq_prodMkLeft (by fun_prop) (by fun_prop) ?_
    h_cond.condDistrib_eq
  exact Measurable.prodMk (by fun_prop) (measurable_pullCount_action_add_one alg n)

hasCondDistrib_reward'๐Ÿ”—

LemmaBandits.ArrayModel.hasCondDistrib_reward'

The conditional distribution of the reward at time n + 1, given the history up to time n and the action at time n + 1, is equal to the kernel ฮฝ.

๐Ÿ”—theorem
Bandits.ArrayModel.hasCondDistrib_reward'.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) : ProbabilityTheory.HasCondDistrib (reward alg (n + 1)) (fun ฯ‰ => (hist alg ฯ‰ n, action alg (n + 1) ฯ‰)) (ProbabilityTheory.Kernel.prodMkLeft (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— R) ฮฝ) (arrayMeasure ฮฝ)
Bandits.ArrayModel.hasCondDistrib_reward'.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) : ProbabilityTheory.HasCondDistrib (reward alg (n + 1)) (fun ฯ‰ => (hist alg ฯ‰ n, action alg (n + 1) ฯ‰)) (ProbabilityTheory.Kernel.prodMkLeft (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— R) ฮฝ) (arrayMeasure ฮฝ)

Code

lemma hasCondDistrib_reward' (alg : Algorithm ๐“ R) (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ] (n : โ„•) :
    HasCondDistrib (reward alg (n + 1)) (fun ฯ‰ โ†ฆ (hist alg ฯ‰ n, action alg (n + 1) ฯ‰))
      (ฮฝ.prodMkLeft _) (arrayMeasure ฮฝ)
Type uses (7)
Body uses (16)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  let R' := reward alg (n + 1)
  let H := (hist alg ยท n)
  let A := action alg (n + 1)
  let P := fun ฯ‰ โ†ฆ pullCount (action alg) (action alg (n + 1) ฯ‰) (n + 1) ฯ‰
  have hP : Measurable P := measurable_pullCount_action_add_one alg n
  change HasCondDistrib R' (fun ฯ‰ โ†ฆ (H ฯ‰, A ฯ‰)) (ฮฝ.prodMkLeft _) _
  suffices HasCondDistrib R' (fun ฯ‰ โ†ฆ (A ฯ‰, H ฯ‰)) (ฮฝ.prodMkRight _) (arrayMeasure ฮฝ) by
    have h_eq : (fun ฯ‰ โ†ฆ (H ฯ‰, A ฯ‰)) = MeasurableEquiv.prodComm โˆ˜ (fun ฯ‰ โ†ฆ (A ฯ‰, H ฯ‰)) := rfl
    rw [h_eq]
    exact this.measurableEquiv_comp_right (ฮบ := ฮฝ.prodMkRight _) _
  suffices HasCondDistrib R' (fun ฯ‰ โ†ฆ ((A ฯ‰, H ฯ‰), P ฯ‰))
      ((ฮฝ.prodMkRight _).prodMkRight _) (arrayMeasure ฮฝ) by
    -- use that `P` is measurable wrt `(A, H)` to drop it from the conditioning
    have hP_meas :
        Measurable[MeasurableSpace.comap (fun ฯ‰ โ†ฆ (A ฯ‰, H ฯ‰)) inferInstance] P :=
      measurable_pullCount_action_add_one_hist alg n
    obtain โŸจf, hf_meas, hf_eqโŸฉ := hP_meas.exists_eq_measurable_comp
    simp only [hf_eq, Function.comp_apply] at this
    rwa [hasCondDistrib_prod_right_iff _ _ hf_meas] at this
  suffices HasCondDistrib R' (fun ฯ‰ โ†ฆ ((A ฯ‰, P ฯ‰), H ฯ‰))
      ((ฮฝ.prodMkRight _).prodMkRight _) (arrayMeasure ฮฝ) by
    let e : ((๐“ ร— โ„•) ร— (Iic n โ†’ ๐“ ร— R)) โ‰ƒแต ((๐“ ร— (Iic n โ†’ ๐“ ร— R)) ร— โ„•) :=
    { toFun := fun x โ†ฆ ((x.1.1, x.2), x.1.2)
      invFun := fun x โ†ฆ ((x.1.1, x.2), x.1.2)
      measurable_toFun := by simp only [Equiv.coe_fn_mk]; fun_prop
      measurable_invFun := by simp only [Equiv.symm_mk, Equiv.coe_fn_mk]; fun_prop }
    exact this.measurableEquiv_comp_right e
  suffices HasCondDistrib R' (fun ฯ‰ โ†ฆ (A ฯ‰, P ฯ‰)) (ฮฝ.prodMkRight _) (arrayMeasure ฮฝ) by
    have h_indep : H โŸ‚แตข[(fun ฯ‰ โ†ฆ (A ฯ‰, P ฯ‰)), (by fun_prop); arrayMeasure ฮฝ] R' :=
      (condIndepFun_reward_hist alg ฮฝ n).symm
    have h_condDistrib := this.condDistrib_eq
    rw [condIndepFun_iff_condDistrib_prod_ae_eq_prodMkRight (by fun_prop) (by fun_prop)
      (by fun_prop)] at h_indep
    refine hasCondDistrib_of_condDistrib_eq (by fun_prop) (by fun_prop) ?_
    refine h_indep.trans ?_
    rw [Filter.EventuallyEq, ae_map_iff] at h_condDistrib โŠข
    ยท simpa only [Kernel.prodMkRight_apply]
    ยท fun_prop
    ยท exact Kernel.measurableSet_eq _ _
    ยท fun_prop
    ยท exact Kernel.measurableSet_eq _ _
  exact hasCondDistrib_reward_pullCount_action alg ฮฝ n

hasCondDistrib_action๐Ÿ”—

LemmaBandits.ArrayModel.hasCondDistrib_action

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.hasCondDistrib_action.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) : ProbabilityTheory.HasCondDistrib (action alg (n + 1)) (fun ฯ‰ i => (action alg (โ†‘i) ฯ‰, reward alg (โ†‘i) ฯ‰)) (Learning.Algorithm.policy alg n) (arrayMeasure ฮฝ)
Bandits.ArrayModel.hasCondDistrib_action.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) : ProbabilityTheory.HasCondDistrib (action alg (n + 1)) (fun ฯ‰ i => (action alg (โ†‘i) ฯ‰, reward alg (โ†‘i) ฯ‰)) (Learning.Algorithm.policy alg n) (arrayMeasure ฮฝ)

Code

lemma hasCondDistrib_action (alg : Algorithm ๐“ R) (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ] (n : โ„•) :
    HasCondDistrib (action alg (n + 1))
      (fun ฯ‰ (i : Iic n) โ†ฆ (action alg i ฯ‰, reward alg i ฯ‰))
      (alg.policy n) (arrayMeasure ฮฝ)
Type uses (6)
Body uses (3)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  convert hasCondDistrib_action' alg ฮฝ n with ฯ‰ i
  ยท simp only [action]
    rw [hist_eq _ _ n]
  ยท simp only [reward]
    rw [hist_eq _ _ n]

hasCondDistrib_reward๐Ÿ”—

LemmaBandits.ArrayModel.hasCondDistrib_reward

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.hasCondDistrib_reward.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) : ProbabilityTheory.HasCondDistrib (reward alg (n + 1)) (fun ฯ‰ => (fun i => (action alg (โ†‘i) ฯ‰, reward alg (โ†‘i) ฯ‰), action alg (n + 1) ฯ‰)) (Learning.Environment.feedback (Learning.stationaryEnv ฮฝ) n) (arrayMeasure ฮฝ)
Bandits.ArrayModel.hasCondDistrib_reward.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ„•) : ProbabilityTheory.HasCondDistrib (reward alg (n + 1)) (fun ฯ‰ => (fun i => (action alg (โ†‘i) ฯ‰, reward alg (โ†‘i) ฯ‰), action alg (n + 1) ฯ‰)) (Learning.Environment.feedback (Learning.stationaryEnv ฮฝ) n) (arrayMeasure ฮฝ)

Code

lemma hasCondDistrib_reward (alg : Algorithm ๐“ R) (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ]
    (n : โ„•) :
    HasCondDistrib (reward alg (n + 1))
      (fun ฯ‰ โ†ฆ (fun (i : Iic n) โ†ฆ (action alg i ฯ‰, reward alg i ฯ‰), action alg (n + 1) ฯ‰))
      ((stationaryEnv ฮฝ).feedback n) (arrayMeasure ฮฝ)
Type uses (8)
Body uses (3)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  convert hasCondDistrib_reward' alg ฮฝ n with ฯ‰ i
  ยท simp only [action]
    rw [hist_eq _ _ n]
  ยท simp only [reward]
    rw [hist_eq _ _ n]
  ยท rfl

isAlgEnvSeq_arrayMeasure๐Ÿ”—

LemmaBandits.ArrayModel.isAlgEnvSeq_arrayMeasure

No docstring.

๐Ÿ”—theorem
Bandits.ArrayModel.isAlgEnvSeq_arrayMeasure.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : Learning.IsAlgEnvSeq (action alg) (reward alg) alg (Learning.stationaryEnv ฮฝ) (arrayMeasure ฮฝ)
Bandits.ArrayModel.isAlgEnvSeq_arrayMeasure.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [Nonempty ๐“] [StandardBorelSpace ๐“] [DecidableEq ๐“] [Countable ๐“] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐“ R) (ฮฝ : ProbabilityTheory.Kernel ๐“ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : Learning.IsAlgEnvSeq (action alg) (reward alg) alg (Learning.stationaryEnv ฮฝ) (arrayMeasure ฮฝ)

Code

lemma isAlgEnvSeq_arrayMeasure (alg : Algorithm ๐“ R) (ฮฝ : Kernel ๐“ R) [IsMarkovKernel ฮฝ] :
    IsAlgEnvSeq (action alg) (reward alg) alg (stationaryEnv ฮฝ) (arrayMeasure ฮฝ) where
  hasLaw_action_zero
Type uses (9)
Body uses (8)
Used by (3)

Actions: Source ยท Open Issue

Proof
hasLaw_action_zero alg ฮฝ
  hasCondDistrib_feedback_zero := hasCondDistrib_reward_zero alg ฮฝ
  hasCondDistrib_action := hasCondDistrib_action alg ฮฝ
  hasCondDistrib_feedback := hasCondDistrib_reward alg ฮฝ
  1. Bandits.streamMeasure
  2. Bandits.instIsProbabilityMeasureForallNatForallStreamMeasureOfIsMarkovKernel
  3. hasLaw_eval_infinitePi
  4. Bandits.hasLaw_eval_streamMeasure
  5. Bandits.hasLaw_eval_eval_streamMeasure
  6. Bandits.identDistrib_eval_eval_id_streamMeasure
  7. Bandits.integrable_eval_streamMeasure
  8. Bandits.integral_eval_streamMeasure
  9. Bandits.iIndepFun_eval_streamMeasure'
  10. Bandits.iIndepFun_eval_streamMeasure''
  11. Bandits.iIndepFun_eval_streamMeasure
  12. Bandits.indepFun_eval_streamMeasure
  13. Bandits.indepFun_eval_streamMeasure'
  14. Bandits.ArrayModel.probSpace
  15. Bandits.ArrayModel.instMeasurableSpaceProbSpace
  16. Bandits.ArrayModel.instStandardBorelSpaceProbSpaceOfCountable
  17. Bandits.ArrayModel.arrayMeasure
  18. Bandits.ArrayModel.instIsProbabilityMeasureProbSpaceArrayMeasureOfIsMarkovKernel
  19. Bandits.ArrayModel.initAlgFunction
  20. Bandits.ArrayModel.initAlgFunction_map
  21. Bandits.ArrayModel.measurable_initAlgFunction
  22. Bandits.ArrayModel.algFunction
  23. Bandits.ArrayModel.algFunction_map
  24. Bandits.ArrayModel.measurable_algFunction
  25. Bandits.ArrayModel.hist
  26. Bandits.ArrayModel.hist_zero
  27. Bandits.ArrayModel.hist_add_one
  28. Bandits.ArrayModel.hist_eq
  29. Bandits.ArrayModel.hist_add_one_eq_IicSuccProd'
  30. Bandits.ArrayModel.action
  31. Bandits.ArrayModel.action_zero
  32. Bandits.ArrayModel.action_add_one_eq
  33. Bandits.ArrayModel.reward
  34. Bandits.ArrayModel.reward_zero
  35. Bandits.ArrayModel.reward_add_one
  36. Bandits.ArrayModel.reward_eq
  37. Bandits.ArrayModel.sumRewards_eq
  38. Bandits.ArrayModel.measurable_action_add_one'
  39. Bandits.ArrayModel.measurable_pullCount'_action_add_one
  40. Bandits.ArrayModel.measurable_hist
  41. Bandits.ArrayModel.measurable_action
  42. Bandits.ArrayModel.measurable_reward
  43. Bandits.ArrayModel.hist_add_one_eq_IicSuccProd
  44. Bandits.ArrayModel.measurable_pullCount_action_add_one
  45. Bandits.ArrayModel.hist_congr
  46. Bandits.ArrayModel.stepsUntil_congr_aux
  47. Bandits.ArrayModel.stepsUntil_congr
  48. Bandits.ArrayModel.stepsUntil_indicator_congr
  49. Bandits.ArrayModel.measurable_hist_comap
  50. Bandits.ArrayModel.truePast
  51. Bandits.ArrayModel.truePast_eq_of_pullCount_eq
  52. Bandits.ArrayModel.truePast_eq_of_pullCount_eq_of_ne_zero
  53. Bandits.ArrayModel.measurable_hist_truePast
  54. Bandits.ArrayModel.measurable_action_add_one_truePast
  55. Bandits.ArrayModel.measurable_pullCount_add_one_truePast
  56. Bandits.ArrayModel.measurable_stepsUntil
  57. Bandits.ArrayModel.measurable_pullCount_action_add_one_hist
  58. Bandits.ArrayModel.map_snd_apply_arrayMeasure
  59. Bandits.ArrayModel.indepFun_fst_snd
  60. Bandits.ArrayModel.indepFun_fst_zero_snd_zero_action
  61. Bandits.ArrayModel.indepFun_fst_add_one_aux
  62. Bandits.ArrayModel.indepFun_fst_add_one_hist
  63. Bandits.ArrayModel.indepFun_snd_apply_aux
  64. Bandits.ArrayModel.indepFun_snd_apply_pullCount_action
  65. Bandits.ArrayModel.indepFun_cond_comp
  66. Bandits.ArrayModel.indepFun_snd_hist_cond
  67. Bandits.ArrayModel.hasLaw_action_zero
  68. Bandits.ArrayModel.hasCondDistrib_reward_zero
  69. Bandits.ArrayModel.hasCondDistrib_action'
  70. Bandits.ArrayModel.hasCondDistrib_reward_pullCount_action
  71. Bandits.ArrayModel.reward_ae_eq_cond
  72. Bandits.ArrayModel.hasCondDistrib_reward_hist_action_pullCount
  73. Bandits.ArrayModel.condIndepFun_reward_hist
  74. Bandits.ArrayModel.hasCondDistrib_reward'
  75. Bandits.ArrayModel.hasCondDistrib_action
  76. Bandits.ArrayModel.hasCondDistrib_reward
  77. Bandits.ArrayModel.isAlgEnvSeq_arrayMeasure