2.3.ย Online.Bandit.ArrayProbSpace
Array-of-rewards probability space for stochastic bandits
We build a particular probability space for stochastic bandits, called the "array model", in which
an infinite array of i.i.d. rewards is first produced for all actions. When the algorithm chooses
action a for the nth time, it receives the reward in the row a of the array and column n.
Some statements about bandit algorithms are easier to prove in this space, and can then be transfered to any other probability space using the fact that the conditinonal distributions of the arms and rewards specified in the bandit model determine their laws uniquely.
Main definitions
-
streamMeasure ฮฝ: probability measure on the space of infinite arrays of rewards, where the rewards in each row are i.i.d. according toฮฝ. -
probSpace ๐ R: probability space for the array model of stochastic bandits with action space๐and reward spaceR. -
arrayMeasure ฮฝ: probability measure onprobSpace ๐ Rfor the array model of stochastic bandits with reward kernelฮฝ.
Module LeanMachineLearning.Online.Bandit.ArrayProbSpace contains 77 exposed declarations.
streamMeasure๐
Bandits.streamMeasureMeasure of an infinite stream of rewards from each action.
Bandits.streamMeasure.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) : MeasureTheory.Measure (โ โ ๐ โ R)Bandits.streamMeasure.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) : MeasureTheory.Measure (โ โ ๐ โ R)
Code
noncomputable def streamMeasure (ฮฝ : Kernel ๐ R) : Measure (โ โ ๐ โ R) := Measure.infinitePi fun _ โฆ Measure.infinitePi ฮฝ
Used by (56)
Actions: Source ยท Open Issue
instIsProbabilityMeasureForallNatForallStreamMeasureOfIsMarkovKernel๐
Bandits.instIsProbabilityMeasureForallNatForallStreamMeasureOfIsMarkovKernelNo docstring.
Bandits.instIsProbabilityMeasureForallNatForallStreamMeasureOfIsMarkovKernel.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : MeasureTheory.IsProbabilityMeasure (streamMeasure ฮฝ)Bandits.instIsProbabilityMeasureForallNatForallStreamMeasureOfIsMarkovKernel.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : MeasureTheory.IsProbabilityMeasure (streamMeasure ฮฝ)
Code
instance (ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ] : IsProbabilityMeasure (streamMeasure ฮฝ)
Type uses (1)
Actions: Source ยท Open Issue
Proof
by unfold streamMeasure infer_instance
hasLaw_eval_infinitePi๐
hasLaw_eval_infinitePiNo docstring.
hasLaw_eval_infinitePi.{u_3, u_4} {ฮน : Type u_3} {X : ฮน โ Type u_4} {mX : (i : ฮน) โ MeasurableSpace (X i)} (ฮผ : (i : ฮน) โ MeasureTheory.Measure (X i)) [hฮผ : โ (i : ฮน), MeasureTheory.IsProbabilityMeasure (ฮผ i)] (i : ฮน) : ProbabilityTheory.HasLaw (Function.eval i) (ฮผ i) (MeasureTheory.Measure.infinitePi ฮผ)hasLaw_eval_infinitePi.{u_3, u_4} {ฮน : Type u_3} {X : ฮน โ Type u_4} {mX : (i : ฮน) โ MeasurableSpace (X i)} (ฮผ : (i : ฮน) โ MeasureTheory.Measure (X i)) [hฮผ : โ (i : ฮน), MeasureTheory.IsProbabilityMeasure (ฮผ i)] (i : ฮน) : ProbabilityTheory.HasLaw (Function.eval i) (ฮผ i) (MeasureTheory.Measure.infinitePi ฮผ)
Code
lemma _root_.hasLaw_eval_infinitePi {ฮน : Type*} {X : ฮน โ Type*} {mX : โ i, MeasurableSpace (X i)}
(ฮผ : (i : ฮน) โ Measure (X i)) [hฮผ : โ i, IsProbabilityMeasure (ฮผ i)] (i : ฮน) :
HasLaw (Function.eval i) (ฮผ i) (Measure.infinitePi ฮผ) where
aemeasurableActions: Source ยท Open Issue
Proof
Measurable.aemeasurable (by fun_prop) map_eq := by exact (measurePreserving_eval_infinitePi ฮผ i).map_eq
hasLaw_eval_streamMeasure๐
Bandits.hasLaw_eval_streamMeasureNo docstring.
Bandits.hasLaw_eval_streamMeasure.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) : ProbabilityTheory.HasLaw (fun h => h n) (MeasureTheory.Measure.infinitePi โฮฝ) (streamMeasure ฮฝ)Bandits.hasLaw_eval_streamMeasure.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) : ProbabilityTheory.HasLaw (fun h => h n) (MeasureTheory.Measure.infinitePi โฮฝ) (streamMeasure ฮฝ)
Code
lemma hasLaw_eval_streamMeasure (ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ] (n : โ) :
HasLaw (fun h : โ โ ๐ โ R โฆ h n) (Measure.infinitePi ฮฝ) (streamMeasure ฮฝ)Type uses (1)
Body uses (1)
Used by (1)
Actions: Source ยท Open Issue
Proof
hasLaw_eval_infinitePi (fun _ โฆ Measure.infinitePi ฮฝ) n
hasLaw_eval_eval_streamMeasure๐
Bandits.hasLaw_eval_eval_streamMeasureNo docstring.
Bandits.hasLaw_eval_eval_streamMeasure.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) (a : ๐) : ProbabilityTheory.HasLaw (fun h => h n a) (ฮฝ a) (streamMeasure ฮฝ)Bandits.hasLaw_eval_eval_streamMeasure.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) (a : ๐) : ProbabilityTheory.HasLaw (fun h => h n a) (ฮฝ a) (streamMeasure ฮฝ)
Code
lemma hasLaw_eval_eval_streamMeasure (ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ] (n : โ) (a : ๐) :
HasLaw (fun h : โ โ ๐ โ R โฆ h n a) (ฮฝ a) (streamMeasure ฮฝ)Type uses (1)
Body uses (2)
Actions: Source ยท Open Issue
Proof
(hasLaw_eval_infinitePi ฮฝ a).comp (hasLaw_eval_streamMeasure ฮฝ n)
identDistrib_eval_eval_id_streamMeasure๐
Bandits.identDistrib_eval_eval_id_streamMeasureNo docstring.
Bandits.identDistrib_eval_eval_id_streamMeasure.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) (a : ๐) : ProbabilityTheory.IdentDistrib (fun h => h n a) id (streamMeasure ฮฝ) (ฮฝ a)Bandits.identDistrib_eval_eval_id_streamMeasure.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) (a : ๐) : ProbabilityTheory.IdentDistrib (fun h => h n a) id (streamMeasure ฮฝ) (ฮฝ a)
Code
lemma identDistrib_eval_eval_id_streamMeasure (ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ] (n : โ) (a : ๐) :
IdentDistrib (fun h : โ โ ๐ โ R โฆ h n a) id (streamMeasure ฮฝ) (ฮฝ a) where
aemeasurable_fstType uses (1)
Body uses (1)
Used by (7)
Actions: Source ยท Open Issue
Proof
Measurable.aemeasurable (by fun_prop)
aemeasurable_snd := Measurable.aemeasurable (by fun_prop)
map_eq := by
rw [โ (hasLaw_eval_eval_streamMeasure ฮฝ n a).map_eq,
Measure.map_map (by fun_prop) (by fun_prop)]
simp
integrable_eval_streamMeasure๐
Bandits.integrable_eval_streamMeasureNo docstring.
Bandits.integrable_eval_streamMeasure.{u_1} {๐ : Type u_1} {m๐ : MeasurableSpace ๐} (ฮฝ : ProbabilityTheory.Kernel ๐ โ) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) (a : ๐) (h_int : MeasureTheory.Integrable id (ฮฝ a)) : MeasureTheory.Integrable (fun h => h n a) (streamMeasure ฮฝ)Bandits.integrable_eval_streamMeasure.{u_1} {๐ : Type u_1} {m๐ : MeasurableSpace ๐} (ฮฝ : ProbabilityTheory.Kernel ๐ โ) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) (a : ๐) (h_int : MeasureTheory.Integrable id (ฮฝ a)) : MeasureTheory.Integrable (fun h => h n a) (streamMeasure ฮฝ)
Code
lemma integrable_eval_streamMeasure (ฮฝ : Kernel ๐ โ) [IsMarkovKernel ฮฝ] (n : โ) (a : ๐)
(h_int : Integrable id (ฮฝ a)) :
Integrable (fun h : โ โ ๐ โ โ โฆ h n a) (streamMeasure ฮฝ)Type uses (1)
Body uses (2)
Actions: Source ยท Open Issue
Proof
Integrable.congr_identDistrib h_int (identDistrib_eval_eval_id_streamMeasure ฮฝ n a).symm
integral_eval_streamMeasure๐
Bandits.integral_eval_streamMeasureNo docstring.
Bandits.integral_eval_streamMeasure.{u_1} {๐ : Type u_1} {m๐ : MeasurableSpace ๐} (ฮฝ : ProbabilityTheory.Kernel ๐ โ) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) (a : ๐) : โซ (h : โ โ ๐ โ โ), h n a โstreamMeasure ฮฝ = โซ (x : โ), id x โฮฝ aBandits.integral_eval_streamMeasure.{u_1} {๐ : Type u_1} {m๐ : MeasurableSpace ๐} (ฮฝ : ProbabilityTheory.Kernel ๐ โ) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) (a : ๐) : โซ (h : โ โ ๐ โ โ), h n a โstreamMeasure ฮฝ = โซ (x : โ), id x โฮฝ a
Code
lemma integral_eval_streamMeasure (ฮฝ : Kernel ๐ โ) [IsMarkovKernel ฮฝ] (n : โ) (a : ๐) :
โซ h, h n a โ(streamMeasure ฮฝ) = (ฮฝ a)[id]Type uses (1)
Body uses (1)
Used by (1)
Actions: Source ยท Open Issue
Proof
by
calc โซ h, h n a โ(streamMeasure ฮฝ)
_ = โซ x, x โ((streamMeasure ฮฝ).map (fun h โฆ h n a)) := by
rw [integral_map (Measurable.aemeasurable (by fun_prop)) (by fun_prop)]
_ = (ฮฝ a)[id] := by simp [(hasLaw_eval_eval_streamMeasure ฮฝ n a).map_eq]
iIndepFun_eval_streamMeasure'๐
Bandits.iIndepFun_eval_streamMeasure'No docstring.
Bandits.iIndepFun_eval_streamMeasure'.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : ProbabilityTheory.iIndepFun (fun n ฯ => ฯ n) (streamMeasure ฮฝ)Bandits.iIndepFun_eval_streamMeasure'.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : ProbabilityTheory.iIndepFun (fun n ฯ => ฯ n) (streamMeasure ฮฝ)
Code
lemma iIndepFun_eval_streamMeasure' (ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ] :
iIndepFun (fun n ฯ โฆ ฯ n) (streamMeasure ฮฝ)Type uses (1)
Used by (1)
Actions: Source ยท Open Issue
Proof
iIndepFun_infinitePi (P := fun (_ : โ) โฆ Measure.infinitePi ฮฝ) (ฮฉ := fun _ โฆ ๐ โ R)
(X := fun i u โฆ u) (fun i โฆ by fun_prop)
iIndepFun_eval_streamMeasure''๐
Bandits.iIndepFun_eval_streamMeasure''No docstring.
Bandits.iIndepFun_eval_streamMeasure''.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (a : ๐) : ProbabilityTheory.iIndepFun (fun n ฯ => ฯ n a) (streamMeasure ฮฝ)Bandits.iIndepFun_eval_streamMeasure''.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (a : ๐) : ProbabilityTheory.iIndepFun (fun n ฯ => ฯ n a) (streamMeasure ฮฝ)
Code
lemma iIndepFun_eval_streamMeasure'' (ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ] (a : ๐) :
iIndepFun (fun n ฯ โฆ ฯ n a) (streamMeasure ฮฝ)Type uses (1)
Body uses (1)
Used by (5)
Actions: Source ยท Open Issue
Proof
(iIndepFun_eval_streamMeasure' ฮฝ).comp (g := fun i ฯ โฆ ฯ a) (by fun_prop)
iIndepFun_eval_streamMeasure๐
Bandits.iIndepFun_eval_streamMeasureNo docstring.
Bandits.iIndepFun_eval_streamMeasure.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : ProbabilityTheory.iIndepFun (fun p ฯ => ฯ (Prod.fst p) (Prod.snd p)) (streamMeasure ฮฝ)Bandits.iIndepFun_eval_streamMeasure.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : ProbabilityTheory.iIndepFun (fun p ฯ => ฯ (Prod.fst p) (Prod.snd p)) (streamMeasure ฮฝ)
Code
lemma iIndepFun_eval_streamMeasure (ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ] :
iIndepFun (fun (p : โ ร ๐) ฯ โฆ ฯ p.1 p.2) (streamMeasure ฮฝ)Type uses (1)
Used by (1)
Actions: Source ยท Open Issue
Proof
iIndepFun_uncurry_infinitePi' (X := fun _ _ โฆ id) (fun _ โฆ ฮฝ) (by fun_prop)
indepFun_eval_streamMeasure๐
Bandits.indepFun_eval_streamMeasureNo docstring.
Bandits.indepFun_eval_streamMeasure.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] {n m : โ} {a b : ๐} (h : n โ m โจ a โ b) : ProbabilityTheory.IndepFun (fun ฯ => ฯ n a) (fun ฯ => ฯ m b) (streamMeasure ฮฝ)Bandits.indepFun_eval_streamMeasure.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] {n m : โ} {a b : ๐} (h : n โ m โจ a โ b) : ProbabilityTheory.IndepFun (fun ฯ => ฯ n a) (fun ฯ => ฯ m b) (streamMeasure ฮฝ)
Code
lemma indepFun_eval_streamMeasure (ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ] {n m : โ} {a b : ๐}
(h : n โ m โจ a โ b) :
IndepFun (fun ฯ โฆ ฯ n a) (fun ฯ โฆ ฯ m b) (streamMeasure ฮฝ)Type uses (1)
Body uses (1)
Actions: Source ยท Open Issue
Proof
by
change IndepFun (fun ฯ โฆ ฯ (n, a).1 (n, a).2) (fun ฯ โฆ ฯ (m, b).1 (m, b).2)
(streamMeasure ฮฝ)
exact (iIndepFun_eval_streamMeasure ฮฝ).indepFun (by grind)
indepFun_eval_streamMeasure'๐
Bandits.indepFun_eval_streamMeasure'No docstring.
Bandits.indepFun_eval_streamMeasure'.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] {a b : ๐} (h : a โ b) : ProbabilityTheory.IndepFun (fun ฯ n => ฯ n a) (fun ฯ n => ฯ n b) (streamMeasure ฮฝ)Bandits.indepFun_eval_streamMeasure'.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] {a b : ๐} (h : a โ b) : ProbabilityTheory.IndepFun (fun ฯ n => ฯ n a) (fun ฯ n => ฯ n b) (streamMeasure ฮฝ)
Code
lemma indepFun_eval_streamMeasure' (ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ] {a b : ๐} (h : a โ b) :
IndepFun (fun ฯ n โฆ ฯ n a) (fun ฯ n โฆ ฯ n b) (streamMeasure ฮฝ)Type uses (1)
Body uses (1)
Used by (1)
Actions: Source ยท Open Issue
Proof
indepFun_proj_infinitePi_infinitePi h
probSpace๐
Bandits.ArrayModel.probSpaceProbability space for the array model of stochastic bandits.
Bandits.ArrayModel.probSpace.{u_1, u_2} (๐ : Type u_1) (R : Type u_2) : Type (max u_1 u_2)Bandits.ArrayModel.probSpace.{u_1, u_2} (๐ : Type u_1) (R : Type u_2) : Type (max u_1 u_2)
Code
def probSpace : Type _ := (โ โ I) ร (โ โ ๐ โ R)
Actions: Source ยท Open Issue
instMeasurableSpaceProbSpace๐
Bandits.ArrayModel.instMeasurableSpaceProbSpaceNo docstring.
Bandits.ArrayModel.instMeasurableSpaceProbSpace.{u_3, u_4} {๐ : Type u_3} {R : Type u_4} [MeasurableSpace R] : MeasurableSpace (probSpace ๐ R)Bandits.ArrayModel.instMeasurableSpaceProbSpace.{u_3, u_4} {๐ : Type u_3} {R : Type u_4} [MeasurableSpace R] : MeasurableSpace (probSpace ๐ R)
Code
instance {๐ R : Type*} [MeasurableSpace R] : MeasurableSpace (probSpace ๐ R)Type uses (1)
Actions: Source ยท Open Issue
Proof
inferInstanceAs (MeasurableSpace ((โ โ I) ร (โ โ ๐ โ R)))
instStandardBorelSpaceProbSpaceOfCountable๐
Bandits.ArrayModel.instStandardBorelSpaceProbSpaceOfCountableNo docstring.
Bandits.ArrayModel.instStandardBorelSpaceProbSpaceOfCountable.{u_3, u_4} {๐ : Type u_3} {R : Type u_4} [Countable ๐] [MeasurableSpace R] [StandardBorelSpace R] : StandardBorelSpace (probSpace ๐ R)Bandits.ArrayModel.instStandardBorelSpaceProbSpaceOfCountable.{u_3, u_4} {๐ : Type u_3} {R : Type u_4} [Countable ๐] [MeasurableSpace R] [StandardBorelSpace R] : StandardBorelSpace (probSpace ๐ R)
Code
instance {๐ R : Type*} [Countable ๐] [MeasurableSpace R] [StandardBorelSpace R] :
StandardBorelSpace (probSpace ๐ R)Type uses (2)
Used by (2)
Actions: Source ยท Open Issue
Proof
inferInstanceAs (StandardBorelSpace ((โ โ I) ร (โ โ ๐ โ R)))
arrayMeasure๐
Bandits.ArrayModel.arrayMeasureProbability measure for the array model of stochastic bandits.
Bandits.ArrayModel.arrayMeasure.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) : MeasureTheory.Measure (probSpace ๐ R)Bandits.ArrayModel.arrayMeasure.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) : MeasureTheory.Measure (probSpace ๐ R)
Code
noncomputable def arrayMeasure (ฮฝ : Kernel ๐ R) : Measure (probSpace ๐ R) := (Measure.infinitePi fun _ โฆ volume).prod (streamMeasure ฮฝ)
Type uses (2)
Body uses (1)
Actions: Source ยท Open Issue
instIsProbabilityMeasureProbSpaceArrayMeasureOfIsMarkovKernel๐
Bandits.ArrayModel.instIsProbabilityMeasureProbSpaceArrayMeasureOfIsMarkovKernelNo docstring.
Bandits.ArrayModel.instIsProbabilityMeasureProbSpaceArrayMeasureOfIsMarkovKernel.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : MeasureTheory.IsProbabilityMeasure (arrayMeasure ฮฝ)Bandits.ArrayModel.instIsProbabilityMeasureProbSpaceArrayMeasureOfIsMarkovKernel.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : MeasureTheory.IsProbabilityMeasure (arrayMeasure ฮฝ)
Code
instance (ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ] : IsProbabilityMeasure (arrayMeasure ฮฝ)
Type uses (3)
Used by (10)
Actions: Source ยท Open Issue
Proof
Measure.prod.instIsProbabilityMeasure _ _
initAlgFunction๐
Bandits.ArrayModel.initAlgFunctionThe initial action is the image of a uniform random variable by this function.
Bandits.ArrayModel.initAlgFunction.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] (alg : Learning.Algorithm ๐ R) : โunitInterval โ ๐Bandits.ArrayModel.initAlgFunction.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] (alg : Learning.Algorithm ๐ R) : โunitInterval โ ๐
Code
noncomputable def initAlgFunction (alg : Algorithm ๐ R) : I โ ๐ := (Measure.exists_measurable_map_eq alg.p0).choose
Type uses (1)
Body uses (1)
Used by (12)
Actions: Source ยท Open Issue
initAlgFunction_map๐
Bandits.ArrayModel.initAlgFunction_mapNo docstring.
Bandits.ArrayModel.initAlgFunction_map.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] (alg : Learning.Algorithm ๐ R) : MeasureTheory.Measure.map (initAlgFunction alg) MeasureTheory.volume = Learning.Algorithm.p0 algBandits.ArrayModel.initAlgFunction_map.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] (alg : Learning.Algorithm ๐ R) : MeasureTheory.Measure.map (initAlgFunction alg) MeasureTheory.volume = Learning.Algorithm.p0 alg
Code
lemma initAlgFunction_map (alg : Algorithm ๐ R) : volume.map (initAlgFunction alg) = alg.p0
Type uses (2)
Body uses (1)
Used by (1)
Actions: Source ยท Open Issue
Proof
(Measure.exists_measurable_map_eq alg.p0).choose_spec.2
measurable_initAlgFunction๐
Bandits.ArrayModel.measurable_initAlgFunctionNo docstring.
Bandits.ArrayModel.measurable_initAlgFunction.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] (alg : Learning.Algorithm ๐ R) : Measurable (initAlgFunction alg)Bandits.ArrayModel.measurable_initAlgFunction.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] (alg : Learning.Algorithm ๐ R) : Measurable (initAlgFunction alg)
Code
lemma measurable_initAlgFunction (alg : Algorithm ๐ R) :
Measurable (initAlgFunction alg)Type uses (2)
Body uses (1)
Actions: Source ยท Open Issue
Proof
(Measure.exists_measurable_map_eq alg.p0).choose_spec.1
algFunction๐
Bandits.ArrayModel.algFunctionThe next action is the image of the history and a uniform random variable by this function.
Bandits.ArrayModel.algFunction.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] (alg : Learning.Algorithm ๐ R) (n : โ) : (โฅ(Finset.Iic n) โ ๐ ร R) โ โunitInterval โ ๐Bandits.ArrayModel.algFunction.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] (alg : Learning.Algorithm ๐ R) (n : โ) : (โฅ(Finset.Iic n) โ ๐ ร R) โ โunitInterval โ ๐
Code
noncomputable
def algFunction (alg : Algorithm ๐ R) (n : โ) :
(Iic n โ ๐ ร R) โ I โ ๐ :=
(Kernel.exists_measurable_map_eq_unitInterval (alg.policy n)).chooseType uses (1)
Body uses (1)
Used by (17)
Actions: Source ยท Open Issue
algFunction_map๐
Bandits.ArrayModel.algFunction_mapNo docstring.
Bandits.ArrayModel.algFunction_map.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] (alg : Learning.Algorithm ๐ R) (n : โ) (h : โฅ(Finset.Iic n) โ ๐ ร R) : MeasureTheory.Measure.map (algFunction alg n h) MeasureTheory.volume = (Learning.Algorithm.policy alg n) hBandits.ArrayModel.algFunction_map.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] (alg : Learning.Algorithm ๐ R) (n : โ) (h : โฅ(Finset.Iic n) โ ๐ ร R) : MeasureTheory.Measure.map (algFunction alg n h) MeasureTheory.volume = (Learning.Algorithm.policy alg n) h
Code
lemma algFunction_map (alg : Algorithm ๐ R) (n : โ) (h : Iic n โ ๐ ร R) :
volume.map (algFunction alg n h) = alg.policy n hType uses (2)
Body uses (1)
Used by (1)
Actions: Source ยท Open Issue
Proof
(Kernel.exists_measurable_map_eq_unitInterval (alg.policy n)).choose_spec.2 h
measurable_algFunction๐
Bandits.ArrayModel.measurable_algFunctionNo docstring.
Bandits.ArrayModel.measurable_algFunction.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] (alg : Learning.Algorithm ๐ R) (n : โ) : Measurable (Function.uncurry (algFunction alg n))Bandits.ArrayModel.measurable_algFunction.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] (alg : Learning.Algorithm ๐ R) (n : โ) : Measurable (Function.uncurry (algFunction alg n))
Code
lemma measurable_algFunction (alg : Algorithm ๐ R) (n : โ) :
Measurable (Function.uncurry (algFunction alg n))Type uses (2)
Body uses (1)
Used by (4)
Actions: Source ยท Open Issue
Proof
(Kernel.exists_measurable_map_eq_unitInterval (alg.policy n)).choose_spec.1
hist๐
Bandits.ArrayModel.hist
History of actions and rewards up to time n in the array model.
Bandits.ArrayModel.hist.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (ฯ : probSpace ๐ R) (n : โ) : โฅ(Finset.Iic n) โ ๐ ร RBandits.ArrayModel.hist.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (ฯ : probSpace ๐ R) (n : โ) : โฅ(Finset.Iic n) โ ๐ ร R
Code
noncomputable def hist [DecidableEq ๐] (alg : Algorithm ๐ R) (ฯ : probSpace ๐ R) : (n : โ) โ Iic n โ ๐ ร R | 0 => fun _ โฆ (initAlgFunction alg (ฯ.1 0), ฯ.2 0 (initAlgFunction alg (ฯ.1 0))) | n + 1 => let hn : Iic n โ ๐ ร R := hist alg ฯ n let a : ๐ := algFunction alg n hn (ฯ.1 (n + 1)) fun i โฆ if hin : i โค n then hn โจi, by simp [hin]โฉ else (a, ฯ.2 (pullCount' n hn a) a)
Body uses (3)
Used by (30)
Actions: Source ยท Open Issue
hist_zero๐
Bandits.ArrayModel.hist_zeroNo docstring.
Bandits.ArrayModel.hist_zero.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (ฯ : probSpace ๐ R) : hist alg ฯ 0 = fun x => (initAlgFunction alg (Prod.fst ฯ 0), Prod.snd ฯ 0 (initAlgFunction alg (Prod.fst ฯ 0)))Bandits.ArrayModel.hist_zero.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (ฯ : probSpace ๐ R) : hist alg ฯ 0 = fun x => (initAlgFunction alg (Prod.fst ฯ 0), Prod.snd ฯ 0 (initAlgFunction alg (Prod.fst ฯ 0)))
Code
lemma hist_zero [DecidableEq ๐] (alg : Algorithm ๐ R) (ฯ : probSpace ๐ R) :
hist alg ฯ 0 = fun _ โฆ (initAlgFunction alg (ฯ.1 0), ฯ.2 0 (initAlgFunction alg (ฯ.1 0)))Type uses (4)
Actions: Source ยท Open Issue
Proof
rfl
hist_add_one๐
Bandits.ArrayModel.hist_add_oneNo docstring.
Bandits.ArrayModel.hist_add_one.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (ฯ : probSpace ๐ R) (n : โ) : have a := algFunction alg n (hist alg ฯ n) (Prod.fst ฯ (n + 1)); hist alg ฯ (n + 1) = fun i => if hin : โi โค n then hist alg ฯ n โจโi, โฏโฉ else (a, Prod.snd ฯ (Learning.pullCount' n (hist alg ฯ n) a) a)Bandits.ArrayModel.hist_add_one.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (ฯ : probSpace ๐ R) (n : โ) : have a := algFunction alg n (hist alg ฯ n) (Prod.fst ฯ (n + 1)); hist alg ฯ (n + 1) = fun i => if hin : โi โค n then hist alg ฯ n โจโi, โฏโฉ else (a, Prod.snd ฯ (Learning.pullCount' n (hist alg ฯ n) a) a)
Code
lemma hist_add_one [DecidableEq ๐] (alg : Algorithm ๐ R) (ฯ : probSpace ๐ R) (n : โ) :
let a : ๐Type uses (5)
Actions: Source ยท Open Issue
Proof
algFunction alg n (hist alg ฯ n) (ฯ.1 (n + 1))
hist alg ฯ (n + 1) =
fun (i : Iic (n + 1)) โฆ if hin : i โค n then hist alg ฯ n โจi, by simp [hin]โฉ
else (a, ฯ.2 (pullCount' n (hist alg ฯ n) a) a) := rfl
hist_eq๐
Bandits.ArrayModel.hist_eqNo docstring.
Bandits.ArrayModel.hist_eq.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (ฯ : probSpace ๐ R) (n : โ) : hist alg ฯ n = fun i => hist alg ฯ โi โจโi, โฏโฉBandits.ArrayModel.hist_eq.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (ฯ : probSpace ๐ R) (n : โ) : hist alg ฯ n = fun i => hist alg ฯ โi โจโi, โฏโฉ
Code
lemma hist_eq [DecidableEq ๐] (alg : Algorithm ๐ R) (ฯ : probSpace ๐ R) (n : โ) :
hist alg ฯ n = fun i : Iic n โฆ hist alg ฯ i โจi.1, by simpโฉBody uses (4)
Used by (7)
Actions: Source ยท Open Issue
Proof
by
induction n with
| zero =>
ext i : 1
simp only [hist]
rw [Unique.eq_default i]
simp [coe_default_Iic_zero]
| succ n hn =>
ext i : 1
by_cases hin : i โค n
ยท rw [hist_add_one]
simp only [hin, โreduceDIte]
rw [funext_iff] at hn
simp_rw [hn]
ยท grind
hist_add_one_eq_IicSuccProd'๐
Bandits.ArrayModel.hist_add_one_eq_IicSuccProd'No docstring.
Bandits.ArrayModel.hist_add_one_eq_IicSuccProd'.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (ฯ : probSpace ๐ R) (n : โ) : have a := algFunction alg n (hist alg ฯ n) (Prod.fst ฯ (n + 1)); hist alg ฯ (n + 1) = (MeasurableEquiv.symm (MeasurableEquiv.IicSuccProd (fun x => ๐ ร R) n)) (hist alg ฯ n, a, Prod.snd ฯ (Learning.pullCount' n (hist alg ฯ n) a) a)Bandits.ArrayModel.hist_add_one_eq_IicSuccProd'.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (ฯ : probSpace ๐ R) (n : โ) : have a := algFunction alg n (hist alg ฯ n) (Prod.fst ฯ (n + 1)); hist alg ฯ (n + 1) = (MeasurableEquiv.symm (MeasurableEquiv.IicSuccProd (fun x => ๐ ร R) n)) (hist alg ฯ n, a, Prod.snd ฯ (Learning.pullCount' n (hist alg ฯ n) a) a)
Code
lemma hist_add_one_eq_IicSuccProd' [DecidableEq ๐] (alg : Algorithm ๐ R) (ฯ : probSpace ๐ R)
(n : โ) :
let a : ๐Type uses (6)
Body uses (1)
Used by (1)
Actions: Source ยท Open Issue
Proof
algFunction alg n (hist alg ฯ n) (ฯ.1 (n + 1))
hist alg ฯ (n + 1) =
(MeasurableEquiv.IicSuccProd (fun _ โฆ ๐ ร R) n).symm
(hist alg ฯ n, (a, ฯ.2 (pullCount' n (hist alg ฯ n) a) a)) := by
intro a
rw [hist_add_one]
ext i : 1
simp only [Kernel.symm_IicSuccProd, MeasurableEquiv.prodCongr, MeasurableEquiv.refl_toEquiv,
MeasurableEquiv.piSingleton, eq_rec_constant, MeasurableEquiv.IicProdIoc,
MeasurableEquiv.trans_apply, MeasurableEquiv.coe_mk, Equiv.prodCongr_apply, Equiv.coe_refl,
Equiv.coe_fn_mk, Prod.map_apply, id_eq]
rfl
action๐
Bandits.ArrayModel.action
Action taken at time n in the array model.
Bandits.ArrayModel.action.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (n : โ) (ฯ : probSpace ๐ R) : ๐Bandits.ArrayModel.action.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (n : โ) (ฯ : probSpace ๐ R) : ๐
Code
noncomputable def action [DecidableEq ๐] (alg : Algorithm ๐ R) (n : โ) (ฯ : probSpace ๐ R) : ๐ := (hist alg ฯ n โจn, by simpโฉ).1
Body uses (1)
Used by (43)
Actions: Source ยท Open Issue
action_zero๐
Bandits.ArrayModel.action_zeroNo docstring.
Bandits.ArrayModel.action_zero.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) : action alg 0 = fun ฯ => initAlgFunction alg (Prod.fst ฯ 0)Bandits.ArrayModel.action_zero.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) : action alg 0 = fun ฯ => initAlgFunction alg (Prod.fst ฯ 0)
Code
lemma action_zero [DecidableEq ๐] (alg : Algorithm ๐ R) :
action alg 0 = fun ฯ โฆ initAlgFunction alg (ฯ.1 0)Type uses (4)
Used by (3)
Actions: Source ยท Open Issue
Proof
by ext simp [action, hist_zero]
action_add_one_eq๐
Bandits.ArrayModel.action_add_one_eqNo docstring.
Bandits.ArrayModel.action_add_one_eq.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (n : โ) : action alg (n + 1) = fun ฯ => algFunction alg n (hist alg ฯ n) (Prod.fst ฯ (n + 1))Bandits.ArrayModel.action_add_one_eq.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (n : โ) : action alg (n + 1) = fun ฯ => algFunction alg n (hist alg ฯ n) (Prod.fst ฯ (n + 1))
Code
lemma action_add_one_eq [DecidableEq ๐] (alg : Algorithm ๐ R) (n : โ) :
action alg (n + 1) = fun ฯ โฆ algFunction alg n (hist alg ฯ n) (ฯ.1 (n + 1))Type uses (5)
Body uses (2)
Used by (7)
Actions: Source ยท Open Issue
Proof
by ext ฯ rw [action, hist_add_one] simp only [add_le_iff_nonpos_right, nonpos_iff_eq_zero, one_ne_zero, โreduceDIte]
reward๐
Bandits.ArrayModel.reward
Reward received at time n in the array model.
Bandits.ArrayModel.reward.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (n : โ) (ฯ : probSpace ๐ R) : RBandits.ArrayModel.reward.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (n : โ) (ฯ : probSpace ๐ R) : R
Code
noncomputable def reward [DecidableEq ๐] (alg : Algorithm ๐ R) (n : โ) (ฯ : probSpace ๐ R) : R := (hist alg ฯ n โจn, by simpโฉ).2
Body uses (1)
Used by (24)
Actions: Source ยท Open Issue
reward_zero๐
Bandits.ArrayModel.reward_zeroNo docstring.
Bandits.ArrayModel.reward_zero.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) : reward alg 0 = fun ฯ => Prod.snd ฯ 0 (action alg 0 ฯ)Bandits.ArrayModel.reward_zero.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) : reward alg 0 = fun ฯ => Prod.snd ฯ 0 (action alg 0 ฯ)
Code
lemma reward_zero [DecidableEq ๐] (alg : Algorithm ๐ R) :
reward alg 0 = fun ฯ โฆ ฯ.2 0 (action alg 0 ฯ)Body uses (2)
Used by (2)
Actions: Source ยท Open Issue
Proof
by ext simp [reward, hist_zero, action_zero]
reward_add_one๐
Bandits.ArrayModel.reward_add_oneNo docstring.
Bandits.ArrayModel.reward_add_one.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (n : โ) : reward alg (n + 1) = fun ฯ => Prod.snd ฯ (Learning.pullCount' n (hist alg ฯ n) (action alg (n + 1) ฯ)) (action alg (n + 1) ฯ)Bandits.ArrayModel.reward_add_one.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (n : โ) : reward alg (n + 1) = fun ฯ => Prod.snd ฯ (Learning.pullCount' n (hist alg ฯ n) (action alg (n + 1) ฯ)) (action alg (n + 1) ฯ)
Code
lemma reward_add_one [DecidableEq ๐] (alg : Algorithm ๐ R) (n : โ) :
reward alg (n + 1) =
fun ฯ โฆ ฯ.2 (pullCount' n (hist alg ฯ n) (action alg (n + 1) ฯ)) (action alg (n + 1) ฯ)Body uses (3)
Used by (2)
Actions: Source ยท Open Issue
Proof
by ext ฯ simp [reward, hist_add_one, action_add_one_eq]
reward_eq๐
Bandits.ArrayModel.reward_eqNo docstring.
Bandits.ArrayModel.reward_eq.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (n : โ) : reward alg n = fun ฯ => Prod.snd ฯ (Learning.pullCount (action alg) (action alg n ฯ) n ฯ) (action alg n ฯ)Bandits.ArrayModel.reward_eq.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (n : โ) : reward alg n = fun ฯ => Prod.snd ฯ (Learning.pullCount (action alg) (action alg n ฯ) n ฯ) (action alg n ฯ)
Code
lemma reward_eq [DecidableEq ๐] (alg : Algorithm ๐ R) (n : โ) :
reward alg n = fun ฯ โฆ ฯ.2 (pullCount (action alg) (action alg n ฯ) n ฯ) (action alg n ฯ)Body uses (11)
Actions: Source ยท Open Issue
Proof
by
cases n with
| zero => ext; simp [reward_zero, action_zero]
| succ n =>
ext ฯ
rw [reward, hist_add_one]
simp only [add_le_iff_nonpos_right, nonpos_iff_eq_zero, one_ne_zero, โreduceDIte]
rw [action_add_one_eq, pullCount_eq_pullCount' (R' := reward alg) (by simp)]
simp only [Nat.add_one_sub_one]
rw [hist_eq]
rfl
sumRewards_eq๐
Bandits.ArrayModel.sumRewards_eqNo docstring.
Bandits.ArrayModel.sumRewards_eq.{u_1} {๐ : Type u_1} {m๐ : MeasurableSpace ๐} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ โ) (a : ๐) (n : โ) (ฯ : probSpace ๐ โ) : Learning.sumRewards (action alg) (reward alg) a n ฯ = โ i โ Finset.range (Learning.pullCount (action alg) a n ฯ), Prod.snd ฯ i aBandits.ArrayModel.sumRewards_eq.{u_1} {๐ : Type u_1} {m๐ : MeasurableSpace ๐} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ โ) (a : ๐) (n : โ) (ฯ : probSpace ๐ โ) : Learning.sumRewards (action alg) (reward alg) a n ฯ = โ i โ Finset.range (Learning.pullCount (action alg) a n ฯ), Prod.snd ฯ i a
Code
lemma sumRewards_eq [DecidableEq ๐] (alg : Algorithm ๐ โ) (a : ๐) (n : โ) (ฯ : probSpace ๐ โ) :
sumRewards (action alg) (reward alg) a n ฯ =
โ i โ range (pullCount (action alg) a n ฯ), ฯ.2 i aBody uses (6)
Used by (3)
Actions: Source ยท Open Issue
Proof
by
induction n with
| zero => simp
| succ n ih =>
by_cases ha : action alg n ฯ = a
ยท simp [ha, sumRewards_add_one, pullCount_add_one, sum_range_succ, ih, reward_eq]
ยท simp [ha, sumRewards_add_one, pullCount_eq_pullCount_of_action_ne, ih]
measurable_action_add_one'๐
Bandits.ArrayModel.measurable_action_add_one'No docstring.
Bandits.ArrayModel.measurable_action_add_one'.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] {alg : Learning.Algorithm ๐ R} (n : โ) (h : Measurable fun x => hist alg x n) : Measurable fun x => algFunction alg n (hist alg x n) (Prod.fst x (n + 1))Bandits.ArrayModel.measurable_action_add_one'.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] {alg : Learning.Algorithm ๐ R} (n : โ) (h : Measurable fun x => hist alg x n) : Measurable fun x => algFunction alg n (hist alg x n) (Prod.fst x (n + 1))
Code
lemma measurable_action_add_one' [DecidableEq ๐] {alg : Algorithm ๐ R}
(n : โ) (h : Measurable (hist alg ยท n)) :
Measurable (fun x โฆ algFunction alg n (hist alg x n) (x.1 (n + 1)))Type uses (5)
Body uses (1)
Used by (1)
Actions: Source ยท Open Issue
Proof
by fun_prop
measurable_pullCount'_action_add_one๐
Bandits.ArrayModel.measurable_pullCount'_action_add_oneNo docstring.
Bandits.ArrayModel.measurable_pullCount'_action_add_one.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] {alg : Learning.Algorithm ๐ R} (n : โ) (h_hist : Measurable fun x => hist alg x n) : Measurable fun x => Learning.pullCount' n (hist alg x n) (algFunction alg n (hist alg x n) (Prod.fst x (n + 1)))Bandits.ArrayModel.measurable_pullCount'_action_add_one.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] {alg : Learning.Algorithm ๐ R} (n : โ) (h_hist : Measurable fun x => hist alg x n) : Measurable fun x => Learning.pullCount' n (hist alg x n) (algFunction alg n (hist alg x n) (Prod.fst x (n + 1)))
Code
lemma measurable_pullCount'_action_add_one [DecidableEq ๐] {alg : Algorithm ๐ R}
(n : โ) (h_hist : Measurable (hist alg ยท n)) :
Measurable (fun x โฆ
pullCount' n (hist alg x n) (algFunction alg n (hist alg x n) (x.1 (n + 1))))Type uses (6)
Body uses (3)
Used by (1)
Actions: Source ยท Open Issue
Proof
by
have h_alg_meas : Measurable (fun x โฆ algFunction alg n (hist alg x n) (x.1 (n + 1))) :=
measurable_action_add_one' n h_hist
exact (measurable_uncurry_pullCount' (๐ := ๐) n).comp (h_hist.prodMk h_alg_meas)
measurable_hist๐
Bandits.ArrayModel.measurable_histNo docstring.
Bandits.ArrayModel.measurable_hist.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] (alg : Learning.Algorithm ๐ R) (n : โ) : Measurable fun ฯ => hist alg ฯ nBandits.ArrayModel.measurable_hist.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] (alg : Learning.Algorithm ๐ R) (n : โ) : Measurable fun ฯ => hist alg ฯ n
Code
lemma measurable_hist [DecidableEq ๐] [Countable ๐] (alg : Algorithm ๐ R) (n : โ) :
Measurable (fun ฯ โฆ hist alg ฯ n)Type uses (4)
Body uses (7)
Used by (8)
Actions: Source ยท Open Issue
Proof
by
induction n with
| zero =>
simp_rw [hist_zero, measurable_pi_iff]
refine fun _ โฆ Measurable.prodMk (by fun_prop) ?_
change Measurable ((fun x : ๐ ร ((โ โ I) ร (โ โ ๐ โ R)) โฆ x.2.2 0 x.1) โ
(fun x : (โ โ I) ร (โ โ ๐ โ R) โฆ (initAlgFunction alg (x.1 0), x)))
have : Measurable (fun x : ๐ ร ((โ โ I) ร (โ โ ๐ โ R)) โฆ x.2.2 0 x.1) :=
measurable_from_prod_countable_right fun p โฆ by simp only; fun_prop
exact Measurable.comp (by fun_prop) (Measurable.prodMk (by fun_prop) (by fun_prop))
| succ n hn =>
refine measurable_pi_iff.mpr fun i โฆ ?_
by_cases hin : i โค n
ยท simp only [hist, hin, โreduceDIte]
rw [measurable_pi_iff] at hn
exact hn โจi.1, by simp [hin]โฉ
ยท simp only [hist, hin, โreduceDIte]
refine Measurable.prodMk (by fun_prop) ?_
change Measurable ((fun (x : (โ โ ๐ โ R) ร โ ร ๐) โฆ x.1 x.2.1 x.2.2) โ
(fun x โฆ (x.2, pullCount' n (hist alg x n) (algFunction alg n (hist alg x n) (x.1 (n + 1))),
(algFunction alg n (hist alg x n) (x.1 (n + 1))))))
have h1 : Measurable (fun (x : (โ โ ๐ โ R) ร โ ร ๐) โฆ x.1 x.2.1 x.2.2) :=
measurable_from_prod_countable_left fun p : โ ร ๐ โฆ (by simp only; fun_prop)
refine Measurable.comp (by fun_prop) (Measurable.prodMk (by fun_prop) ?_)
refine Measurable.prodMk ?_ (by fun_prop)
exact measurable_pullCount'_action_add_one n hn
measurable_action๐
Bandits.ArrayModel.measurable_actionNo docstring.
Bandits.ArrayModel.measurable_action.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] (alg : Learning.Algorithm ๐ R) (n : โ) : Measurable (action alg n)Bandits.ArrayModel.measurable_action.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] (alg : Learning.Algorithm ๐ R) (n : โ) : Measurable (action alg n)
Code
lemma measurable_action [DecidableEq ๐] [Countable ๐] (alg : Algorithm ๐ R) (n : โ) :
Measurable (action alg n)Type uses (4)
Body uses (2)
Used by (13)
Actions: Source ยท Open Issue
Proof
by unfold action; fun_prop
measurable_reward๐
Bandits.ArrayModel.measurable_rewardNo docstring.
Bandits.ArrayModel.measurable_reward.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] (alg : Learning.Algorithm ๐ R) (n : โ) : Measurable (reward alg n)Bandits.ArrayModel.measurable_reward.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] (alg : Learning.Algorithm ๐ R) (n : โ) : Measurable (reward alg n)
Code
lemma measurable_reward [DecidableEq ๐] [Countable ๐] (alg : Algorithm ๐ R) (n : โ) :
Measurable (reward alg n)Type uses (4)
Body uses (2)
Used by (8)
Actions: Source ยท Open Issue
Proof
by unfold reward; fun_prop
hist_add_one_eq_IicSuccProd๐
Bandits.ArrayModel.hist_add_one_eq_IicSuccProdNo docstring.
Bandits.ArrayModel.hist_add_one_eq_IicSuccProd.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (ฯ : probSpace ๐ R) (n : โ) : hist alg ฯ (n + 1) = (MeasurableEquiv.symm (MeasurableEquiv.IicSuccProd (fun x => ๐ ร R) n)) (hist alg ฯ n, action alg (n + 1) ฯ, reward alg (n + 1) ฯ)Bandits.ArrayModel.hist_add_one_eq_IicSuccProd.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (ฯ : probSpace ๐ R) (n : โ) : hist alg ฯ (n + 1) = (MeasurableEquiv.symm (MeasurableEquiv.IicSuccProd (fun x => ๐ ร R) n)) (hist alg ฯ n, action alg (n + 1) ฯ, reward alg (n + 1) ฯ)
Code
lemma hist_add_one_eq_IicSuccProd [DecidableEq ๐] (alg : Algorithm ๐ R) (ฯ : probSpace ๐ R)
(n : โ) :
hist alg ฯ (n + 1) =
(MeasurableEquiv.IicSuccProd (fun _ โฆ ๐ ร R) n).symm
(hist alg ฯ n, (action alg (n + 1) ฯ, reward alg (n + 1) ฯ))Used by (1)
Actions: Source ยท Open Issue
Proof
by rw [hist_add_one_eq_IicSuccProd', reward_add_one, action_add_one_eq]
measurable_pullCount_action_add_one๐
Bandits.ArrayModel.measurable_pullCount_action_add_oneNo docstring.
Bandits.ArrayModel.measurable_pullCount_action_add_one.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] (alg : Learning.Algorithm ๐ R) (n : โ) : Measurable fun ฯ => Learning.pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯBandits.ArrayModel.measurable_pullCount_action_add_one.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] (alg : Learning.Algorithm ๐ R) (n : โ) : Measurable fun ฯ => Learning.pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ
Code
lemma measurable_pullCount_action_add_one [DecidableEq ๐] [Countable ๐] (alg : Algorithm ๐ R)
(n : โ) :
Measurable (fun ฯ โฆ pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ)Type uses (5)
Body uses (3)
Used by (6)
Actions: Source ยท Open Issue
Proof
by
change Measurable ((fun p : (probSpace ๐ R) ร ๐ โฆ pullCount (action alg) p.2 (n + 1) p.1) โ
(fun ฯ : probSpace ๐ R โฆ (ฯ, action alg (n + 1) ฯ)))
exact (measurable_uncurry_pullCount (by fun_prop) _).comp (by fun_prop)
hist_congr๐
Bandits.ArrayModel.hist_congrNo docstring.
Bandits.ArrayModel.hist_congr.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (n : โ) {ฯ ฯ' : probSpace ๐ R} (hฯ1 : โ i โค n, Prod.fst ฯ i = Prod.fst ฯ' i) (hฯ2 : โ (i : โ) (a : ๐), i < Learning.pullCount (action alg) a (n + 1) ฯ โ Prod.snd ฯ i a = Prod.snd ฯ' i a) : hist alg ฯ n = hist alg ฯ' nBandits.ArrayModel.hist_congr.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (n : โ) {ฯ ฯ' : probSpace ๐ R} (hฯ1 : โ i โค n, Prod.fst ฯ i = Prod.fst ฯ' i) (hฯ2 : โ (i : โ) (a : ๐), i < Learning.pullCount (action alg) a (n + 1) ฯ โ Prod.snd ฯ i a = Prod.snd ฯ' i a) : hist alg ฯ n = hist alg ฯ' n
Code
lemma hist_congr (alg : Algorithm ๐ R) (n : โ) {ฯ ฯ' : probSpace ๐ R}
(hฯ1 : โ i โค n, ฯ.1 i = ฯ'.1 i)
(hฯ2 : โ i a, i < pullCount (action alg) a (n + 1) ฯ โ ฯ.2 i a = ฯ'.2 i a) :
hist alg ฯ n = hist alg ฯ' nBody uses (13)
Actions: Source ยท Open Issue
Proof
by
induction n with
| zero =>
simp only [zero_add, pullCount_one] at hฯ2
simp_rw [hist_zero]
ext i : 1
simp only [le_refl, hฯ1, Prod.mk.injEq, true_and]
refine hฯ2 0 _ ?_
simp [action, hฯ1]
| succ n hn =>
simp_rw [hist_add_one_eq_IicSuccProd]
specialize hn fun i hin โฆ hฯ1 i (by grind)
have h_hist : hist alg ฯ n = hist alg ฯ' n := by
refine hn fun i a hi โฆ hฯ2 i a (hi.trans_le ?_)
exact pullCount_mono _ (by lia) _
have h_action : action alg (n + 1) ฯ = action alg (n + 1) ฯ' := by
simp_rw [action_add_one_eq]
rw [h_hist, hฯ1 _ le_rfl]
congr 3
simp only [reward_add_one, h_hist, h_action]
refine hฯ2 _ _ ?_
rw [pullCount_add_one, h_action]
simp only [โreduceIte]
rw [pullCount_eq_pullCount' (R' := reward alg) (by simp)]
simp only [Nat.add_one_sub_one]
rw [โ h_hist, hist_eq]
change pullCount' n (fun i โฆ (action alg i ฯ, reward alg i ฯ)) (action alg (n + 1) ฯ') <
pullCount' n (fun i โฆ (action alg i ฯ, reward alg i ฯ)) (action alg (n + 1) ฯ') + 1
grind
stepsUntil_congr_aux๐
Bandits.ArrayModel.stepsUntil_congr_auxNo docstring.
Bandits.ArrayModel.stepsUntil_congr_aux.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (a : ๐) (m n : โ) {ฯ ฯ' : probSpace ๐ R} (hฯ1 : โ (i : โ), Prod.fst ฯ i = Prod.fst ฯ' i) (hฯ2_ne : โ (i : โ) (b : ๐), b โ a โ Prod.snd ฯ i b = Prod.snd ฯ' i b) (hฯ2_eq : โ (i : โ), i + 1 โค m โ Prod.snd ฯ i a = Prod.snd ฯ' i a) (h_eq : action alg (n + 1) ฯ = a โง Learning.pullCount (action alg) a (n + 1) ฯ = m) : action alg (n + 1) ฯ' = a โง Learning.pullCount (action alg) a (n + 1) ฯ' = mBandits.ArrayModel.stepsUntil_congr_aux.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (a : ๐) (m n : โ) {ฯ ฯ' : probSpace ๐ R} (hฯ1 : โ (i : โ), Prod.fst ฯ i = Prod.fst ฯ' i) (hฯ2_ne : โ (i : โ) (b : ๐), b โ a โ Prod.snd ฯ i b = Prod.snd ฯ' i b) (hฯ2_eq : โ (i : โ), i + 1 โค m โ Prod.snd ฯ i a = Prod.snd ฯ' i a) (h_eq : action alg (n + 1) ฯ = a โง Learning.pullCount (action alg) a (n + 1) ฯ = m) : action alg (n + 1) ฯ' = a โง Learning.pullCount (action alg) a (n + 1) ฯ' = m
Code
lemma stepsUntil_congr_aux (alg : Algorithm ๐ R)
(a : ๐) (m n : โ) {ฯ ฯ' : probSpace ๐ R}
(hฯ1 : โ i, ฯ.1 i = ฯ'.1 i) (hฯ2_ne : โ i b, b โ a โ ฯ.2 i b = ฯ'.2 i b)
(hฯ2_eq : โ i, i + 1 โค m โ ฯ.2 i a = ฯ'.2 i a)
(h_eq : action alg (n + 1) ฯ = a โง pullCount (action alg) a (n + 1) ฯ = m) :
action alg (n + 1) ฯ' = a โง pullCount (action alg) a (n + 1) ฯ' = mBody uses (6)
Used by (1)
Actions: Source ยท Open Issue
Proof
by
obtain โจh_action, h_pcโฉ := h_eq
have h_hist := hist_congr alg n (ฯ := ฯ) (ฯ' := ฯ') (by grind) fun i b hi โฆ ?_
swap
ยท rcases eq_or_ne b a with (rfl | hba)
ยท refine hฯ2_eq i ?_
rw [h_pc] at hi
grind
ยท grind
constructor
ยท rw [โ h_action, action_add_one_eq]
simp [h_hist, hฯ1]
ยท simp_rw [โ h_pc, pullCount_eq_sum]
refine Finset.sum_congr rfl fun i hi โฆ ?_
congr 2
rw [hist_eq _ _ n, hist_eq _ _ n, funext_iff] at h_hist
unfold action
specialize h_hist โจi, by grindโฉ
simp only at h_hist
rw [h_hist]
stepsUntil_congr๐
Bandits.ArrayModel.stepsUntil_congrNo docstring.
Bandits.ArrayModel.stepsUntil_congr.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (a : ๐) (m n : โ) {ฯ ฯ' : probSpace ๐ R} (hฯ1 : โ (i : โ), Prod.fst ฯ i = Prod.fst ฯ' i) (hฯ2_ne : โ (i : โ) (b : ๐), b โ a โ Prod.snd ฯ i b = Prod.snd ฯ' i b) (hฯ2_eq : โ (i : โ), i + 1 โค m โ Prod.snd ฯ i a = Prod.snd ฯ' i a) : action alg (n + 1) ฯ = a โง Learning.pullCount (action alg) a (n + 1) ฯ = m โ action alg (n + 1) ฯ' = a โง Learning.pullCount (action alg) a (n + 1) ฯ' = mBandits.ArrayModel.stepsUntil_congr.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (a : ๐) (m n : โ) {ฯ ฯ' : probSpace ๐ R} (hฯ1 : โ (i : โ), Prod.fst ฯ i = Prod.fst ฯ' i) (hฯ2_ne : โ (i : โ) (b : ๐), b โ a โ Prod.snd ฯ i b = Prod.snd ฯ' i b) (hฯ2_eq : โ (i : โ), i + 1 โค m โ Prod.snd ฯ i a = Prod.snd ฯ' i a) : action alg (n + 1) ฯ = a โง Learning.pullCount (action alg) a (n + 1) ฯ = m โ action alg (n + 1) ฯ' = a โง Learning.pullCount (action alg) a (n + 1) ฯ' = m
Code
lemma stepsUntil_congr (alg : Algorithm ๐ R) (a : ๐) (m n : โ) {ฯ ฯ' : probSpace ๐ R}
(hฯ1 : โ i, ฯ.1 i = ฯ'.1 i) (hฯ2_ne : โ i b, b โ a โ ฯ.2 i b = ฯ'.2 i b)
(hฯ2_eq : โ i, i + 1 โค m โ ฯ.2 i a = ฯ'.2 i a) :
(action alg (n + 1) ฯ = a โง pullCount (action alg) a (n + 1) ฯ = m) โ
(action alg (n + 1) ฯ' = a โง pullCount (action alg) a (n + 1) ฯ' = m)Body uses (1)
Used by (1)
Actions: Source ยท Open Issue
Proof
โจstepsUntil_congr_aux alg a m n hฯ1 hฯ2_ne hฯ2_eq,
stepsUntil_congr_aux alg a m n (by grind) (by grind) (by grind)โฉ
stepsUntil_indicator_congr๐
Bandits.ArrayModel.stepsUntil_indicator_congrNo docstring.
Bandits.ArrayModel.stepsUntil_indicator_congr.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (a : ๐) (m n : โ) {ฯ ฯ' : probSpace ๐ R} (hฯ1 : โ (i : โ), Prod.fst ฯ i = Prod.fst ฯ' i) (hฯ2_ne : โ (i : โ) (b : ๐), b โ a โ Prod.snd ฯ i b = Prod.snd ฯ' i b) (hฯ2_eq : โ (i : โ), i + 1 โค m โ Prod.snd ฯ i a = Prod.snd ฯ' i a) : Set.indicator {ฯ | action alg (n + 1) ฯ = a โง Learning.pullCount (action alg) a (n + 1) ฯ = m} (fun x => 1) ฯ = Set.indicator {ฯ | action alg (n + 1) ฯ = a โง Learning.pullCount (action alg) a (n + 1) ฯ = m} (fun x => 1) ฯ'Bandits.ArrayModel.stepsUntil_indicator_congr.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (a : ๐) (m n : โ) {ฯ ฯ' : probSpace ๐ R} (hฯ1 : โ (i : โ), Prod.fst ฯ i = Prod.fst ฯ' i) (hฯ2_ne : โ (i : โ) (b : ๐), b โ a โ Prod.snd ฯ i b = Prod.snd ฯ' i b) (hฯ2_eq : โ (i : โ), i + 1 โค m โ Prod.snd ฯ i a = Prod.snd ฯ' i a) : Set.indicator {ฯ | action alg (n + 1) ฯ = a โง Learning.pullCount (action alg) a (n + 1) ฯ = m} (fun x => 1) ฯ = Set.indicator {ฯ | action alg (n + 1) ฯ = a โง Learning.pullCount (action alg) a (n + 1) ฯ = m} (fun x => 1) ฯ'
Code
lemma stepsUntil_indicator_congr (alg : Algorithm ๐ R) (a : ๐) (m n : โ) {ฯ ฯ' : probSpace ๐ R}
(hฯ1 : โ i, ฯ.1 i = ฯ'.1 i) (hฯ2_ne : โ i b, b โ a โ ฯ.2 i b = ฯ'.2 i b)
(hฯ2_eq : โ i, i + 1 โค m โ ฯ.2 i a = ฯ'.2 i a) :
{ฯ | action alg (n + 1) ฯ = a โง pullCount (action alg) a (n + 1) ฯ = m}.indicator (fun _ โฆ 1)
ฯ =
{ฯ | action alg (n + 1) ฯ = a โง pullCount (action alg) a (n + 1) ฯ = m}.indicator
(fun _ โฆ 1) ฯ'Body uses (1)
Used by (1)
Actions: Source ยท Open Issue
Proof
by simp only [Set.indicator_apply, Set.mem_setOf_eq] simp_rw [stepsUntil_congr alg a m n hฯ1 hฯ2_ne hฯ2_eq]
measurable_hist_comap๐
Bandits.ArrayModel.measurable_hist_comapNo docstring.
Bandits.ArrayModel.measurable_hist_comap.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] (alg : Learning.Algorithm ๐ R) (n : โ) : Measurable fun x => hist alg x nBandits.ArrayModel.measurable_hist_comap.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] (alg : Learning.Algorithm ๐ R) (n : โ) : Measurable fun x => hist alg x n
Code
lemma measurable_hist_comap [Countable ๐] (alg : Algorithm ๐ R) (n : โ) :
Measurable[MeasurableSpace.comap (fun ฯ โฆ (fun (i : Iic n) โฆ ฯ.1 i, ฯ.2)) inferInstance]
(hist alg ยท n)Body uses (6)
Used by (1)
Actions: Source ยท Open Issue
Proof
by
have h_eq : (hist alg ยท n) =
((hist alg ยท n) โ (fun p โฆ (fun i : โ โฆ p.1 โจmin i n, by grindโฉ, p.2))) โ
(fun ฯ โฆ (fun (i : Iic n) โฆ ฯ.1 i, ฯ.2)) := by
ext ฯ : 1
exact hist_congr alg n (by grind) (by simp)
rw [h_eq]
refine measurable_comp_comap _ (Measurable.comp (by fun_prop) ?_)
refine Measurable.prodMk ?_ (by fun_prop)
rw [measurable_pi_iff]
intro i
change Measurable ((fun p โฆ p โจmin i n, by simpโฉ) โ (fun x : (Iic n โ I) ร (โ โ ๐ โ R) โฆ x.1))
exact Measurable.comp (by fun_prop) measurable_fst
truePast๐
Bandits.ArrayModel.truePast
All random variables in the space, except for the unseen rewards for action a after
time n.
Bandits.ArrayModel.truePast.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Nonempty R] (alg : Learning.Algorithm ๐ R) (a : ๐) (n : โ) (ฯ : probSpace ๐ R) : probSpace ๐ RBandits.ArrayModel.truePast.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Nonempty R] (alg : Learning.Algorithm ๐ R) (a : ๐) (n : โ) (ฯ : probSpace ๐ R) : probSpace ๐ R
Code
noncomputable
def truePast (alg : Algorithm ๐ R) (a : ๐) (n : โ) (ฯ : probSpace ๐ R) :
probSpace ๐ R :=
(ฯ.1, fun i b โฆ if b = a then if pullCount (action alg) a (n + 1) ฯ โ 0 then
ฯ.2 (min i ((pullCount (action alg) a (n + 1) ฯ) - 1)) a else Nonempty.some inferInstance
else ฯ.2 i b)Used by (6)
Actions: Source ยท Open Issue
truePast_eq_of_pullCount_eq๐
Bandits.ArrayModel.truePast_eq_of_pullCount_eqNo docstring.
Bandits.ArrayModel.truePast_eq_of_pullCount_eq.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Nonempty R] (alg : Learning.Algorithm ๐ R) (a : ๐) (n m : โ) (ฯ : probSpace ๐ R) (h_pc : Learning.pullCount (action alg) a (n + 1) ฯ = m) : truePast alg a n ฯ = (Prod.fst ฯ, fun i b => if b = a then if m โ 0 then Prod.snd ฯ (min i (m - 1)) a else Nonempty.some โฏ else Prod.snd ฯ i b)Bandits.ArrayModel.truePast_eq_of_pullCount_eq.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Nonempty R] (alg : Learning.Algorithm ๐ R) (a : ๐) (n m : โ) (ฯ : probSpace ๐ R) (h_pc : Learning.pullCount (action alg) a (n + 1) ฯ = m) : truePast alg a n ฯ = (Prod.fst ฯ, fun i b => if b = a then if m โ 0 then Prod.snd ฯ (min i (m - 1)) a else Nonempty.some โฏ else Prod.snd ฯ i b)
Code
lemma truePast_eq_of_pullCount_eq (alg : Algorithm ๐ R)
(a : ๐) (n m : โ) (ฯ : probSpace ๐ R)
(h_pc : pullCount (action alg) a (n + 1) ฯ = m) :
truePast alg a n ฯ = (ฯ.1, fun i b โฆ if b = a then if m โ 0 then
ฯ.2 (min i (m - 1)) a else Nonempty.some inferInstance else ฯ.2 i b)Actions: Source ยท Open Issue
Proof
by simp [truePast, h_pc]
truePast_eq_of_pullCount_eq_of_ne_zero๐
Bandits.ArrayModel.truePast_eq_of_pullCount_eq_of_ne_zeroNo docstring.
Bandits.ArrayModel.truePast_eq_of_pullCount_eq_of_ne_zero.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Nonempty R] (alg : Learning.Algorithm ๐ R) (a : ๐) (n m : โ) (ฯ : probSpace ๐ R) (h_pc : Learning.pullCount (action alg) a (n + 1) ฯ = m) (hm : m โ 0) : truePast alg a n ฯ = (Prod.fst ฯ, fun i b => if b = a then Prod.snd ฯ (min i (m - 1)) a else Prod.snd ฯ i b)Bandits.ArrayModel.truePast_eq_of_pullCount_eq_of_ne_zero.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Nonempty R] (alg : Learning.Algorithm ๐ R) (a : ๐) (n m : โ) (ฯ : probSpace ๐ R) (h_pc : Learning.pullCount (action alg) a (n + 1) ฯ = m) (hm : m โ 0) : truePast alg a n ฯ = (Prod.fst ฯ, fun i b => if b = a then Prod.snd ฯ (min i (m - 1)) a else Prod.snd ฯ i b)
Code
lemma truePast_eq_of_pullCount_eq_of_ne_zero (alg : Algorithm ๐ R)
(a : ๐) (n m : โ) (ฯ : probSpace ๐ R)
(h_pc : pullCount (action alg) a (n + 1) ฯ = m) (hm : m โ 0) :
truePast alg a n ฯ = (ฯ.1, fun i b โฆ if b = a then
ฯ.2 (min i (m - 1)) a else ฯ.2 i b)Actions: Source ยท Open Issue
Proof
by simp [truePast, h_pc, hm]
measurable_hist_truePast๐
Bandits.ArrayModel.measurable_hist_truePastNo docstring.
Bandits.ArrayModel.measurable_hist_truePast.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Nonempty R] [Countable ๐] (alg : Learning.Algorithm ๐ R) (a : ๐) (n : โ) : Measurable fun x => hist alg x nBandits.ArrayModel.measurable_hist_truePast.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Nonempty R] [Countable ๐] (alg : Learning.Algorithm ๐ R) (a : ๐) (n : โ) : Measurable fun x => hist alg x n
Code
lemma measurable_hist_truePast [Countable ๐] (alg : Algorithm ๐ R)
(a : ๐) (n : โ) :
Measurable[MeasurableSpace.comap (truePast alg a n) inferInstance] (hist alg ยท n)Type uses (5)
Body uses (4)
Used by (3)
Actions: Source ยท Open Issue
Proof
by
have h_eq : (hist alg ยท n) = (hist alg ยท n) โ (truePast alg a n) := by
ext ฯ : 1
refine hist_congr alg n (fun _ _ โฆ rfl) fun i b hi โฆ ?_
by_cases hb : b = a
ยท subst hb
simp only [truePast, โreduceIte]
rw [min_eq_left, if_pos (by grind)]
grind
ยท simp [truePast, hb]
rw [h_eq]
refine Measurable.comp ?_ (Measurable.of_comap_le le_rfl)
fun_prop
measurable_action_add_one_truePast๐
Bandits.ArrayModel.measurable_action_add_one_truePastNo docstring.
Bandits.ArrayModel.measurable_action_add_one_truePast.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Nonempty R] [Countable ๐] (alg : Learning.Algorithm ๐ R) (a : ๐) (n : โ) : Measurable (action alg (n + 1))Bandits.ArrayModel.measurable_action_add_one_truePast.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Nonempty R] [Countable ๐] (alg : Learning.Algorithm ๐ R) (a : ๐) (n : โ) : Measurable (action alg (n + 1))
Code
lemma measurable_action_add_one_truePast [Countable ๐] (alg : Algorithm ๐ R)
(a : ๐) (n : โ) :
Measurable[MeasurableSpace.comap (truePast alg a n) inferInstance]
(action alg (n + 1))Type uses (5)
Actions: Source ยท Open Issue
Proof
by
rw [action_add_one_eq]
change Measurable[MeasurableSpace.comap (truePast alg a n) inferInstance]
((fun p โฆ algFunction alg n p.1 p.2) โ (fun ฯ โฆ (hist alg ฯ n, ฯ.1 (n + 1))))
refine (measurable_algFunction alg n).comp (Measurable.prodMk ?_ ?_)
ยท exact measurable_hist_truePast alg a n
ยท have : (fun ฯ โฆ ฯ.1 (n + 1)) =
(fun (p : probSpace ๐ R) โฆ p.1 (n + 1)) โ (truePast alg a n) := rfl
rw [this]
exact Measurable.comp (by fun_prop) (Measurable.of_comap_le le_rfl)
measurable_pullCount_add_one_truePast๐
Bandits.ArrayModel.measurable_pullCount_add_one_truePastNo docstring.
Bandits.ArrayModel.measurable_pullCount_add_one_truePast.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Nonempty R] [Countable ๐] (alg : Learning.Algorithm ๐ R) (a : ๐) (n : โ) : Measurable (Learning.pullCount (action alg) a (n + 1))Bandits.ArrayModel.measurable_pullCount_add_one_truePast.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Nonempty R] [Countable ๐] (alg : Learning.Algorithm ๐ R) (a : ๐) (n : โ) : Measurable (Learning.pullCount (action alg) a (n + 1))
Code
lemma measurable_pullCount_add_one_truePast [Countable ๐] (alg : Algorithm ๐ R) (a : ๐) (n : โ) :
Measurable[MeasurableSpace.comap (truePast alg a n) inferInstance]
(pullCount (action alg) a (n + 1))Body uses (5)
Actions: Source ยท Open Issue
Proof
by
change Measurable[MeasurableSpace.comap (truePast alg a n) inferInstance]
(fun ฯ โฆ pullCount (action alg) a (n + 1) ฯ)
simp_rw [pullCount_eq_sum]
refine measurable_sum _ fun i hi โฆ Measurable.ite ?_ (by fun_prop) (by fun_prop)
refine (measurableSet_singleton _).preimage ?_
have h_meas := measurable_hist_truePast alg a n
simp_rw [hist_eq _ _ n, @measurable_pi_iff] at h_meas
exact (h_meas โจi, by grindโฉ).fst
measurable_stepsUntil๐
Bandits.ArrayModel.measurable_stepsUntilNo docstring.
Bandits.ArrayModel.measurable_stepsUntil.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Nonempty R] [Countable ๐] (alg : Learning.Algorithm ๐ R) (a : ๐) (m n : โ) : Measurable (Set.indicator {ฯ | action alg (n + 1) ฯ = a โง Learning.pullCount (action alg) a (n + 1) ฯ = m} fun x => 1)Bandits.ArrayModel.measurable_stepsUntil.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Nonempty R] [Countable ๐] (alg : Learning.Algorithm ๐ R) (a : ๐) (m n : โ) : Measurable (Set.indicator {ฯ | action alg (n + 1) ฯ = a โง Learning.pullCount (action alg) a (n + 1) ฯ = m} fun x => 1)
Code
lemma measurable_stepsUntil [Countable ๐] (alg : Algorithm ๐ R) (a : ๐) (m n : โ) :
Measurable[MeasurableSpace.comap
(fun ฯ โฆ (ฯ.1, fun k b โฆ if b = a then if m โ 0 then ฯ.2 (min k (m - 1)) b
else Nonempty.some inferInstance else ฯ.2 k b)) inferInstance]
(({ฯ | action alg (n + 1) ฯ = a โง
pullCount (action alg) a (n + 1) ฯ = m}).indicator (fun _ โฆ 1))Body uses (4)
Actions: Source ยท Open Issue
Proof
by
let f := ({ฯ | action alg (n + 1) ฯ = a โง pullCount (action alg) a (n + 1) ฯ = m}).indicator
(fun _ โฆ 1)
have h_eq : f = f โ
fun ฯ โฆ (ฯ.1, fun k b โฆ if b = a then if m โ 0 then ฯ.2 (min k (m - 1)) b
else Nonempty.some inferInstance else ฯ.2 k b) := by
ext ฯ
exact stepsUntil_indicator_congr alg a m n (by grind) (by grind) (by grind)
change Measurable[MeasurableSpace.comap
(fun ฯ โฆ (ฯ.1, fun k b โฆ if b = a then if m โ 0 then ฯ.2 (min k (m - 1)) b
else Nonempty.some inferInstance else ฯ.2 k b)) inferInstance] f
rw [h_eq]
refine Measurable.comp ?_ (Measurable.of_comap_le le_rfl)
refine Measurable.indicator (by fun_prop) ?_
exact MeasurableSet.inter ((measurableSet_singleton _).preimage (by fun_prop))
((measurableSet_singleton _).preimage (by fun_prop))
measurable_pullCount_action_add_one_hist๐
Bandits.ArrayModel.measurable_pullCount_action_add_one_histNo docstring.
Bandits.ArrayModel.measurable_pullCount_action_add_one_hist.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (n : โ) : Measurable fun ฯ => Learning.pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯBandits.ArrayModel.measurable_pullCount_action_add_one_hist.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] (alg : Learning.Algorithm ๐ R) (n : โ) : Measurable fun ฯ => Learning.pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ
Code
lemma measurable_pullCount_action_add_one_hist (alg : Algorithm ๐ R) (n : โ) :
Measurable[MeasurableSpace.comap (fun ฯ โฆ (action alg (n + 1) ฯ, hist alg ฯ n)) inferInstance]
(fun ฯ โฆ pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ)Body uses (4)
Used by (1)
Actions: Source ยท Open Issue
Proof
by
simp_rw [pullCount_eq_sum]
refine measurable_sum _ fun i hi โฆ Measurable.ite ?_ (by fun_prop) (by fun_prop)
refine measurableSet_eq_fun ?_ (measurable_comp_comap _ measurable_fst)
rw [measurable_iff_comap_le]
simp_rw [hist_eq _ _ n]
rw [โ measurable_iff_comap_le]
unfold action
refine Measurable.fst (mฮณ := inferInstance) ?_
have : (hist alg ยท i โจi, by grindโฉ) =
(fun ฯ : ๐ ร (Iic n โ ๐ ร R) โฆ ฯ.2 โจi, by grindโฉ) โ
(fun ฯ โฆ (action alg (n + 1) ฯ, fun i : Iic n โฆ hist alg ฯ i โจi, by grindโฉ)) := rfl
rw [this]
exact measurable_comp_comap _ (Measurable.prodMk (by fun_prop) (by fun_prop))
map_snd_apply_arrayMeasure๐
Bandits.ArrayModel.map_snd_apply_arrayMeasureNo docstring.
Bandits.ArrayModel.map_snd_apply_arrayMeasure.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} {ฮฝ : ProbabilityTheory.Kernel ๐ R} [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) (a : ๐) : MeasureTheory.Measure.map (fun ฯ => Prod.snd ฯ n a) (arrayMeasure ฮฝ) = ฮฝ aBandits.ArrayModel.map_snd_apply_arrayMeasure.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} {ฮฝ : ProbabilityTheory.Kernel ๐ R} [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) (a : ๐) : MeasureTheory.Measure.map (fun ฯ => Prod.snd ฯ n a) (arrayMeasure ฮฝ) = ฮฝ a
Code
lemma map_snd_apply_arrayMeasure {ฮฝ : Kernel ๐ R} [IsMarkovKernel ฮฝ] (n : โ) (a : ๐) :
(arrayMeasure ฮฝ).map (fun ฯ โฆ ฯ.2 n a) = ฮฝ aType uses (3)
Used by (3)
Actions: Source ยท Open Issue
Proof
by
calc (arrayMeasure ฮฝ).map (fun ฯ โฆ ฯ.2 n a)
_ = (arrayMeasure ฮฝ).snd.map (fun ฯ โฆ ฯ n a) := by
rw [Measure.snd, Measure.map_map (by fun_prop) (by fun_prop)]
rfl
_ = ฮฝ a := by
rw [arrayMeasure, Measure.snd_prod, streamMeasure]
have : (fun ฯ โฆ ฯ n a) = (fun h : ๐ โ R โฆ h a) โ (fun ฯ : โ โ ๐ โ R โฆ ฯ n) := rfl
rw [this, โ Measure.map_map (by fun_prop) (by fun_prop), Measure.infinitePi_map_eval,
Measure.infinitePi_map_eval]
indepFun_fst_snd๐
Bandits.ArrayModel.indepFun_fst_sndNo docstring.
Bandits.ArrayModel.indepFun_fst_snd.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : ProbabilityTheory.IndepFun Prod.fst Prod.snd (arrayMeasure ฮฝ)Bandits.ArrayModel.indepFun_fst_snd.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : ProbabilityTheory.IndepFun Prod.fst Prod.snd (arrayMeasure ฮฝ)
Code
lemma indepFun_fst_snd (ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ] :
IndepFun Prod.fst Prod.snd (arrayMeasure ฮฝ)Type uses (2)
Actions: Source ยท Open Issue
Proof
indepFun_prod measurable_id measurable_id
indepFun_fst_zero_snd_zero_action๐
Bandits.ArrayModel.indepFun_fst_zero_snd_zero_actionNo docstring.
Bandits.ArrayModel.indepFun_fst_zero_snd_zero_action.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (a : ๐) : ProbabilityTheory.IndepFun (fun ฯ => Prod.fst ฯ 0) (fun ฯ => Prod.snd ฯ 0 a) (arrayMeasure ฮฝ)Bandits.ArrayModel.indepFun_fst_zero_snd_zero_action.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (a : ๐) : ProbabilityTheory.IndepFun (fun ฯ => Prod.fst ฯ 0) (fun ฯ => Prod.snd ฯ 0 a) (arrayMeasure ฮฝ)
Code
lemma indepFun_fst_zero_snd_zero_action (ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ] (a : ๐) :
IndepFun (fun ฯ โฆ ฯ.1 0) (fun ฯ โฆ ฯ.2 0 a) (arrayMeasure ฮฝ)Type uses (3)
Used by (1)
Actions: Source ยท Open Issue
Proof
indepFun_prod (X := fun ฯ : โ โ I โฆ ฯ 0) (Y := fun ฯ : โ โ ๐ โ R โฆ ฯ 0 a)
(by fun_prop) (by fun_prop)
indepFun_fst_add_one_aux๐
Bandits.ArrayModel.indepFun_fst_add_one_auxNo docstring.
Bandits.ArrayModel.indepFun_fst_add_one_aux.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) : ProbabilityTheory.IndepFun (fun ฯ => Prod.fst ฯ (n + 1)) (fun ฯ => (fun i => Prod.fst ฯ โi, Prod.snd ฯ)) (arrayMeasure ฮฝ)Bandits.ArrayModel.indepFun_fst_add_one_aux.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) : ProbabilityTheory.IndepFun (fun ฯ => Prod.fst ฯ (n + 1)) (fun ฯ => (fun i => Prod.fst ฯ โi, Prod.snd ฯ)) (arrayMeasure ฮฝ)
Code
lemma indepFun_fst_add_one_aux (ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ] (n : โ) :
(fun ฯ โฆ ฯ.1 (n + 1)) โแตข[arrayMeasure ฮฝ] (fun ฯ โฆ (fun (i : Iic n) โฆ ฯ.1 i, ฯ.2))Type uses (3)
Used by (1)
Actions: Source ยท Open Issue
Proof
by
let ฮผโ : Measure (โ โ I) := Measure.infinitePi fun _ โฆ volume
let ฮผโ : Measure (โ โ ๐ โ R) := streamMeasure ฮฝ
-- Coordinates of ฮผโ are independent
have h_indep : iIndepFun (fun i (ฯ : โ โ I) โฆ ฯ i) ฮผโ :=
iIndepFun_infinitePi (fun _ โฆ measurable_id)
have h_indep_n : IndepFun (fun ฯ โฆ ฯ (n + 1)) (fun ฯ โฆ fun i : Iic n โฆ ฯ i) ฮผโ := by
have h := h_indep.indepFun_finsetโ {n + 1} (Iic n) (by simp)
(fun i โฆ (measurable_pi_apply i).aemeasurable)
convert h.comp (measurable_pi_apply โจn + 1, by simpโฉ) measurable_id using 1
ยท rfl
ยท rfl
rw [indepFun_iff_measure_inter_preimage_eq_mul]
intro s t hs ht
let X : (โ โ I) ร (โ โ ๐ โ R) โ I := fun ฯ โฆ ฯ.1 (n + 1)
let Y : (โ โ I) ร (โ โ ๐ โ R) โ (Iic n โ I) ร (โ โ ๐ โ R) := fun ฯ โฆ (fun i โฆ ฯ.1 i, ฯ.2)
change (ฮผโ.prod ฮผโ) (X โปยน' s โฉ Y โปยน' t) = (ฮผโ.prod ฮผโ) (X โปยน' s) * (ฮผโ.prod ฮผโ) (Y โปยน' t)
-- Rewrite using Fubini
rw [Measure.prod_apply (hs.preimage (by fun_prop : Measurable X)),
Measure.prod_apply (ht.preimage (by fun_prop : Measurable Y)),
Measure.prod_apply ((hs.preimage (by fun_prop : Measurable X)).inter
(ht.preimage (by fun_prop : Measurable Y)))]
-- Compute fibers
have hX_fst ฯโ : ฮผโ (Prod.mk ฯโ โปยน' (X โปยน' s)) = s.indicator 1 (ฯโ (n + 1)) := by
simp only [X, Set.preimage_preimage]
by_cases h : ฯโ (n + 1) โ s <;> simp [h]
have hY_fst ฯโ : ฮผโ (Prod.mk ฯโ โปยน' (Y โปยน' t)) = ฮผโ {y | ((fun i : Iic n โฆ ฯโ i), y) โ t} := rfl
have hXY ฯโ : ฮผโ (Prod.mk ฯโ โปยน' (X โปยน' s โฉ Y โปยน' t)) =
s.indicator 1 (ฯโ (n + 1)) * ฮผโ {y | ((fun i : Iic n โฆ ฯโ i), y) โ t} := by
simp only [X, Y, Set.preimage_inter, Set.preimage_preimage]
by_cases h : ฯโ (n + 1) โ s
ยท simp [h]
congr
ยท simp [h]
simp_rw [hY_fst, hX_fst, hXY]
-- Factor the integral using independence
let g : (Iic n โ I) โ ENNReal := fun x โฆ ฮผโ {y | (x, y) โ t}
have hg_meas : Measurable g := measurable_measure_prodMk_left ht
have hf_meas : Measurable (fun ฯโ : โ โ I โฆ s.indicator (1 : I โ ENNReal) (ฯโ (n + 1))) :=
(measurable_one.indicator hs).comp (measurable_pi_apply _)
have hindep_fg : IndepFun (fun ฯโ โฆ s.indicator (1 : I โ ENNReal) (ฯโ (n + 1)))
(fun ฯโ โฆ g (fun i โฆ ฯโ i)) ฮผโ :=
h_indep_n.comp (measurable_one.indicator hs) hg_meas
have h_eq (ฯโ : โ โ I) : ฮผโ {y | ((fun i : Iic n โฆ ฯโ i), y) โ t} = g (fun i โฆ ฯโ i) := rfl
simp_rw [h_eq]
exact lintegral_mul_eq_lintegral_mul_lintegral_of_indepFun hf_meas (by fun_prop) hindep_fg
indepFun_fst_add_one_hist๐
Bandits.ArrayModel.indepFun_fst_add_one_histNo docstring.
Bandits.ArrayModel.indepFun_fst_add_one_hist.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [StandardBorelSpace R] [Nonempty R] [Countable ๐] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) : ProbabilityTheory.IndepFun (fun ฯ => Prod.fst ฯ (n + 1)) (fun x => hist alg x n) (arrayMeasure ฮฝ)Bandits.ArrayModel.indepFun_fst_add_one_hist.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [StandardBorelSpace R] [Nonempty R] [Countable ๐] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) : ProbabilityTheory.IndepFun (fun ฯ => Prod.fst ฯ (n + 1)) (fun x => hist alg x n) (arrayMeasure ฮฝ)
Code
lemma indepFun_fst_add_one_hist [Countable ๐] (alg : Algorithm ๐ R)
(ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ] (n : โ) :
IndepFun (fun ฯ โฆ ฯ.1 (n + 1)) (hist alg ยท n) (arrayMeasure ฮฝ)Type uses (5)
Body uses (3)
Used by (1)
Actions: Source ยท Open Issue
Proof
(indepFun_fst_add_one_aux ฮฝ n).of_measurable_right (measurable_hist_comap alg n)
indepFun_snd_apply_aux๐
Bandits.ArrayModel.indepFun_snd_apply_auxNo docstring.
Bandits.ArrayModel.indepFun_snd_apply_aux.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [DecidableEq ๐] [Nonempty R] (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (a : ๐) (m : โ) : ProbabilityTheory.IndepFun (fun ฯ => Prod.snd ฯ m a) (fun ฯ => (Prod.fst ฯ, fun k b => if b = a then if m โ 0 then Prod.snd ฯ (min k (m - 1)) b else Nonempty.some โฏ else Prod.snd ฯ k b)) (arrayMeasure ฮฝ)Bandits.ArrayModel.indepFun_snd_apply_aux.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [DecidableEq ๐] [Nonempty R] (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (a : ๐) (m : โ) : ProbabilityTheory.IndepFun (fun ฯ => Prod.snd ฯ m a) (fun ฯ => (Prod.fst ฯ, fun k b => if b = a then if m โ 0 then Prod.snd ฯ (min k (m - 1)) b else Nonempty.some โฏ else Prod.snd ฯ k b)) (arrayMeasure ฮฝ)
Code
lemma indepFun_snd_apply_aux (ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ] (a : ๐) (m : โ) :
(fun ฯ โฆ ฯ.2 m a) โแตข[arrayMeasure ฮฝ]
(fun ฯ โฆ (ฯ.1, fun k b โฆ if b = a then if m โ 0 then ฯ.2 (min k (m - 1)) b
else Nonempty.some inferInstance else ฯ.2 k b))Type uses (3)
Body uses (2)
Actions: Source ยท Open Issue
Proof
by
unfold arrayMeasure
let ฮผโ : Measure (โ โ I) := Measure.infinitePi fun _ โฆ volume
let ฮผโ : Measure (โ โ ๐ โ R) := Measure.infinitePi fun _ โฆ Measure.infinitePi ฮฝ
-- Independence within ฮผโ: coordinates ฯ i are independent
have h_indepโ : iIndepFun (fun i (ฯ : โ โ ๐ โ R) โฆ ฯ i) ฮผโ :=
iIndepFun_infinitePi (fun _ โฆ measurable_id)
-- Independence within each infinitePi ฮฝ: coordinates f b are independent
have h_indep_inner : iIndepFun (fun (b : ๐) (f : ๐ โ R) โฆ f b) (Measure.infinitePi ฮฝ) :=
iIndepFun_infinitePi (fun _ โฆ measurable_id)
rw [indepFun_iff_measure_inter_preimage_eq_mul]
intro s t hs ht
let X : (โ โ I) ร (โ โ ๐ โ R) โ R := fun ฯ โฆ ฯ.2 m a
let Y : (โ โ I) ร (โ โ ๐ โ R) โ (โ โ I) ร (โ โ ๐ โ R) :=
fun ฯ โฆ (ฯ.1, fun k b โฆ if b = a then if m โ 0 then ฯ.2 (min k (m - 1)) b
else Nonempty.some inferInstance else ฯ.2 k b)
have hX_meas : Measurable X :=
(measurable_pi_apply a).comp ((measurable_pi_apply m).comp measurable_snd)
have hY_meas : Measurable Y := by
change Measurable (fun ฯ : (โ โ I) ร (โ โ ๐ โ R) โฆ
(ฯ.1, fun k b โฆ if b = a then if m โ 0 then ฯ.2 (min k (m - 1)) b
else Nonempty.some inferInstance else ฯ.2 k b))
refine Measurable.prod measurable_fst ?_
refine measurable_pi_lambda _ (fun k โฆ ?_)
refine measurable_pi_lambda _ (fun b โฆ ?_)
by_cases hb : b = a
ยท simp only [hb, โreduceIte]
by_cases hm : m โ 0
ยท simp only [ne_eq, hm, not_false_eq_true, โreduceIte]
exact (measurable_pi_apply a).comp
((measurable_pi_apply (min k (m - 1))).comp measurable_snd)
ยท simp only [hm, โreduceIte]
exact measurable_const
ยท simp only [hb, โreduceIte]
exact (measurable_pi_apply b).comp ((measurable_pi_apply k).comp measurable_snd)
change (ฮผโ.prod ฮผโ) (X โปยน' s โฉ Y โปยน' t) = (ฮผโ.prod ฮผโ) (X โปยน' s) * (ฮผโ.prod ฮผโ) (Y โปยน' t)
-- Use Fubini on ฮผโ.prod ฮผโ
rw [Measure.prod_apply (hs.preimage hX_meas),
Measure.prod_apply (ht.preimage hY_meas),
Measure.prod_apply ((hs.preimage hX_meas).inter (ht.preimage hY_meas))]
-- X only depends on ฯโ, so its fiber is constant in ฯโ
have hX_fst : โ ฯโ, ฮผโ (Prod.mk ฯโ โปยน' (X โปยน' s)) = ฮผโ ((fun ฯโ โฆ ฯโ m a) โปยน' s) := fun _ โฆ rfl
simp_rw [hX_fst]
-- The LHS integral: fiber of X โฉ Y at ฯโ
-- Key: X depends only on ฯโ m a, while Y's dependence on ฯโ avoids (m, a)
-- Define the "truncation" map on ฯโ
let trunc : (โ โ ๐ โ R) โ (โ โ ๐ โ R) :=
fun ฯโ k b โฆ if b = a then if m โ 0 then ฯโ (min k (m - 1)) b
else Nonempty.some inferInstance else ฯโ k b
-- The fiber of Y at ฯโ only depends on trunc(ฯโ)
have hY_fiber : โ ฯโ, Prod.mk ฯโ โปยน' (Y โปยน' t) = (fun ฯโ โฆ (ฯโ, trunc ฯโ)) โปยน' t := fun _ โฆ rfl
-- The fiber of X โฉ Y factors
have hXY_fiber : โ ฯโ, Prod.mk ฯโ โปยน' (X โปยน' s โฉ Y โปยน' t) =
((fun ฯโ โฆ ฯโ m a) โปยน' s) โฉ ((fun ฯโ โฆ (ฯโ, trunc ฯโ)) โปยน' t) := fun _ โฆ rfl
simp_rw [hXY_fiber, hY_fiber]
-- Now we use independence in ฮผโ: (ฯโ m a) is independent of (trunc ฯโ)
-- because trunc only uses indices (k, a) with k < m, and (k, b) with b โ a
have h_trunc_meas : Measurable trunc := by
refine measurable_pi_lambda _ (fun k โฆ ?_)
refine measurable_pi_lambda _ (fun b โฆ ?_)
simp only [trunc]
by_cases hb : b = a
ยท simp only [hb, โreduceIte]
by_cases hm : m = 0
ยท simp only [hm]
exact measurable_const
ยท simp only [ne_eq, hm, not_false_eq_true, โreduceIte]
exact (measurable_pi_apply a).comp (measurable_pi_apply (min k (m - 1)))
ยท simp only [hb, โreduceIte]
exact (measurable_pi_apply b).comp (measurable_pi_apply k)
-- Key independence: (ฯโ m a) โ trunc because trunc only uses coordinates โ (m, a)
have h_indep_trunc : IndepFun (fun ฯโ โฆ ฯโ m a) trunc ฮผโ := by
-- Factor trunc through proj which extracts the relevant coordinates
let proj : (โ โ ๐ โ R) โ ((โ โ R) ร (โ โ {b : ๐ // b โ a} โ R)) := fun ฯโ โฆ
(fun k โฆ if m โ 0 then ฯโ (min k (m - 1)) a else Nonempty.some inferInstance,
fun k โจb, _โฉ โฆ ฯโ k b)
have h_trunc_proj : โ ฯโ, trunc ฯโ = (fun p k b โฆ
if h : b = a then if m โ 0 then p.1 k
else Nonempty.some inferInstance else p.2 k โจb, hโฉ) (proj ฯโ) := by
intro ฯโ; ext k b; simp only [trunc, proj]; by_cases hb : b = a <;> simp [hb]; grind
have h_proj_meas : Measurable proj := by
refine Measurable.prod ?_ ?_
ยท refine measurable_pi_lambda _ fun k โฆ ?_
by_cases hm : m โ 0
ยท simp only [proj, ne_eq, hm, not_false_eq_true, โreduceIte]
exact (measurable_pi_apply a).comp (measurable_pi_apply (min k (m - 1)))
ยท simp [proj, hm]
ยท exact measurable_pi_lambda _ (fun k โฆ measurable_pi_lambda _ (fun โจb, _โฉ โฆ
(measurable_pi_apply b).comp (measurable_pi_apply k)))
have h_g_meas : Measurable (fun p : (โ โ R) ร (โ โ {b : ๐ // b โ a} โ R) โฆ
(fun k b โฆ if h : b = a then if m โ 0 then p.1 k else Nonempty.some inferInstance
else p.2 k โจb, hโฉ)) := by
refine measurable_pi_lambda _ (fun k โฆ measurable_pi_lambda _ (fun b โฆ ?_))
by_cases hb : b = a
ยท simp only [hb, โreduceDIte]
by_cases hm : m โ 0
ยท simp only [ne_eq, hm, not_false_eq_true]
exact (measurable_pi_apply k).comp measurable_fst
ยท simp [hm]
ยท simp only [hb, โreduceDIte]
exact (measurable_pi_apply (โจb, hbโฉ : {b : ๐ // b โ a})).comp
((measurable_pi_apply k).comp measurable_snd)
-- Show (ฯโ m a) โ proj: proj uses coordinates disjoint from (m, a)
have h_indep_proj : IndepFun (fun ฯโ โฆ ฯโ m a) proj ฮผโ := by
have h_row_bound (hm : m โ 0) : โ k, min k (m - 1) < m := by
intro k
calc min k (m - 1) โค m - 1 := Nat.min_le_right k (m - 1)
_ < m := Nat.sub_lt (by grind) Nat.one_pos
rw [indepFun_iff_measure_inter_preimage_eq_mul]
intro s t' hs ht'
-- rows_lt_m extracts column a at rows < m, other_cols extracts columns โ a
let rows_lt_m : (โ โ ๐ โ R) โ (Iio m โ R) := fun ฯโ โจj, _โฉ โฆ ฯโ j a
let other_cols : (โ โ ๐ โ R) โ (โ โ {b : ๐ // b โ a} โ R) := fun ฯโ k โจb, _โฉ โฆ ฯโ k b
have h_proj_factor : โ ฯโ, proj ฯโ =
((fun r k โฆ if hm : m โ 0 then r โจmin k (m - 1), Finset.mem_Iio.mpr (h_row_bound hm k)โฉ
else Nonempty.some inferInstance) (rows_lt_m ฯโ),
other_cols ฯโ) := by
intro ฯโ; ext1
ยท ext k
by_cases hm : m โ 0
ยท simp [proj, rows_lt_m, hm]
ยท simp [proj, hm]
ยท rfl
-- Use iIndepFun structure of the doubly-indexed infinite product
have h_iindep : iIndepFun (fun (p : โ ร ๐) ฯ โฆ ฯ p.1 p.2) ฮผโ :=
iIndepFun_uncurry_infinitePi' (X := fun _ _ โฆ id) (fun _ โฆ ฮฝ) (by fun_prop)
have h_rows_meas : Measurable rows_lt_m :=
measurable_pi_lambda _ (fun โจj, _โฉ โฆ (measurable_pi_apply a).comp (measurable_pi_apply j))
have h_other_meas : Measurable other_cols :=
measurable_pi_lambda _ (fun k โฆ measurable_pi_lambda _ (fun โจb, _โฉ โฆ
(measurable_pi_apply b).comp (measurable_pi_apply k)))
-- Show (ฯโ m a) โ (rows_lt_m, other_cols) via indep_iSup_of_disjoint
have h_indep_combined : IndepFun (fun ฯโ โฆ ฯโ m a)
(fun ฯโ โฆ (rows_lt_m ฯโ, other_cols ฯโ)) ฮผโ := by
rw [IndepFun_iff_Indep]
have h_comap_le : (MeasurableSpace.pi.prod MeasurableSpace.pi).comap
(fun ฯโ โฆ (rows_lt_m ฯโ, other_cols ฯโ)) โค
โจ (p : {p : โ ร ๐ // p โ (m, a)}), mR.comap (fun ฯ โฆ ฯ p.val.1 p.val.2) := by
rw [MeasurableSpace.comap_prodMk]
refine sup_le ?_ ?_
ยท rw [MeasurableSpace.comap_pi]
refine iSup_le (fun โจj, hjโฉ โฆ ?_)
have h_ne : (j, a) โ (m, a) := fun h โฆ (Finset.mem_Iio.mp hj).ne (Prod.mk.inj h).1
exact le_iSup_of_le โจ(j, a), h_neโฉ le_rfl
ยท rw [MeasurableSpace.comap_pi]
refine iSup_le (fun k โฆ ?_)
rw [MeasurableSpace.comap_pi]
refine iSup_le (fun โจb, hbโฉ โฆ ?_)
have h_ne : (k, b) โ (m, a) := fun h โฆ hb (Prod.mk.inj h).2
exact le_iSup_of_le โจ(k, b), h_neโฉ le_rfl
refine indep_of_indep_of_le_right ?_ h_comap_le
have h_disjoint : Disjoint ({(m, a)} : Set (โ ร ๐)) {p | p โ (m, a)} := by simp
have h_le : โ p : โ ร ๐, mR.comap (fun ฯ : โ โ ๐ โ R โฆ ฯ p.1 p.2) โค
MeasurableSpace.pi (m := fun _ โฆ MeasurableSpace.pi) := fun p โฆ
Measurable.comap_le ((measurable_pi_apply p.2).comp (measurable_pi_apply p.1))
have h_iindep' : iIndep (fun p : โ ร ๐ โฆ mR.comap (fun ฯ : โ โ ๐ โ R โฆ ฯ p.1 p.2)) ฮผโ :=
h_iindep.iIndep
have h_indep := indep_iSup_of_disjoint h_le h_iindep' h_disjoint
convert h_indep using 2
ยท simp only [Set.mem_singleton_iff, iSup_iSup_eq_left]
ยท simp only [ne_eq, Set.mem_setOf_eq, iSup_subtype']
have h_proj_preimage : proj โปยน' t' = (fun ฯโ โฆ (rows_lt_m ฯโ, other_cols ฯโ)) โปยน'
{p | ((fun r k โฆ if hm : m โ 0 then
r โจmin k (m - 1), Finset.mem_Iio.mpr (h_row_bound hm k)โฉ
else Nonempty.some inferInstance) p.1, p.2) โ t'}
:= by ext ฯโ; simp only [Set.mem_preimage, Set.mem_setOf_eq, h_proj_factor]
rw [indepFun_iff_measure_inter_preimage_eq_mul] at h_indep_combined
rw [h_proj_preimage]
let T : Set ((Iio m โ R) ร (โ โ {b : ๐ // b โ a} โ R)) :=
{p | ((fun r k โฆ if hm : m โ 0 then
r โจmin k (m - 1), Finset.mem_Iio.mpr (h_row_bound hm k)โฉ
else Nonempty.some inferInstance) p.1, p.2) โ t'}
have hT_meas : MeasurableSet T := by
refine ht'.preimage (Measurable.prod ?_ measurable_snd)
refine measurable_pi_lambda _ (fun k โฆ ?_)
by_cases hm : m โ 0
ยท simp only [ne_eq, hm, not_false_eq_true, โreduceDIte]
exact (measurable_pi_apply (โจmin k (m - 1), Finset.mem_Iio.mpr (h_row_bound hm k)โฉ :
Iio m)).comp measurable_fst
ยท simp [hm]
change ฮผโ ((fun ฯโ โฆ ฯโ m a) โปยน' s โฉ (fun ฯโ โฆ (rows_lt_m ฯโ, other_cols ฯโ)) โปยน' T) =
ฮผโ ((fun ฯโ โฆ ฯโ m a) โปยน' s) *
ฮผโ ((fun ฯโ โฆ (rows_lt_m ฯโ, other_cols ฯโ)) โปยน' T)
exact h_indep_combined s T hs hT_meas
have h_eq : trunc = (fun p k b โฆ if h : b = a then if m โ 0 then p.1 k
else Nonempty.some inferInstance else p.2 k โจb, hโฉ) โ proj := by
funext ฯโ; exact h_trunc_proj ฯโ
rw [h_eq]
exact h_indep_proj.comp measurable_id h_g_meas
rw [indepFun_iff_measure_inter_preimage_eq_mul] at h_indep_trunc
have h_const : โ ฯโ, ฮผโ (((fun ฯโ โฆ ฯโ m a) โปยน' s) โฉ ((fun ฯโ โฆ (ฯโ, trunc ฯโ)) โปยน' t)) =
ฮผโ ((fun ฯโ โฆ ฯโ m a) โปยน' s) * ฮผโ ((fun ฯโ โฆ (ฯโ, trunc ฯโ)) โปยน' t) := fun ฯโ โฆ
h_indep_trunc s _ hs (ht.preimage (by fun_prop))
simp_rw [h_const]
let c := ฮผโ ((fun ฯโ โฆ ฯโ m a) โปยน' s)
change โซโป x, c * ฮผโ ((fun ฯโ โฆ (x, trunc ฯโ)) โปยน' t) โฮผโ =
(โซโป _, c โฮผโ) * โซโป x, ฮผโ ((fun ฯโ โฆ (x, trunc ฯโ)) โปยน' t) โฮผโ
have h_preimage : โ x, (fun ฯโ โฆ (x, trunc ฯโ)) โปยน' t = trunc โปยน' (Prod.mk x โปยน' t) := fun _ โฆ rfl
simp_rw [h_preimage]
have h_map : โ x, ฮผโ (trunc โปยน' (Prod.mk x โปยน' t)) = (ฮผโ.map trunc) (Prod.mk x โปยน' t) := by
intro x; rw [Measure.map_apply h_trunc_meas (ht.preimage (by fun_prop))]
simp_rw [h_map]
rw [lintegral_const_mul _ (measurable_measure_prodMk_left_finite ht),
lintegral_const, measure_univ, mul_one]
indepFun_snd_apply_pullCount_action๐
Bandits.ArrayModel.indepFun_snd_apply_pullCount_actionNo docstring.
Bandits.ArrayModel.indepFun_snd_apply_pullCount_action.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Nonempty R] [Countable ๐] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (a : ๐) (m n : โ) : ProbabilityTheory.IndepFun (fun ฯ => Prod.snd ฯ m a) (Set.indicator {ฯ | action alg (n + 1) ฯ = a โง Learning.pullCount (action alg) a (n + 1) ฯ = m} fun x => 1) (arrayMeasure ฮฝ)Bandits.ArrayModel.indepFun_snd_apply_pullCount_action.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Nonempty R] [Countable ๐] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (a : ๐) (m n : โ) : ProbabilityTheory.IndepFun (fun ฯ => Prod.snd ฯ m a) (Set.indicator {ฯ | action alg (n + 1) ฯ = a โง Learning.pullCount (action alg) a (n + 1) ฯ = m} fun x => 1) (arrayMeasure ฮฝ)
Code
lemma indepFun_snd_apply_pullCount_action [Countable ๐] (alg : Algorithm ๐ R)
(ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ] (a : ๐) (m n : โ) :
(fun ฯ โฆ ฯ.2 m a) โแตข[arrayMeasure ฮฝ]
({ฯ | action alg (n + 1) ฯ = a โง
pullCount (action alg) a (n + 1) ฯ = m}).indicator (fun _ โฆ 1)Type uses (6)
Body uses (3)
Actions: Source ยท Open Issue
Proof
(indepFun_snd_apply_aux ฮฝ a m).of_measurable_right (measurable_stepsUntil alg a m n)
indepFun_cond_comp๐
Bandits.ArrayModel.indepFun_cond_compNo docstring.
Bandits.ArrayModel.indepFun_cond_comp.{u_3, u_4, u_5, u_6} {๐ : Type u_3} {ฮฒ : Type u_4} {ฮณ : Type u_5} {ฮด : Type u_6} {m๐ : MeasurableSpace ๐} {mฮฒ : MeasurableSpace ฮฒ} {mฮณ : MeasurableSpace ฮณ} {mฮด : MeasurableSpace ฮด} [MeasurableSingletonClass ฮด] {ฮผ : MeasureTheory.Measure ๐} {X : ๐ โ ฮฒ} {Y : ๐ โ ฮณ} (hXY : ProbabilityTheory.IndepFun X Y ฮผ) (hY : Measurable Y) {Z : ฮณ โ ฮด} (hZ : Measurable Z) (z : ฮด) : ProbabilityTheory.IndepFun X Y ฮผ[|Z โ Y โปยน' {z}]Bandits.ArrayModel.indepFun_cond_comp.{u_3, u_4, u_5, u_6} {๐ : Type u_3} {ฮฒ : Type u_4} {ฮณ : Type u_5} {ฮด : Type u_6} {m๐ : MeasurableSpace ๐} {mฮฒ : MeasurableSpace ฮฒ} {mฮณ : MeasurableSpace ฮณ} {mฮด : MeasurableSpace ฮด} [MeasurableSingletonClass ฮด] {ฮผ : MeasureTheory.Measure ๐} {X : ๐ โ ฮฒ} {Y : ๐ โ ฮณ} (hXY : ProbabilityTheory.IndepFun X Y ฮผ) (hY : Measurable Y) {Z : ฮณ โ ฮด} (hZ : Measurable Z) (z : ฮด) : ProbabilityTheory.IndepFun X Y ฮผ[|Z โ Y โปยน' {z}]
Code
lemma indepFun_cond_comp {๐ ฮฒ ฮณ ฮด : Type*} {m๐ : MeasurableSpace ๐} {mฮฒ : MeasurableSpace ฮฒ}
{mฮณ : MeasurableSpace ฮณ} {mฮด : MeasurableSpace ฮด} [MeasurableSingletonClass ฮด] {ฮผ : Measure ๐}
{X : ๐ โ ฮฒ} {Y : ๐ โ ฮณ} (hXY : X โแตข[ฮผ] Y) (hY : Measurable Y)
{Z : ฮณ โ ฮด} (hZ : Measurable Z) (z : ฮด) :
X โแตข[ฮผ[|(Z โ Y) โปยน' {z}]] YBody uses (1)
Used by (1)
Actions: Source ยท Open Issue
Proof
by
have h_preim : (Z โ Y) โปยน' {z} = Y โปยน' (Z โปยน' {z}) := by grind
simp_rw [h_preim]
exact indepFun_cond_of_indepFun hXY hY (hZ (measurableSet_singleton z))
indepFun_snd_hist_cond๐
Bandits.ArrayModel.indepFun_snd_hist_condNo docstring.
Bandits.ArrayModel.indepFun_snd_hist_cond.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [StandardBorelSpace R] [Nonempty R] [Countable ๐] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (a : ๐) (n m : โ) : ProbabilityTheory.IndepFun (fun ฯ => Prod.snd ฯ m a) (fun x => hist alg x n) (arrayMeasure ฮฝ)[|(fun ฯ => (action alg (n + 1) ฯ, Learning.pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ)) โปยน' {(a, m)}]Bandits.ArrayModel.indepFun_snd_hist_cond.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [StandardBorelSpace R] [Nonempty R] [Countable ๐] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (a : ๐) (n m : โ) : ProbabilityTheory.IndepFun (fun ฯ => Prod.snd ฯ m a) (fun x => hist alg x n) (arrayMeasure ฮฝ)[|(fun ฯ => (action alg (n + 1) ฯ, Learning.pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ)) โปยน' {(a, m)}]
Code
lemma indepFun_snd_hist_cond [Countable ๐] (alg : Algorithm ๐ R)
(ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ] (a : ๐) (n m : โ) :
(fun ฯ โฆ ฯ.2 m a) โแตข[(arrayMeasure ฮฝ)[|(fun ฯ โฆ (action alg (n + 1) ฯ,
pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ)) โปยน' {(a, m)}]]
(hist alg ยท n)Type uses (7)
Body uses (10)
Used by (1)
Actions: Source ยท Open Issue
Proof
by
have h_meas := measurable_hist_truePast alg a n
refine IndepFun.of_measurable_right ?_ h_meas
have h_ae_eq : truePast alg a n =แต[(arrayMeasure ฮฝ)[|(fun ฯ โฆ (action alg (n + 1) ฯ,
pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ)) โปยน' {(a, m)}]]
(fun ฯ โฆ (ฯ.1, fun k b โฆ if b = a then if m โ 0 then ฯ.2 (min k (m - 1)) b
else Nonempty.some inferInstance else ฯ.2 k b)) := by
refine ae_cond_of_forall_mem ?_ fun x hx โฆ ?_
ยท refine (measurableSet_singleton _).preimage ?_
have h_meas_pc : Measurable fun ฯ โฆ
pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ := by
change Measurable ((fun p : (probSpace ๐ R) ร ๐ โฆ pullCount (action alg) p.2 (n + 1) p.1) โ
(fun ฯ : probSpace ๐ R โฆ (ฯ, action alg (n + 1) ฯ)))
exact (measurable_uncurry_pullCount (by fun_prop) _).comp (by fun_prop)
fun_prop
simp only [Set.mem_preimage, Set.mem_singleton_iff, Prod.mk.injEq] at hx
simp only [truePast]
congr with i b
by_cases hb : b = a
ยท simp only [hb, โreduceIte]
simp only [hx.1, true_and] at hx
congr!
ยท simp [hb]
refine IndepFun.congr ?_ EventuallyEq.rfl h_ae_eq.symm
suffices (fun ฯ โฆ ฯ.2 m a) โแตข[(arrayMeasure ฮฝ)[|(({ฯ | action alg (n + 1) ฯ = a โง
pullCount (action alg) a (n + 1) ฯ = m}).indicator (fun _ โฆ 1)) โปยน' {1}]]
fun ฯ โฆ (ฯ.1, fun k b โฆ if b = a then if m โ 0 then ฯ.2 (min k (m - 1)) b
else Nonempty.some inferInstance else ฯ.2 k b) by
convert this using 1
ยท rfl
ยท rfl
ยท rfl
congr with ฯ
simp only [Set.mem_preimage, Set.mem_singleton_iff, Prod.mk.injEq, Set.indicator_apply,
Set.mem_setOf_eq, ite_eq_left_iff, not_and, zero_ne_one, imp_false,
Classical.not_imp, Decidable.not_not, and_congr_right_iff]
intro ha
simp [ha]
have h_meas := measurable_stepsUntil alg a m n
obtain โจf, hf, hf_eqโฉ := h_meas.exists_eq_measurable_comp
simp_rw [hf_eq]
refine indepFun_cond_comp (Z := f) (z := 1) ?_ ?_ hf
ยท exact indepFun_snd_apply_aux ฮฝ a m
ยท refine Measurable.prodMk (by fun_prop) ?_
simp_rw [measurable_pi_iff]
intro i b
refine Measurable.ite (MeasurableSet.const _) ?_ (by fun_prop)
refine Measurable.ite (MeasurableSet.const _) (by fun_prop) (by fun_prop)
hasLaw_action_zero๐
Bandits.ArrayModel.hasLaw_action_zeroNo docstring.
Bandits.ArrayModel.hasLaw_action_zero.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : ProbabilityTheory.HasLaw (action alg 0) (Learning.Algorithm.p0 alg) (arrayMeasure ฮฝ)Bandits.ArrayModel.hasLaw_action_zero.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : ProbabilityTheory.HasLaw (action alg 0) (Learning.Algorithm.p0 alg) (arrayMeasure ฮฝ)
Code
lemma hasLaw_action_zero (alg : Algorithm ๐ R) (ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ] :
HasLaw (action alg 0) alg.p0 (arrayMeasure ฮฝ) where
map_eqType uses (5)
Body uses (6)
Used by (1)
Actions: Source ยท Open Issue
Proof
by
calc (arrayMeasure ฮฝ).map (fun ฯ โฆ initAlgFunction alg (ฯ.1 0))
_ = ((arrayMeasure ฮฝ).fst.map (Function.eval 0)).map (initAlgFunction alg) := by
rw [Measure.fst, Measure.map_map (by fun_prop) (by fun_prop),
Measure.map_map (by fun_prop) (by fun_prop)]
rfl
_ = (volume : Measure I).map (initAlgFunction alg) := by
simp only [arrayMeasure, Measure.fst_prod]
rw [(measurePreserving_eval_infinitePi (fun _ โฆ volume) 0).map_eq]
_ = alg.p0 := initAlgFunction_map alg
hasCondDistrib_reward_zero๐
Bandits.ArrayModel.hasCondDistrib_reward_zeroNo docstring.
Bandits.ArrayModel.hasCondDistrib_reward_zero.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : ProbabilityTheory.HasCondDistrib (reward alg 0) (action alg 0) ฮฝ (arrayMeasure ฮฝ)Bandits.ArrayModel.hasCondDistrib_reward_zero.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : ProbabilityTheory.HasCondDistrib (reward alg 0) (action alg 0) ฮฝ (arrayMeasure ฮฝ)
Code
lemma hasCondDistrib_reward_zero (alg : Algorithm ๐ R) (ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ] :
HasCondDistrib (reward alg 0) (action alg 0) ฮฝ (arrayMeasure ฮฝ)Type uses (6)
Used by (1)
Actions: Source ยท Open Issue
Proof
by
refine hasCondDistrib_of_condDistrib_eq (by fun_prop) (by fun_prop) ?_
refine (condDistrib_ae_eq_cond (by fun_prop) (by fun_prop)).trans ?_
rw [Filter.EventuallyEq, ae_iff_of_countable]
intro a ha
simp only [reward_zero]
calc ((arrayMeasure ฮฝ)[|action alg 0 โปยน' {a}]).map (fun ฯ โฆ ฯ.2 0 (action alg 0 ฯ))
_ = ((arrayMeasure ฮฝ)[|action alg 0 โปยน' {a}]).map (fun ฯ โฆ ฯ.2 0 a) := by
refine Measure.map_congr
(ae_cond_of_forall_mem ((measurableSet_singleton _).preimage (by fun_prop)) ?_)
intro x hx
simp only [Set.mem_preimage, Set.mem_singleton_iff] at hx
simp [hx]
_ = ฮฝ a := by
rw [cond_of_indepFun]
ยท exact map_snd_apply_arrayMeasure 0 a
ยท have : (fun ฯ โฆ ฯ.1 0) โแตข[arrayMeasure ฮฝ] fun ฯ โฆ ฯ.2 0 a :=
indepFun_fst_zero_snd_zero_action ฮฝ a
rw [action_zero]
exact this.comp (ฯ := initAlgFunction alg) (by fun_prop) measurable_id
ยท fun_prop
ยท fun_prop
ยท simp
ยท rwa [Measure.map_apply (by fun_prop) (by simp)] at ha
hasCondDistrib_action'๐
Bandits.ArrayModel.hasCondDistrib_action'No docstring.
Bandits.ArrayModel.hasCondDistrib_action'.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) : ProbabilityTheory.HasCondDistrib (action alg (n + 1)) (fun x => hist alg x n) (Learning.Algorithm.policy alg n) (arrayMeasure ฮฝ)Bandits.ArrayModel.hasCondDistrib_action'.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) : ProbabilityTheory.HasCondDistrib (action alg (n + 1)) (fun x => hist alg x n) (Learning.Algorithm.policy alg n) (arrayMeasure ฮฝ)
Code
lemma hasCondDistrib_action' (alg : Algorithm ๐ R) (ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ] (n : โ) :
HasCondDistrib (action alg (n + 1)) (hist alg ยท n) (alg.policy n) (arrayMeasure ฮฝ)Type uses (6)
Body uses (11)
Used by (1)
Actions: Source ยท Open Issue
Proof
by
rw [action_add_one_eq]
have h_fun ฯ := algFunction_map alg n (hist alg ฯ n)
refine โจby fun_prop, ?_โฉ
have h_indep : (arrayMeasure ฮฝ).map (fun ฯ โฆ (ฯ.1 (n + 1), hist alg ฯ n)) =
(โ).prod ((arrayMeasure ฮฝ).map (hist alg ยท n)) := by
have h_indep' := indepFun_fst_add_one_hist alg ฮฝ n
rw [indepFun_iff_map_prod_eq_prod_map_map (by fun_prop) (by fun_prop)] at h_indep'
rw [h_indep']
congr
simp only [arrayMeasure]
calc ((Measure.infinitePi fun x โฆ โ).prod (streamMeasure ฮฝ)).map (fun ฯ โฆ ฯ.1 (n + 1))
_ = (Measure.infinitePi fun x โฆ โ).map (Function.eval (n + 1)) := by
nth_rw 2 [โ Measure.fst_prod (ฮผ := Measure.infinitePi fun x โฆ โ)
(ฮฝ := streamMeasure ฮฝ)]
rw [Measure.fst, Measure.map_map (by fun_prop) (by fun_prop)]
rfl
_ = โ := by rw [Measure.infinitePi_map_eval]
have : (fun x โฆ (hist alg x n, algFunction alg n (hist alg x n) (x.1 (n + 1)))) =
(fun p โฆ (p.2, algFunction alg n (p.2) (p.1))) โ (fun x โฆ (x.1 (n + 1), hist alg x n)) := rfl
rw [this, โ Measure.map_map (by fun_prop) (by fun_prop), h_indep]
have : (โ : Measure I).prod ((arrayMeasure ฮฝ).map (hist alg ยท n)) =
((Kernel.const _ โ) รโ Kernel.id) โโ ((arrayMeasure ฮฝ).map (hist alg ยท n)) := by
have h := Measure.compProd_const (ฮผ := (arrayMeasure ฮฝ).map (hist alg ยท n))
(ฮฝ := (โ : Measure I))
rw [Measure.compProd_eq_comp_prod] at h
rw [โ Measure.prod_swap, โ h, โ Measure.deterministic_comp_eq_map (by fun_prop),
Measure.comp_assoc, โ Kernel.swap, Kernel.swap_prod]
rw [this, โ Measure.deterministic_comp_eq_map (by fun_prop),
โ Measure.deterministic_comp_eq_map (by fun_prop), Measure.compProd_eq_comp_prod,
Measure.comp_assoc, Measure.comp_assoc, Measure.comp_assoc]
congr 2
ext ฯ : 1
simp only [Kernel.deterministic_comp_eq_map, Kernel.comp_deterministic_eq_comap, Kernel.coe_comap,
Function.comp_apply]
rw [Kernel.map_apply _ (by fun_prop), Kernel.prod_apply, Kernel.const_apply, Kernel.id_apply,
Kernel.prod_apply, Kernel.id_apply, โ h_fun]
calc (((โ).prod (Measure.dirac (hist alg ฯ n)))).map (fun p โฆ (p.2, algFunction alg n p.2 p.1))
_ = (((โ).prod (Measure.dirac (hist alg ฯ n))).map Prod.swap).map
(fun p โฆ (p.1, algFunction alg n p.1 p.2)) := by
rw [Measure.map_map (by fun_prop) (by fun_prop)]
rfl
_ = ((Measure.dirac (hist alg ฯ n)).prod โ).map (fun p โฆ (p.1, algFunction alg n p.1 p.2)) := by
rw [Measure.prod_swap]
_ = (Measure.dirac (hist alg ฯ n)).prod ((โ).map (algFunction alg n (hist alg ฯ n))) := by
ext s hs
rw [Measure.map_apply (by fun_prop) hs, Measure.prod_apply, lintegral_dirac, Measure.prod_apply,
lintegral_dirac, Measure.map_apply (by fun_prop)]
ยท congr
ยท exact hs.preimage (by fun_prop)
ยท exact hs
ยท exact hs.preimage (by fun_prop)
hasCondDistrib_reward_pullCount_action๐
Bandits.ArrayModel.hasCondDistrib_reward_pullCount_action
The conditional distribution of the reward at time n + 1, given the action at time n + 1
and the number of times that action has been pulled before time n + 1, is equal to
the kernel ฮฝ.
Bandits.ArrayModel.hasCondDistrib_reward_pullCount_action.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) : ProbabilityTheory.HasCondDistrib (reward alg (n + 1)) (fun ฯ => (action alg (n + 1) ฯ, Learning.pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ)) (ProbabilityTheory.Kernel.prodMkRight โ ฮฝ) (arrayMeasure ฮฝ)Bandits.ArrayModel.hasCondDistrib_reward_pullCount_action.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) : ProbabilityTheory.HasCondDistrib (reward alg (n + 1)) (fun ฯ => (action alg (n + 1) ฯ, Learning.pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ)) (ProbabilityTheory.Kernel.prodMkRight โ ฮฝ) (arrayMeasure ฮฝ)
Code
lemma hasCondDistrib_reward_pullCount_action
(alg : Algorithm ๐ R) (ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ] (n : โ) :
HasCondDistrib (reward alg (n + 1))
(fun ฯ โฆ (action alg (n + 1) ฯ, pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ))
(ฮฝ.prodMkRight _) (arrayMeasure ฮฝ)Type uses (7)
Used by (1)
Actions: Source ยท Open Issue
Proof
by
have h_meas : Measurable fun ฯ โฆ pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ := by
change Measurable ((fun p : (probSpace ๐ R) ร ๐ โฆ pullCount (action alg) p.2 (n + 1) p.1) โ
(fun ฯ : probSpace ๐ R โฆ (ฯ, action alg (n + 1) ฯ)))
exact (measurable_uncurry_pullCount (by fun_prop) _).comp (by fun_prop)
refine hasCondDistrib_of_condDistrib_eq (by fun_prop) (by fun_prop) ?_
refine (condDistrib_ae_eq_cond
(Measurable.prodMk (by fun_prop) (by fun_prop)) (by fun_prop)).trans ?_
rw [Filter.EventuallyEq, ae_iff_of_countable]
intro โจa, mโฉ ham
simp only [Kernel.prodMkRight_apply]
calc
Measure.map (reward alg (n + 1))
(arrayMeasure ฮฝ)[|(fun ฯ โฆ (action alg (n + 1) ฯ,
pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ)) โปยน' {(a, m)}]
_ = Measure.map (fun ฯ โฆ ฯ.2 m a)
(arrayMeasure ฮฝ)[|(fun ฯ โฆ (action alg (n + 1) ฯ,
pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ)) โปยน' {(a, m)}] := by
rw [reward_eq]
refine Measure.map_congr
(ae_cond_of_forall_mem ((measurableSet_singleton _).preimage (by fun_prop)) (fun x hx โฆ ?_))
simp only [Set.mem_preimage, Set.mem_singleton_iff, Prod.mk.injEq] at hx
simp only [hx.1] at hx โข
simp [hx.2]
_ = Measure.map (fun ฯ โฆ ฯ.2 m a)
(arrayMeasure ฮฝ)[|({ฯ | action alg (n + 1) ฯ = a โง
pullCount (action alg) a (n + 1) ฯ = m}).indicator 1 โปยน' {1}] := by
congr with ฯ
simp only [Set.mem_preimage, Set.mem_singleton_iff, Prod.mk.injEq, Set.indicator_apply,
Set.mem_setOf_eq, Pi.one_apply, ite_eq_left_iff, not_and, zero_ne_one, imp_false,
Classical.not_imp, Decidable.not_not, and_congr_right_iff]
intro ha
simp [ha]
_ = ฮฝ a := by
rw [cond_of_indepFun, map_snd_apply_arrayMeasure m a]
ยท exact (indepFun_snd_apply_pullCount_action alg ฮฝ a m n).symm
ยท refine Measurable.indicator (by fun_prop) ?_
exact MeasurableSet.inter ((measurableSet_singleton _).preimage (by fun_prop))
((measurableSet_singleton _).preimage (by fun_prop))
ยท fun_prop
ยท simp
ยท rw [Measure.map_apply (by fun_prop) (by simp)] at ham
convert ham
ext ฯ
simp only [Set.mem_preimage, Set.indicator_apply, Set.mem_setOf_eq, Pi.one_apply,
Set.mem_singleton_iff, ite_eq_left_iff, not_and, zero_ne_one, imp_false, Classical.not_imp,
Decidable.not_not, Prod.mk.injEq, and_congr_right_iff]
intro ha
simp [ha]
reward_ae_eq_cond๐
Bandits.ArrayModel.reward_ae_eq_condNo docstring.
Bandits.ArrayModel.reward_ae_eq_cond.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) (a : ๐) (n m : โ) : reward alg (n + 1) =แต[(arrayMeasure ฮฝ)[|(fun ฯ => (action alg (n + 1) ฯ, Learning.pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ)) โปยน' {(a, m)}]] fun ฯ => Prod.snd ฯ m aBandits.ArrayModel.reward_ae_eq_cond.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) (a : ๐) (n m : โ) : reward alg (n + 1) =แต[(arrayMeasure ฮฝ)[|(fun ฯ => (action alg (n + 1) ฯ, Learning.pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ)) โปยน' {(a, m)}]] fun ฯ => Prod.snd ฯ m a
Code
lemma reward_ae_eq_cond (alg : Algorithm ๐ R) (ฮฝ : Kernel ๐ R) (a : ๐) (n m : โ) :
reward alg (n + 1) =แต[(arrayMeasure ฮฝ)[|(fun ฯ โฆ (action alg (n + 1) ฯ,
pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ)) โปยน' {(a, m)}]]
(fun ฯ โฆ ฯ.2 m a)Type uses (7)
Body uses (5)
Used by (1)
Actions: Source ยท Open Issue
Proof
by
rw [reward_eq]
refine ae_cond_of_forall_mem ?_ ?_
ยท have : Measurable fun ฯ โฆ pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ := by
change Measurable ((fun p : (probSpace ๐ R) ร ๐ โฆ pullCount (action alg) p.2 (n + 1) p.1) โ
(fun ฯ : probSpace ๐ R โฆ (ฯ, action alg (n + 1) ฯ)))
exact (measurable_uncurry_pullCount (by fun_prop) _).comp (by fun_prop)
exact (measurableSet_singleton _).preimage (by fun_prop)
intro ฯ hฯ
simp only [Set.mem_preimage, Set.mem_singleton_iff, Prod.mk.injEq] at hฯ
simp only [hฯ.2]
simp [hฯ.1]
hasCondDistrib_reward_hist_action_pullCount๐
Bandits.ArrayModel.hasCondDistrib_reward_hist_action_pullCount
The conditional distribution of the reward at time n + 1, given the history up to time n,
the action at time n + 1, and the number of times that action has been pulled before time n + 1,
is equal to the kernel ฮฝ.
Bandits.ArrayModel.hasCondDistrib_reward_hist_action_pullCount.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) : ProbabilityTheory.HasCondDistrib (reward alg (n + 1)) (fun ฯ => (hist alg ฯ n, action alg (n + 1) ฯ, Learning.pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ)) (ProbabilityTheory.Kernel.prodMkLeft (โฅ(Finset.Iic n) โ ๐ ร R) (ProbabilityTheory.Kernel.prodMkRight โ ฮฝ)) (arrayMeasure ฮฝ)Bandits.ArrayModel.hasCondDistrib_reward_hist_action_pullCount.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) : ProbabilityTheory.HasCondDistrib (reward alg (n + 1)) (fun ฯ => (hist alg ฯ n, action alg (n + 1) ฯ, Learning.pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ)) (ProbabilityTheory.Kernel.prodMkLeft (โฅ(Finset.Iic n) โ ๐ ร R) (ProbabilityTheory.Kernel.prodMkRight โ ฮฝ)) (arrayMeasure ฮฝ)
Code
lemma hasCondDistrib_reward_hist_action_pullCount
(alg : Algorithm ๐ R) (ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ] (n : โ) :
HasCondDistrib (reward alg (n + 1))
(fun ฯ โฆ (hist alg ฯ n, action alg (n + 1) ฯ,
pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ))
((ฮฝ.prodMkRight _).prodMkLeft _) (arrayMeasure ฮฝ)Type uses (8)
Used by (1)
Actions: Source ยท Open Issue
Proof
by
have h_meas : Measurable fun ฯ โฆ pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ := by
change Measurable ((fun p : (probSpace ๐ R) ร ๐ โฆ pullCount (action alg) p.2 (n + 1) p.1) โ
(fun ฯ : probSpace ๐ R โฆ (ฯ, action alg (n + 1) ฯ)))
exact (measurable_uncurry_pullCount (by fun_prop) _).comp (by fun_prop)
refine hasCondDistrib_of_condDistrib_eq (by fun_prop) (by fun_prop) ?_
refine condDistrib_prod_of_forall_condDistrib_cond (by fun_prop) (by fun_prop) (by fun_prop) _ ?_
intro (a, m) ham
have h_eq : ((ฮฝ.prodMkRight _).prodMkLeft _).comap (fun ฯ : (Iic n โ ๐ ร R) โฆ (ฯ, a, m))
(by fun_prop) =
Kernel.const _ (ฮฝ a) := by ext; simp
rw [h_eq, condDistrib_congr_left (reward_ae_eq_cond alg ฮฝ a n m)]
refine (condDistrib_of_indepFun ?_ (by fun_prop) (by fun_prop)).trans (ae_of_all _ fun ฯ โฆ ?_)
ยท exact (indepFun_snd_hist_cond alg ฮฝ a n m).symm
ยท simp only [Kernel.const_apply]
have : (fun ฯ โฆ (action alg (n + 1) ฯ,
pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ)) โปยน' {(a, m)} =
({ฯ | action alg (n + 1) ฯ = a โง
pullCount (action alg) a (n + 1) ฯ = m}).indicator 1 โปยน' {1} := by
ext ฯ
simp [Set.indicator_apply]
grind
rw [this, cond_of_indepFun, map_snd_apply_arrayMeasure m a]
ยท exact (indepFun_snd_apply_pullCount_action alg ฮฝ a m n).symm
ยท refine Measurable.indicator (by fun_prop) ?_
exact MeasurableSet.inter ((measurableSet_singleton _).preimage (by fun_prop))
((measurableSet_singleton _).preimage (by fun_prop))
ยท fun_prop
ยท simp
ยท convert ham
ext ฯ
simp only [Set.mem_preimage, Set.indicator_apply, Set.mem_setOf_eq, Pi.one_apply,
Set.mem_singleton_iff, ite_eq_left_iff, not_and, zero_ne_one, imp_false, Classical.not_imp,
Decidable.not_not, Prod.mk.injEq, and_congr_right_iff]
intro ha
simp [ha]
condIndepFun_reward_hist๐
Bandits.ArrayModel.condIndepFun_reward_hist
The reward at time n + 1 is conditionally independent of the history up to time n,
given the action at time n + 1 and the number of times that action has been pulled before
time n + 1.
Bandits.ArrayModel.condIndepFun_reward_hist.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) : ProbabilityTheory.CondIndepFun (MeasurableSpace.comap (fun ฯ => (action alg (n + 1) ฯ, Learning.pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ)) inferInstance) โฏ (reward alg (n + 1)) (fun x => hist alg x n) (arrayMeasure ฮฝ)Bandits.ArrayModel.condIndepFun_reward_hist.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) : ProbabilityTheory.CondIndepFun (MeasurableSpace.comap (fun ฯ => (action alg (n + 1) ฯ, Learning.pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ)) inferInstance) โฏ (reward alg (n + 1)) (fun x => hist alg x n) (arrayMeasure ฮฝ)
Code
lemma condIndepFun_reward_hist (alg : Algorithm ๐ R) (ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ] (n : โ) :
(reward alg (n + 1)) โแตข[(fun ฯ โฆ (action alg (n + 1) ฯ,
pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ)),
Measurable.prodMk (by fun_prop) (measurable_pullCount_action_add_one alg n);
arrayMeasure ฮฝ]
(hist alg ยท n)Type uses (12)
Body uses (5)
Used by (1)
Actions: Source ยท Open Issue
Proof
by
have h_cond := hasCondDistrib_reward_hist_action_pullCount alg ฮฝ n
refine condIndepFun_of_exists_condDistrib_prod_ae_eq_prodMkLeft (by fun_prop) (by fun_prop) ?_
h_cond.condDistrib_eq
exact Measurable.prodMk (by fun_prop) (measurable_pullCount_action_add_one alg n)
hasCondDistrib_reward'๐
Bandits.ArrayModel.hasCondDistrib_reward'
The conditional distribution of the reward at time n + 1, given the history up to time n
and the action at time n + 1, is equal to the kernel ฮฝ.
Bandits.ArrayModel.hasCondDistrib_reward'.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) : ProbabilityTheory.HasCondDistrib (reward alg (n + 1)) (fun ฯ => (hist alg ฯ n, action alg (n + 1) ฯ)) (ProbabilityTheory.Kernel.prodMkLeft (โฅ(Finset.Iic n) โ ๐ ร R) ฮฝ) (arrayMeasure ฮฝ)Bandits.ArrayModel.hasCondDistrib_reward'.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) : ProbabilityTheory.HasCondDistrib (reward alg (n + 1)) (fun ฯ => (hist alg ฯ n, action alg (n + 1) ฯ)) (ProbabilityTheory.Kernel.prodMkLeft (โฅ(Finset.Iic n) โ ๐ ร R) ฮฝ) (arrayMeasure ฮฝ)
Code
lemma hasCondDistrib_reward' (alg : Algorithm ๐ R) (ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ] (n : โ) :
HasCondDistrib (reward alg (n + 1)) (fun ฯ โฆ (hist alg ฯ n, action alg (n + 1) ฯ))
(ฮฝ.prodMkLeft _) (arrayMeasure ฮฝ)Type uses (7)
Used by (1)
Actions: Source ยท Open Issue
Proof
by
let R' := reward alg (n + 1)
let H := (hist alg ยท n)
let A := action alg (n + 1)
let P := fun ฯ โฆ pullCount (action alg) (action alg (n + 1) ฯ) (n + 1) ฯ
have hP : Measurable P := measurable_pullCount_action_add_one alg n
change HasCondDistrib R' (fun ฯ โฆ (H ฯ, A ฯ)) (ฮฝ.prodMkLeft _) _
suffices HasCondDistrib R' (fun ฯ โฆ (A ฯ, H ฯ)) (ฮฝ.prodMkRight _) (arrayMeasure ฮฝ) by
have h_eq : (fun ฯ โฆ (H ฯ, A ฯ)) = MeasurableEquiv.prodComm โ (fun ฯ โฆ (A ฯ, H ฯ)) := rfl
rw [h_eq]
exact this.measurableEquiv_comp_right (ฮบ := ฮฝ.prodMkRight _) _
suffices HasCondDistrib R' (fun ฯ โฆ ((A ฯ, H ฯ), P ฯ))
((ฮฝ.prodMkRight _).prodMkRight _) (arrayMeasure ฮฝ) by
-- use that `P` is measurable wrt `(A, H)` to drop it from the conditioning
have hP_meas :
Measurable[MeasurableSpace.comap (fun ฯ โฆ (A ฯ, H ฯ)) inferInstance] P :=
measurable_pullCount_action_add_one_hist alg n
obtain โจf, hf_meas, hf_eqโฉ := hP_meas.exists_eq_measurable_comp
simp only [hf_eq, Function.comp_apply] at this
rwa [hasCondDistrib_prod_right_iff _ _ hf_meas] at this
suffices HasCondDistrib R' (fun ฯ โฆ ((A ฯ, P ฯ), H ฯ))
((ฮฝ.prodMkRight _).prodMkRight _) (arrayMeasure ฮฝ) by
let e : ((๐ ร โ) ร (Iic n โ ๐ ร R)) โแต ((๐ ร (Iic n โ ๐ ร R)) ร โ) :=
{ toFun := fun x โฆ ((x.1.1, x.2), x.1.2)
invFun := fun x โฆ ((x.1.1, x.2), x.1.2)
measurable_toFun := by simp only [Equiv.coe_fn_mk]; fun_prop
measurable_invFun := by simp only [Equiv.symm_mk, Equiv.coe_fn_mk]; fun_prop }
exact this.measurableEquiv_comp_right e
suffices HasCondDistrib R' (fun ฯ โฆ (A ฯ, P ฯ)) (ฮฝ.prodMkRight _) (arrayMeasure ฮฝ) by
have h_indep : H โแตข[(fun ฯ โฆ (A ฯ, P ฯ)), (by fun_prop); arrayMeasure ฮฝ] R' :=
(condIndepFun_reward_hist alg ฮฝ n).symm
have h_condDistrib := this.condDistrib_eq
rw [condIndepFun_iff_condDistrib_prod_ae_eq_prodMkRight (by fun_prop) (by fun_prop)
(by fun_prop)] at h_indep
refine hasCondDistrib_of_condDistrib_eq (by fun_prop) (by fun_prop) ?_
refine h_indep.trans ?_
rw [Filter.EventuallyEq, ae_map_iff] at h_condDistrib โข
ยท simpa only [Kernel.prodMkRight_apply]
ยท fun_prop
ยท exact Kernel.measurableSet_eq _ _
ยท fun_prop
ยท exact Kernel.measurableSet_eq _ _
exact hasCondDistrib_reward_pullCount_action alg ฮฝ n
hasCondDistrib_action๐
Bandits.ArrayModel.hasCondDistrib_actionNo docstring.
Bandits.ArrayModel.hasCondDistrib_action.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) : ProbabilityTheory.HasCondDistrib (action alg (n + 1)) (fun ฯ i => (action alg (โi) ฯ, reward alg (โi) ฯ)) (Learning.Algorithm.policy alg n) (arrayMeasure ฮฝ)Bandits.ArrayModel.hasCondDistrib_action.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) : ProbabilityTheory.HasCondDistrib (action alg (n + 1)) (fun ฯ i => (action alg (โi) ฯ, reward alg (โi) ฯ)) (Learning.Algorithm.policy alg n) (arrayMeasure ฮฝ)
Code
lemma hasCondDistrib_action (alg : Algorithm ๐ R) (ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ] (n : โ) :
HasCondDistrib (action alg (n + 1))
(fun ฯ (i : Iic n) โฆ (action alg i ฯ, reward alg i ฯ))
(alg.policy n) (arrayMeasure ฮฝ)Type uses (6)
Body uses (3)
Used by (1)
Actions: Source ยท Open Issue
Proof
by
convert hasCondDistrib_action' alg ฮฝ n with ฯ i
ยท simp only [action]
rw [hist_eq _ _ n]
ยท simp only [reward]
rw [hist_eq _ _ n]
hasCondDistrib_reward๐
Bandits.ArrayModel.hasCondDistrib_rewardNo docstring.
Bandits.ArrayModel.hasCondDistrib_reward.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) : ProbabilityTheory.HasCondDistrib (reward alg (n + 1)) (fun ฯ => (fun i => (action alg (โi) ฯ, reward alg (โi) ฯ), action alg (n + 1) ฯ)) (Learning.Environment.feedback (Learning.stationaryEnv ฮฝ) n) (arrayMeasure ฮฝ)Bandits.ArrayModel.hasCondDistrib_reward.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] (n : โ) : ProbabilityTheory.HasCondDistrib (reward alg (n + 1)) (fun ฯ => (fun i => (action alg (โi) ฯ, reward alg (โi) ฯ), action alg (n + 1) ฯ)) (Learning.Environment.feedback (Learning.stationaryEnv ฮฝ) n) (arrayMeasure ฮฝ)
Code
lemma hasCondDistrib_reward (alg : Algorithm ๐ R) (ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ]
(n : โ) :
HasCondDistrib (reward alg (n + 1))
(fun ฯ โฆ (fun (i : Iic n) โฆ (action alg i ฯ, reward alg i ฯ), action alg (n + 1) ฯ))
((stationaryEnv ฮฝ).feedback n) (arrayMeasure ฮฝ)Type uses (8)
Body uses (3)
Used by (1)
Actions: Source ยท Open Issue
Proof
by
convert hasCondDistrib_reward' alg ฮฝ n with ฯ i
ยท simp only [action]
rw [hist_eq _ _ n]
ยท simp only [reward]
rw [hist_eq _ _ n]
ยท rfl
isAlgEnvSeq_arrayMeasure๐
Bandits.ArrayModel.isAlgEnvSeq_arrayMeasureNo docstring.
Bandits.ArrayModel.isAlgEnvSeq_arrayMeasure.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : Learning.IsAlgEnvSeq (action alg) (reward alg) alg (Learning.stationaryEnv ฮฝ) (arrayMeasure ฮฝ)Bandits.ArrayModel.isAlgEnvSeq_arrayMeasure.{u_1, u_2} {๐ : Type u_1} {R : Type u_2} {m๐ : MeasurableSpace ๐} {mR : MeasurableSpace R} [Nonempty ๐] [StandardBorelSpace ๐] [DecidableEq ๐] [Countable ๐] [StandardBorelSpace R] [Nonempty R] (alg : Learning.Algorithm ๐ R) (ฮฝ : ProbabilityTheory.Kernel ๐ R) [ProbabilityTheory.IsMarkovKernel ฮฝ] : Learning.IsAlgEnvSeq (action alg) (reward alg) alg (Learning.stationaryEnv ฮฝ) (arrayMeasure ฮฝ)
Code
lemma isAlgEnvSeq_arrayMeasure (alg : Algorithm ๐ R) (ฮฝ : Kernel ๐ R) [IsMarkovKernel ฮฝ] :
IsAlgEnvSeq (action alg) (reward alg) alg (stationaryEnv ฮฝ) (arrayMeasure ฮฝ) where
hasLaw_action_zeroType uses (9)
Body uses (8)
Used by (3)
Actions: Source ยท Open Issue
Proof
hasLaw_action_zero alg ฮฝ hasCondDistrib_feedback_zero := hasCondDistrib_reward_zero alg ฮฝ hasCondDistrib_action := hasCondDistrib_action alg ฮฝ hasCondDistrib_feedback := hasCondDistrib_reward alg ฮฝ
-
Bandits.streamMeasure -
Bandits.instIsProbabilityMeasureForallNatForallStreamMeasureOfIsMarkovKernel -
hasLaw_eval_infinitePi -
Bandits.hasLaw_eval_streamMeasure -
Bandits.hasLaw_eval_eval_streamMeasure -
Bandits.identDistrib_eval_eval_id_streamMeasure -
Bandits.integrable_eval_streamMeasure -
Bandits.integral_eval_streamMeasure -
Bandits.iIndepFun_eval_streamMeasure' -
Bandits.iIndepFun_eval_streamMeasure'' -
Bandits.iIndepFun_eval_streamMeasure -
Bandits.indepFun_eval_streamMeasure -
Bandits.indepFun_eval_streamMeasure' -
Bandits.ArrayModel.probSpace -
Bandits.ArrayModel.instMeasurableSpaceProbSpace -
Bandits.ArrayModel.instStandardBorelSpaceProbSpaceOfCountable -
Bandits.ArrayModel.arrayMeasure -
Bandits.ArrayModel.instIsProbabilityMeasureProbSpaceArrayMeasureOfIsMarkovKernel -
Bandits.ArrayModel.initAlgFunction -
Bandits.ArrayModel.initAlgFunction_map -
Bandits.ArrayModel.measurable_initAlgFunction -
Bandits.ArrayModel.algFunction -
Bandits.ArrayModel.algFunction_map -
Bandits.ArrayModel.measurable_algFunction -
Bandits.ArrayModel.hist -
Bandits.ArrayModel.hist_zero -
Bandits.ArrayModel.hist_add_one -
Bandits.ArrayModel.hist_eq -
Bandits.ArrayModel.hist_add_one_eq_IicSuccProd' -
Bandits.ArrayModel.action -
Bandits.ArrayModel.action_zero -
Bandits.ArrayModel.action_add_one_eq -
Bandits.ArrayModel.reward -
Bandits.ArrayModel.reward_zero -
Bandits.ArrayModel.reward_add_one -
Bandits.ArrayModel.reward_eq -
Bandits.ArrayModel.sumRewards_eq -
Bandits.ArrayModel.measurable_action_add_one' -
Bandits.ArrayModel.measurable_pullCount'_action_add_one -
Bandits.ArrayModel.measurable_hist -
Bandits.ArrayModel.measurable_action -
Bandits.ArrayModel.measurable_reward -
Bandits.ArrayModel.hist_add_one_eq_IicSuccProd -
Bandits.ArrayModel.measurable_pullCount_action_add_one -
Bandits.ArrayModel.hist_congr -
Bandits.ArrayModel.stepsUntil_congr_aux -
Bandits.ArrayModel.stepsUntil_congr -
Bandits.ArrayModel.stepsUntil_indicator_congr -
Bandits.ArrayModel.measurable_hist_comap -
Bandits.ArrayModel.truePast -
Bandits.ArrayModel.truePast_eq_of_pullCount_eq -
Bandits.ArrayModel.truePast_eq_of_pullCount_eq_of_ne_zero -
Bandits.ArrayModel.measurable_hist_truePast -
Bandits.ArrayModel.measurable_action_add_one_truePast -
Bandits.ArrayModel.measurable_pullCount_add_one_truePast -
Bandits.ArrayModel.measurable_stepsUntil -
Bandits.ArrayModel.measurable_pullCount_action_add_one_hist -
Bandits.ArrayModel.map_snd_apply_arrayMeasure -
Bandits.ArrayModel.indepFun_fst_snd -
Bandits.ArrayModel.indepFun_fst_zero_snd_zero_action -
Bandits.ArrayModel.indepFun_fst_add_one_aux -
Bandits.ArrayModel.indepFun_fst_add_one_hist -
Bandits.ArrayModel.indepFun_snd_apply_aux -
Bandits.ArrayModel.indepFun_snd_apply_pullCount_action -
Bandits.ArrayModel.indepFun_cond_comp -
Bandits.ArrayModel.indepFun_snd_hist_cond -
Bandits.ArrayModel.hasLaw_action_zero -
Bandits.ArrayModel.hasCondDistrib_reward_zero -
Bandits.ArrayModel.hasCondDistrib_action' -
Bandits.ArrayModel.hasCondDistrib_reward_pullCount_action -
Bandits.ArrayModel.reward_ae_eq_cond -
Bandits.ArrayModel.hasCondDistrib_reward_hist_action_pullCount -
Bandits.ArrayModel.condIndepFun_reward_hist -
Bandits.ArrayModel.hasCondDistrib_reward' -
Bandits.ArrayModel.hasCondDistrib_action -
Bandits.ArrayModel.hasCondDistrib_reward -
Bandits.ArrayModel.isAlgEnvSeq_arrayMeasure