LeanMachineLearning exposition

3.1.ย SequentialLearning.FiniteActions๐Ÿ”—

Bookkeeping definitions for finite action space sequential learning problems

If the number of actions is finite, it makes sense to define the number of times each action was chosen, the time at which an action was chosen for the nth time, the value of the reward at that time, the sum of rewards obtained for each action, the empirical mean reward for each action, etc.

For each definition that take as arguments a time t : โ„•, a history h : โ„• โ†’ ๐“ ร— R, and possibly other parameters, we put the time and history at the end in this order, so that the definition can be seen as a stochastic process indexed by time t on the measurable space โ„• โ†’ ๐“ ร— R.

Module LeanMachineLearning.SequentialLearning.FiniteActions contains 103 exposed declarations.

pullCount๐Ÿ”—

DefinitionLearning.pullCount

Number of times action a was chosen up to time t (excluding t).

๐Ÿ”—def
Learning.pullCount.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] (A : โ„• โ†’ ฮฉ โ†’ ๐“) (a : ๐“) (t : โ„•) (ฯ‰ : ฮฉ) : โ„•
Learning.pullCount.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] (A : โ„• โ†’ ฮฉ โ†’ ๐“) (a : ๐“) (t : โ„•) (ฯ‰ : ฮฉ) : โ„•

Code

noncomputable
def pullCount (A : โ„• โ†’ ฮฉ โ†’ ๐“) (a : ๐“) (t : โ„•) (ฯ‰ : ฮฉ) : โ„• :=
  #(filter (fun s โ†ฆ A s ฯ‰ = a) (range t))
Used by (146)

Actions: Source ยท Open Issue

pullCount'๐Ÿ”—

DefinitionLearning.pullCount'

Number of pulls of arm a up to (and including) time n. This is the number of entries in h in which the arm is a.

๐Ÿ”—def
Learning.pullCount'.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} [DecidableEq ๐“] (n : โ„•) (h : โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— R) (a : ๐“) : โ„•
Learning.pullCount'.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} [DecidableEq ๐“] (n : โ„•) (h : โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— R) (a : ๐“) : โ„•

Code

noncomputable
def pullCount' (n : โ„•) (h : Iic n โ†’ ๐“ ร— R) (a : ๐“) := #{s | (h s).1 = a}
Used by (29)

Actions: Source ยท Open Issue

pullCount_zero๐Ÿ”—

LemmaLearning.pullCount_zero

No docstring.

๐Ÿ”—theorem
Learning.pullCount_zero.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} (a : ๐“) : pullCount A a 0 = 0
Learning.pullCount_zero.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} (a : ๐“) : pullCount A a 0 = 0

Code

lemma pullCount_zero (a : ๐“) : pullCount A a 0 = 0
Type uses (1)
Used by (15)

Actions: Source ยท Open Issue

Proof
by ext; simp [pullCount]

pullCount_zero_apply๐Ÿ”—

LemmaLearning.pullCount_zero_apply

No docstring.

๐Ÿ”—theorem
Learning.pullCount_zero_apply.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} (a : ๐“) (ฯ‰ : ฮฉ) : pullCount A a 0 ฯ‰ = 0
Learning.pullCount_zero_apply.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} (a : ๐“) (ฯ‰ : ฮฉ) : pullCount A a 0 ฯ‰ = 0

Code

lemma pullCount_zero_apply (a : ๐“) (ฯ‰ : ฮฉ) : pullCount A a 0 ฯ‰ = 0
Type uses (1)
Body uses (1)

Actions: Source ยท Open Issue

Proof
by simp

pullCount_one๐Ÿ”—

LemmaLearning.pullCount_one

No docstring.

๐Ÿ”—theorem
Learning.pullCount_one.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {ฯ‰ : ฮฉ} : pullCount A a 1 ฯ‰ = if A 0 ฯ‰ = a then 1 else 0
Learning.pullCount_one.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {ฯ‰ : ฮฉ} : pullCount A a 1 ฯ‰ = if A 0 ฯ‰ = a then 1 else 0

Code

lemma pullCount_one : pullCount A a 1 ฯ‰ = if A 0 ฯ‰ = a then 1 else 0
Type uses (1)
Used by (3)

Actions: Source ยท Open Issue

Proof
by
  simp only [pullCount, range_one]
  split_ifs with h
  ยท rw [card_eq_one]
    refine โŸจ0, by simp [h]โŸฉ
  ยท simp [h]

monotone_pullCount๐Ÿ”—

LemmaLearning.monotone_pullCount

No docstring.

๐Ÿ”—theorem
Learning.monotone_pullCount.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} (a : ๐“) (ฯ‰ : ฮฉ) : Monotone fun x => pullCount A a x ฯ‰
Learning.monotone_pullCount.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} (a : ๐“) (ฯ‰ : ฮฉ) : Monotone fun x => pullCount A a x ฯ‰

Code

lemma monotone_pullCount (a : ๐“) (ฯ‰ : ฮฉ) : Monotone (pullCount A a ยท ฯ‰)
Type uses (1)
Used by (5)

Actions: Source ยท Open Issue

Proof
fun _ _ _ โ†ฆ card_le_card (filter_subset_filter _ (by simpa))

pullCount_mono๐Ÿ”—

LemmaLearning.pullCount_mono

No docstring.

๐Ÿ”—theorem
Learning.pullCount_mono.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} (a : ๐“) {n m : โ„•} (hnm : n โ‰ค m) (ฯ‰ : ฮฉ) : pullCount A a n ฯ‰ โ‰ค pullCount A a m ฯ‰
Learning.pullCount_mono.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} (a : ๐“) {n m : โ„•} (hnm : n โ‰ค m) (ฯ‰ : ฮฉ) : pullCount A a n ฯ‰ โ‰ค pullCount A a m ฯ‰

Code

lemma pullCount_mono (a : ๐“) {n m : โ„•} (hnm : n โ‰ค m) (ฯ‰ : ฮฉ) :
    pullCount A a n ฯ‰ โ‰ค pullCount A a m ฯ‰
Type uses (1)
Body uses (1)
Used by (6)

Actions: Source ยท Open Issue

Proof
monotone_pullCount a ฯ‰ hnm

pullCount_action_eq_pullCount_add_one๐Ÿ”—

LemmaLearning.pullCount_action_eq_pullCount_add_one

No docstring.

๐Ÿ”—theorem
Learning.pullCount_action_eq_pullCount_add_one.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} (t : โ„•) (ฯ‰ : ฮฉ) : pullCount A (A t ฯ‰) (t + 1) ฯ‰ = pullCount A (A t ฯ‰) t ฯ‰ + 1
Learning.pullCount_action_eq_pullCount_add_one.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} (t : โ„•) (ฯ‰ : ฮฉ) : pullCount A (A t ฯ‰) (t + 1) ฯ‰ = pullCount A (A t ฯ‰) t ฯ‰ + 1

Code

lemma pullCount_action_eq_pullCount_add_one (t : โ„•) (ฯ‰ : ฮฉ) :
    pullCount A (A t ฯ‰) (t + 1) ฯ‰ = pullCount A (A t ฯ‰) t ฯ‰ + 1
Type uses (1)
Used by (7)

Actions: Source ยท Open Issue

Proof
by
  simp [pullCount, range_add_one, filter_insert]

pullCount_eq_pullCount_of_action_ne๐Ÿ”—

LemmaLearning.pullCount_eq_pullCount_of_action_ne

No docstring.

๐Ÿ”—theorem
Learning.pullCount_eq_pullCount_of_action_ne.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {t : โ„•} {ฯ‰ : ฮฉ} (ha : A t ฯ‰ โ‰  a) : pullCount A a (t + 1) ฯ‰ = pullCount A a t ฯ‰
Learning.pullCount_eq_pullCount_of_action_ne.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {t : โ„•} {ฯ‰ : ฮฉ} (ha : A t ฯ‰ โ‰  a) : pullCount A a (t + 1) ฯ‰ = pullCount A a t ฯ‰

Code

lemma pullCount_eq_pullCount_of_action_ne (ha : A t ฯ‰ โ‰  a) :
    pullCount A a (t + 1) ฯ‰ = pullCount A a t ฯ‰
Type uses (1)
Used by (5)

Actions: Source ยท Open Issue

Proof
by
  simp [pullCount, range_add_one, filter_insert, ha]

pullCount_add_one๐Ÿ”—

LemmaLearning.pullCount_add_one

No docstring.

๐Ÿ”—theorem
Learning.pullCount_add_one.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {t : โ„•} {ฯ‰ : ฮฉ} : pullCount A a (t + 1) ฯ‰ = pullCount A a t ฯ‰ + if A t ฯ‰ = a then 1 else 0
Learning.pullCount_add_one.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {t : โ„•} {ฯ‰ : ฮฉ} : pullCount A a (t + 1) ฯ‰ = pullCount A a t ฯ‰ + if A t ฯ‰ = a then 1 else 0

Code

lemma pullCount_add_one :
    pullCount A a (t + 1) ฯ‰ = pullCount A a t ฯ‰ + if A t ฯ‰ = a then 1 else 0
Type uses (1)
Body uses (2)
Used by (7)

Actions: Source ยท Open Issue

Proof
by
  split_ifs with h
  ยท rw [โ† h, pullCount_action_eq_pullCount_add_one]
  ยท rw [pullCount_eq_pullCount_of_action_ne h, add_zero]

pullCount_eq_sum๐Ÿ”—

LemmaLearning.pullCount_eq_sum

No docstring.

๐Ÿ”—theorem
Learning.pullCount_eq_sum.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} (a : ๐“) (t : โ„•) (ฯ‰ : ฮฉ) : pullCount A a t ฯ‰ = โˆ‘ s โˆˆ Finset.range t, if A s ฯ‰ = a then 1 else 0
Learning.pullCount_eq_sum.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} (a : ๐“) (t : โ„•) (ฯ‰ : ฮฉ) : pullCount A a t ฯ‰ = โˆ‘ s โˆˆ Finset.range t, if A s ฯ‰ = a then 1 else 0

Code

lemma pullCount_eq_sum (a : ๐“) (t : โ„•) (ฯ‰ : ฮฉ) :
    pullCount A a t ฯ‰ = โˆ‘ s โˆˆ range t, if A s ฯ‰ = a then 1 else 0
Type uses (1)
Used by (8)

Actions: Source ยท Open Issue

Proof
by simp [pullCount]

pullCount'_eq_sum๐Ÿ”—

LemmaLearning.pullCount'_eq_sum

No docstring.

๐Ÿ”—theorem
Learning.pullCount'_eq_sum.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} [DecidableEq ๐“] (n : โ„•) (h : โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— R) (a : ๐“) : pullCount' n h a = โˆ‘ s, if Prod.fst (h s) = a then 1 else 0
Learning.pullCount'_eq_sum.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} [DecidableEq ๐“] (n : โ„•) (h : โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— R) (a : ๐“) : pullCount' n h a = โˆ‘ s, if Prod.fst (h s) = a then 1 else 0

Code

lemma pullCount'_eq_sum (n : โ„•) (h : Iic n โ†’ ๐“ ร— R) (a : ๐“) :
    pullCount' n h a = โˆ‘ s : Iic n, if (h s).1 = a then 1 else 0
Type uses (1)
Used by (3)

Actions: Source ยท Open Issue

Proof
by simp [pullCount']

pullCount_add_one_eq_pullCount'๐Ÿ”—

LemmaLearning.pullCount_add_one_eq_pullCount'

No docstring.

๐Ÿ”—theorem
Learning.pullCount_add_one_eq_pullCount'.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} {a : ๐“} {n : โ„•} {ฯ‰ : ฮฉ} : pullCount A a (n + 1) ฯ‰ = pullCount' n (fun i => (A (โ†‘i) ฯ‰, R' (โ†‘i) ฯ‰)) a
Learning.pullCount_add_one_eq_pullCount'.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} {a : ๐“} {n : โ„•} {ฯ‰ : ฮฉ} : pullCount A a (n + 1) ฯ‰ = pullCount' n (fun i => (A (โ†‘i) ฯ‰, R' (โ†‘i) ฯ‰)) a

Code

lemma pullCount_add_one_eq_pullCount' {n : โ„•} {ฯ‰ : ฮฉ} :
    pullCount A a (n + 1) ฯ‰ = pullCount' n (fun i โ†ฆ (A i ฯ‰, R' i ฯ‰)) a
Type uses (2)
Body uses (2)
Used by (5)

Actions: Source ยท Open Issue

Proof
by
  rw [pullCount_eq_sum, pullCount'_eq_sum]
  rw [Finset.sum_coe_sort (f := fun s โ†ฆ if A s ฯ‰ = a then 1 else 0) (Iic n)]
  congr with m
  simp only [mem_range, mem_Iic]
  grind

pullCount_eq_pullCount'๐Ÿ”—

LemmaLearning.pullCount_eq_pullCount'

No docstring.

๐Ÿ”—theorem
Learning.pullCount_eq_pullCount'.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} {a : ๐“} {n : โ„•} {ฯ‰ : ฮฉ} (hn : n โ‰  0) : pullCount A a n ฯ‰ = pullCount' (n - 1) (fun i => (A (โ†‘i) ฯ‰, R' (โ†‘i) ฯ‰)) a
Learning.pullCount_eq_pullCount'.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} {a : ๐“} {n : โ„•} {ฯ‰ : ฮฉ} (hn : n โ‰  0) : pullCount A a n ฯ‰ = pullCount' (n - 1) (fun i => (A (โ†‘i) ฯ‰, R' (โ†‘i) ฯ‰)) a

Code

lemma pullCount_eq_pullCount' {n : โ„•} {ฯ‰ : ฮฉ} (hn : n โ‰  0) :
    pullCount A a n ฯ‰ = pullCount' (n - 1) (fun i โ†ฆ (A i ฯ‰, R' i ฯ‰)) a
Type uses (2)
Body uses (1)
Used by (4)

Actions: Source ยท Open Issue

Proof
by
  cases n with
  | zero => exact absurd rfl hn
  | succ n => simp [pullCount_add_one_eq_pullCount' (R' := R')]

pullCount'_mono๐Ÿ”—

LemmaLearning.pullCount'_mono

No docstring.

๐Ÿ”—theorem
Learning.pullCount'_mono.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} {a : ๐“} {ฯ‰ : ฮฉ} {n m : โ„•} (hnm : n โ‰ค m) : pullCount' n (fun i => (A (โ†‘i) ฯ‰, R' (โ†‘i) ฯ‰)) a โ‰ค pullCount' m (fun i => (A (โ†‘i) ฯ‰, R' (โ†‘i) ฯ‰)) a
Learning.pullCount'_mono.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} {a : ๐“} {ฯ‰ : ฮฉ} {n m : โ„•} (hnm : n โ‰ค m) : pullCount' n (fun i => (A (โ†‘i) ฯ‰, R' (โ†‘i) ฯ‰)) a โ‰ค pullCount' m (fun i => (A (โ†‘i) ฯ‰, R' (โ†‘i) ฯ‰)) a

Code

lemma pullCount'_mono {n m : โ„•} (hnm : n โ‰ค m) :
    pullCount' n (fun i โ†ฆ (A i ฯ‰, R' i ฯ‰)) a โ‰ค pullCount' m (fun i โ†ฆ (A i ฯ‰, R' i ฯ‰)) a
Type uses (1)
Body uses (3)

Actions: Source ยท Open Issue

Proof
by
  rw [โ† pullCount_add_one_eq_pullCount', โ† pullCount_add_one_eq_pullCount']
  exact pullCount_mono a (by lia) _

pullCount_le๐Ÿ”—

LemmaLearning.pullCount_le

No docstring.

๐Ÿ”—theorem
Learning.pullCount_le.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} (a : ๐“) (t : โ„•) (ฯ‰ : ฮฉ) : pullCount A a t ฯ‰ โ‰ค t
Learning.pullCount_le.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} (a : ๐“) (t : โ„•) (ฯ‰ : ฮฉ) : pullCount A a t ฯ‰ โ‰ค t

Code

lemma pullCount_le (a : ๐“) (t : โ„•) (ฯ‰ : ฮฉ) : pullCount A a t ฯ‰ โ‰ค t
Type uses (1)
Used by (4)

Actions: Source ยท Open Issue

Proof
(card_filter_le _ _).trans_eq (by simp)

pullCount_congr๐Ÿ”—

LemmaLearning.pullCount_congr

No docstring.

๐Ÿ”—theorem
Learning.pullCount_congr.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {n : โ„•} {ฯ‰ ฯ‰' : ฮฉ} (h_eq : โˆ€ i โ‰ค n, A i ฯ‰ = A i ฯ‰') : pullCount A a (n + 1) ฯ‰ = pullCount A a (n + 1) ฯ‰'
Learning.pullCount_congr.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {n : โ„•} {ฯ‰ ฯ‰' : ฮฉ} (h_eq : โˆ€ i โ‰ค n, A i ฯ‰ = A i ฯ‰') : pullCount A a (n + 1) ฯ‰ = pullCount A a (n + 1) ฯ‰'

Code

lemma pullCount_congr {ฯ‰' : ฮฉ} (h_eq : โˆ€ i โ‰ค n, A i ฯ‰ = A i ฯ‰') :
    pullCount A a (n + 1) ฯ‰ = pullCount A a (n + 1) ฯ‰'
Type uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  unfold pullCount
  congr 1 with s
  simp only [mem_filter, mem_range, and_congr_right_iff]
  intro hs
  rw [Nat.lt_add_one_iff] at hs
  rw [h_eq s hs]

pullCount_lt_of_forall_ne๐Ÿ”—

LemmaLearning.pullCount_lt_of_forall_ne

No docstring.

๐Ÿ”—theorem
Learning.pullCount_lt_of_forall_ne.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {n t : โ„•} {ฯ‰ : ฮฉ} (h_lt : โˆ€ (s : โ„•), pullCount A a (s + 1) ฯ‰ โ‰  t) (ht : t โ‰  0) : pullCount A a n ฯ‰ < t
Learning.pullCount_lt_of_forall_ne.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {n t : โ„•} {ฯ‰ : ฮฉ} (h_lt : โˆ€ (s : โ„•), pullCount A a (s + 1) ฯ‰ โ‰  t) (ht : t โ‰  0) : pullCount A a n ฯ‰ < t

Code

lemma pullCount_lt_of_forall_ne (h_lt : โˆ€ s, pullCount A a (s + 1) ฯ‰ โ‰  t) (ht : t โ‰  0) :
    pullCount A a n ฯ‰ < t
Type uses (1)
Body uses (2)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  induction n with
  | zero => simpa using ht.bot_lt
  | succ n hn =>
    specialize h_lt n
    rw [pullCount_add_one] at h_lt โŠข
    grind

exists_pullCount_eq_of_le๐Ÿ”—

LemmaLearning.exists_pullCount_eq_of_le

No docstring.

๐Ÿ”—theorem
Learning.exists_pullCount_eq_of_le.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {n t : โ„•} {ฯ‰ : ฮฉ} (hnm : t โ‰ค pullCount A a (n + 1) ฯ‰) (ht : t โ‰  0) : โˆƒ s, pullCount A a (s + 1) ฯ‰ = t
Learning.exists_pullCount_eq_of_le.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {n t : โ„•} {ฯ‰ : ฮฉ} (hnm : t โ‰ค pullCount A a (n + 1) ฯ‰) (ht : t โ‰  0) : โˆƒ s, pullCount A a (s + 1) ฯ‰ = t

Code

lemma exists_pullCount_eq_of_le (hnm : t โ‰ค pullCount A a (n + 1) ฯ‰) (ht : t โ‰  0) :
    โˆƒ s, pullCount A a (s + 1) ฯ‰ = t
Type uses (1)
Body uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  by_contra! h_contra
  refine lt_irrefl (pullCount A a (n + 1) ฯ‰) ?_
  refine lt_of_lt_of_le ?_ hnm
  exact pullCount_lt_of_forall_ne h_contra ht

pullCount_le_add๐Ÿ”—

LemmaLearning.pullCount_le_add

No docstring.

๐Ÿ”—theorem
Learning.pullCount_le_add.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} (a : ๐“) (n C : โ„•) (ฯ‰ : ฮฉ) : pullCount A a n ฯ‰ โ‰ค C + 1 + โˆ‘ s โˆˆ Finset.range n, Set.indicator {s | A s ฯ‰ = a โˆง C < pullCount A a s ฯ‰} 1 s
Learning.pullCount_le_add.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} (a : ๐“) (n C : โ„•) (ฯ‰ : ฮฉ) : pullCount A a n ฯ‰ โ‰ค C + 1 + โˆ‘ s โˆˆ Finset.range n, Set.indicator {s | A s ฯ‰ = a โˆง C < pullCount A a s ฯ‰} 1 s

Code

lemma pullCount_le_add (a : ๐“) (n C : โ„•) (ฯ‰ : ฮฉ) :
    pullCount A a n ฯ‰ โ‰ค C + 1 +
      โˆ‘ s โˆˆ range n, {s | A s ฯ‰ = a โˆง C < pullCount A a s ฯ‰}.indicator 1 s
Type uses (1)
Body uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  rw [pullCount_eq_sum]
  calc โˆ‘ s โˆˆ range n, if A s ฯ‰ = a then 1 else 0
  _ โ‰ค โˆ‘ s โˆˆ range n, ({s | A s ฯ‰ = a โˆง pullCount A a s ฯ‰ โ‰ค C}.indicator 1 s +
      {s | A s ฯ‰ = a โˆง C < pullCount A a s ฯ‰}.indicator 1 s) := by
    gcongr with s hs
    simp [Set.indicator_apply]
    grind
  _ = โˆ‘ s โˆˆ range n, {s | A s ฯ‰ = a โˆง pullCount A a s ฯ‰ โ‰ค C}.indicator 1 s +
      โˆ‘ s โˆˆ range n, {s | A s ฯ‰ = a โˆง C < pullCount A a s ฯ‰}.indicator 1 s := by
    rw [Finset.sum_add_distrib]
  _ โ‰ค C + 1 + โˆ‘ s โˆˆ range n, {s | A s ฯ‰ = a โˆง C < pullCount A a s ฯ‰}.indicator 1 s := by
    gcongr
    have h_le n : โˆ‘ s โˆˆ range n, {s | A s ฯ‰ = a โˆง pullCount A a s ฯ‰ โ‰ค C}.indicator 1 s โ‰ค
        pullCount A a n ฯ‰ := by
      rw [pullCount_eq_sum]
      gcongr with s hs
      simp only [Set.indicator_apply, Set.mem_setOf_eq, Pi.one_apply]
      grind
    induction n with
    | zero => simp
    | succ n hn =>
      rw [Finset.sum_range_succ]
      rcases le_or_gt (pullCount A a n ฯ‰) C with h_pc | h_pc
      ยท have hn' : โˆ‘ s โˆˆ range n, {s | A s ฯ‰ = a โˆง pullCount A a s ฯ‰ โ‰ค C}.indicator 1 s โ‰ค C :=
          (h_le n).trans h_pc
        grw [hn']
        gcongr
        simp only [Set.indicator_apply, Set.mem_setOf_eq, Pi.one_apply]
        grind
      ยท refine le_trans ?_ hn
        simp [h_pc]

measurable_pullCount๐Ÿ”—

LemmaLearning.measurable_pullCount

No docstring.

๐Ÿ”—theorem
Learning.measurable_pullCount.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (a : ๐“) (t : โ„•) : Measurable fun ฯ‰ => pullCount A a t ฯ‰
Learning.measurable_pullCount.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (a : ๐“) (t : โ„•) : Measurable fun ฯ‰ => pullCount A a t ฯ‰

Code

lemma measurable_pullCount [MeasurableSingletonClass ๐“] (hA : โˆ€ n, Measurable (A n))
    (a : ๐“) (t : โ„•) :
    Measurable (fun ฯ‰ : ฮฉ โ†ฆ pullCount A a t ฯ‰)
Type uses (1)
Body uses (1)
Used by (15)

Actions: Source ยท Open Issue

Proof
by
  simp_rw [pullCount_eq_sum]
  have h_meas s : Measurable (fun ฯ‰ : ฮฉ โ†ฆ if A s ฯ‰ = a then 1 else 0) := by
    refine Measurable.ite ?_ (by fun_prop) (by fun_prop)
    exact (measurableSet_singleton _).preimage (by fun_prop)
  fun_prop

measurable_uncurry_pullCount๐Ÿ”—

LemmaLearning.measurable_uncurry_pullCount

No docstring.

๐Ÿ”—theorem
Learning.measurable_uncurry_pullCount.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [MeasurableEq ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (t : โ„•) : Measurable fun p => pullCount A (Prod.snd p) t (Prod.fst p)
Learning.measurable_uncurry_pullCount.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [MeasurableEq ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (t : โ„•) : Measurable fun p => pullCount A (Prod.snd p) t (Prod.fst p)

Code

lemma measurable_uncurry_pullCount [MeasurableEq ๐“]
    (hA : โˆ€ n, Measurable (A n)) (t : โ„•) :
    Measurable (fun p : ฮฉ ร— ๐“ โ†ฆ pullCount A p.2 t p.1)
Type uses (1)
Body uses (1)
Used by (5)

Actions: Source ยท Open Issue

Proof
by
  simp_rw [pullCount_eq_sum]
  have h_meas s : Measurable (fun h : ฮฉ ร— ๐“ โ†ฆ if A s h.1 = h.2 then 1 else 0) := by
    refine Measurable.ite ?_ (by fun_prop) (by fun_prop)
    exact measurableSet_eq_fun (by fun_prop) (by fun_prop)
  fun_prop

measurable_uncurry_pullCount_comp๐Ÿ”—

LemmaLearning.measurable_uncurry_pullCount_comp

No docstring.

๐Ÿ”—theorem
Learning.measurable_uncurry_pullCount_comp.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [Countable ๐“] [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) {f : ฮฉ โ†’ ๐“} (hf : Measurable f) {g : ฮฉ โ†’ โ„•} (hg : Measurable g) : Measurable fun ฯ‰ => pullCount A (f ฯ‰) (g ฯ‰) ฯ‰
Learning.measurable_uncurry_pullCount_comp.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [Countable ๐“] [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) {f : ฮฉ โ†’ ๐“} (hf : Measurable f) {g : ฮฉ โ†’ โ„•} (hg : Measurable g) : Measurable fun ฯ‰ => pullCount A (f ฯ‰) (g ฯ‰) ฯ‰

Code

lemma measurable_uncurry_pullCount_comp [Countable ๐“] [MeasurableSingletonClass ๐“]
    (hA : โˆ€ n, Measurable (A n)) {f : ฮฉ โ†’ ๐“} (hf : Measurable f) {g : ฮฉ โ†’ โ„•} (hg : Measurable g) :
    Measurable (fun ฯ‰ โ†ฆ pullCount A (f ฯ‰) (g ฯ‰) ฯ‰)
Type uses (1)
Body uses (1)
Used by (6)

Actions: Source ยท Open Issue

Proof
by
  change Measurable ((fun aฯ‰ โ†ฆ pullCount A aฯ‰.1 (g aฯ‰.2) aฯ‰.2) โˆ˜ fun ฯ‰ โ†ฆ (f ฯ‰, ฯ‰))
  apply Measurable.comp _ (by fun_prop)
  refine measurable_from_prod_countable_right fun a โ†ฆ ?_
  change Measurable ((fun tฯ‰ โ†ฆ pullCount A a tฯ‰.1 tฯ‰.2) โˆ˜ fun ฯ‰ โ†ฆ (g ฯ‰, ฯ‰))
  apply Measurable.comp _ (by fun_prop)
  exact measurable_from_prod_countable_right (fun t โ†ฆ measurable_pullCount hA a t)

measurable_pullCount'๐Ÿ”—

LemmaLearning.measurable_pullCount'

No docstring.

๐Ÿ”—theorem
Learning.measurable_pullCount'.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [DecidableEq ๐“] [MeasurableSingletonClass ๐“] (n : โ„•) (a : ๐“) : Measurable fun h => pullCount' n h a
Learning.measurable_pullCount'.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [DecidableEq ๐“] [MeasurableSingletonClass ๐“] (n : โ„•) (a : ๐“) : Measurable fun h => pullCount' n h a

Code

lemma measurable_pullCount' [MeasurableSingletonClass ๐“] (n : โ„•) (a : ๐“) :
    Measurable (fun h : Iic n โ†’ ๐“ ร— R โ†ฆ pullCount' n h a)
Type uses (1)
Body uses (1)
Used by (3)

Actions: Source ยท Open Issue

Proof
by
  simp_rw [pullCount'_eq_sum]
  have h_meas s : Measurable (fun (h : Iic n โ†’ ๐“ ร— R) โ†ฆ if (h s).1 = a then 1 else 0) := by
    refine Measurable.ite ?_ (by fun_prop) (by fun_prop)
    exact (measurableSet_singleton _).preimage (by fun_prop)
  fun_prop

measurable_uncurry_pullCount'๐Ÿ”—

LemmaLearning.measurable_uncurry_pullCount'

No docstring.

๐Ÿ”—theorem
Learning.measurable_uncurry_pullCount'.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [DecidableEq ๐“] [MeasurableEq ๐“] (n : โ„•) : Measurable fun p => pullCount' n (Prod.fst p) (Prod.snd p)
Learning.measurable_uncurry_pullCount'.{u_1, u_2} {๐“ : Type u_1} {R : Type u_2} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} [DecidableEq ๐“] [MeasurableEq ๐“] (n : โ„•) : Measurable fun p => pullCount' n (Prod.fst p) (Prod.snd p)

Code

lemma measurable_uncurry_pullCount' [MeasurableEq ๐“] (n : โ„•) :
    Measurable (fun p : (Iic n โ†’ ๐“ ร— R) ร— ๐“ โ†ฆ pullCount' n p.1 p.2)
Type uses (1)
Body uses (1)
Used by (3)

Actions: Source ยท Open Issue

Proof
by
  simp_rw [pullCount'_eq_sum]
  have h_meas s : Measurable (fun h : (Iic n โ†’ ๐“ ร— R) ร— ๐“ โ†ฆ if (h.1 s).1 = h.2 then 1 else 0) := by
    refine Measurable.ite ?_ (by fun_prop) (by fun_prop)
    exact measurableSet_eq_fun (by fun_prop) (by fun_prop)
  fun_prop

adapted_pullCount_add_one๐Ÿ”—

LemmaLearning.adapted_pullCount_add_one

No docstring.

๐Ÿ”—theorem
Learning.adapted_pullCount_add_one.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (hR' : โˆ€ (n : โ„•), Measurable (R' n)) (a : ๐“) : MeasureTheory.Adapted (IsAlgEnvSeq.filtration hA hR') fun n => pullCount A a (n + 1)
Learning.adapted_pullCount_add_one.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (hR' : โˆ€ (n : โ„•), Measurable (R' n)) (a : ๐“) : MeasureTheory.Adapted (IsAlgEnvSeq.filtration hA hR') fun n => pullCount A a (n + 1)

Code

lemma adapted_pullCount_add_one [MeasurableSingletonClass ๐“]
    (hA : โˆ€ n, Measurable (A n)) (hR' : โˆ€ n, Measurable (R' n)) (a : ๐“) :
    Adapted (IsAlgEnvSeq.filtration hA hR') (fun n โ†ฆ pullCount A a (n + 1))
Type uses (2)
Body uses (5)
Used by (3)

Actions: Source ยท Open Issue

Proof
by
  intro n
  have : pullCount A a (n + 1) = (fun h : Iic n โ†’ ๐“ ร— R โ†ฆ pullCount' n h a) โˆ˜
      (history A R' n) := by
    ext
    exact pullCount_add_one_eq_pullCount'
  rw [measurable_iff_comap_le]
  simp_rw [IsAlgEnvSeq.filtration, this]
  rw [โ† measurable_iff_comap_le]
  exact measurable_comp_comap _ (measurable_pullCount' n a)

stronglyAdapted_pullCount_add_one๐Ÿ”—

LemmaLearning.stronglyAdapted_pullCount_add_one

No docstring.

๐Ÿ”—theorem
Learning.stronglyAdapted_pullCount_add_one.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (hR' : โˆ€ (n : โ„•), Measurable (R' n)) (a : ๐“) : MeasureTheory.StronglyAdapted (IsAlgEnvSeq.filtration hA hR') fun n => pullCount A a (n + 1)
Learning.stronglyAdapted_pullCount_add_one.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (hR' : โˆ€ (n : โ„•), Measurable (R' n)) (a : ๐“) : MeasureTheory.StronglyAdapted (IsAlgEnvSeq.filtration hA hR') fun n => pullCount A a (n + 1)

Code

lemma stronglyAdapted_pullCount_add_one [MeasurableSingletonClass ๐“]
    (hA : โˆ€ n, Measurable (A n)) (hR' : โˆ€ n, Measurable (R' n)) (a : ๐“) :
    StronglyAdapted (IsAlgEnvSeq.filtration hA hR') (fun n โ†ฆ pullCount A a (n + 1))
Type uses (2)
Body uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
(adapted_pullCount_add_one hA hR' a).stronglyAdapted

isPredictable_pullCount๐Ÿ”—

LemmaLearning.isPredictable_pullCount

No docstring.

๐Ÿ”—theorem
Learning.isPredictable_pullCount.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (hR' : โˆ€ (n : โ„•), Measurable (R' n)) (a : ๐“) : MeasureTheory.IsStronglyPredictable (IsAlgEnvSeq.filtration hA hR') (pullCount A a)
Learning.isPredictable_pullCount.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (hR' : โˆ€ (n : โ„•), Measurable (R' n)) (a : ๐“) : MeasureTheory.IsStronglyPredictable (IsAlgEnvSeq.filtration hA hR') (pullCount A a)

Code

lemma isPredictable_pullCount [MeasurableSingletonClass ๐“]
    (hA : โˆ€ n, Measurable (A n)) (hR' : โˆ€ n, Measurable (R' n)) (a : ๐“) :
    IsStronglyPredictable (IsAlgEnvSeq.filtration hA hR') (pullCount A a)
Type uses (2)
Body uses (2)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  rw [IsStronglyPredictable.iff_measurable_add_one]
  refine โŸจ?_, stronglyAdapted_pullCount_add_one hA hR' aโŸฉ
  simp only [pullCount_zero]
  fun_prop

integrable_pullCount๐Ÿ”—

LemmaLearning.integrable_pullCount

No docstring.

๐Ÿ”—theorem
Learning.integrable_pullCount.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (a : ๐“) (n : โ„•) : MeasureTheory.Integrable (fun ฯ‰ => โ†‘(pullCount A a n ฯ‰)) P
Learning.integrable_pullCount.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (a : ๐“) (n : โ„•) : MeasureTheory.Integrable (fun ฯ‰ => โ†‘(pullCount A a n ฯ‰)) P

Code

lemma integrable_pullCount [MeasurableSingletonClass ๐“]
    (hA : โˆ€ n, Measurable (A n)) (a : ๐“) (n : โ„•) :
    Integrable (fun ฯ‰ โ†ฆ (pullCount A a n ฯ‰ : โ„)) P
Type uses (1)
Body uses (2)
Used by (2)

Actions: Source ยท Open Issue

Proof
by
  refine integrable_of_le_of_le (gโ‚ := 0) (gโ‚‚ := fun _ โ†ฆ n) (by fun_prop)
    (ae_of_all _ fun ฯ‰ โ†ฆ by simp) (ae_of_all _ fun ฯ‰ โ†ฆ ?_) (integrable_const _) (integrable_const _)
  simp only [Nat.cast_le]
  exact pullCount_le a n ฯ‰

stepsUntil๐Ÿ”—

DefinitionLearning.stepsUntil

Number of steps until action a was pulled exactly m times.

๐Ÿ”—def
Learning.stepsUntil.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] (A : โ„• โ†’ ฮฉ โ†’ ๐“) (a : ๐“) (m : โ„•) (ฯ‰ : ฮฉ) : โ„•โˆž
Learning.stepsUntil.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] (A : โ„• โ†’ ฮฉ โ†’ ๐“) (a : ๐“) (m : โ„•) (ฯ‰ : ฮฉ) : โ„•โˆž

Code

noncomputable
def stepsUntil (A : โ„• โ†’ ฮฉ โ†’ ๐“) (a : ๐“) (m : โ„•) (ฯ‰ : ฮฉ) : โ„•โˆž :=
  sInf ((โ†‘) '' {s | pullCount A a (s + 1) ฯ‰ = m})
Body uses (1)
Used by (46)

Actions: Source ยท Open Issue

stepsUntil_eq_top_iff๐Ÿ”—

LemmaLearning.stepsUntil_eq_top_iff

No docstring.

๐Ÿ”—theorem
Learning.stepsUntil_eq_top_iff.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {m : โ„•} {ฯ‰ : ฮฉ} : stepsUntil A a m ฯ‰ = โŠค โ†” โˆ€ (s : โ„•), pullCount A a (s + 1) ฯ‰ โ‰  m
Learning.stepsUntil_eq_top_iff.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {m : โ„•} {ฯ‰ : ฮฉ} : stepsUntil A a m ฯ‰ = โŠค โ†” โˆ€ (s : โ„•), pullCount A a (s + 1) ฯ‰ โ‰  m

Code

lemma stepsUntil_eq_top_iff : stepsUntil A a m ฯ‰ = โŠค โ†” โˆ€ s, pullCount A a (s + 1) ฯ‰ โ‰  m
Type uses (2)
Used by (4)

Actions: Source ยท Open Issue

Proof
by
  simp [stepsUntil, sInf_eq_top]

stepsUntil_ne_top๐Ÿ”—

LemmaLearning.stepsUntil_ne_top

No docstring.

๐Ÿ”—theorem
Learning.stepsUntil_ne_top.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {m : โ„•} {ฯ‰ : ฮฉ} (h_exists : โˆƒ s, pullCount A a (s + 1) ฯ‰ = m) : stepsUntil A a m ฯ‰ โ‰  โŠค
Learning.stepsUntil_ne_top.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {m : โ„•} {ฯ‰ : ฮฉ} (h_exists : โˆƒ s, pullCount A a (s + 1) ฯ‰ = m) : stepsUntil A a m ฯ‰ โ‰  โŠค

Code

lemma stepsUntil_ne_top (h_exists : โˆƒ s, pullCount A a (s + 1) ฯ‰ = m) : stepsUntil A a m ฯ‰ โ‰  โŠค
Type uses (2)
Body uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  simpa [stepsUntil_eq_top_iff]

exists_pullCount_eq๐Ÿ”—

LemmaLearning.exists_pullCount_eq

No docstring.

๐Ÿ”—theorem
Learning.exists_pullCount_eq.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {m : โ„•} {ฯ‰ : ฮฉ} (h' : stepsUntil A a m ฯ‰ โ‰  โŠค) : โˆƒ s, pullCount A a (s + 1) ฯ‰ = m
Learning.exists_pullCount_eq.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {m : โ„•} {ฯ‰ : ฮฉ} (h' : stepsUntil A a m ฯ‰ โ‰  โŠค) : โˆƒ s, pullCount A a (s + 1) ฯ‰ = m

Code

lemma exists_pullCount_eq (h' : stepsUntil A a m ฯ‰ โ‰  โŠค) :
    โˆƒ s, pullCount A a (s + 1) ฯ‰ = m
Type uses (2)
Body uses (1)
Used by (5)

Actions: Source ยท Open Issue

Proof
by
  by_contra! h_contra
  rw [โ† stepsUntil_eq_top_iff] at h_contra
  simp [h_contra] at h'

stepsUntil_zero_of_ne๐Ÿ”—

LemmaLearning.stepsUntil_zero_of_ne

No docstring.

๐Ÿ”—theorem
Learning.stepsUntil_zero_of_ne.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {ฯ‰ : ฮฉ} (hka : A 0 ฯ‰ โ‰  a) : stepsUntil A a 0 ฯ‰ = 0
Learning.stepsUntil_zero_of_ne.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {ฯ‰ : ฮฉ} (hka : A 0 ฯ‰ โ‰  a) : stepsUntil A a 0 ฯ‰ = 0

Code

lemma stepsUntil_zero_of_ne (hka : A 0 ฯ‰ โ‰  a) : stepsUntil A a 0 ฯ‰ = 0
Type uses (1)
Body uses (3)
Used by (3)

Actions: Source ยท Open Issue

Proof
by
  unfold stepsUntil
  simp_rw [โ† bot_eq_zero, sInf_eq_bot, bot_eq_zero]
  intro n hn
  refine โŸจ0, ?_, hnโŸฉ
  simp only [Set.mem_image, Set.mem_setOf_eq, Nat.cast_eq_zero, exists_eq_right, zero_add]
  rw [โ† zero_add 1, pullCount_eq_pullCount_of_action_ne hka]
  simp

stepsUntil_zero_of_eq๐Ÿ”—

LemmaLearning.stepsUntil_zero_of_eq

No docstring.

๐Ÿ”—theorem
Learning.stepsUntil_zero_of_eq.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {ฯ‰ : ฮฉ} (hka : A 0 ฯ‰ = a) : stepsUntil A a 0 ฯ‰ = โŠค
Learning.stepsUntil_zero_of_eq.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {ฯ‰ : ฮฉ} (hka : A 0 ฯ‰ = a) : stepsUntil A a 0 ฯ‰ = โŠค

Code

lemma stepsUntil_zero_of_eq (hka : A 0 ฯ‰ = a) : stepsUntil A a 0 ฯ‰ = โŠค
Type uses (1)
Body uses (5)
Used by (2)

Actions: Source ยท Open Issue

Proof
by
  rw [stepsUntil_eq_top_iff]
  suffices 0 < pullCount A a 1 ฯ‰ from fun _ โ†ฆ (this.trans_le (monotone_pullCount _ _ (by lia))).ne'
  rw [โ† hka, โ† zero_add 1, pullCount_action_eq_pullCount_add_one]
  simp

stepsUntil_eq_dite๐Ÿ”—

LemmaLearning.stepsUntil_eq_dite

No docstring.

๐Ÿ”—theorem
Learning.stepsUntil_eq_dite.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} (a : ๐“) (m : โ„•) (ฯ‰ : ฮฉ) [Decidable (โˆƒ s, pullCount A a (s + 1) ฯ‰ = m)] : stepsUntil A a m ฯ‰ = if h : โˆƒ s, pullCount A a (s + 1) ฯ‰ = m then โ†‘(Nat.find h) else โŠค
Learning.stepsUntil_eq_dite.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} (a : ๐“) (m : โ„•) (ฯ‰ : ฮฉ) [Decidable (โˆƒ s, pullCount A a (s + 1) ฯ‰ = m)] : stepsUntil A a m ฯ‰ = if h : โˆƒ s, pullCount A a (s + 1) ฯ‰ = m then โ†‘(Nat.find h) else โŠค

Code

lemma stepsUntil_eq_dite (a : ๐“) (m : โ„•) (ฯ‰ : ฮฉ)
    [Decidable (โˆƒ s, pullCount A a (s + 1) ฯ‰ = m)] :
    stepsUntil A a m ฯ‰ =
      if h : โˆƒ s, pullCount A a (s + 1) ฯ‰ = m then (Nat.find h : โ„•โˆž) else โŠค
Type uses (2)
Used by (7)

Actions: Source ยท Open Issue

Proof
by
  unfold stepsUntil
  split_ifs with h'
  ยท refine le_antisymm ?_ ?_
    ยท refine sInf_le ?_
      simpa using Nat.find_spec h'
    ยท simp only [le_sInf_iff, Set.mem_image, Set.mem_setOf_eq, forall_exists_index, and_imp,
        forall_apply_eq_imp_iffโ‚‚, Nat.cast_le, Nat.find_le_iff]
      exact fun n hn โ†ฆ โŸจn, le_rfl, hnโŸฉ
  ยท push Not at h'
    suffices {s | pullCount A a (s + 1) ฯ‰ = m} = โˆ… by simp [this]
    ext s
    simpa using (h' s)

stepsUntil_eq_leastGE๐Ÿ”—

LemmaLearning.stepsUntil_eq_leastGE

No docstring.

๐Ÿ”—theorem
Learning.stepsUntil_eq_leastGE.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {m : โ„•} (a : ๐“) (hm : m โ‰  0) : stepsUntil A a m = MeasureTheory.leastGE (fun n ฯ‰ => pullCount A a (n + 1) ฯ‰) m
Learning.stepsUntil_eq_leastGE.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {m : โ„•} (a : ๐“) (hm : m โ‰  0) : stepsUntil A a m = MeasureTheory.leastGE (fun n ฯ‰ => pullCount A a (n + 1) ฯ‰) m

Code

lemma stepsUntil_eq_leastGE (a : ๐“) (hm : m โ‰  0) :
    stepsUntil A a m = leastGE (fun n (ฯ‰ : ฮฉ) โ†ฆ pullCount A a (n + 1) ฯ‰) m
Type uses (2)
Body uses (3)
Used by (2)

Actions: Source ยท Open Issue

Proof
by
  classical
  ext ฯ‰
  rw [stepsUntil_eq_dite]
  unfold leastGE hittingAfter
  simp only [Nat.bot_eq_zero, zero_le, Set.mem_Ici, true_and, ENat.some_eq_coe]
  have h_iff : (โˆƒ s, pullCount A a (s + 1) ฯ‰ = m) โ†” (โˆƒ s, m โ‰ค pullCount A a (s + 1) ฯ‰) := by
    refine โŸจfun โŸจs, hsโŸฉ โ†ฆ โŸจs, hs.geโŸฉ, fun โŸจs, hsโŸฉ โ†ฆ ?_โŸฉ
    exact exists_pullCount_eq_of_le hs hm
  by_cases h_exists : โˆƒ s, m โ‰ค pullCount A a (s + 1) ฯ‰
  swap; ยท simp_rw [h_iff]; simp [h_exists]
  rw [if_pos h_exists, dif_pos]
  swap; ยท rwa [h_iff]
  norm_cast
  rw [Nat.find_eq_iff]
  constructor
  ยท apply le_antisymm
    ยท by_contra! h_contra
      obtain โŸจs, hsโŸฉ : โˆƒ s, pullCount A a (s + 1) ฯ‰ = m := exists_pullCount_eq_of_le h_contra.le hm
      rw [โ† hs] at h_contra
      refine h_contra.not_ge ?_
      gcongr
      exact csInf_le (by simp) (by simp)
    ยท exact Nat.sInf_mem (s := {j | m โ‰ค pullCount A a (j + 1) ฯ‰}) h_exists
  ยท intro n hn h_contra
    refine hn.not_ge ?_
    exact csInf_le (by simp) (by simp [h_contra])

stepsUntil_mono๐Ÿ”—

LemmaLearning.stepsUntil_mono

No docstring.

๐Ÿ”—theorem
Learning.stepsUntil_mono.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} (a : ๐“) (ฯ‰ : ฮฉ) {n m : โ„•} (hn : n โ‰  0) (hnm : n โ‰ค m) : stepsUntil A a n ฯ‰ โ‰ค stepsUntil A a m ฯ‰
Learning.stepsUntil_mono.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} (a : ๐“) (ฯ‰ : ฮฉ) {n m : โ„•} (hn : n โ‰  0) (hnm : n โ‰ค m) : stepsUntil A a n ฯ‰ โ‰ค stepsUntil A a m ฯ‰

Code

lemma stepsUntil_mono (a : ๐“) (ฯ‰ : ฮฉ) {n m : โ„•} (hn : n โ‰  0) (hnm : n โ‰ค m) :
    stepsUntil A a n ฯ‰ โ‰ค stepsUntil A a m ฯ‰
Type uses (1)
Body uses (2)

Actions: Source ยท Open Issue

Proof
by
  rw [stepsUntil_eq_leastGE a hn, stepsUntil_eq_leastGE a (by lia)]
  simp_rw [leastGE]
  exact hittingAfter_anti (fun n ฯ‰ โ†ฆ (pullCount A a (n + 1) ฯ‰)) 0 (fun x โ†ฆ by grind) ฯ‰

stepsUntil_pullCount_le๐Ÿ”—

LemmaLearning.stepsUntil_pullCount_le

No docstring.

๐Ÿ”—theorem
Learning.stepsUntil_pullCount_le.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} (ฯ‰ : ฮฉ) (a : ๐“) (t : โ„•) : stepsUntil A a (pullCount A a (t + 1) ฯ‰) ฯ‰ โ‰ค โ†‘t
Learning.stepsUntil_pullCount_le.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} (ฯ‰ : ฮฉ) (a : ๐“) (t : โ„•) : stepsUntil A a (pullCount A a (t + 1) ฯ‰) ฯ‰ โ‰ค โ†‘t

Code

lemma stepsUntil_pullCount_le (ฯ‰ : ฮฉ) (a : ๐“) (t : โ„•) :
    stepsUntil A a (pullCount A a (t + 1) ฯ‰) ฯ‰ โ‰ค t
Type uses (2)
Used by (2)

Actions: Source ยท Open Issue

Proof
by
  rw [stepsUntil]
  exact csInf_le (OrderBot.bddBelow _) โŸจt, rfl, rflโŸฉ

stepsUntil_pullCount_eq๐Ÿ”—

LemmaLearning.stepsUntil_pullCount_eq

No docstring.

๐Ÿ”—theorem
Learning.stepsUntil_pullCount_eq.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} (ฯ‰ : ฮฉ) (t : โ„•) : stepsUntil A (A t ฯ‰) (pullCount A (A t ฯ‰) (t + 1) ฯ‰) ฯ‰ = โ†‘t
Learning.stepsUntil_pullCount_eq.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} (ฯ‰ : ฮฉ) (t : โ„•) : stepsUntil A (A t ฯ‰) (pullCount A (A t ฯ‰) (t + 1) ฯ‰) ฯ‰ = โ†‘t

Code

lemma stepsUntil_pullCount_eq (ฯ‰ : ฮฉ) (t : โ„•) :
    stepsUntil A (A t ฯ‰) (pullCount A (A t ฯ‰) (t + 1) ฯ‰) ฯ‰ = t
Type uses (2)
Body uses (3)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  apply le_antisymm (stepsUntil_pullCount_le ฯ‰ (A t ฯ‰) t)
  suffices โˆ€ t', pullCount A (A t ฯ‰) (t' + 1) ฯ‰ = pullCount A (A t ฯ‰) t ฯ‰ + 1 โ†’ t โ‰ค t' by
    simpa [stepsUntil, pullCount_action_eq_pullCount_add_one]
  exact fun t' h' โ†ฆ Nat.le_of_lt_succ ((monotone_pullCount (A t ฯ‰) ฯ‰).reflect_lt
    (h' โ–ธ lt_add_one _))

stepsUntil_one_of_eq๐Ÿ”—

LemmaLearning.stepsUntil_one_of_eq

If we pull action a at time 0, the first time at which it is pulled once is 0.

๐Ÿ”—theorem
Learning.stepsUntil_one_of_eq.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {ฯ‰ : ฮฉ} (hka : A 0 ฯ‰ = a) : stepsUntil A a 1 ฯ‰ = 0
Learning.stepsUntil_one_of_eq.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {ฯ‰ : ฮฉ} (hka : A 0 ฯ‰ = a) : stepsUntil A a 1 ฯ‰ = 0

Code

lemma stepsUntil_one_of_eq (hka : A 0 ฯ‰ = a) : stepsUntil A a 1 ฯ‰ = 0
Type uses (1)
Body uses (3)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  classical
  have h_pull : pullCount A a 1 ฯ‰ = 1 := by simp [pullCount_one, hka]
  have h_le := stepsUntil_pullCount_le (A := A) ฯ‰ a 0
  simpa [h_pull] using h_le

stepsUntil_eq_zero_iff๐Ÿ”—

LemmaLearning.stepsUntil_eq_zero_iff

No docstring.

๐Ÿ”—theorem
Learning.stepsUntil_eq_zero_iff.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {m : โ„•} {ฯ‰ : ฮฉ} : stepsUntil A a m ฯ‰ = 0 โ†” m = 0 โˆง A 0 ฯ‰ โ‰  a โˆจ m = 1 โˆง A 0 ฯ‰ = a
Learning.stepsUntil_eq_zero_iff.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {m : โ„•} {ฯ‰ : ฮฉ} : stepsUntil A a m ฯ‰ = 0 โ†” m = 0 โˆง A 0 ฯ‰ โ‰  a โˆจ m = 1 โˆง A 0 ฯ‰ = a

Code

lemma stepsUntil_eq_zero_iff :
    stepsUntil A a m ฯ‰ = 0 โ†” (m = 0 โˆง A 0 ฯ‰ โ‰  a) โˆจ (m = 1 โˆง A 0 ฯ‰ = a)
Type uses (1)
Body uses (6)
Used by (3)

Actions: Source ยท Open Issue

Proof
by
  classical
  refine โŸจfun h' โ†ฆ ?_, fun h' โ†ฆ ?_โŸฉ
  ยท have h_exists : โˆƒ s, pullCount A a (s + 1) ฯ‰ = m := exists_pullCount_eq (by simp [h'])
    simp only [stepsUntil_eq_dite, h_exists, โ†“reduceDIte, Nat.cast_eq_zero, Nat.find_eq_zero,
      zero_add] at h'
    rw [pullCount_one] at h'
    by_cases hka : A 0 ฯ‰ = a <;> simp_all
  ยท cases h' with
  | inl h =>
    rw [h.1, stepsUntil_zero_of_ne h.2]
  | inr h =>
    rw [h.1]
    exact stepsUntil_one_of_eq h.2

action_stepsUntil๐Ÿ”—

LemmaLearning.action_stepsUntil

No docstring.

๐Ÿ”—theorem
Learning.action_stepsUntil.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {m : โ„•} {ฯ‰ : ฮฉ} (hm : m โ‰  0) (h_exists : โˆƒ s, pullCount A a (s + 1) ฯ‰ = m) : A (ENat.toNat (stepsUntil A a m ฯ‰)) ฯ‰ = a
Learning.action_stepsUntil.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {m : โ„•} {ฯ‰ : ฮฉ} (hm : m โ‰  0) (h_exists : โˆƒ s, pullCount A a (s + 1) ฯ‰ = m) : A (ENat.toNat (stepsUntil A a m ฯ‰)) ฯ‰ = a

Code

lemma action_stepsUntil (hm : m โ‰  0) (h_exists : โˆƒ s, pullCount A a (s + 1) ฯ‰ = m) :
    A (stepsUntil A a m ฯ‰).toNat ฯ‰ = a
Type uses (2)
Body uses (3)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  classical
  simp only [stepsUntil_eq_dite, h_exists, โ†“reduceDIte, ENat.toNat_coe]
  have h_spec := Nat.find_spec h_exists
  have h_spec' n := Nat.find_min h_exists (m := n)
  by_cases h_zero : Nat.find h_exists = 0
  ยท simp only [h_zero, zero_add, not_lt_zero, IsEmpty.forall_iff, implies_true] at *
    by_contra h_ne
    rw [โ† zero_add 1, pullCount_eq_pullCount_of_action_ne h_ne] at h_spec
    simp only [pullCount_zero] at h_spec
    exact hm h_spec.symm
  have h_pos : 0 < Nat.find h_exists := Nat.pos_of_ne_zero h_zero
  by_contra h_ne
  refine h_spec' (Nat.find h_exists - 1) ?_ ?_
  ยท simp [h_pos]
  rw [Nat.sub_add_cancel (by omega)]
  rwa [โ† pullCount_eq_pullCount_of_action_ne]
  exact h_ne

action_eq_of_stepsUntil_eq_coe๐Ÿ”—

LemmaLearning.action_eq_of_stepsUntil_eq_coe

No docstring.

๐Ÿ”—theorem
Learning.action_eq_of_stepsUntil_eq_coe.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {m n : โ„•} {ฯ‰ : ฮฉ} (hm : m โ‰  0) (h : stepsUntil A a m ฯ‰ = โ†‘n) : A n ฯ‰ = a
Learning.action_eq_of_stepsUntil_eq_coe.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {m n : โ„•} {ฯ‰ : ฮฉ} (hm : m โ‰  0) (h : stepsUntil A a m ฯ‰ = โ†‘n) : A n ฯ‰ = a

Code

lemma action_eq_of_stepsUntil_eq_coe (hm : m โ‰  0) (h : stepsUntil A a m ฯ‰ = n) :
    A n ฯ‰ = a
Type uses (1)
Body uses (3)
Used by (2)

Actions: Source ยท Open Issue

Proof
by
  have : n = (stepsUntil A a m ฯ‰).toNat := by simp [h]
  rw [this]
  have h_exists : โˆƒ s, pullCount A a (s + 1) ฯ‰ = m := exists_pullCount_eq (by simp [h])
  exact action_stepsUntil hm h_exists

pullCount_stepsUntil_add_one๐Ÿ”—

LemmaLearning.pullCount_stepsUntil_add_one

No docstring.

๐Ÿ”—theorem
Learning.pullCount_stepsUntil_add_one.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {m : โ„•} {ฯ‰ : ฮฉ} (h_exists : โˆƒ s, pullCount A a (s + 1) ฯ‰ = m) : pullCount A a (ENat.toNat (stepsUntil A a m ฯ‰ + 1)) ฯ‰ = m
Learning.pullCount_stepsUntil_add_one.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {m : โ„•} {ฯ‰ : ฮฉ} (h_exists : โˆƒ s, pullCount A a (s + 1) ฯ‰ = m) : pullCount A a (ENat.toNat (stepsUntil A a m ฯ‰ + 1)) ฯ‰ = m

Code

lemma pullCount_stepsUntil_add_one (h_exists : โˆƒ s, pullCount A a (s + 1) ฯ‰ = m) :
    pullCount A a (stepsUntil A a m ฯ‰ + 1).toNat ฯ‰ = m
Type uses (2)
Body uses (1)
Used by (3)

Actions: Source ยท Open Issue

Proof
by
  classical
  have h_eq := stepsUntil_eq_dite (A := A) a m ฯ‰
  simp only [h_exists, โ†“reduceDIte] at h_eq
  have h' := Nat.find_spec h_exists
  rw [h_eq]
  rw [ENat.toNat_add (by simp) (by simp)]
  simp only [ENat.toNat_coe, ENat.toNat_one]
  exact h'

pullCount_stepsUntil๐Ÿ”—

LemmaLearning.pullCount_stepsUntil

No docstring.

๐Ÿ”—theorem
Learning.pullCount_stepsUntil.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {m : โ„•} {ฯ‰ : ฮฉ} (hm : m โ‰  0) (h_exists : โˆƒ s, pullCount A a (s + 1) ฯ‰ = m) : pullCount A a (ENat.toNat (stepsUntil A a m ฯ‰)) ฯ‰ = m - 1
Learning.pullCount_stepsUntil.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {m : โ„•} {ฯ‰ : ฮฉ} (hm : m โ‰  0) (h_exists : โˆƒ s, pullCount A a (s + 1) ฯ‰ = m) : pullCount A a (ENat.toNat (stepsUntil A a m ฯ‰)) ฯ‰ = m - 1

Code

lemma pullCount_stepsUntil (hm : m โ‰  0) (h_exists : โˆƒ s, pullCount A a (s + 1) ฯ‰ = m) :
    pullCount A a (stepsUntil A a m ฯ‰).toNat ฯ‰ = m - 1
Type uses (2)
Body uses (4)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  have h_action := action_eq_of_stepsUntil_eq_coe (A := A) (n := (stepsUntil A a m ฯ‰).toNat)
    (a := a) (ฯ‰ := ฯ‰) hm ?_
  swap; ยท symm; simpa [stepsUntil_eq_top_iff]
  have h_add_one := pullCount_stepsUntil_add_one h_exists
  nth_rw 1 [โ† h_action] at h_add_one
  rw [ENat.toNat_add ?_ (by simp), ENat.toNat_one, pullCount_action_eq_pullCount_add_one]
    at h_add_one
  swap; ยท simpa [stepsUntil_eq_top_iff]
  grind

pullCount_lt_of_le_stepsUntil๐Ÿ”—

LemmaLearning.pullCount_lt_of_le_stepsUntil

No docstring.

๐Ÿ”—theorem
Learning.pullCount_lt_of_le_stepsUntil.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} (a : ๐“) {n m : โ„•} (ฯ‰ : ฮฉ) (h_exists : โˆƒ s, pullCount A a (s + 1) ฯ‰ = m) (hn : โ†‘n < stepsUntil A a m ฯ‰) : pullCount A a (n + 1) ฯ‰ < m
Learning.pullCount_lt_of_le_stepsUntil.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} (a : ๐“) {n m : โ„•} (ฯ‰ : ฮฉ) (h_exists : โˆƒ s, pullCount A a (s + 1) ฯ‰ = m) (hn : โ†‘n < stepsUntil A a m ฯ‰) : pullCount A a (n + 1) ฯ‰ < m

Code

lemma pullCount_lt_of_le_stepsUntil (a : ๐“) {n m : โ„•} (ฯ‰ : ฮฉ)
    (h_exists : โˆƒ s, pullCount A a (s + 1) ฯ‰ = m) (hn : n < stepsUntil A a m ฯ‰) :
    pullCount A a (n + 1) ฯ‰ < m
Type uses (2)
Body uses (4)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  classical
  have h_eq := stepsUntil_eq_dite (A := A) a m ฯ‰
  simp only [h_exists, โ†“reduceDIte] at h_eq
  rw [โ† ENat.coe_toNat (stepsUntil_ne_top h_exists)] at hn
  refine lt_of_le_of_ne ?_ ?_
  ยท calc pullCount A a (n + 1) ฯ‰
    _ โ‰ค pullCount A a (stepsUntil A a m ฯ‰ + 1).toNat ฯ‰ := by
      refine monotone_pullCount a ฯ‰ ?_
      rw [ENat.toNat_add (stepsUntil_ne_top h_exists) (by simp)]
      simp only [ENat.toNat_one, add_le_add_iff_right]
      exact mod_cast hn.le
    _ = m := pullCount_stepsUntil_add_one h_exists
  ยท refine Nat.find_min h_exists (m := n) ?_
    suffices n < (stepsUntil A a m ฯ‰).toNat by
      rwa [h_eq, ENat.toNat_coe] at this
    exact mod_cast hn

pullCount_eq_of_stepsUntil_eq_coe๐Ÿ”—

LemmaLearning.pullCount_eq_of_stepsUntil_eq_coe

No docstring.

๐Ÿ”—theorem
Learning.pullCount_eq_of_stepsUntil_eq_coe.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {m n : โ„•} {ฯ‰ : ฮฉ} (hm : m โ‰  0) (h : stepsUntil A a m ฯ‰ = โ†‘n) : pullCount A a n ฯ‰ = m - 1
Learning.pullCount_eq_of_stepsUntil_eq_coe.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {m n : โ„•} {ฯ‰ : ฮฉ} (hm : m โ‰  0) (h : stepsUntil A a m ฯ‰ = โ†‘n) : pullCount A a n ฯ‰ = m - 1

Code

lemma pullCount_eq_of_stepsUntil_eq_coe {ฯ‰ : ฮฉ} (hm : m โ‰  0)
    (h : stepsUntil A a m ฯ‰ = n) :
    pullCount A a n ฯ‰ = m - 1
Type uses (2)
Body uses (2)

Actions: Source ยท Open Issue

Proof
by
  have : n = (stepsUntil A a m ฯ‰).toNat := by simp [h]
  rw [this, pullCount_stepsUntil hm]
  exact exists_pullCount_eq (by simp [h])

pullCount_add_one_eq_of_stepsUntil_eq_coe๐Ÿ”—

LemmaLearning.pullCount_add_one_eq_of_stepsUntil_eq_coe

No docstring.

๐Ÿ”—theorem
Learning.pullCount_add_one_eq_of_stepsUntil_eq_coe.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {m n : โ„•} {ฯ‰ : ฮฉ} (h : stepsUntil A a m ฯ‰ = โ†‘n) : pullCount A a (n + 1) ฯ‰ = m
Learning.pullCount_add_one_eq_of_stepsUntil_eq_coe.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {m n : โ„•} {ฯ‰ : ฮฉ} (h : stepsUntil A a m ฯ‰ = โ†‘n) : pullCount A a (n + 1) ฯ‰ = m

Code

lemma pullCount_add_one_eq_of_stepsUntil_eq_coe {ฯ‰ : ฮฉ}
    (h : stepsUntil A a m ฯ‰ = n) :
    pullCount A a (n + 1) ฯ‰ = m
Type uses (2)
Body uses (2)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  have : n + 1 = (stepsUntil A a m ฯ‰ + 1).toNat := by
    rw [ENat.toNat_add (by simp [h]) (by simp)]; simp [h]
  rw [this, pullCount_stepsUntil_add_one]
  exact exists_pullCount_eq (by simp [h])

stepsUntil_eq_iff๐Ÿ”—

LemmaLearning.stepsUntil_eq_iff

No docstring.

๐Ÿ”—theorem
Learning.stepsUntil_eq_iff.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {m : โ„•} {ฯ‰ : ฮฉ} (n : โ„•) : stepsUntil A a m ฯ‰ = โ†‘n โ†” pullCount A a (n + 1) ฯ‰ = m โˆง โˆ€ k < n, pullCount A a (k + 1) ฯ‰ < m
Learning.stepsUntil_eq_iff.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {m : โ„•} {ฯ‰ : ฮฉ} (n : โ„•) : stepsUntil A a m ฯ‰ = โ†‘n โ†” pullCount A a (n + 1) ฯ‰ = m โˆง โˆ€ k < n, pullCount A a (k + 1) ฯ‰ < m

Code

lemma stepsUntil_eq_iff {ฯ‰ : ฮฉ} (n : โ„•) :
    stepsUntil A a m ฯ‰ = n โ†”
      pullCount A a (n + 1) ฯ‰ = m โˆง (โˆ€ k < n, pullCount A a (k + 1) ฯ‰ < m)
Type uses (2)
Body uses (4)
Used by (2)

Actions: Source ยท Open Issue

Proof
by
  refine โŸจfun h โ†ฆ ?_, fun h โ†ฆ ?_โŸฉ
  ยท have h_exists : โˆƒ s, pullCount A a (s + 1) ฯ‰ = m := exists_pullCount_eq (by simp [h])
    refine โŸจpullCount_add_one_eq_of_stepsUntil_eq_coe h, fun k hk โ†ฆ ?_โŸฉ
    exact pullCount_lt_of_le_stepsUntil a ฯ‰ h_exists (by rw [h]; exact mod_cast hk)
  ยท classical
    rw [stepsUntil_eq_dite a m ฯ‰, dif_pos โŸจn, h.1โŸฉ]
    simp only [Nat.cast_inj]
    rw [Nat.find_eq_iff]
    exact โŸจh.1, fun k hk โ†ฆ (h.2 k hk).neโŸฉ

stepsUntil_eq_iff'๐Ÿ”—

LemmaLearning.stepsUntil_eq_iff'

No docstring.

๐Ÿ”—theorem
Learning.stepsUntil_eq_iff'.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {m : โ„•} {ฯ‰ : ฮฉ} (hm : m โ‰  0) (n : โ„•) : stepsUntil A a m ฯ‰ = โ†‘n โ†” A n ฯ‰ = a โˆง pullCount A a n ฯ‰ = m - 1
Learning.stepsUntil_eq_iff'.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {m : โ„•} {ฯ‰ : ฮฉ} (hm : m โ‰  0) (n : โ„•) : stepsUntil A a m ฯ‰ = โ†‘n โ†” A n ฯ‰ = a โˆง pullCount A a n ฯ‰ = m - 1

Code

lemma stepsUntil_eq_iff' {ฯ‰ : ฮฉ} (hm : m โ‰  0) (n : โ„•) :
    stepsUntil A a m ฯ‰ = n โ†” A n ฯ‰ = a โˆง pullCount A a n ฯ‰ = m - 1
Type uses (2)
Body uses (5)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  by_cases hn : n = 0
  ยท simp [hn, stepsUntil_eq_zero_iff, hm]
    grind
  rw [stepsUntil_eq_iff n]
  refine โŸจfun โŸจh1, h2โŸฉ โ†ฆ โŸจ?_, ?_โŸฉ, fun โŸจh1, h2โŸฉ โ†ฆ โŸจ?_, fun k hk โ†ฆ ?_โŸฉโŸฉ
  ยท rw [pullCount_add_one] at h1
    specialize h2 (n - 1) (by lia)
    grind
  ยท rw [pullCount_add_one] at h1
    specialize h2 (n - 1) (by lia)
    grind
  ยท rw [pullCount_add_one, h1, h2]
    grind
  ยท rw [Nat.lt_iff_le_pred (by grind)]
    rw [โ† h2]
    refine monotone_pullCount a ฯ‰ ?_
    grind

stepsUntil_eq_congr๐Ÿ”—

LemmaLearning.stepsUntil_eq_congr

No docstring.

๐Ÿ”—theorem
Learning.stepsUntil_eq_congr.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {m n : โ„•} {ฯ‰ ฯ‰' : ฮฉ} (h_eq : โˆ€ i โ‰ค n, A i ฯ‰ = A i ฯ‰') : stepsUntil A a m ฯ‰ = โ†‘n โ†” stepsUntil A a m ฯ‰' = โ†‘n
Learning.stepsUntil_eq_congr.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {m n : โ„•} {ฯ‰ ฯ‰' : ฮฉ} (h_eq : โˆ€ i โ‰ค n, A i ฯ‰ = A i ฯ‰') : stepsUntil A a m ฯ‰ = โ†‘n โ†” stepsUntil A a m ฯ‰' = โ†‘n

Code

lemma stepsUntil_eq_congr {ฯ‰' : ฮฉ} (h_eq : โˆ€ i โ‰ค n, A i ฯ‰ = A i ฯ‰') :
    stepsUntil A a m ฯ‰ = n โ†” stepsUntil A a m ฯ‰' = n
Type uses (1)
Body uses (3)

Actions: Source ยท Open Issue

Proof
by
  simp_rw [stepsUntil_eq_iff n]
  congr! 1
  ยท rw [pullCount_congr h_eq]
  ยท congr! 3 with k hk
    rw [pullCount_congr]
    grind

isStoppingTime_stepsUntil๐Ÿ”—

LemmaLearning.isStoppingTime_stepsUntil

No docstring.

๐Ÿ”—theorem
Learning.isStoppingTime_stepsUntil.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} {m : โ„•} [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (hR' : โˆ€ (n : โ„•), Measurable (R' n)) (a : ๐“) (hm : m โ‰  0) : MeasureTheory.IsStoppingTime (IsAlgEnvSeq.filtration hA hR') (stepsUntil A a m)
Learning.isStoppingTime_stepsUntil.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} {m : โ„•} [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (hR' : โˆ€ (n : โ„•), Measurable (R' n)) (a : ๐“) (hm : m โ‰  0) : MeasureTheory.IsStoppingTime (IsAlgEnvSeq.filtration hA hR') (stepsUntil A a m)

Code

lemma isStoppingTime_stepsUntil [MeasurableSingletonClass ๐“]
    (hA : โˆ€ n, Measurable (A n)) (hR' : โˆ€ n, Measurable (R' n)) (a : ๐“) (hm : m โ‰  0) :
    IsStoppingTime (IsAlgEnvSeq.filtration hA hR') (stepsUntil A a m)
Type uses (2)
Body uses (3)

Actions: Source ยท Open Issue

Proof
by
  rw [stepsUntil_eq_leastGE _ hm]
  refine StronglyAdapted.isStoppingTime_leastGE _ fun n โ†ฆ ?_
  suffices StronglyMeasurable[IsAlgEnvSeq.filtration hA hR' n] (pullCount A a (n + 1)) by
    fun_prop
  refine Measurable.stronglyMeasurable ?_
  exact adapted_pullCount_add_one hA hR' a n

measurable_stepsUntil๐Ÿ”—

LemmaLearning.measurable_stepsUntil

No docstring.

๐Ÿ”—theorem
Learning.measurable_stepsUntil.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (a : ๐“) (m : โ„•) : Measurable (stepsUntil A a m)
Learning.measurable_stepsUntil.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (a : ๐“) (m : โ„•) : Measurable (stepsUntil A a m)

Code

lemma measurable_stepsUntil [MeasurableSingletonClass ๐“]
    (hA : โˆ€ n, Measurable (A n)) (a : ๐“) (m : โ„•) :
    Measurable (stepsUntil A a m)
Type uses (1)
Body uses (4)
Used by (4)

Actions: Source ยท Open Issue

Proof
by
  classical
  have h_union : {h' : ฮฉ | โˆƒ s, pullCount A a (s + 1) h' = m}
      = โ‹ƒ s : โ„•, {h' | pullCount A a (s + 1) h' = m} := by ext; simp
  have h_meas_set : MeasurableSet {h' : ฮฉ | โˆƒ s, pullCount A a (s + 1) h' = m} := by
    rw [h_union]
    refine MeasurableSet.iUnion fun s โ†ฆ (measurableSet_singleton _).preimage ?_
    exact measurable_pullCount hA a (s + 1)
  suffices Measurable fun k โ†ฆ if h : k โˆˆ {k' | โˆƒ s, pullCount A a (s + 1) k' = m}
      then (Nat.find h : โ„•โˆž) else โŠค by
    convert this with ฯ‰
    rw [stepsUntil_eq_dite a m ฯ‰]
    rfl
  refine Measurable.dite (s := {k' : ฮฉ | โˆƒ s, pullCount A a (s + 1) k' = m})
    (f := fun x โ†ฆ (Nat.find x.2 : โ„•โˆž)) (g := fun _ โ†ฆ โŠค) ?_ (by fun_prop) h_meas_set
  refine Measurable.coe_nat_enat ?_
  refine measurable_find _ fun k โ†ฆ ?_
  suffices MeasurableSet {x : ฮฉ | pullCount A a (k + 1) x = m} by
    have : Subtype.val '' {x : {k' : ฮฉ |
          โˆƒ s, pullCount A a (s + 1) k' = m} | pullCount A a (k + 1) (x : ฮฉ) = m}
        = {x : ฮฉ | pullCount A a (k + 1) x = m} := by
      ext x
      simp only [Set.mem_setOf_eq, Set.coe_setOf, Set.mem_image, Subtype.exists, exists_and_left,
        exists_prop, exists_eq_right_right, and_iff_left_iff_imp]
      exact fun h โ†ฆ โŸจ_, hโŸฉ
    refine (MeasurableEmbedding.subtype_coe h_meas_set).measurableSet_image.mp ?_
    rw [this]
    exact (measurableSet_singleton _).preimage (by fun_prop)
  exact (measurableSet_singleton _).preimage (by fun_prop)

measurable_stepsUntil'๐Ÿ”—

LemmaLearning.measurable_stepsUntil'

No docstring.

๐Ÿ”—theorem
Learning.measurable_stepsUntil'.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (a : ๐“) (m : โ„•) : Measurable fun ฯ‰ => stepsUntil A a m (Prod.fst ฯ‰)
Learning.measurable_stepsUntil'.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (a : ๐“) (m : โ„•) : Measurable fun ฯ‰ => stepsUntil A a m (Prod.fst ฯ‰)

Code

lemma measurable_stepsUntil' [MeasurableSingletonClass ๐“]
    (hA : โˆ€ n, Measurable (A n)) (a : ๐“) (m : โ„•) :
    Measurable (fun ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ R) โ†ฆ stepsUntil A a m ฯ‰.1)
Type uses (1)
Body uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
(measurable_stepsUntil hA a m).comp measurable_fst

measurable_comap_indicator_stepsUntil_eq๐Ÿ”—

LemmaLearning.measurable_comap_indicator_stepsUntil_eq

No docstring.

๐Ÿ”—theorem
Learning.measurable_comap_indicator_stepsUntil_eq.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (hR' : โˆ€ (n : โ„•), Measurable (R' n)) (a : ๐“) (m n : โ„•) : Measurable (Set.indicator {ฯ‰ | stepsUntil A a m ฯ‰ = โ†‘n} fun x => 1)
Learning.measurable_comap_indicator_stepsUntil_eq.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (hR' : โˆ€ (n : โ„•), Measurable (R' n)) (a : ๐“) (m n : โ„•) : Measurable (Set.indicator {ฯ‰ | stepsUntil A a m ฯ‰ = โ†‘n} fun x => 1)

Code

lemma measurable_comap_indicator_stepsUntil_eq [MeasurableSingletonClass ๐“]
    (hA : โˆ€ n, Measurable (A n)) (hR' : โˆ€ n, Measurable (R' n)) (a : ๐“) (m n : โ„•) :
    Measurable[MeasurableSpace.comap
        (fun ฯ‰ : ฮฉ โ†ฆ (history A R' (n-1) ฯ‰, A n ฯ‰)) inferInstance]
      ({ฯ‰ | stepsUntil A a m ฯ‰ = โ†‘n}.indicator fun _ โ†ฆ 1)
Type uses (2)
Body uses (8)
Used by (3)

Actions: Source ยท Open Issue

Proof
by
  by_cases hm : m = 0
  ยท simp only [hm]
    by_cases hn : n = 0
    ยท subst hn
      simp only [CharP.cast_eq_zero, stepsUntil_eq_zero_iff, ne_eq, true_and, zero_ne_one,
        false_and, or_false]
      refine Measurable.indicator measurable_const ?_
      refine (measurableSet_singleton _).compl.preimage ?_
      rw [measurable_iff_comap_le, Prod.instMeasurableSpace, MeasurableSpace.comap_prodMk]
      exact le_sup_of_le_right le_rfl
    ยท have : {ฯ‰ | stepsUntil A a 0 ฯ‰ = n} = โˆ… := by
        ext ฯ‰
        by_cases ha : A 0 ฯ‰ = a
        ยท simp [stepsUntil_zero_of_eq ha]
        ยท simp only [Set.mem_setOf_eq, stepsUntil_zero_of_ne ha, Set.mem_empty_iff_false,
            iff_false]
          norm_cast
          exact Ne.symm hn
      simp [this]
  simp_rw [stepsUntil_eq_iff' hm]
  refine Measurable.indicator measurable_const ?_
  refine ((measurableSet_singleton _).preimage ?_).inter ((measurableSet_singleton _).preimage ?_)
  ยท rw [measurable_iff_comap_le, Prod.instMeasurableSpace, MeasurableSpace.comap_prodMk]
    exact le_sup_of_le_right le_rfl
  ยท rw [measurable_iff_comap_le, Prod.instMeasurableSpace, MeasurableSpace.comap_prodMk]
    refine le_sup_of_le_left ?_
    rw [โ† measurable_iff_comap_le]
    by_cases hn : n = 0
    ยท simp only [hn, pullCount_zero]
      exact measurable_const
    have h_meas := adapted_pullCount_add_one hA hR' a (n - 1)
    have : 1 โ‰ค n := by grind
    convert h_meas using 1
    ยท rfl
    ยท simp [Nat.sub_add_cancel this]

measurable_indicator_stepsUntil_eq๐Ÿ”—

LemmaLearning.measurable_indicator_stepsUntil_eq

No docstring.

๐Ÿ”—theorem
Learning.measurable_indicator_stepsUntil_eq.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (hR' : โˆ€ (n : โ„•), Measurable (R' n)) (a : ๐“) (m n : โ„•) : Measurable (Set.indicator {ฯ‰ | stepsUntil A a m ฯ‰ = โ†‘n} fun x => 1)
Learning.measurable_indicator_stepsUntil_eq.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (hR' : โˆ€ (n : โ„•), Measurable (R' n)) (a : ๐“) (m n : โ„•) : Measurable (Set.indicator {ฯ‰ | stepsUntil A a m ฯ‰ = โ†‘n} fun x => 1)

Code

lemma measurable_indicator_stepsUntil_eq [MeasurableSingletonClass ๐“]
    (hA : โˆ€ n, Measurable (A n)) (hR' : โˆ€ n, Measurable (R' n)) (a : ๐“) (m n : โ„•) :
    Measurable ({ฯ‰ : ฮฉ | stepsUntil A a m ฯ‰ = โ†‘n}.indicator fun _ โ†ฆ 1)
Type uses (1)
Body uses (3)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  refine (measurable_comap_indicator_stepsUntil_eq hA hR' a m n).mono ?_ le_rfl
  refine Measurable.comap_le ?_
  fun_prop

measurableSet_stepsUntil_eq_zero๐Ÿ”—

LemmaLearning.measurableSet_stepsUntil_eq_zero

No docstring.

๐Ÿ”—theorem
Learning.measurableSet_stepsUntil_eq_zero.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [MeasurableSingletonClass ๐“] (a : ๐“) (m : โ„•) : MeasurableSet {ฯ‰ | stepsUntil A a m ฯ‰ = 0}
Learning.measurableSet_stepsUntil_eq_zero.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [MeasurableSingletonClass ๐“] (a : ๐“) (m : โ„•) : MeasurableSet {ฯ‰ | stepsUntil A a m ฯ‰ = 0}

Code

lemma measurableSet_stepsUntil_eq_zero [MeasurableSingletonClass ๐“] (a : ๐“) (m : โ„•) :
    MeasurableSet[MeasurableSpace.comap (A 0) inferInstance]
      {ฯ‰ : ฮฉ | stepsUntil A a m ฯ‰ = 0}
Type uses (1)
Body uses (1)
Used by (2)

Actions: Source ยท Open Issue

Proof
by
  simp only [stepsUntil_eq_zero_iff (a := a) (m := m), ne_eq]
  by_cases hm : m = 0
  ยท simp only [hm, true_and, zero_ne_one, false_and, or_false]
    refine (measurableSet_singleton _).compl.preimage ?_
    rw [measurable_iff_comap_le]
  by_cases hm1 : m = 1
  swap; ยท simp [hm, hm1]
  simp only [hm1, one_ne_zero, false_and, true_and, false_or]
  refine (measurableSet_singleton _).preimage ?_
  rw [measurable_iff_comap_le]

measurable_comap_indicator_stepsUntil_eq_zero๐Ÿ”—

LemmaLearning.measurable_comap_indicator_stepsUntil_eq_zero

No docstring.

๐Ÿ”—theorem
Learning.measurable_comap_indicator_stepsUntil_eq_zero.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [MeasurableSingletonClass ๐“] (a : ๐“) (m : โ„•) : Measurable (Set.indicator {ฯ‰ | stepsUntil A a m ฯ‰ = 0} fun x => 1)
Learning.measurable_comap_indicator_stepsUntil_eq_zero.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [MeasurableSingletonClass ๐“] (a : ๐“) (m : โ„•) : Measurable (Set.indicator {ฯ‰ | stepsUntil A a m ฯ‰ = 0} fun x => 1)

Code

lemma measurable_comap_indicator_stepsUntil_eq_zero [MeasurableSingletonClass ๐“] (a : ๐“) (m : โ„•) :
    Measurable[MeasurableSpace.comap (A 0) inferInstance]
      ({ฯ‰ | stepsUntil A a m ฯ‰ = 0}.indicator fun _ โ†ฆ 1)
Type uses (1)
Body uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  rw [measurable_indicator_const_iff]
  exact measurableSet_stepsUntil_eq_zero a m

measurableSet_stepsUntil_eq๐Ÿ”—

LemmaLearning.measurableSet_stepsUntil_eq

No docstring.

๐Ÿ”—theorem
Learning.measurableSet_stepsUntil_eq.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (hR' : โˆ€ (n : โ„•), Measurable (R' n)) (a : ๐“) (m n : โ„•) : MeasurableSet {ฯ‰ | stepsUntil A a m ฯ‰ = โ†‘n}
Learning.measurableSet_stepsUntil_eq.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (hR' : โˆ€ (n : โ„•), Measurable (R' n)) (a : ๐“) (m n : โ„•) : MeasurableSet {ฯ‰ | stepsUntil A a m ฯ‰ = โ†‘n}

Code

lemma measurableSet_stepsUntil_eq [MeasurableSingletonClass ๐“]
    (hA : โˆ€ n, Measurable (A n)) (hR' : โˆ€ n, Measurable (R' n)) (a : ๐“) (m n : โ„•) :
    MeasurableSet[MeasurableSpace.comap (fun ฯ‰ : ฮฉ โ†ฆ (history A R' (n-1) ฯ‰, A n ฯ‰))
        inferInstance]
      {ฯ‰ : ฮฉ | stepsUntil A a m ฯ‰ = โ†‘n}
Type uses (2)
Body uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  let mProd := MeasurableSpace.comap
    (fun ฯ‰ : ฮฉ โ†ฆ (history A R' (n-1) ฯ‰, A n ฯ‰)) inferInstance
  suffices Measurable[mProd] ({ฯ‰ | stepsUntil A a m ฯ‰ = โ†‘n}.indicator fun x โ†ฆ 1) by
    rwa [measurable_indicator_const_iff] at this
  exact measurable_comap_indicator_stepsUntil_eq hA hR' a m n

isStoppingTime_stepsUntil_filtrationAction๐Ÿ”—

LemmaLearning.isStoppingTime_stepsUntil_filtrationAction

stepsUntil a m is a stopping time with respect to the filtration filtrationAction.

๐Ÿ”—theorem
Learning.isStoppingTime_stepsUntil_filtrationAction.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (hR' : โˆ€ (n : โ„•), Measurable (R' n)) (a : ๐“) (m : โ„•) : MeasureTheory.IsStoppingTime (IsAlgEnvSeq.filtrationAction hA hR') (stepsUntil A a m)
Learning.isStoppingTime_stepsUntil_filtrationAction.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (hR' : โˆ€ (n : โ„•), Measurable (R' n)) (a : ๐“) (m : โ„•) : MeasureTheory.IsStoppingTime (IsAlgEnvSeq.filtrationAction hA hR') (stepsUntil A a m)

Code

lemma isStoppingTime_stepsUntil_filtrationAction [MeasurableSingletonClass ๐“]
    (hA : โˆ€ n, Measurable (A n)) (hR' : โˆ€ n, Measurable (R' n)) (a : ๐“) (m : โ„•) :
    IsStoppingTime (IsAlgEnvSeq.filtrationAction hA hR') (stepsUntil A a m)
Type uses (2)
Body uses (4)

Actions: Source ยท Open Issue

Proof
by
  refine isStoppingTime_of_measurableSet_eq fun n โ†ฆ ?_
  by_cases hn : n = 0
  ยท subst hn
    simp only [WithTop.coe_zero]
    exact measurableSet_stepsUntil_eq_zero a m
  ยท rw [IsAlgEnvSeq.filtrationAction_eq_comap _ hn]
    exact measurableSet_stepsUntil_eq hA hR' a m n

rewardByCount๐Ÿ”—

DefinitionLearning.rewardByCount

Reward obtained when pulling action a for the m-th time. If it is never pulled m times, the reward is given by the second component of ฯ‰, which in applications will be indepedent with same law.

๐Ÿ”—def
Learning.rewardByCount.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] (A : โ„• โ†’ ฮฉ โ†’ ๐“) (R' : โ„• โ†’ ฮฉ โ†’ R) (a : ๐“) (m : โ„•) (ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ R)) : R
Learning.rewardByCount.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] (A : โ„• โ†’ ฮฉ โ†’ ๐“) (R' : โ„• โ†’ ฮฉ โ†’ R) (a : ๐“) (m : โ„•) (ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ R)) : R

Code

noncomputable
def rewardByCount (A : โ„• โ†’ ฮฉ โ†’ ๐“) (R' : โ„• โ†’ ฮฉ โ†’ R) (a : ๐“) (m : โ„•) (ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ R)) : R :=
  match (stepsUntil A a m ฯ‰.1) with
  | โŠค => ฯ‰.2 m a
  | (n : โ„•) => R' n ฯ‰.1
Body uses (1)
Used by (15)

Actions: Source ยท Open Issue

rewardByCount_eq_ite๐Ÿ”—

LemmaLearning.rewardByCount_eq_ite

No docstring.

๐Ÿ”—theorem
Learning.rewardByCount_eq_ite.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} (a : ๐“) (m : โ„•) (ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ R)) : rewardByCount A R' a m ฯ‰ = if stepsUntil A a m (Prod.fst ฯ‰) = โŠค then Prod.snd ฯ‰ m a else R' (ENat.toNat (stepsUntil A a m (Prod.fst ฯ‰))) (Prod.fst ฯ‰)
Learning.rewardByCount_eq_ite.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} (a : ๐“) (m : โ„•) (ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ R)) : rewardByCount A R' a m ฯ‰ = if stepsUntil A a m (Prod.fst ฯ‰) = โŠค then Prod.snd ฯ‰ m a else R' (ENat.toNat (stepsUntil A a m (Prod.fst ฯ‰))) (Prod.fst ฯ‰)

Code

lemma rewardByCount_eq_ite (a : ๐“) (m : โ„•) (ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ R)) :
    rewardByCount A R' a m ฯ‰ =
      if (stepsUntil A a m ฯ‰.1) = โŠค then ฯ‰.2 m a else R' (stepsUntil A a m ฯ‰.1).toNat ฯ‰.1
Type uses (2)
Used by (6)

Actions: Source ยท Open Issue

Proof
by
  unfold rewardByCount
  cases stepsUntil A a m ฯ‰.1
  ยท simp; rfl
  ยท simp

rewardByCount_eq_add๐Ÿ”—

LemmaLearning.rewardByCount_eq_add

No docstring.

๐Ÿ”—theorem
Learning.rewardByCount_eq_add.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} [AddMonoid R] (a : ๐“) (m : โ„•) : rewardByCount A R' a m = (Set.indicator {ฯ‰ | stepsUntil A a m (Prod.fst ฯ‰) โ‰  โŠค} fun ฯ‰ => R' (ENat.toNat (stepsUntil A a m (Prod.fst ฯ‰))) (Prod.fst ฯ‰)) + Set.indicator {ฯ‰ | stepsUntil A a m (Prod.fst ฯ‰) = โŠค} fun ฯ‰ => Prod.snd ฯ‰ m a
Learning.rewardByCount_eq_add.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} [AddMonoid R] (a : ๐“) (m : โ„•) : rewardByCount A R' a m = (Set.indicator {ฯ‰ | stepsUntil A a m (Prod.fst ฯ‰) โ‰  โŠค} fun ฯ‰ => R' (ENat.toNat (stepsUntil A a m (Prod.fst ฯ‰))) (Prod.fst ฯ‰)) + Set.indicator {ฯ‰ | stepsUntil A a m (Prod.fst ฯ‰) = โŠค} fun ฯ‰ => Prod.snd ฯ‰ m a

Code

lemma rewardByCount_eq_add [AddMonoid R] (a : ๐“) (m : โ„•) :
    rewardByCount A R' a m =
      {ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ R) | stepsUntil A a m ฯ‰.1 โ‰  โŠค}.indicator
          (fun ฯ‰ โ†ฆ R' (stepsUntil A a m ฯ‰.1).toNat ฯ‰.1)
        + {ฯ‰ | stepsUntil A a m ฯ‰.1 = โŠค}.indicator (fun ฯ‰ โ†ฆ ฯ‰.2 m a)
Type uses (2)
Body uses (1)

Actions: Source ยท Open Issue

Proof
by
  ext ฯ‰
  simp only [rewardByCount_eq_ite, ne_eq, Pi.add_apply, Set.indicator_apply, Set.mem_setOf_eq,
    ite_not]
  grind

rewardByCount_of_stepsUntil_eq_top๐Ÿ”—

LemmaLearning.rewardByCount_of_stepsUntil_eq_top

No docstring.

๐Ÿ”—theorem
Learning.rewardByCount_of_stepsUntil_eq_top.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} {a : ๐“} {m : โ„•} {ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ R)} (h : stepsUntil A a m (Prod.fst ฯ‰) = โŠค) : rewardByCount A R' a m ฯ‰ = Prod.snd ฯ‰ m a
Learning.rewardByCount_of_stepsUntil_eq_top.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} {a : ๐“} {m : โ„•} {ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ R)} (h : stepsUntil A a m (Prod.fst ฯ‰) = โŠค) : rewardByCount A R' a m ฯ‰ = Prod.snd ฯ‰ m a

Code

lemma rewardByCount_of_stepsUntil_eq_top (h : stepsUntil A a m ฯ‰.1 = โŠค) :
    rewardByCount A R' a m ฯ‰ = ฯ‰.2 m a
Type uses (2)
Body uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
by simp [rewardByCount_eq_ite, h]

rewardByCount_of_stepsUntil_ne_top๐Ÿ”—

LemmaLearning.rewardByCount_of_stepsUntil_ne_top

No docstring.

๐Ÿ”—theorem
Learning.rewardByCount_of_stepsUntil_ne_top.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} {a : ๐“} {m : โ„•} {ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ R)} (h : stepsUntil A a m (Prod.fst ฯ‰) โ‰  โŠค) : rewardByCount A R' a m ฯ‰ = R' (ENat.toNat (stepsUntil A a m (Prod.fst ฯ‰))) (Prod.fst ฯ‰)
Learning.rewardByCount_of_stepsUntil_ne_top.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} {a : ๐“} {m : โ„•} {ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ R)} (h : stepsUntil A a m (Prod.fst ฯ‰) โ‰  โŠค) : rewardByCount A R' a m ฯ‰ = R' (ENat.toNat (stepsUntil A a m (Prod.fst ฯ‰))) (Prod.fst ฯ‰)

Code

lemma rewardByCount_of_stepsUntil_ne_top (h : stepsUntil A a m ฯ‰.1 โ‰  โŠค) :
    rewardByCount A R' a m ฯ‰ = R' (stepsUntil A a m ฯ‰.1).toNat ฯ‰.1
Type uses (2)
Body uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  simp [rewardByCount_eq_ite, h]

rewardByCount_eq_stoppedValue๐Ÿ”—

LemmaLearning.rewardByCount_eq_stoppedValue

No docstring.

๐Ÿ”—theorem
Learning.rewardByCount_eq_stoppedValue.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} {a : ๐“} {m : โ„•} {ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ R)} (h : stepsUntil A a m (Prod.fst ฯ‰) โ‰  โŠค) : rewardByCount A R' a m ฯ‰ = MeasureTheory.stoppedValue R' (stepsUntil A a m) (Prod.fst ฯ‰)
Learning.rewardByCount_eq_stoppedValue.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} {a : ๐“} {m : โ„•} {ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ R)} (h : stepsUntil A a m (Prod.fst ฯ‰) โ‰  โŠค) : rewardByCount A R' a m ฯ‰ = MeasureTheory.stoppedValue R' (stepsUntil A a m) (Prod.fst ฯ‰)

Code

lemma rewardByCount_eq_stoppedValue (h : stepsUntil A a m ฯ‰.1 โ‰  โŠค) :
    rewardByCount A R' a m ฯ‰ = stoppedValue R' (stepsUntil A a m) ฯ‰.1
Type uses (2)
Body uses (1)

Actions: Source ยท Open Issue

Proof
by
  rw [rewardByCount_of_stepsUntil_ne_top h, stoppedValue]
  lift stepsUntil A a m ฯ‰.1 to โ„• using h with n
  simp

rewardByCount_of_stepsUntil_eq_coe๐Ÿ”—

LemmaLearning.rewardByCount_of_stepsUntil_eq_coe

No docstring.

๐Ÿ”—theorem
Learning.rewardByCount_of_stepsUntil_eq_coe.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} {a : ๐“} {m n : โ„•} {ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ R)} (h : stepsUntil A a m (Prod.fst ฯ‰) = โ†‘n) : rewardByCount A R' a m ฯ‰ = R' n (Prod.fst ฯ‰)
Learning.rewardByCount_of_stepsUntil_eq_coe.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} {a : ๐“} {m n : โ„•} {ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ R)} (h : stepsUntil A a m (Prod.fst ฯ‰) = โ†‘n) : rewardByCount A R' a m ฯ‰ = R' n (Prod.fst ฯ‰)

Code

lemma rewardByCount_of_stepsUntil_eq_coe (h : stepsUntil A a m ฯ‰.1 = n) :
    rewardByCount A R' a m ฯ‰ = R' n ฯ‰.1
Type uses (2)
Body uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
by simp [rewardByCount_eq_ite, h]

rewardByCount_zero๐Ÿ”—

LemmaLearning.rewardByCount_zero

The value at 0 does not matter (it would be the "zeroth" reward). It should be considered a junk value.

๐Ÿ”—theorem
Learning.rewardByCount_zero.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} (a : ๐“) (ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ R)) : rewardByCount A R' a 0 ฯ‰ = if A 0 (Prod.fst ฯ‰) = a then Prod.snd ฯ‰ 0 a else R' 0 (Prod.fst ฯ‰)
Learning.rewardByCount_zero.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} (a : ๐“) (ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ R)) : rewardByCount A R' a 0 ฯ‰ = if A 0 (Prod.fst ฯ‰) = a then Prod.snd ฯ‰ 0 a else R' 0 (Prod.fst ฯ‰)

Code

lemma rewardByCount_zero (a : ๐“) (ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ R)) :
    rewardByCount A R' a 0 ฯ‰ = if A 0 ฯ‰.1 = a then ฯ‰.2 0 a else R' 0 ฯ‰.1
Type uses (1)
Body uses (4)

Actions: Source ยท Open Issue

Proof
by
  rw [rewardByCount_eq_ite]
  by_cases ha : A 0 ฯ‰.1 = a
  ยท simp [ha, stepsUntil_zero_of_eq]
  ยท simp [stepsUntil_zero_of_ne, ha]

rewardByCount_pullCount_add_one_eq_reward๐Ÿ”—

LemmaLearning.rewardByCount_pullCount_add_one_eq_reward

No docstring.

๐Ÿ”—theorem
Learning.rewardByCount_pullCount_add_one_eq_reward.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} (t : โ„•) (ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ R)) : rewardByCount A R' (A t (Prod.fst ฯ‰)) (pullCount A (A t (Prod.fst ฯ‰)) t (Prod.fst ฯ‰) + 1) ฯ‰ = R' t (Prod.fst ฯ‰)
Learning.rewardByCount_pullCount_add_one_eq_reward.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} (t : โ„•) (ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ R)) : rewardByCount A R' (A t (Prod.fst ฯ‰)) (pullCount A (A t (Prod.fst ฯ‰)) t (Prod.fst ฯ‰) + 1) ฯ‰ = R' t (Prod.fst ฯ‰)

Code

lemma rewardByCount_pullCount_add_one_eq_reward (t : โ„•) (ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ R)) :
    rewardByCount A R' (A t ฯ‰.1) (pullCount A (A t ฯ‰.1) t ฯ‰.1 + 1) ฯ‰ = R' t ฯ‰.1
Type uses (2)
Body uses (3)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  rw [rewardByCount, โ† pullCount_action_eq_pullCount_add_one, stepsUntil_pullCount_eq]

measurable_rewardByCount๐Ÿ”—

LemmaLearning.measurable_rewardByCount

No docstring.

๐Ÿ”—theorem
Learning.measurable_rewardByCount.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (hR' : โˆ€ (n : โ„•), Measurable (R' n)) (a : ๐“) (m : โ„•) : Measurable fun ฯ‰ => rewardByCount A R' a m ฯ‰
Learning.measurable_rewardByCount.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mR : MeasurableSpace R} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ R} [MeasurableSingletonClass ๐“] (hA : โˆ€ (n : โ„•), Measurable (A n)) (hR' : โˆ€ (n : โ„•), Measurable (R' n)) (a : ๐“) (m : โ„•) : Measurable fun ฯ‰ => rewardByCount A R' a m ฯ‰

Code

lemma measurable_rewardByCount [MeasurableSingletonClass ๐“]
    (hA : โˆ€ n, Measurable (A n)) (hR' : โˆ€ n, Measurable (R' n)) (a : ๐“) (m : โ„•) :
    Measurable (fun ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ R) โ†ฆ rewardByCount A R' a m ฯ‰)
Type uses (1)
Body uses (4)
Used by (4)

Actions: Source ยท Open Issue

Proof
by
  simp_rw [rewardByCount_eq_ite]
  refine Measurable.ite ?_ ?_ ?_
  ยท exact (measurableSet_singleton _).preimage <| measurable_stepsUntil' hA a m
  ยท fun_prop
  ยท change Measurable ((fun p : โ„• ร— ฮฉ โ†ฆ R' p.1 p.2)
      โˆ˜ (fun ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ R) โ†ฆ ((stepsUntil A a m ฯ‰.1).toNat, ฯ‰.1)))
    have : Measurable fun ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ R) โ†ฆ ((stepsUntil A a m ฯ‰.1).toNat, ฯ‰.1) :=
      (measurable_stepsUntil' hA a m).toNat.prodMk (by fun_prop)
    refine Measurable.comp ?_ this
    refine measurable_from_prod_countable_right fun n โ†ฆ ?_
    simp only
    fun_prop

sum_pullCount_mul๐Ÿ”—

LemmaLearning.sum_pullCount_mul

No docstring.

๐Ÿ”—theorem
Learning.sum_pullCount_mul.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [Fintype ๐“] [Semiring R] (ฯ‰ : ฮฉ) (f : ๐“ โ†’ R) (t : โ„•) : โˆ‘ a, โ†‘(pullCount A a t ฯ‰) * f a = โˆ‘ s โˆˆ Finset.range t, f (A s ฯ‰)
Learning.sum_pullCount_mul.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [Fintype ๐“] [Semiring R] (ฯ‰ : ฮฉ) (f : ๐“ โ†’ R) (t : โ„•) : โˆ‘ a, โ†‘(pullCount A a t ฯ‰) * f a = โˆ‘ s โˆˆ Finset.range t, f (A s ฯ‰)

Code

lemma sum_pullCount_mul [Fintype ๐“] [Semiring R] (ฯ‰ : ฮฉ) (f : ๐“ โ†’ R) (t : โ„•) :
    โˆ‘ a, pullCount A a t ฯ‰ * f a = โˆ‘ s โˆˆ range t, f (A s ฯ‰)
Type uses (1)
Used by (2)

Actions: Source ยท Open Issue

Proof
by
  unfold pullCount
  classical
  simp_rw [card_eq_sum_ones]
  push_cast
  simp_rw [sum_mul, one_mul]
  exact sum_fiberwise' (range t) (A ยท ฯ‰) f

sum_pullCount๐Ÿ”—

LemmaLearning.sum_pullCount

No docstring.

๐Ÿ”—theorem
Learning.sum_pullCount.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {t : โ„•} [Fintype ๐“] {ฯ‰ : ฮฉ} : โˆ‘ a, pullCount A a t ฯ‰ = t
Learning.sum_pullCount.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {t : โ„•} [Fintype ๐“] {ฯ‰ : ฮฉ} : โˆ‘ a, pullCount A a t ฯ‰ = t

Code

lemma sum_pullCount [Fintype ๐“] {ฯ‰ : ฮฉ} : โˆ‘ a, pullCount A a t ฯ‰ = t
Type uses (1)
Body uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  suffices โˆ‘ a, pullCount A a t ฯ‰ * (1 : โ„) = t by norm_cast at this; simpa
  rw [sum_pullCount_mul]
  simp

sum_comp_pullCount๐Ÿ”—

LemmaLearning.sum_comp_pullCount

No docstring.

๐Ÿ”—theorem
Learning.sum_comp_pullCount.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [Fintype ๐“] [AddCommMonoid R] (f : โ„• โ†’ R) (t : โ„•) (ฯ‰ : ฮฉ) : โˆ‘ s โˆˆ Finset.range t, f (pullCount A (A s ฯ‰) s ฯ‰) = โˆ‘ a, โˆ‘ j โˆˆ Finset.range (pullCount A a t ฯ‰), f j
Learning.sum_comp_pullCount.{u_1, u_2, u_3} {๐“ : Type u_1} {R : Type u_2} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [Fintype ๐“] [AddCommMonoid R] (f : โ„• โ†’ R) (t : โ„•) (ฯ‰ : ฮฉ) : โˆ‘ s โˆˆ Finset.range t, f (pullCount A (A s ฯ‰) s ฯ‰) = โˆ‘ a, โˆ‘ j โˆˆ Finset.range (pullCount A a t ฯ‰), f j

Code

lemma sum_comp_pullCount [Fintype ๐“] [AddCommMonoid R] (f : โ„• โ†’ R) (t : โ„•) (ฯ‰ : ฮฉ) :
    โˆ‘ s โˆˆ range t, f (pullCount A (A s ฯ‰) s ฯ‰) = โˆ‘ a, โˆ‘ j โˆˆ range (pullCount A a t ฯ‰), f j
Type uses (1)
Body uses (2)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  induction t with
  | zero => simp
  | succ n ih =>
    have hf : f (pullCount A (A n ฯ‰) n ฯ‰) =
      โˆ‘ a, if A n ฯ‰ = a then f (pullCount A a n ฯ‰) else 0 := by simp
    simp_rw [sum_range_succ, ih, hf, โ† sum_add_distrib, pullCount_add_one]
    congr 1 with a
    split_ifs
    ยท simp [sum_range_succ]
    ยท simp

sumRewards๐Ÿ”—

DefinitionLearning.sumRewards

Sum of rewards obtained when pulling action a up to time t (exclusive).

๐Ÿ”—def
Learning.sumRewards.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] (A : โ„• โ†’ ฮฉ โ†’ ๐“) (R' : โ„• โ†’ ฮฉ โ†’ โ„) (a : ๐“) (t : โ„•) (ฯ‰ : ฮฉ) : โ„
Learning.sumRewards.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] (A : โ„• โ†’ ฮฉ โ†’ ๐“) (R' : โ„• โ†’ ฮฉ โ†’ โ„) (a : ๐“) (t : โ„•) (ฯ‰ : ฮฉ) : โ„

Code

def sumRewards (A : โ„• โ†’ ฮฉ โ†’ ๐“) (R' : โ„• โ†’ ฮฉ โ†’ โ„) (a : ๐“) (t : โ„•) (ฯ‰ : ฮฉ) : โ„ :=
  โˆ‘ s โˆˆ range t, if A s ฯ‰ = a then R' s ฯ‰ else 0
Used by (44)

Actions: Source ยท Open Issue

sumRewards'๐Ÿ”—

DefinitionLearning.sumRewards'

Sum of rewards of arm a up to (and including) time n.

๐Ÿ”—def
Learning.sumRewards'.{u_1} {๐“ : Type u_1} [DecidableEq ๐“] (n : โ„•) (h : โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— โ„) (a : ๐“) : โ„
Learning.sumRewards'.{u_1} {๐“ : Type u_1} [DecidableEq ๐“] (n : โ„•) (h : โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— โ„) (a : ๐“) : โ„

Code

noncomputable
def sumRewards' (n : โ„•) (h : Iic n โ†’ ๐“ ร— โ„) (a : ๐“) :=
  โˆ‘ s, if (h s).1 = a then (h s).2 else 0
Used by (9)

Actions: Source ยท Open Issue

empMean๐Ÿ”—

DefinitionLearning.empMean

Empirical mean reward obtained when pulling action a up to time t (exclusive).

๐Ÿ”—def
Learning.empMean.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] (A : โ„• โ†’ ฮฉ โ†’ ๐“) (R' : โ„• โ†’ ฮฉ โ†’ โ„) (a : ๐“) (t : โ„•) (ฯ‰ : ฮฉ) : โ„
Learning.empMean.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] (A : โ„• โ†’ ฮฉ โ†’ ๐“) (R' : โ„• โ†’ ฮฉ โ†’ โ„) (a : ๐“) (t : โ„•) (ฯ‰ : ฮฉ) : โ„

Code

noncomputable
def empMean (A : โ„• โ†’ ฮฉ โ†’ ๐“) (R' : โ„• โ†’ ฮฉ โ†’ โ„) (a : ๐“) (t : โ„•) (ฯ‰ : ฮฉ) : โ„ :=
  sumRewards A R' a t ฯ‰ / pullCount A a t ฯ‰
Body uses (2)
Used by (34)

Actions: Source ยท Open Issue

empMean'๐Ÿ”—

DefinitionLearning.empMean'

Empirical mean of arm a at time n.

๐Ÿ”—def
Learning.empMean'.{u_1} {๐“ : Type u_1} [DecidableEq ๐“] (n : โ„•) (h : โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— โ„) (a : ๐“) : โ„
Learning.empMean'.{u_1} {๐“ : Type u_1} [DecidableEq ๐“] (n : โ„•) (h : โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— โ„) (a : ๐“) : โ„

Code

noncomputable
def empMean' (n : โ„•) (h : Iic n โ†’ ๐“ ร— โ„) (a : ๐“) :=
  (sumRewards' n h a) / (pullCount' n h a)
Body uses (2)
Used by (18)

Actions: Source ยท Open Issue

sumRewards_zero๐Ÿ”—

LemmaLearning.sumRewards_zero

No docstring.

๐Ÿ”—theorem
Learning.sumRewards_zero.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {R' : โ„• โ†’ ฮฉ โ†’ โ„} : sumRewards A R' a 0 = 0
Learning.sumRewards_zero.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {R' : โ„• โ†’ ฮฉ โ†’ โ„} : sumRewards A R' a 0 = 0

Code

lemma sumRewards_zero {R' : โ„• โ†’ ฮฉ โ†’ โ„} : sumRewards A R' a 0 = 0
Type uses (1)
Used by (3)

Actions: Source ยท Open Issue

Proof
by ext; simp [sumRewards]

sumRewards_add_one๐Ÿ”—

LemmaLearning.sumRewards_add_one

No docstring.

๐Ÿ”—theorem
Learning.sumRewards_add_one.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {t : โ„•} {ฯ‰ : ฮฉ} {R' : โ„• โ†’ ฮฉ โ†’ โ„} : sumRewards A R' a (t + 1) ฯ‰ = sumRewards A R' a t ฯ‰ + if A t ฯ‰ = a then R' t ฯ‰ else 0
Learning.sumRewards_add_one.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {t : โ„•} {ฯ‰ : ฮฉ} {R' : โ„• โ†’ ฮฉ โ†’ โ„} : sumRewards A R' a (t + 1) ฯ‰ = sumRewards A R' a t ฯ‰ + if A t ฯ‰ = a then R' t ฯ‰ else 0

Code

lemma sumRewards_add_one {R' : โ„• โ†’ ฮฉ โ†’ โ„} :
    sumRewards A R' a (t + 1) ฯ‰ = sumRewards A R' a t ฯ‰ + if A t ฯ‰ = a then R' t ฯ‰ else 0
Type uses (1)
Used by (3)

Actions: Source ยท Open Issue

Proof
by
  unfold sumRewards
  rw [sum_range_succ]

sumRewards_eq_of_pullCount_eq๐Ÿ”—

LemmaLearning.sumRewards_eq_of_pullCount_eq

No docstring.

๐Ÿ”—theorem
Learning.sumRewards_eq_of_pullCount_eq.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {ฯ‰ : ฮฉ} {R' : โ„• โ†’ ฮฉ โ†’ โ„} {s t : โ„•} (h_eq : pullCount A a s ฯ‰ = pullCount A a t ฯ‰) : sumRewards A R' a s ฯ‰ = sumRewards A R' a t ฯ‰
Learning.sumRewards_eq_of_pullCount_eq.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {ฯ‰ : ฮฉ} {R' : โ„• โ†’ ฮฉ โ†’ โ„} {s t : โ„•} (h_eq : pullCount A a s ฯ‰ = pullCount A a t ฯ‰) : sumRewards A R' a s ฯ‰ = sumRewards A R' a t ฯ‰

Code

lemma sumRewards_eq_of_pullCount_eq {R' : โ„• โ†’ ฮฉ โ†’ โ„} {s t : โ„•}
    (h_eq : pullCount A a s ฯ‰ = pullCount A a t ฯ‰) :
    sumRewards A R' a s ฯ‰ = sumRewards A R' a t ฯ‰
Type uses (2)
Body uses (3)

Actions: Source ยท Open Issue

Proof
by
  wlog hst : s โ‰ค t
  ยท have hts : t โ‰ค s := by lia
    exact (this h_eq.symm hts).symm
  induction t, hst using Nat.le_induction with
  | base => rfl
  | succ t hst' ih =>
    have h_mono' : pullCount A a t ฯ‰ โ‰ค pullCount A a (t + 1) ฯ‰ := pullCount_mono a (Nat.le_succ t) ฯ‰
    have h_eq_t : pullCount A a s ฯ‰ = pullCount A a t ฯ‰ :=
      le_antisymm (pullCount_mono a hst' ฯ‰) (h_eq โ–ธ h_mono')
    have hne : A t ฯ‰ โ‰  a := by
      intro ha
      have h1 := ha โ–ธ pullCount_action_eq_pullCount_add_one (A := A) t ฯ‰
      lia
    rw [sumRewards_add_one, if_neg hne, add_zero, ih h_eq_t]

sumRewards_eq_pullCount_mul_empMean๐Ÿ”—

LemmaLearning.sumRewards_eq_pullCount_mul_empMean

No docstring.

๐Ÿ”—theorem
Learning.sumRewards_eq_pullCount_mul_empMean.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {t : โ„•} {R' : โ„• โ†’ ฮฉ โ†’ โ„} {ฯ‰ : ฮฉ} (h_pull : pullCount A a t ฯ‰ โ‰  0) : sumRewards A R' a t ฯ‰ = โ†‘(pullCount A a t ฯ‰) * empMean A R' a t ฯ‰
Learning.sumRewards_eq_pullCount_mul_empMean.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {t : โ„•} {R' : โ„• โ†’ ฮฉ โ†’ โ„} {ฯ‰ : ฮฉ} (h_pull : pullCount A a t ฯ‰ โ‰  0) : sumRewards A R' a t ฯ‰ = โ†‘(pullCount A a t ฯ‰) * empMean A R' a t ฯ‰

Code

lemma sumRewards_eq_pullCount_mul_empMean {R' : โ„• โ†’ ฮฉ โ†’ โ„} {ฯ‰ : ฮฉ}
    (h_pull : pullCount A a t ฯ‰ โ‰  0) :
    sumRewards A R' a t ฯ‰ = pullCount A a t ฯ‰ * empMean A R' a t ฯ‰
Type uses (3)
Used by (1)

Actions: Source ยท Open Issue

Proof
by unfold empMean; field_simp

sum_rewardByCount_eq_sumRewards๐Ÿ”—

LemmaLearning.sum_rewardByCount_eq_sumRewards

No docstring.

๐Ÿ”—theorem
Learning.sum_rewardByCount_eq_sumRewards.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ โ„} (a : ๐“) (t : โ„•) (ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ โ„)) : โˆ‘ m โˆˆ Finset.Icc 1 (pullCount A a t (Prod.fst ฯ‰)), rewardByCount A R' a m ฯ‰ = sumRewards A R' a t (Prod.fst ฯ‰)
Learning.sum_rewardByCount_eq_sumRewards.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {R' : โ„• โ†’ ฮฉ โ†’ โ„} (a : ๐“) (t : โ„•) (ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ โ„)) : โˆ‘ m โˆˆ Finset.Icc 1 (pullCount A a t (Prod.fst ฯ‰)), rewardByCount A R' a m ฯ‰ = sumRewards A R' a t (Prod.fst ฯ‰)

Code

lemma sum_rewardByCount_eq_sumRewards {R' : โ„• โ†’ ฮฉ โ†’ โ„} (a : ๐“) (t : โ„•) (ฯ‰ : ฮฉ ร— (โ„• โ†’ ๐“ โ†’ โ„)) :
    โˆ‘ m โˆˆ Icc 1 (pullCount A a t ฯ‰.1), rewardByCount A R' a m ฯ‰ = sumRewards A R' a t ฯ‰.1
Type uses (3)
Body uses (3)

Actions: Source ยท Open Issue

Proof
by
  induction t with
  | zero => simp [pullCount, sumRewards]
  | succ t ht =>
    by_cases hta : A t ฯ‰.1 = a
    ยท rw [โ† hta] at ht โŠข
      rw [pullCount_action_eq_pullCount_add_one, sum_Icc_succ_top (Nat.le_add_left 1 _), ht]
      unfold sumRewards
      rw [sum_range_succ, if_pos rfl, rewardByCount_pullCount_add_one_eq_reward]
    ยท unfold sumRewards
      rwa [pullCount_eq_pullCount_of_action_ne hta, sum_range_succ, if_neg hta, add_zero]

sumRewards_add_one_eq_sumRewards'๐Ÿ”—

LemmaLearning.sumRewards_add_one_eq_sumRewards'

No docstring.

๐Ÿ”—theorem
Learning.sumRewards_add_one_eq_sumRewards'.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {R' : โ„• โ†’ ฮฉ โ†’ โ„} {n : โ„•} {ฯ‰ : ฮฉ} : sumRewards A R' a (n + 1) ฯ‰ = sumRewards' n (fun i => (A (โ†‘i) ฯ‰, R' (โ†‘i) ฯ‰)) a
Learning.sumRewards_add_one_eq_sumRewards'.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {R' : โ„• โ†’ ฮฉ โ†’ โ„} {n : โ„•} {ฯ‰ : ฮฉ} : sumRewards A R' a (n + 1) ฯ‰ = sumRewards' n (fun i => (A (โ†‘i) ฯ‰, R' (โ†‘i) ฯ‰)) a

Code

lemma sumRewards_add_one_eq_sumRewards' {R' : โ„• โ†’ ฮฉ โ†’ โ„} {n : โ„•} {ฯ‰ : ฮฉ} :
    sumRewards A R' a (n + 1) ฯ‰ = sumRewards' n (fun i โ†ฆ (A i ฯ‰, R' i ฯ‰)) a
Type uses (2)
Used by (2)

Actions: Source ยท Open Issue

Proof
by
  unfold sumRewards sumRewards'
  rw [Finset.sum_coe_sort (f := fun s โ†ฆ if A s ฯ‰ = a then R' s ฯ‰ else 0) (Iic n)]
  congr with m
  simp only [mem_range, mem_Iic]
  grind

sumRewards_eq_sumRewards'๐Ÿ”—

LemmaLearning.sumRewards_eq_sumRewards'

No docstring.

๐Ÿ”—theorem
Learning.sumRewards_eq_sumRewards'.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {R' : โ„• โ†’ ฮฉ โ†’ โ„} {n : โ„•} {ฯ‰ : ฮฉ} (hn : n โ‰  0) : sumRewards A R' a n ฯ‰ = sumRewards' (n - 1) (fun i => (A (โ†‘i) ฯ‰, R' (โ†‘i) ฯ‰)) a
Learning.sumRewards_eq_sumRewards'.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {R' : โ„• โ†’ ฮฉ โ†’ โ„} {n : โ„•} {ฯ‰ : ฮฉ} (hn : n โ‰  0) : sumRewards A R' a n ฯ‰ = sumRewards' (n - 1) (fun i => (A (โ†‘i) ฯ‰, R' (โ†‘i) ฯ‰)) a

Code

lemma sumRewards_eq_sumRewards' {R' : โ„• โ†’ ฮฉ โ†’ โ„} {n : โ„•} {ฯ‰ : ฮฉ} (hn : n โ‰  0) :
    sumRewards A R' a n ฯ‰ = sumRewards' (n - 1) (fun i โ†ฆ (A i ฯ‰, R' i ฯ‰)) a
Type uses (2)
Body uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  cases n with
  | zero => exact absurd rfl hn
  | succ n => simp [sumRewards_add_one_eq_sumRewards']

empMean_add_one_eq_empMean'๐Ÿ”—

LemmaLearning.empMean_add_one_eq_empMean'

No docstring.

๐Ÿ”—theorem
Learning.empMean_add_one_eq_empMean'.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {R' : โ„• โ†’ ฮฉ โ†’ โ„} {n : โ„•} {ฯ‰ : ฮฉ} : empMean A R' a (n + 1) ฯ‰ = empMean' n (fun i => (A (โ†‘i) ฯ‰, R' (โ†‘i) ฯ‰)) a
Learning.empMean_add_one_eq_empMean'.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {R' : โ„• โ†’ ฮฉ โ†’ โ„} {n : โ„•} {ฯ‰ : ฮฉ} : empMean A R' a (n + 1) ฯ‰ = empMean' n (fun i => (A (โ†‘i) ฯ‰, R' (โ†‘i) ฯ‰)) a

Code

lemma empMean_add_one_eq_empMean' {R' : โ„• โ†’ ฮฉ โ†’ โ„} {n : โ„•} {ฯ‰ : ฮฉ} :
    empMean A R' a (n + 1) ฯ‰ = empMean' n (fun i โ†ฆ (A i ฯ‰, R' i ฯ‰)) a
Type uses (2)
Body uses (6)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  unfold empMean empMean'
  rw [sumRewards_add_one_eq_sumRewards', pullCount_add_one_eq_pullCount']

empMean_eq_empMean'๐Ÿ”—

LemmaLearning.empMean_eq_empMean'

No docstring.

๐Ÿ”—theorem
Learning.empMean_eq_empMean'.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {R' : โ„• โ†’ ฮฉ โ†’ โ„} {n : โ„•} {ฯ‰ : ฮฉ} (hn : n โ‰  0) : empMean A R' a n ฯ‰ = empMean' (n - 1) (fun i => (A (โ†‘i) ฯ‰, R' (โ†‘i) ฯ‰)) a
Learning.empMean_eq_empMean'.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {R' : โ„• โ†’ ฮฉ โ†’ โ„} {n : โ„•} {ฯ‰ : ฮฉ} (hn : n โ‰  0) : empMean A R' a n ฯ‰ = empMean' (n - 1) (fun i => (A (โ†‘i) ฯ‰, R' (โ†‘i) ฯ‰)) a

Code

lemma empMean_eq_empMean' {R' : โ„• โ†’ ฮฉ โ†’ โ„} {n : โ„•} {ฯ‰ : ฮฉ} (hn : n โ‰  0) :
    empMean A R' a n ฯ‰ = empMean' (n - 1) (fun i โ†ฆ (A i ฯ‰, R' i ฯ‰)) a
Type uses (2)
Body uses (6)
Used by (2)

Actions: Source ยท Open Issue

Proof
by
  unfold empMean empMean'
  rw [sumRewards_eq_sumRewards' hn, pullCount_eq_pullCount' hn]

sumRewards_sub_pullCount_mul_eq_sum๐Ÿ”—

LemmaLearning.sumRewards_sub_pullCount_mul_eq_sum

No docstring.

๐Ÿ”—theorem
Learning.sumRewards_sub_pullCount_mul_eq_sum.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {n : โ„•} {ฯ‰ : ฮฉ} {R' : โ„• โ†’ ฮฉ โ†’ โ„} (c : ๐“ โ†’ โ„) : sumRewards A R' a (n + 1) ฯ‰ - โ†‘(pullCount A a (n + 1) ฯ‰) * c a = โˆ‘ i โˆˆ Finset.range (n + 1), if A i ฯ‰ = a then R' i ฯ‰ - c a else 0
Learning.sumRewards_sub_pullCount_mul_eq_sum.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {a : ๐“} {n : โ„•} {ฯ‰ : ฮฉ} {R' : โ„• โ†’ ฮฉ โ†’ โ„} (c : ๐“ โ†’ โ„) : sumRewards A R' a (n + 1) ฯ‰ - โ†‘(pullCount A a (n + 1) ฯ‰) * c a = โˆ‘ i โˆˆ Finset.range (n + 1), if A i ฯ‰ = a then R' i ฯ‰ - c a else 0

Code

lemma sumRewards_sub_pullCount_mul_eq_sum {R' : โ„• โ†’ ฮฉ โ†’ โ„} (c : ๐“ โ†’ โ„) :
    sumRewards A R' a (n + 1) ฯ‰ - pullCount A a (n + 1) ฯ‰ * c a =
      โˆ‘ i โˆˆ range (n + 1), (if A i ฯ‰ = a then R' i ฯ‰ - c a else 0)
Type uses (2)
Body uses (4)

Actions: Source ยท Open Issue

Proof
by
  induction n with
  | zero =>
    simp_rw [sumRewards_add_one, pullCount_add_one]
    simp only [sumRewards_zero, Pi.zero_apply, zero_add, pullCount_zero, Nat.cast_ite, Nat.cast_one,
      CharP.cast_eq_zero, ite_mul, one_mul, zero_mul, range_one, sum_singleton]
    grind
  | succ n hn =>
    simp_rw [sumRewards_add_one (t := n + 1), pullCount_add_one (t := n + 1)]
    split_ifs with ha
    ยท conv_rhs => rw [sum_range_succ]
      simp only [Nat.cast_add, Nat.cast_one, ha, โ†“reduceIte, add_mul, one_mul]
      grind
    ยท simp only [add_zero, hn]
      conv_rhs => rw [sum_range_succ]
      simp [ha]

measurable_sumRewards๐Ÿ”—

LemmaLearning.measurable_sumRewards

No docstring.

๐Ÿ”—theorem
Learning.measurable_sumRewards.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [MeasurableSingletonClass ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„} (hA : โˆ€ (n : โ„•), Measurable (A n)) (hR' : โˆ€ (n : โ„•), Measurable (R' n)) (a : ๐“) (t : โ„•) : Measurable (sumRewards A R' a t)
Learning.measurable_sumRewards.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [MeasurableSingletonClass ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„} (hA : โˆ€ (n : โ„•), Measurable (A n)) (hR' : โˆ€ (n : โ„•), Measurable (R' n)) (a : ๐“) (t : โ„•) : Measurable (sumRewards A R' a t)

Code

lemma measurable_sumRewards [MeasurableSingletonClass ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„}
    (hA : โˆ€ n, Measurable (A n)) (hR' : โˆ€ n, Measurable (R' n)) (a : ๐“) (t : โ„•) :
    Measurable (sumRewards A R' a t)
Type uses (1)
Used by (6)

Actions: Source ยท Open Issue

Proof
by
  unfold sumRewards
  have h_meas s : Measurable (fun h : ฮฉ โ†ฆ if A s h = a then R' s h else 0) := by
    refine Measurable.ite ?_ (by fun_prop) (by fun_prop)
    exact (measurableSet_singleton _).preimage (by fun_prop)
  fun_prop

measurable_uncurry_sumRewards_comp๐Ÿ”—

LemmaLearning.measurable_uncurry_sumRewards_comp

No docstring.

๐Ÿ”—theorem
Learning.measurable_uncurry_sumRewards_comp.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [Countable ๐“] [MeasurableSingletonClass ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„} (hA : โˆ€ (n : โ„•), Measurable (A n)) (hR' : โˆ€ (n : โ„•), Measurable (R' n)) {f : ฮฉ โ†’ ๐“} (hf : Measurable f) {g : ฮฉ โ†’ โ„•} (hg : Measurable g) : Measurable fun ฯ‰ => sumRewards A R' (f ฯ‰) (g ฯ‰) ฯ‰
Learning.measurable_uncurry_sumRewards_comp.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [Countable ๐“] [MeasurableSingletonClass ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„} (hA : โˆ€ (n : โ„•), Measurable (A n)) (hR' : โˆ€ (n : โ„•), Measurable (R' n)) {f : ฮฉ โ†’ ๐“} (hf : Measurable f) {g : ฮฉ โ†’ โ„•} (hg : Measurable g) : Measurable fun ฯ‰ => sumRewards A R' (f ฯ‰) (g ฯ‰) ฯ‰

Code

lemma measurable_uncurry_sumRewards_comp [Countable ๐“] [MeasurableSingletonClass ๐“]
    {R' : โ„• โ†’ ฮฉ โ†’ โ„} (hA : โˆ€ n, Measurable (A n)) (hR' : โˆ€ n, Measurable (R' n)) {f : ฮฉ โ†’ ๐“}
    (hf : Measurable f) {g : ฮฉ โ†’ โ„•} (hg : Measurable g) :
    Measurable (fun ฯ‰ โ†ฆ sumRewards A R' (f ฯ‰) (g ฯ‰) ฯ‰)
Type uses (1)
Body uses (1)
Used by (2)

Actions: Source ยท Open Issue

Proof
by
  change Measurable ((fun aฯ‰ โ†ฆ sumRewards A R' aฯ‰.1 (g aฯ‰.2) aฯ‰.2) โˆ˜ fun ฯ‰ โ†ฆ (f ฯ‰, ฯ‰))
  apply Measurable.comp _ (by fun_prop)
  refine measurable_from_prod_countable_right fun a โ†ฆ ?_
  change Measurable ((fun tฯ‰ โ†ฆ sumRewards A R' a tฯ‰.1 tฯ‰.2) โˆ˜ fun ฯ‰ โ†ฆ (g ฯ‰, ฯ‰))
  apply Measurable.comp _ (by fun_prop)
  exact measurable_from_prod_countable_right (fun t โ†ฆ measurable_sumRewards hA hR' a t)

measurable_empMean๐Ÿ”—

LemmaLearning.measurable_empMean

No docstring.

๐Ÿ”—theorem
Learning.measurable_empMean.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [MeasurableSingletonClass ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„} (hA : โˆ€ (n : โ„•), Measurable (A n)) (hR' : โˆ€ (n : โ„•), Measurable (R' n)) (a : ๐“) (n : โ„•) : Measurable (empMean A R' a n)
Learning.measurable_empMean.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [MeasurableSingletonClass ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„} (hA : โˆ€ (n : โ„•), Measurable (A n)) (hR' : โˆ€ (n : โ„•), Measurable (R' n)) (a : ๐“) (n : โ„•) : Measurable (empMean A R' a n)

Code

lemma measurable_empMean [MeasurableSingletonClass ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„} (hA : โˆ€ n, Measurable (A n))
    (hR' : โˆ€ n, Measurable (R' n)) (a : ๐“) (n : โ„•) :
    Measurable (empMean A R' a n)
Type uses (1)
Body uses (4)
Used by (3)

Actions: Source ยท Open Issue

Proof
by
  unfold empMean
  fun_prop

measurable_uncurry_empMean_comp๐Ÿ”—

LemmaLearning.measurable_uncurry_empMean_comp

No docstring.

๐Ÿ”—theorem
Learning.measurable_uncurry_empMean_comp.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [Countable ๐“] [MeasurableSingletonClass ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„} (hA : โˆ€ (n : โ„•), Measurable (A n)) (hR' : โˆ€ (n : โ„•), Measurable (R' n)) {f : ฮฉ โ†’ ๐“} (hf : Measurable f) {g : ฮฉ โ†’ โ„•} (hg : Measurable g) : Measurable fun ฯ‰ => empMean A R' (f ฯ‰) (g ฯ‰) ฯ‰
Learning.measurable_uncurry_empMean_comp.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [Countable ๐“] [MeasurableSingletonClass ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„} (hA : โˆ€ (n : โ„•), Measurable (A n)) (hR' : โˆ€ (n : โ„•), Measurable (R' n)) {f : ฮฉ โ†’ ๐“} (hf : Measurable f) {g : ฮฉ โ†’ โ„•} (hg : Measurable g) : Measurable fun ฯ‰ => empMean A R' (f ฯ‰) (g ฯ‰) ฯ‰

Code

lemma measurable_uncurry_empMean_comp [Countable ๐“] [MeasurableSingletonClass ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„}
    (hA : โˆ€ n, Measurable (A n)) (hR' : โˆ€ n, Measurable (R' n)) {f : ฮฉ โ†’ ๐“} (hf : Measurable f)
    {g : ฮฉ โ†’ โ„•} (hg : Measurable g) :
    Measurable (fun ฯ‰ โ†ฆ empMean A R' (f ฯ‰) (g ฯ‰) ฯ‰)
Type uses (1)
Body uses (4)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  unfold empMean
  fun_prop

measurable_sumRewards'๐Ÿ”—

LemmaLearning.measurable_sumRewards'

No docstring.

๐Ÿ”—theorem
Learning.measurable_sumRewards'.{u_1} {๐“ : Type u_1} {m๐“ : MeasurableSpace ๐“} [DecidableEq ๐“] [MeasurableSingletonClass ๐“] (n : โ„•) (a : ๐“) : Measurable fun h => sumRewards' n h a
Learning.measurable_sumRewards'.{u_1} {๐“ : Type u_1} {m๐“ : MeasurableSpace ๐“} [DecidableEq ๐“] [MeasurableSingletonClass ๐“] (n : โ„•) (a : ๐“) : Measurable fun h => sumRewards' n h a

Code

lemma measurable_sumRewards' [MeasurableSingletonClass ๐“] (n : โ„•) (a : ๐“) :
    Measurable (fun h โ†ฆ sumRewards' n h a)
Type uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  simp_rw [sumRewards']
  have h_meas s : Measurable (fun (h : Iic n โ†’ ๐“ ร— โ„) โ†ฆ if (h s).1 = a then (h s).2 else 0) := by
    refine Measurable.ite ?_ (by fun_prop) (by fun_prop)
    exact (measurableSet_singleton _).preimage (by fun_prop)
  fun_prop

measurable_uncurry_sumRewards'๐Ÿ”—

LemmaLearning.measurable_uncurry_sumRewards'

No docstring.

๐Ÿ”—theorem
Learning.measurable_uncurry_sumRewards'.{u_1} {๐“ : Type u_1} {m๐“ : MeasurableSpace ๐“} [DecidableEq ๐“] [MeasurableEq ๐“] (n : โ„•) : Measurable fun p => sumRewards' n (Prod.fst p) (Prod.snd p)
Learning.measurable_uncurry_sumRewards'.{u_1} {๐“ : Type u_1} {m๐“ : MeasurableSpace ๐“} [DecidableEq ๐“] [MeasurableEq ๐“] (n : โ„•) : Measurable fun p => sumRewards' n (Prod.fst p) (Prod.snd p)

Code

lemma measurable_uncurry_sumRewards' [MeasurableEq ๐“] (n : โ„•) :
    Measurable (fun p : (Iic n โ†’ ๐“ ร— โ„) ร— ๐“ โ†ฆ sumRewards' n p.1 p.2)
Type uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  simp_rw [sumRewards']
  have h_meas s : Measurable (fun p : (Iic n โ†’ ๐“ ร— โ„) ร— ๐“ โ†ฆ
      if (p.1 s).1 = p.2 then (p.1 s).2 else 0) := by
    refine Measurable.ite ?_ (by fun_prop) (by fun_prop)
    exact measurableSet_eq_fun (by fun_prop) (by fun_prop)
  fun_prop

measurable_empMean'๐Ÿ”—

LemmaLearning.measurable_empMean'

No docstring.

๐Ÿ”—theorem
Learning.measurable_empMean'.{u_1} {๐“ : Type u_1} {m๐“ : MeasurableSpace ๐“} [DecidableEq ๐“] [MeasurableSingletonClass ๐“] (n : โ„•) (a : ๐“) : Measurable fun h => empMean' n h a
Learning.measurable_empMean'.{u_1} {๐“ : Type u_1} {m๐“ : MeasurableSpace ๐“} [DecidableEq ๐“] [MeasurableSingletonClass ๐“] (n : โ„•) (a : ๐“) : Measurable fun h => empMean' n h a

Code

lemma measurable_empMean' [MeasurableSingletonClass ๐“] (n : โ„•) (a : ๐“) :
    Measurable (fun h โ†ฆ empMean' n h a)
Type uses (1)
Body uses (4)
Used by (2)

Actions: Source ยท Open Issue

Proof
by
  unfold empMean'
  fun_prop

measurable_uncurry_empMean'๐Ÿ”—

LemmaLearning.measurable_uncurry_empMean'

No docstring.

๐Ÿ”—theorem
Learning.measurable_uncurry_empMean'.{u_1} {๐“ : Type u_1} {m๐“ : MeasurableSpace ๐“} [DecidableEq ๐“] [MeasurableEq ๐“] (n : โ„•) : Measurable fun p => empMean' n (Prod.fst p) (Prod.snd p)
Learning.measurable_uncurry_empMean'.{u_1} {๐“ : Type u_1} {m๐“ : MeasurableSpace ๐“} [DecidableEq ๐“] [MeasurableEq ๐“] (n : โ„•) : Measurable fun p => empMean' n (Prod.fst p) (Prod.snd p)

Code

lemma measurable_uncurry_empMean' [MeasurableEq ๐“] (n : โ„•) :
    Measurable (fun p : (Iic n โ†’ ๐“ ร— โ„) ร— ๐“ โ†ฆ empMean' n p.1 p.2)
Type uses (1)
Body uses (4)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  unfold empMean'
  fun_prop

isPredictable_sumRewards๐Ÿ”—

LemmaLearning.IsAlgEnvSeq.isPredictable_sumRewards

No docstring.

๐Ÿ”—theorem
Learning.IsAlgEnvSeq.isPredictable_sumRewards.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [StandardBorelSpace ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„} {alg : Algorithm ๐“ โ„} {env : Environment ๐“ โ„} (h : IsAlgEnvSeq A R' alg env P) (a : ๐“) : MeasureTheory.IsStronglyPredictable (filtration โ‹ฏ โ‹ฏ) (sumRewards A R' a)
Learning.IsAlgEnvSeq.isPredictable_sumRewards.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [StandardBorelSpace ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„} {alg : Algorithm ๐“ โ„} {env : Environment ๐“ โ„} (h : IsAlgEnvSeq A R' alg env P) (a : ๐“) : MeasureTheory.IsStronglyPredictable (filtration โ‹ฏ โ‹ฏ) (sumRewards A R' a)

Code

lemma IsAlgEnvSeq.isPredictable_sumRewards [StandardBorelSpace ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„}
    {alg : Algorithm ๐“ โ„} {env : Environment ๐“ โ„}
    (h : IsAlgEnvSeq A R' alg env P) (a : ๐“) :
    IsStronglyPredictable (IsAlgEnvSeq.filtration h.measurable_action h.measurable_feedback)
      (sumRewards A R' a)
Type uses (5)
Body uses (3)
Used by (2)

Actions: Source ยท Open Issue

Proof
by
  rw [IsStronglyPredictable.iff_measurable_add_one]
  constructor
  ยท simp only [sumRewards_zero]
    fun_prop
  refine fun n โ†ฆ Measurable.stronglyMeasurable ?_
  refine measurable_fun_sum _ fun i hi โ†ฆ Measurable.ite ?_ ?_ (by fun_prop)
  ยท refine (measurableSet_singleton a).preimage ?_
    have h_meas_i := IsAlgEnvSeq.adapted_action h.measurable_action h.measurable_feedback i
    simp only [mem_range] at hi
    exact h_meas_i.mono ((IsAlgEnvSeq.filtration h.measurable_action h.measurable_feedback).mono
      (by lia)) le_rfl
  ยท have h_meas_i := IsAlgEnvSeq.adapted_feedback h.measurable_action h.measurable_feedback i
    simp only [mem_range] at hi
    exact h_meas_i.mono ((IsAlgEnvSeq.filtration h.measurable_action h.measurable_feedback).mono
      (by lia)) le_rfl

stronglyAdapted_sumRewards_add_one๐Ÿ”—

LemmaLearning.IsAlgEnvSeq.stronglyAdapted_sumRewards_add_one

No docstring.

๐Ÿ”—theorem
Learning.IsAlgEnvSeq.stronglyAdapted_sumRewards_add_one.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [StandardBorelSpace ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„} {alg : Algorithm ๐“ โ„} {env : Environment ๐“ โ„} (h : IsAlgEnvSeq A R' alg env P) (a : ๐“) : MeasureTheory.StronglyAdapted (filtration โ‹ฏ โ‹ฏ) fun n => sumRewards A R' a (n + 1)
Learning.IsAlgEnvSeq.stronglyAdapted_sumRewards_add_one.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [StandardBorelSpace ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„} {alg : Algorithm ๐“ โ„} {env : Environment ๐“ โ„} (h : IsAlgEnvSeq A R' alg env P) (a : ๐“) : MeasureTheory.StronglyAdapted (filtration โ‹ฏ โ‹ฏ) fun n => sumRewards A R' a (n + 1)

Code

lemma IsAlgEnvSeq.stronglyAdapted_sumRewards_add_one [StandardBorelSpace ๐“]
    {R' : โ„• โ†’ ฮฉ โ†’ โ„} {alg : Algorithm ๐“ โ„} {env : Environment ๐“ โ„}
    (h : IsAlgEnvSeq A R' alg env P) (a : ๐“) :
    StronglyAdapted (IsAlgEnvSeq.filtration h.measurable_action h.measurable_feedback)
      (fun n โ†ฆ sumRewards A R' a (n + 1))
Type uses (5)
Body uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  have h_predictable := h.isPredictable_sumRewards a
  rw [IsStronglyPredictable.iff_measurable_add_one] at h_predictable
  exact h_predictable.2

adapted_sumRewards_add_one๐Ÿ”—

LemmaLearning.IsAlgEnvSeq.adapted_sumRewards_add_one

No docstring.

๐Ÿ”—theorem
Learning.IsAlgEnvSeq.adapted_sumRewards_add_one.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [StandardBorelSpace ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„} {alg : Algorithm ๐“ โ„} {env : Environment ๐“ โ„} (h : IsAlgEnvSeq A R' alg env P) (a : ๐“) : MeasureTheory.Adapted (filtration โ‹ฏ โ‹ฏ) fun n => sumRewards A R' a (n + 1)
Learning.IsAlgEnvSeq.adapted_sumRewards_add_one.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [StandardBorelSpace ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„} {alg : Algorithm ๐“ โ„} {env : Environment ๐“ โ„} (h : IsAlgEnvSeq A R' alg env P) (a : ๐“) : MeasureTheory.Adapted (filtration โ‹ฏ โ‹ฏ) fun n => sumRewards A R' a (n + 1)

Code

lemma IsAlgEnvSeq.adapted_sumRewards_add_one [StandardBorelSpace ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„}
    {alg : Algorithm ๐“ โ„} {env : Environment ๐“ โ„}
    (h : IsAlgEnvSeq A R' alg env P) (a : ๐“) :
    Adapted (IsAlgEnvSeq.filtration h.measurable_action h.measurable_feedback)
      (fun n โ†ฆ sumRewards A R' a (n + 1))
Type uses (5)
Body uses (1)

Actions: Source ยท Open Issue

Proof
(h.stronglyAdapted_sumRewards_add_one a).adapted

divโ‚€'๐Ÿ”—

LemmaMeasureTheory.StronglyMeasurable.divโ‚€'

No docstring.

๐Ÿ”—theorem
MeasureTheory.StronglyMeasurable.divโ‚€'.{u_4, u_5} {๐“ : Type u_4} {ฮฒ : Type u_5} {m๐“ : MeasurableSpace ๐“} [TopologicalSpace ฮฒ] [GroupWithZero ฮฒ] [ContinuousMul ฮฒ] [ContinuousInvโ‚€ ฮฒ] [TopologicalSpace.PseudoMetrizableSpace ฮฒ] [MeasurableSpace ฮฒ] [BorelSpace ฮฒ] [MeasurableSingletonClass ฮฒ] {f g : ๐“ โ†’ ฮฒ} (hf : StronglyMeasurable f) (hg : StronglyMeasurable g) : StronglyMeasurable (f / g)
MeasureTheory.StronglyMeasurable.divโ‚€'.{u_4, u_5} {๐“ : Type u_4} {ฮฒ : Type u_5} {m๐“ : MeasurableSpace ๐“} [TopologicalSpace ฮฒ] [GroupWithZero ฮฒ] [ContinuousMul ฮฒ] [ContinuousInvโ‚€ ฮฒ] [TopologicalSpace.PseudoMetrizableSpace ฮฒ] [MeasurableSpace ฮฒ] [BorelSpace ฮฒ] [MeasurableSingletonClass ฮฒ] {f g : ๐“ โ†’ ฮฒ} (hf : StronglyMeasurable f) (hg : StronglyMeasurable g) : StronglyMeasurable (f / g)

Code

lemma _root_.MeasureTheory.StronglyMeasurable.divโ‚€' {๐“ ฮฒ : Type*}
    {m๐“ : MeasurableSpace ๐“} [TopologicalSpace ฮฒ]
    [GroupWithZero ฮฒ] [ContinuousMul ฮฒ] [ContinuousInvโ‚€ ฮฒ]
    [TopologicalSpace.PseudoMetrizableSpace ฮฒ]
    [MeasurableSpace ฮฒ] [BorelSpace ฮฒ] [MeasurableSingletonClass ฮฒ]
    {f g : ๐“ โ†’ ฮฒ} (hf : StronglyMeasurable f) (hg : StronglyMeasurable g) :
    StronglyMeasurable (f / g)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  refine โŸจfun n => hf.approx n / (hg.approx n).restrict {x | g x โ‰  0}, fun x => ?_โŸฉ
  have : MeasurableSet {x | g x โ‰  0} := ((MeasurableSet.singleton 0).preimage hg.measurable).compl
  by_cases h : g x = 0
  ยท simp_all only [ne_eq, SimpleFunc.coe_div, SimpleFunc.coe_restrict, Pi.div_apply, mem_setOf_eq,
      not_true_eq_false, not_false_eq_true, indicator_of_notMem, _root_.div_zero]
    exact tendsto_const_nhds
  ยท simp_all only [ne_eq, SimpleFunc.coe_div, SimpleFunc.coe_restrict,
      Pi.div_apply, mem_setOf_eq, not_false_eq_true, indicator_of_mem]
    exact (hf.tendsto_approx x).div (hg.tendsto_approx x) h

isPredictable_empMean๐Ÿ”—

LemmaLearning.IsAlgEnvSeq.isPredictable_empMean

No docstring.

๐Ÿ”—theorem
Learning.IsAlgEnvSeq.isPredictable_empMean.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [StandardBorelSpace ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„} {alg : Algorithm ๐“ โ„} {env : Environment ๐“ โ„} (h : IsAlgEnvSeq A R' alg env P) (a : ๐“) : MeasureTheory.IsStronglyPredictable (filtration โ‹ฏ โ‹ฏ) (empMean A R' a)
Learning.IsAlgEnvSeq.isPredictable_empMean.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [StandardBorelSpace ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„} {alg : Algorithm ๐“ โ„} {env : Environment ๐“ โ„} (h : IsAlgEnvSeq A R' alg env P) (a : ๐“) : MeasureTheory.IsStronglyPredictable (filtration โ‹ฏ โ‹ฏ) (empMean A R' a)

Code

lemma IsAlgEnvSeq.isPredictable_empMean [StandardBorelSpace ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„}
    {alg : Algorithm ๐“ โ„} {env : Environment ๐“ โ„}
    (h : IsAlgEnvSeq A R' alg env P) (a : ๐“) :
    IsStronglyPredictable (IsAlgEnvSeq.filtration h.measurable_action h.measurable_feedback)
      (empMean A R' a)
Type uses (5)
Body uses (5)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  unfold empMean
  refine StronglyMeasurable.divโ‚€' ?_ ?_
  ยท exact h.isPredictable_sumRewards a
  ยท have h_meas := (isPredictable_pullCount h.measurable_action h.measurable_feedback a).measurable
    fun_prop

stronglyAdapted_empMean_add_one๐Ÿ”—

LemmaLearning.IsAlgEnvSeq.stronglyAdapted_empMean_add_one

No docstring.

๐Ÿ”—theorem
Learning.IsAlgEnvSeq.stronglyAdapted_empMean_add_one.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [StandardBorelSpace ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„} {alg : Algorithm ๐“ โ„} {env : Environment ๐“ โ„} (h : IsAlgEnvSeq A R' alg env P) (a : ๐“) : MeasureTheory.StronglyAdapted (filtration โ‹ฏ โ‹ฏ) fun n => empMean A R' a (n + 1)
Learning.IsAlgEnvSeq.stronglyAdapted_empMean_add_one.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [StandardBorelSpace ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„} {alg : Algorithm ๐“ โ„} {env : Environment ๐“ โ„} (h : IsAlgEnvSeq A R' alg env P) (a : ๐“) : MeasureTheory.StronglyAdapted (filtration โ‹ฏ โ‹ฏ) fun n => empMean A R' a (n + 1)

Code

lemma IsAlgEnvSeq.stronglyAdapted_empMean_add_one [StandardBorelSpace ๐“]
    {R' : โ„• โ†’ ฮฉ โ†’ โ„} {alg : Algorithm ๐“ โ„} {env : Environment ๐“ โ„}
    (h : IsAlgEnvSeq A R' alg env P) (a : ๐“) :
    StronglyAdapted (IsAlgEnvSeq.filtration h.measurable_action h.measurable_feedback)
      (fun n โ†ฆ empMean A R' a (n + 1))
Type uses (5)
Body uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  have h_predictable := h.isPredictable_empMean a
  rw [IsStronglyPredictable.iff_measurable_add_one] at h_predictable
  exact h_predictable.2

adapted_empMean_add_one๐Ÿ”—

LemmaLearning.IsAlgEnvSeq.adapted_empMean_add_one

No docstring.

๐Ÿ”—theorem
Learning.IsAlgEnvSeq.adapted_empMean_add_one.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [StandardBorelSpace ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„} {alg : Algorithm ๐“ โ„} {env : Environment ๐“ โ„} (h : IsAlgEnvSeq A R' alg env P) (a : ๐“) : MeasureTheory.Adapted (filtration โ‹ฏ โ‹ฏ) fun n => empMean A R' a (n + 1)
Learning.IsAlgEnvSeq.adapted_empMean_add_one.{u_1, u_3} {๐“ : Type u_1} {ฮฉ : Type u_3} {m๐“ : MeasurableSpace ๐“} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐“] {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} [StandardBorelSpace ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„} {alg : Algorithm ๐“ โ„} {env : Environment ๐“ โ„} (h : IsAlgEnvSeq A R' alg env P) (a : ๐“) : MeasureTheory.Adapted (filtration โ‹ฏ โ‹ฏ) fun n => empMean A R' a (n + 1)

Code

lemma IsAlgEnvSeq.adapted_empMean_add_one [StandardBorelSpace ๐“] {R' : โ„• โ†’ ฮฉ โ†’ โ„}
    {alg : Algorithm ๐“ โ„} {env : Environment ๐“ โ„}
    (h : IsAlgEnvSeq A R' alg env P) (a : ๐“) :
    Adapted (IsAlgEnvSeq.filtration h.measurable_action h.measurable_feedback)
      (fun n โ†ฆ empMean A R' a (n + 1))
Type uses (5)
Body uses (1)

Actions: Source ยท Open Issue

Proof
(h.stronglyAdapted_empMean_add_one a).adapted
  1. Learning.pullCount
  2. Learning.pullCount'
  3. Learning.pullCount_zero
  4. Learning.pullCount_zero_apply
  5. Learning.pullCount_one
  6. Learning.monotone_pullCount
  7. Learning.pullCount_mono
  8. Learning.pullCount_action_eq_pullCount_add_one
  9. Learning.pullCount_eq_pullCount_of_action_ne
  10. Learning.pullCount_add_one
  11. Learning.pullCount_eq_sum
  12. Learning.pullCount'_eq_sum
  13. Learning.pullCount_add_one_eq_pullCount'
  14. Learning.pullCount_eq_pullCount'
  15. Learning.pullCount'_mono
  16. Learning.pullCount_le
  17. Learning.pullCount_congr
  18. Learning.pullCount_lt_of_forall_ne
  19. Learning.exists_pullCount_eq_of_le
  20. Learning.pullCount_le_add
  21. Learning.measurable_pullCount
  22. Learning.measurable_uncurry_pullCount
  23. Learning.measurable_uncurry_pullCount_comp
  24. Learning.measurable_pullCount'
  25. Learning.measurable_uncurry_pullCount'
  26. Learning.adapted_pullCount_add_one
  27. Learning.stronglyAdapted_pullCount_add_one
  28. Learning.isPredictable_pullCount
  29. Learning.integrable_pullCount
  30. Learning.stepsUntil
  31. Learning.stepsUntil_eq_top_iff
  32. Learning.stepsUntil_ne_top
  33. Learning.exists_pullCount_eq
  34. Learning.stepsUntil_zero_of_ne
  35. Learning.stepsUntil_zero_of_eq
  36. Learning.stepsUntil_eq_dite
  37. Learning.stepsUntil_eq_leastGE
  38. Learning.stepsUntil_mono
  39. Learning.stepsUntil_pullCount_le
  40. Learning.stepsUntil_pullCount_eq
  41. Learning.stepsUntil_one_of_eq
  42. Learning.stepsUntil_eq_zero_iff
  43. Learning.action_stepsUntil
  44. Learning.action_eq_of_stepsUntil_eq_coe
  45. Learning.pullCount_stepsUntil_add_one
  46. Learning.pullCount_stepsUntil
  47. Learning.pullCount_lt_of_le_stepsUntil
  48. Learning.pullCount_eq_of_stepsUntil_eq_coe
  49. Learning.pullCount_add_one_eq_of_stepsUntil_eq_coe
  50. Learning.stepsUntil_eq_iff
  51. Learning.stepsUntil_eq_iff'
  52. Learning.stepsUntil_eq_congr
  53. Learning.isStoppingTime_stepsUntil
  54. Learning.measurable_stepsUntil
  55. Learning.measurable_stepsUntil'
  56. Learning.measurable_comap_indicator_stepsUntil_eq
  57. Learning.measurable_indicator_stepsUntil_eq
  58. Learning.measurableSet_stepsUntil_eq_zero
  59. Learning.measurable_comap_indicator_stepsUntil_eq_zero
  60. Learning.measurableSet_stepsUntil_eq
  61. Learning.isStoppingTime_stepsUntil_filtrationAction
  62. Learning.rewardByCount
  63. Learning.rewardByCount_eq_ite
  64. Learning.rewardByCount_eq_add
  65. Learning.rewardByCount_of_stepsUntil_eq_top
  66. Learning.rewardByCount_of_stepsUntil_ne_top
  67. Learning.rewardByCount_eq_stoppedValue
  68. Learning.rewardByCount_of_stepsUntil_eq_coe
  69. Learning.rewardByCount_zero
  70. Learning.rewardByCount_pullCount_add_one_eq_reward
  71. Learning.measurable_rewardByCount
  72. Learning.sum_pullCount_mul
  73. Learning.sum_pullCount
  74. Learning.sum_comp_pullCount
  75. Learning.sumRewards
  76. Learning.sumRewards'
  77. Learning.empMean
  78. Learning.empMean'
  79. Learning.sumRewards_zero
  80. Learning.sumRewards_add_one
  81. Learning.sumRewards_eq_of_pullCount_eq
  82. Learning.sumRewards_eq_pullCount_mul_empMean
  83. Learning.sum_rewardByCount_eq_sumRewards
  84. Learning.sumRewards_add_one_eq_sumRewards'
  85. Learning.sumRewards_eq_sumRewards'
  86. Learning.empMean_add_one_eq_empMean'
  87. Learning.empMean_eq_empMean'
  88. Learning.sumRewards_sub_pullCount_mul_eq_sum
  89. Learning.measurable_sumRewards
  90. Learning.measurable_uncurry_sumRewards_comp
  91. Learning.measurable_empMean
  92. Learning.measurable_uncurry_empMean_comp
  93. Learning.measurable_sumRewards'
  94. Learning.measurable_uncurry_sumRewards'
  95. Learning.measurable_empMean'
  96. Learning.measurable_uncurry_empMean'
  97. Learning.IsAlgEnvSeq.isPredictable_sumRewards
  98. Learning.IsAlgEnvSeq.stronglyAdapted_sumRewards_add_one
  99. Learning.IsAlgEnvSeq.adapted_sumRewards_add_one
  100. MeasureTheory.StronglyMeasurable.divโ‚€'
  101. Learning.IsAlgEnvSeq.isPredictable_empMean
  102. Learning.IsAlgEnvSeq.stronglyAdapted_empMean_add_one
  103. Learning.IsAlgEnvSeq.adapted_empMean_add_one