Learning.measurable_sumRewards
This page has the declaration's own card below, then its dependency graph, then a card for each dependency (type dependencies first, then the rest of the transitive closure). For a theorem, the graph and the dependency cards only follow its statement's dependencies (its proof is replaced by sorry, so what it proves doesn't depend on how); for everything else, both the type and the body/value are followed, since their content is part of what later declarations build on.
measurable_sumRewards๐
Learning.measurable_sumRewardsNo docstring.
Learning.measurable_sumRewards.{u_1, u_3} {๐ : Type u_1} {ฮฉ : Type u_3} {m๐ : MeasurableSpace ๐} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐] {A : โ โ ฮฉ โ ๐} [MeasurableSingletonClass ๐] {R' : โ โ ฮฉ โ โ} (hA : โ (n : โ), Measurable (A n)) (hR' : โ (n : โ), Measurable (R' n)) (a : ๐) (t : โ) : Measurable (sumRewards A R' a t)Learning.measurable_sumRewards.{u_1, u_3} {๐ : Type u_1} {ฮฉ : Type u_3} {m๐ : MeasurableSpace ๐} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐] {A : โ โ ฮฉ โ ๐} [MeasurableSingletonClass ๐] {R' : โ โ ฮฉ โ โ} (hA : โ (n : โ), Measurable (A n)) (hR' : โ (n : โ), Measurable (R' n)) (a : ๐) (t : โ) : Measurable (sumRewards A R' a t)
Code
lemma measurable_sumRewards [MeasurableSingletonClass ๐] {R' : โ โ ฮฉ โ โ}
(hA : โ n, Measurable (A n)) (hR' : โ n, Measurable (R' n)) (a : ๐) (t : โ) :
Measurable (sumRewards A R' a t)Type uses (1)
Used by (6)
Actions: Source ยท Open Issue
Proof
by
unfold sumRewards
have h_meas s : Measurable (fun h : ฮฉ โฆ if A s h = a then R' s h else 0) := by
refine Measurable.ite ?_ (by fun_prop) (by fun_prop)
exact (measurableSet_singleton _).preimage (by fun_prop)
fun_propDependency graph
Type dependencies (1)
sumRewards๐
Learning.sumRewards
Sum of rewards obtained when pulling action a up to time t (exclusive).
Learning.sumRewards.{u_1, u_3} {๐ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐] (A : โ โ ฮฉ โ ๐) (R' : โ โ ฮฉ โ โ) (a : ๐) (t : โ) (ฯ : ฮฉ) : โLearning.sumRewards.{u_1, u_3} {๐ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐] (A : โ โ ฮฉ โ ๐) (R' : โ โ ฮฉ โ โ) (a : ๐) (t : โ) (ฯ : ฮฉ) : โ
Code
def sumRewards (A : โ โ ฮฉ โ ๐) (R' : โ โ ฮฉ โ โ) (a : ๐) (t : โ) (ฯ : ฮฉ) : โ := โ s โ range t, if A s ฯ = a then R' s ฯ else 0
Used by (44)
Actions: Source ยท Open Issue