Learning.measurable_uncurry_sumRewards_comp
This page has the declaration's own card below, then its dependency graph, then a card for each dependency (type dependencies first, then the rest of the transitive closure). For a theorem, the graph and the dependency cards only follow its statement's dependencies (its proof is replaced by sorry, so what it proves doesn't depend on how); for everything else, both the type and the body/value are followed, since their content is part of what later declarations build on.
measurable_uncurry_sumRewards_comp๐
Learning.measurable_uncurry_sumRewards_compNo docstring.
Learning.measurable_uncurry_sumRewards_comp.{u_1, u_3} {๐ : Type u_1} {ฮฉ : Type u_3} {m๐ : MeasurableSpace ๐} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐] {A : โ โ ฮฉ โ ๐} [Countable ๐] [MeasurableSingletonClass ๐] {R' : โ โ ฮฉ โ โ} (hA : โ (n : โ), Measurable (A n)) (hR' : โ (n : โ), Measurable (R' n)) {f : ฮฉ โ ๐} (hf : Measurable f) {g : ฮฉ โ โ} (hg : Measurable g) : Measurable fun ฯ => sumRewards A R' (f ฯ) (g ฯ) ฯLearning.measurable_uncurry_sumRewards_comp.{u_1, u_3} {๐ : Type u_1} {ฮฉ : Type u_3} {m๐ : MeasurableSpace ๐} {mฮฉ : MeasurableSpace ฮฉ} [DecidableEq ๐] {A : โ โ ฮฉ โ ๐} [Countable ๐] [MeasurableSingletonClass ๐] {R' : โ โ ฮฉ โ โ} (hA : โ (n : โ), Measurable (A n)) (hR' : โ (n : โ), Measurable (R' n)) {f : ฮฉ โ ๐} (hf : Measurable f) {g : ฮฉ โ โ} (hg : Measurable g) : Measurable fun ฯ => sumRewards A R' (f ฯ) (g ฯ) ฯ
Code
lemma measurable_uncurry_sumRewards_comp [Countable ๐] [MeasurableSingletonClass ๐]
{R' : โ โ ฮฉ โ โ} (hA : โ n, Measurable (A n)) (hR' : โ n, Measurable (R' n)) {f : ฮฉ โ ๐}
(hf : Measurable f) {g : ฮฉ โ โ} (hg : Measurable g) :
Measurable (fun ฯ โฆ sumRewards A R' (f ฯ) (g ฯ) ฯ)Type uses (1)
Body uses (1)
Actions: Source ยท Open Issue
Proof
by change Measurable ((fun aฯ โฆ sumRewards A R' aฯ.1 (g aฯ.2) aฯ.2) โ fun ฯ โฆ (f ฯ, ฯ)) apply Measurable.comp _ (by fun_prop) refine measurable_from_prod_countable_right fun a โฆ ?_ change Measurable ((fun tฯ โฆ sumRewards A R' a tฯ.1 tฯ.2) โ fun ฯ โฆ (g ฯ, ฯ)) apply Measurable.comp _ (by fun_prop) exact measurable_from_prod_countable_right (fun t โฆ measurable_sumRewards hA hR' a t)
Dependency graph
Type dependencies (1)
sumRewards๐
Learning.sumRewards
Sum of rewards obtained when pulling action a up to time t (exclusive).
Learning.sumRewards.{u_1, u_3} {๐ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐] (A : โ โ ฮฉ โ ๐) (R' : โ โ ฮฉ โ โ) (a : ๐) (t : โ) (ฯ : ฮฉ) : โLearning.sumRewards.{u_1, u_3} {๐ : Type u_1} {ฮฉ : Type u_3} [DecidableEq ๐] (A : โ โ ฮฉ โ ๐) (R' : โ โ ฮฉ โ โ) (a : ๐) (t : โ) (ฯ : ฮฉ) : โ
Code
def sumRewards (A : โ โ ฮฉ โ ๐) (R' : โ โ ฮฉ โ โ) (a : ๐) (t : โ) (ฯ : ฮฉ) : โ := โ s โ range t, if A s ฯ = a then R' s ฯ else 0
Used by (44)
Actions: Source ยท Open Issue