Learning.IsBayesAlgEnvSeq.hasLaw_IT_action_zero
This page has the declaration's own card below, then its dependency graph, then a card for each dependency (type dependencies first, then the rest of the transitive closure). For a theorem, the graph and the dependency cards only follow its statement's dependencies (its proof is replaced by sorry, so what it proves doesn't depend on how); for everything else, both the type and the body/value are followed, since their content is part of what later declarations build on.
hasLaw_IT_action_zero๐
Learning.IsBayesAlgEnvSeq.hasLaw_IT_action_zeroNo docstring.
Learning.IsBayesAlgEnvSeq.hasLaw_IT_action_zero.{u_1, u_2, u_3, u_4} {๐ : Type u_1} {๐ : Type u_2} {๐จ : Type u_3} {ฮฉ : Type u_4} [MeasurableSpace ๐] [MeasurableSpace ๐] [MeasurableSpace ๐จ] [MeasurableSpace ฮฉ] {Q : MeasureTheory.Measure ๐} {ฮบ : ProbabilityTheory.Kernel (๐ ร ๐) ๐จ} {alg : Algorithm ๐ ๐จ} {E : ฮฉ โ ๐} {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] [StandardBorelSpace ๐] [Nonempty ๐] [StandardBorelSpace ๐จ] [Nonempty ๐จ] (h : IsBayesAlgEnvSeq Q ฮบ alg E A Y P) : โแต (e : ๐) โQ, ProbabilityTheory.HasLaw (IT.action 0) (Algorithm.p0 alg) (๐[trajectory A Y | E; P] e)Learning.IsBayesAlgEnvSeq.hasLaw_IT_action_zero.{u_1, u_2, u_3, u_4} {๐ : Type u_1} {๐ : Type u_2} {๐จ : Type u_3} {ฮฉ : Type u_4} [MeasurableSpace ๐] [MeasurableSpace ๐] [MeasurableSpace ๐จ] [MeasurableSpace ฮฉ] {Q : MeasureTheory.Measure ๐} {ฮบ : ProbabilityTheory.Kernel (๐ ร ๐) ๐จ} {alg : Algorithm ๐ ๐จ} {E : ฮฉ โ ๐} {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] [StandardBorelSpace ๐] [Nonempty ๐] [StandardBorelSpace ๐จ] [Nonempty ๐จ] (h : IsBayesAlgEnvSeq Q ฮบ alg E A Y P) : โแต (e : ๐) โQ, ProbabilityTheory.HasLaw (IT.action 0) (Algorithm.p0 alg) (๐[trajectory A Y | E; P] e)
Code
lemma hasLaw_IT_action_zero (h : IsBayesAlgEnvSeq Q ฮบ alg E A Y P) :
โแต e โQ, HasLaw (IT.action 0) alg.p0 (condDistrib (trajectory A Y) E P e)Type uses (4)
Used by (1)
Actions: Source ยท Open Issue
Proof
by
rw [โ h.hasLaw_env.map_eq]
filter_upwards [condDistrib_comp E
((measurable_trajectory h.measurable_action h.measurable_feedback).aemeasurable)
(IT.measurable_action (๐ := ๐) (๐จ := ๐จ) 0),
h.hasCondDistrib_action_zero.condDistrib_eq] with _ hc hcd
exact โจ(IT.measurable_action 0).aemeasurable, by
rw [โ Kernel.map_apply _ (IT.measurable_action 0), โ hc,
show IT.action 0 โ trajectory A Y = A 0 from rfl, hcd, Kernel.const_apply]โฉDependency graph
Type dependencies (4)
Algorithm๐
Learning.AlgorithmA stochastic, sequential algorithm.
Learning.Algorithm.{u_4, u_5} (๐ : Type u_4) (๐จ : Type u_5) [MeasurableSpace ๐] [MeasurableSpace ๐จ] : Type (max u_4 u_5)Learning.Algorithm.{u_4, u_5} (๐ : Type u_4) (๐จ : Type u_5) [MeasurableSpace ๐] [MeasurableSpace ๐จ] : Type (max u_4 u_5)
Code
structure Algorithm (๐ ๐จ : Type*) [MeasurableSpace ๐] [MeasurableSpace ๐จ] where /-- Policy or sampling rule: distribution of the next action. -/ policy : (n : โ) โ Kernel (Iic n โ ๐ ร ๐จ) ๐ /-- The policy is a Markov kernel. -/ [h_policy : โ n, IsMarkovKernel (policy n)] /-- Distribution of the first action. -/ p0 : Measure ๐ /-- The first action distribution is a probability measure. -/ [hp0 : IsProbabilityMeasure p0]
Used by (216)
Actions: Source ยท Open Issue
IsBayesAlgEnvSeq๐
Learning.IsBayesAlgEnvSeq
IsBayesAlgEnvSeq Q ฮบ alg E A Y P states that there is a measure P : Measure ฮฉ such
that the parameter E : ฮฉ โ ๐ has law Q and that the sequences of actions A : โ โ ฮฉ โ ๐
and feedbacks Y : โ โ ฮฉ โ ๐จ are generated by the algorithm alg : Algorithm ๐ ๐จ interacting
with an underlying environment that depends on E and ฮบ (stationaryEnv (ฮบ.sectR (E ฯ))).
Learning.IsBayesAlgEnvSeq.{u_1, u_2, u_3, u_4} {๐ : Type u_1} {๐ : Type u_2} {๐จ : Type u_3} {ฮฉ : Type u_4} [MeasurableSpace ๐] [MeasurableSpace ๐] [MeasurableSpace ๐จ] [MeasurableSpace ฮฉ] (Q : MeasureTheory.Measure ๐) (ฮบ : ProbabilityTheory.Kernel (๐ ร ๐) ๐จ) (alg : Algorithm ๐ ๐จ) (E : ฮฉ โ ๐) (A : โ โ ฮฉ โ ๐) (Y : โ โ ฮฉ โ ๐จ) (P : MeasureTheory.Measure ฮฉ) [MeasureTheory.IsFiniteMeasure P] : PropLearning.IsBayesAlgEnvSeq.{u_1, u_2, u_3, u_4} {๐ : Type u_1} {๐ : Type u_2} {๐จ : Type u_3} {ฮฉ : Type u_4} [MeasurableSpace ๐] [MeasurableSpace ๐] [MeasurableSpace ๐จ] [MeasurableSpace ฮฉ] (Q : MeasureTheory.Measure ๐) (ฮบ : ProbabilityTheory.Kernel (๐ ร ๐) ๐จ) (alg : Algorithm ๐ ๐จ) (E : ฮฉ โ ๐) (A : โ โ ฮฉ โ ๐) (Y : โ โ ฮฉ โ ๐จ) (P : MeasureTheory.Measure ฮฉ) [MeasureTheory.IsFiniteMeasure P] : Prop
Code
structure IsBayesAlgEnvSeq
(Q : Measure ๐) (ฮบ : Kernel (๐ ร ๐) ๐จ) (alg : Algorithm ๐ ๐จ)
(E : ฮฉ โ ๐) (A : โ โ ฮฉ โ ๐) (Y : โ โ ฮฉ โ ๐จ)
(P : Measure ฮฉ) [IsFiniteMeasure P] : Prop where
measurable_param : Measurable E := by fun_prop
measurable_action n : Measurable (A n) := by fun_prop
measurable_feedback n : Measurable (Y n) := by fun_prop
hasLaw_env : HasLaw E Q P
hasCondDistrib_action_zero : HasCondDistrib (A 0) E (Kernel.const _ alg.p0) P
hasCondDistrib_feedback_zero : HasCondDistrib (Y 0) (fun ฯ โฆ (E ฯ, A 0 ฯ)) ฮบ P
hasCondDistrib_action n :
HasCondDistrib (A (n + 1)) (fun ฯ โฆ (E ฯ, history A Y n ฯ))
((alg.policy n).prodMkLeft _) P
hasCondDistrib_feedback n :
HasCondDistrib (Y (n + 1)) (fun ฯ โฆ (history A Y n ฯ, E ฯ, A (n + 1) ฯ))
(ฮบ.prodMkLeft _) PActions: Source ยท Open Issue
action๐
Learning.IT.action
action n is the action pulled at time n. This is a random variable on the measurable space
โ โ ๐ ร ๐จ.
Learning.IT.action.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} (n : โ) (h : โ โ ๐ ร ๐จ) : ๐Learning.IT.action.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} (n : โ) (h : โ โ ๐ ร ๐จ) : ๐
Code
def action (n : โ) (h : โ โ ๐ ร ๐จ) : ๐ := (h n).1
Actions: Source ยท Open Issue
trajectory๐
Learning.trajectoryA random variable that gives the sequence of action-feedback pairs.
Learning.trajectory.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {ฮฉ : Type u_3} (A : โ โ ฮฉ โ ๐) (Y : โ โ ฮฉ โ ๐จ) (ฯ : ฮฉ) : โ โ ๐ ร ๐จLearning.trajectory.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {ฮฉ : Type u_3} (A : โ โ ฮฉ โ ๐) (Y : โ โ ฮฉ โ ๐จ) (ฯ : ฮฉ) : โ โ ๐ ร ๐จ
Code
def trajectory (A : โ โ ฮฉ โ ๐) (Y : โ โ ฮฉ โ ๐จ) (ฯ : ฮฉ) : โ โ ๐ ร ๐จ := fun n โฆ (A n ฯ, Y n ฯ)
Used by (18)
Actions: Source ยท Open Issue
All dependencies, transitively (1)
history๐
Learning.history
History of the algorithm-environment sequence up to time n.
Learning.history.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {ฮฉ : Type u_3} (A : โ โ ฮฉ โ ๐) (Y : โ โ ฮฉ โ ๐จ) (n : โ) (ฯ : ฮฉ) : โฅ(Finset.Iic n) โ ๐ ร ๐จLearning.history.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {ฮฉ : Type u_3} (A : โ โ ฮฉ โ ๐) (Y : โ โ ฮฉ โ ๐จ) (n : โ) (ฯ : ฮฉ) : โฅ(Finset.Iic n) โ ๐ ร ๐จ
Code
def history (A : โ โ ฮฉ โ ๐) (Y : โ โ ฮฉ โ ๐จ) (n : โ) (ฯ : ฮฉ) : Iic n โ ๐ ร ๐จ := fun i โฆ (A i ฯ, Y i ฯ)
Actions: Source ยท Open Issue