Learning.IsBayesAlgEnvSeq.hasCondDistrib_IT_feedback
This page has the declaration's own card below, then its dependency graph, then a card for each dependency (type dependencies first, then the rest of the transitive closure). For a theorem, the graph and the dependency cards only follow its statement's dependencies (its proof is replaced by sorry, so what it proves doesn't depend on how); for everything else, both the type and the body/value are followed, since their content is part of what later declarations build on.
hasCondDistrib_IT_feedback๐
Learning.IsBayesAlgEnvSeq.hasCondDistrib_IT_feedbackNo docstring.
Learning.IsBayesAlgEnvSeq.hasCondDistrib_IT_feedback.{u_1, u_2, u_3, u_4} {๐ : Type u_1} {๐ : Type u_2} {๐จ : Type u_3} {ฮฉ : Type u_4} [MeasurableSpace ๐] [MeasurableSpace ๐] [MeasurableSpace ๐จ] [MeasurableSpace ฮฉ] {Q : MeasureTheory.Measure ๐} {ฮบ : ProbabilityTheory.Kernel (๐ ร ๐) ๐จ} {alg : Algorithm ๐ ๐จ} {E : ฮฉ โ ๐} {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] [StandardBorelSpace ๐] [Nonempty ๐] [StandardBorelSpace ๐จ] [Nonempty ๐จ] [ProbabilityTheory.IsFiniteKernel ฮบ] (h : IsBayesAlgEnvSeq Q ฮบ alg E A Y P) (n : โ) : โแต (e : ๐) โQ, ProbabilityTheory.HasCondDistrib (IT.feedback (n + 1)) (fun ฯ => (IT.hist n ฯ, IT.action (n + 1) ฯ)) (ProbabilityTheory.Kernel.prodMkLeft (โฅ(Finset.Iic n) โ ๐ ร ๐จ) (ProbabilityTheory.Kernel.sectR ฮบ e)) (๐[trajectory A Y | E; P] e)Learning.IsBayesAlgEnvSeq.hasCondDistrib_IT_feedback.{u_1, u_2, u_3, u_4} {๐ : Type u_1} {๐ : Type u_2} {๐จ : Type u_3} {ฮฉ : Type u_4} [MeasurableSpace ๐] [MeasurableSpace ๐] [MeasurableSpace ๐จ] [MeasurableSpace ฮฉ] {Q : MeasureTheory.Measure ๐} {ฮบ : ProbabilityTheory.Kernel (๐ ร ๐) ๐จ} {alg : Algorithm ๐ ๐จ} {E : ฮฉ โ ๐} {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] [StandardBorelSpace ๐] [Nonempty ๐] [StandardBorelSpace ๐จ] [Nonempty ๐จ] [ProbabilityTheory.IsFiniteKernel ฮบ] (h : IsBayesAlgEnvSeq Q ฮบ alg E A Y P) (n : โ) : โแต (e : ๐) โQ, ProbabilityTheory.HasCondDistrib (IT.feedback (n + 1)) (fun ฯ => (IT.hist n ฯ, IT.action (n + 1) ฯ)) (ProbabilityTheory.Kernel.prodMkLeft (โฅ(Finset.Iic n) โ ๐ ร ๐จ) (ProbabilityTheory.Kernel.sectR ฮบ e)) (๐[trajectory A Y | E; P] e)
Code
lemma hasCondDistrib_IT_feedback [IsFiniteKernel ฮบ] (h : IsBayesAlgEnvSeq Q ฮบ alg E A Y P)
(n : โ) :
โแต e โQ, HasCondDistrib (IT.feedback (n + 1)) (fun ฯ โฆ (IT.hist n ฯ, IT.action (n + 1) ฯ))
((ฮบ.sectR e).prodMkLeft _) (condDistrib (trajectory A Y) E P e)Type uses (6)
Body uses (6)
Used by (1)
Actions: Source ยท Open Issue
Proof
by
rw [โ h.hasLaw_env.map_eq]
have hc : HasCondDistrib (Y (n + 1))
(fun ฯ โฆ (E ฯ, history A Y n ฯ, A (n + 1) ฯ))
(ฮบ.comap (fun (e, _, a) โฆ (e, a)) (by fun_prop)) P :=
(h.hasCondDistrib_feedback n).measurableEquiv_comp_right (MeasurableEquiv.prodAssoc.symm.trans
((MeasurableEquiv.prodCongr .prodComm (.refl _)).trans .prodAssoc))
exact hc.hasCondDistrib_sectR ((IT.measurable_hist n).prodMk
(IT.measurable_action (n + 1))) (IT.measurable_feedback (n + 1))
(measurable_trajectory h.measurable_action h.measurable_feedback).aemeasurableDependency graph
Type dependencies (6)
Algorithm๐
Learning.AlgorithmA stochastic, sequential algorithm.
Learning.Algorithm.{u_4, u_5} (๐ : Type u_4) (๐จ : Type u_5) [MeasurableSpace ๐] [MeasurableSpace ๐จ] : Type (max u_4 u_5)Learning.Algorithm.{u_4, u_5} (๐ : Type u_4) (๐จ : Type u_5) [MeasurableSpace ๐] [MeasurableSpace ๐จ] : Type (max u_4 u_5)
Code
structure Algorithm (๐ ๐จ : Type*) [MeasurableSpace ๐] [MeasurableSpace ๐จ] where /-- Policy or sampling rule: distribution of the next action. -/ policy : (n : โ) โ Kernel (Iic n โ ๐ ร ๐จ) ๐ /-- The policy is a Markov kernel. -/ [h_policy : โ n, IsMarkovKernel (policy n)] /-- Distribution of the first action. -/ p0 : Measure ๐ /-- The first action distribution is a probability measure. -/ [hp0 : IsProbabilityMeasure p0]
Used by (216)
Actions: Source ยท Open Issue
IsBayesAlgEnvSeq๐
Learning.IsBayesAlgEnvSeq
IsBayesAlgEnvSeq Q ฮบ alg E A Y P states that there is a measure P : Measure ฮฉ such
that the parameter E : ฮฉ โ ๐ has law Q and that the sequences of actions A : โ โ ฮฉ โ ๐
and feedbacks Y : โ โ ฮฉ โ ๐จ are generated by the algorithm alg : Algorithm ๐ ๐จ interacting
with an underlying environment that depends on E and ฮบ (stationaryEnv (ฮบ.sectR (E ฯ))).
Learning.IsBayesAlgEnvSeq.{u_1, u_2, u_3, u_4} {๐ : Type u_1} {๐ : Type u_2} {๐จ : Type u_3} {ฮฉ : Type u_4} [MeasurableSpace ๐] [MeasurableSpace ๐] [MeasurableSpace ๐จ] [MeasurableSpace ฮฉ] (Q : MeasureTheory.Measure ๐) (ฮบ : ProbabilityTheory.Kernel (๐ ร ๐) ๐จ) (alg : Algorithm ๐ ๐จ) (E : ฮฉ โ ๐) (A : โ โ ฮฉ โ ๐) (Y : โ โ ฮฉ โ ๐จ) (P : MeasureTheory.Measure ฮฉ) [MeasureTheory.IsFiniteMeasure P] : PropLearning.IsBayesAlgEnvSeq.{u_1, u_2, u_3, u_4} {๐ : Type u_1} {๐ : Type u_2} {๐จ : Type u_3} {ฮฉ : Type u_4} [MeasurableSpace ๐] [MeasurableSpace ๐] [MeasurableSpace ๐จ] [MeasurableSpace ฮฉ] (Q : MeasureTheory.Measure ๐) (ฮบ : ProbabilityTheory.Kernel (๐ ร ๐) ๐จ) (alg : Algorithm ๐ ๐จ) (E : ฮฉ โ ๐) (A : โ โ ฮฉ โ ๐) (Y : โ โ ฮฉ โ ๐จ) (P : MeasureTheory.Measure ฮฉ) [MeasureTheory.IsFiniteMeasure P] : Prop
Code
structure IsBayesAlgEnvSeq
(Q : Measure ๐) (ฮบ : Kernel (๐ ร ๐) ๐จ) (alg : Algorithm ๐ ๐จ)
(E : ฮฉ โ ๐) (A : โ โ ฮฉ โ ๐) (Y : โ โ ฮฉ โ ๐จ)
(P : Measure ฮฉ) [IsFiniteMeasure P] : Prop where
measurable_param : Measurable E := by fun_prop
measurable_action n : Measurable (A n) := by fun_prop
measurable_feedback n : Measurable (Y n) := by fun_prop
hasLaw_env : HasLaw E Q P
hasCondDistrib_action_zero : HasCondDistrib (A 0) E (Kernel.const _ alg.p0) P
hasCondDistrib_feedback_zero : HasCondDistrib (Y 0) (fun ฯ โฆ (E ฯ, A 0 ฯ)) ฮบ P
hasCondDistrib_action n :
HasCondDistrib (A (n + 1)) (fun ฯ โฆ (E ฯ, history A Y n ฯ))
((alg.policy n).prodMkLeft _) P
hasCondDistrib_feedback n :
HasCondDistrib (Y (n + 1)) (fun ฯ โฆ (history A Y n ฯ, E ฯ, A (n + 1) ฯ))
(ฮบ.prodMkLeft _) PActions: Source ยท Open Issue
feedback๐
Learning.IT.feedback
feedback n is the feedback at time n. This is a random variable on the measurable space
โ โ ๐ ร ๐จ.
Learning.IT.feedback.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} (n : โ) (h : โ โ ๐ ร ๐จ) : ๐จLearning.IT.feedback.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} (n : โ) (h : โ โ ๐ ร ๐จ) : ๐จ
Code
def feedback (n : โ) (h : โ โ ๐ ร ๐จ) : ๐จ := (h n).2
Used by (16)
Actions: Source ยท Open Issue
hist๐
Learning.IT.hist
hist n is the history up to time n. This is a random variable on the measurable space
โ โ ๐ ร ๐จ.
Learning.IT.hist.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} (n : โ) (h : โ โ ๐ ร ๐จ) : โฅ(Finset.Iic n) โ ๐ ร ๐จLearning.IT.hist.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} (n : โ) (h : โ โ ๐ ร ๐จ) : โฅ(Finset.Iic n) โ ๐ ร ๐จ
Code
def hist (n : โ) (h : โ โ ๐ ร ๐จ) : Iic n โ ๐ ร ๐จ := fun i โฆ h i
Used by (23)
Actions: Source ยท Open Issue
action๐
Learning.IT.action
action n is the action pulled at time n. This is a random variable on the measurable space
โ โ ๐ ร ๐จ.
Learning.IT.action.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} (n : โ) (h : โ โ ๐ ร ๐จ) : ๐Learning.IT.action.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} (n : โ) (h : โ โ ๐ ร ๐จ) : ๐
Code
def action (n : โ) (h : โ โ ๐ ร ๐จ) : ๐ := (h n).1
Actions: Source ยท Open Issue
trajectory๐
Learning.trajectoryA random variable that gives the sequence of action-feedback pairs.
Learning.trajectory.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {ฮฉ : Type u_3} (A : โ โ ฮฉ โ ๐) (Y : โ โ ฮฉ โ ๐จ) (ฯ : ฮฉ) : โ โ ๐ ร ๐จLearning.trajectory.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {ฮฉ : Type u_3} (A : โ โ ฮฉ โ ๐) (Y : โ โ ฮฉ โ ๐จ) (ฯ : ฮฉ) : โ โ ๐ ร ๐จ
Code
def trajectory (A : โ โ ฮฉ โ ๐) (Y : โ โ ฮฉ โ ๐จ) (ฯ : ฮฉ) : โ โ ๐ ร ๐จ := fun n โฆ (A n ฯ, Y n ฯ)
Used by (18)
Actions: Source ยท Open Issue
All dependencies, transitively (1)
history๐
Learning.history
History of the algorithm-environment sequence up to time n.
Learning.history.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {ฮฉ : Type u_3} (A : โ โ ฮฉ โ ๐) (Y : โ โ ฮฉ โ ๐จ) (n : โ) (ฯ : ฮฉ) : โฅ(Finset.Iic n) โ ๐ ร ๐จLearning.history.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {ฮฉ : Type u_3} (A : โ โ ฮฉ โ ๐) (Y : โ โ ฮฉ โ ๐จ) (n : โ) (ฯ : ฮฉ) : โฅ(Finset.Iic n) โ ๐ ร ๐จ
Code
def history (A : โ โ ฮฉ โ ๐) (Y : โ โ ฮฉ โ ๐จ) (n : โ) (ฯ : ฮฉ) : Iic n โ ๐ ร ๐จ := fun i โฆ (A i ฯ, Y i ฯ)
Actions: Source ยท Open Issue