Learning.IsBayesAlgEnvSeq.hasLaw_history_withDensity
This page has the declaration's own card below, then its dependency graph, then a card for each dependency (type dependencies first, then the rest of the transitive closure). For a theorem, the graph and the dependency cards only follow its statement's dependencies (its proof is replaced by sorry, so what it proves doesn't depend on how); for everything else, both the type and the body/value are followed, since their content is part of what later declarations build on.
hasLaw_history_withDensityπ
Learning.IsBayesAlgEnvSeq.hasLaw_history_withDensityNo docstring.
Learning.IsBayesAlgEnvSeq.hasLaw_history_withDensity.{u_1, u_2, u_3, u_4, u_5} {π : Type u_1} {π¨ : Type u_2} [MeasurableSpace π] [MeasurableSpace π¨] {π : Type u_3} [MeasurableSpace π] [StandardBorelSpace π] [Nonempty π] [StandardBorelSpace π¨] [Nonempty π¨] {Q : MeasureTheory.Measure π} {ΞΊ : ProbabilityTheory.Kernel (π Γ π) π¨} [ProbabilityTheory.IsMarkovKernel ΞΊ] {Ξ© : Type u_4} [MeasurableSpace Ξ©] {E : Ξ© β π} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} {alg : Algorithm π π¨} {P : MeasureTheory.Measure Ξ©} [MeasureTheory.IsProbabilityMeasure P] {Ξ©β : Type u_5} [MeasurableSpace Ξ©β] {Eβ : Ξ©β β π} {Aβ : β β Ξ©β β π} {Yβ : β β Ξ©β β π¨} {algβ : Algorithm π π¨} {Pβ : MeasureTheory.Measure Ξ©β} [MeasureTheory.IsProbabilityMeasure Pβ] (h : IsBayesAlgEnvSeq Q ΞΊ alg E A Y P) (hβ : IsBayesAlgEnvSeq Q ΞΊ algβ Eβ Aβ Yβ Pβ) (hc : Algorithm.AbsolutelyContinuous alg algβ) (n : β) : ProbabilityTheory.HasLaw (history A Y n) (MeasureTheory.Measure.withDensity (MeasureTheory.Measure.map (history Aβ Yβ n) Pβ) (Algorithm.density alg algβ n)) PLearning.IsBayesAlgEnvSeq.hasLaw_history_withDensity.{u_1, u_2, u_3, u_4, u_5} {π : Type u_1} {π¨ : Type u_2} [MeasurableSpace π] [MeasurableSpace π¨] {π : Type u_3} [MeasurableSpace π] [StandardBorelSpace π] [Nonempty π] [StandardBorelSpace π¨] [Nonempty π¨] {Q : MeasureTheory.Measure π} {ΞΊ : ProbabilityTheory.Kernel (π Γ π) π¨} [ProbabilityTheory.IsMarkovKernel ΞΊ] {Ξ© : Type u_4} [MeasurableSpace Ξ©] {E : Ξ© β π} {A : β β Ξ© β π} {Y : β β Ξ© β π¨} {alg : Algorithm π π¨} {P : MeasureTheory.Measure Ξ©} [MeasureTheory.IsProbabilityMeasure P] {Ξ©β : Type u_5} [MeasurableSpace Ξ©β] {Eβ : Ξ©β β π} {Aβ : β β Ξ©β β π} {Yβ : β β Ξ©β β π¨} {algβ : Algorithm π π¨} {Pβ : MeasureTheory.Measure Ξ©β} [MeasureTheory.IsProbabilityMeasure Pβ] (h : IsBayesAlgEnvSeq Q ΞΊ alg E A Y P) (hβ : IsBayesAlgEnvSeq Q ΞΊ algβ Eβ Aβ Yβ Pβ) (hc : Algorithm.AbsolutelyContinuous alg algβ) (n : β) : ProbabilityTheory.HasLaw (history A Y n) (MeasureTheory.Measure.withDensity (MeasureTheory.Measure.map (history Aβ Yβ n) Pβ) (Algorithm.density alg algβ n)) P
Code
lemma hasLaw_history_withDensity (h : IsBayesAlgEnvSeq Q ΞΊ alg E A Y P)
(hβ : IsBayesAlgEnvSeq Q ΞΊ algβ Eβ Aβ Yβ Pβ) (hc : alg βͺβ algβ) (n : β) :
HasLaw (history A Y n)
((Pβ.map (history Aβ Yβ n)).withDensity (alg.density algβ n)) P where
aemeasurableType uses (5)
Body uses (4)
Used by (1)
Actions: Source Β· Open Issue
Proof
(measurable_history h.measurable_action h.measurable_feedback n).aemeasurable
map_eq := by
have hA := h.measurable_action
have hY := h.measurable_feedback
have hAβ := hβ.measurable_action
have hYβ := hβ.measurable_feedback
have hE := h.measurable_param
have hEβ := hβ.measurable_param
rw [β condDistrib_comp_map hE.aemeasurable (by fun_prop), h.hasLaw_env.map_eq,
Measure.bind_congr_right (h.condDistrib_history_eq_condDistrib_hist_withDensity hβ hc n),
Kernel.comp_withDensity_eq_withDensity_comp (by fun_prop),
β hβ.hasLaw_env.map_eq, condDistrib_comp_map hEβ.aemeasurable (by fun_prop)]Dependency graph
Type dependencies (5)
Algorithmπ
Learning.AlgorithmA stochastic, sequential algorithm.
Learning.Algorithm.{u_4, u_5} (π : Type u_4) (π¨ : Type u_5) [MeasurableSpace π] [MeasurableSpace π¨] : Type (max u_4 u_5)Learning.Algorithm.{u_4, u_5} (π : Type u_4) (π¨ : Type u_5) [MeasurableSpace π] [MeasurableSpace π¨] : Type (max u_4 u_5)
Code
structure Algorithm (π π¨ : Type*) [MeasurableSpace π] [MeasurableSpace π¨] where /-- Policy or sampling rule: distribution of the next action. -/ policy : (n : β) β Kernel (Iic n β π Γ π¨) π /-- The policy is a Markov kernel. -/ [h_policy : β n, IsMarkovKernel (policy n)] /-- Distribution of the first action. -/ p0 : Measure π /-- The first action distribution is a probability measure. -/ [hp0 : IsProbabilityMeasure p0]
Used by (216)
Actions: Source Β· Open Issue
IsBayesAlgEnvSeqπ
Learning.IsBayesAlgEnvSeq
IsBayesAlgEnvSeq Q ΞΊ alg E A Y P states that there is a measure P : Measure Ξ© such
that the parameter E : Ξ© β π has law Q and that the sequences of actions A : β β Ξ© β π
and feedbacks Y : β β Ξ© β π¨ are generated by the algorithm alg : Algorithm π π¨ interacting
with an underlying environment that depends on E and ΞΊ (stationaryEnv (ΞΊ.sectR (E Ο))).
Learning.IsBayesAlgEnvSeq.{u_1, u_2, u_3, u_4} {π : Type u_1} {π : Type u_2} {π¨ : Type u_3} {Ξ© : Type u_4} [MeasurableSpace π] [MeasurableSpace π] [MeasurableSpace π¨] [MeasurableSpace Ξ©] (Q : MeasureTheory.Measure π) (ΞΊ : ProbabilityTheory.Kernel (π Γ π) π¨) (alg : Algorithm π π¨) (E : Ξ© β π) (A : β β Ξ© β π) (Y : β β Ξ© β π¨) (P : MeasureTheory.Measure Ξ©) [MeasureTheory.IsFiniteMeasure P] : PropLearning.IsBayesAlgEnvSeq.{u_1, u_2, u_3, u_4} {π : Type u_1} {π : Type u_2} {π¨ : Type u_3} {Ξ© : Type u_4} [MeasurableSpace π] [MeasurableSpace π] [MeasurableSpace π¨] [MeasurableSpace Ξ©] (Q : MeasureTheory.Measure π) (ΞΊ : ProbabilityTheory.Kernel (π Γ π) π¨) (alg : Algorithm π π¨) (E : Ξ© β π) (A : β β Ξ© β π) (Y : β β Ξ© β π¨) (P : MeasureTheory.Measure Ξ©) [MeasureTheory.IsFiniteMeasure P] : Prop
Code
structure IsBayesAlgEnvSeq
(Q : Measure π) (ΞΊ : Kernel (π Γ π) π¨) (alg : Algorithm π π¨)
(E : Ξ© β π) (A : β β Ξ© β π) (Y : β β Ξ© β π¨)
(P : Measure Ξ©) [IsFiniteMeasure P] : Prop where
measurable_param : Measurable E := by fun_prop
measurable_action n : Measurable (A n) := by fun_prop
measurable_feedback n : Measurable (Y n) := by fun_prop
hasLaw_env : HasLaw E Q P
hasCondDistrib_action_zero : HasCondDistrib (A 0) E (Kernel.const _ alg.p0) P
hasCondDistrib_feedback_zero : HasCondDistrib (Y 0) (fun Ο β¦ (E Ο, A 0 Ο)) ΞΊ P
hasCondDistrib_action n :
HasCondDistrib (A (n + 1)) (fun Ο β¦ (E Ο, history A Y n Ο))
((alg.policy n).prodMkLeft _) P
hasCondDistrib_feedback n :
HasCondDistrib (Y (n + 1)) (fun Ο β¦ (history A Y n Ο, E Ο, A (n + 1) Ο))
(ΞΊ.prodMkLeft _) PActions: Source Β· Open Issue
AbsolutelyContinuousπ
Learning.Algorithm.AbsolutelyContinuous
For every time and history, the distribution over actions according to alg is absolutely
continuous with respect to the distribution over actions according to algβ.
Learning.Algorithm.AbsolutelyContinuous.{u_1, u_2} {π : Type u_1} {π¨ : Type u_2} [MeasurableSpace π] [MeasurableSpace π¨] (alg algβ : Algorithm π π¨) : PropLearning.Algorithm.AbsolutelyContinuous.{u_1, u_2} {π : Type u_1} {π¨ : Type u_2} [MeasurableSpace π] [MeasurableSpace π¨] (alg algβ : Algorithm π π¨) : Prop
Code
structure AbsolutelyContinuous (alg algβ : Algorithm π π¨) : Prop where p0 : alg.p0 βͺ algβ.p0 policy n h : alg.policy n h βͺ algβ.policy n h
Type uses (1)
Used by (7)
Actions: Source Β· Open Issue
historyπ
Learning.history
History of the algorithm-environment sequence up to time n.
Learning.history.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} (A : β β Ξ© β π) (Y : β β Ξ© β π¨) (n : β) (Ο : Ξ©) : β₯(Finset.Iic n) β π Γ π¨Learning.history.{u_1, u_2, u_3} {π : Type u_1} {π¨ : Type u_2} {Ξ© : Type u_3} (A : β β Ξ© β π) (Y : β β Ξ© β π¨) (n : β) (Ο : Ξ©) : β₯(Finset.Iic n) β π Γ π¨
Code
def history (A : β β Ξ© β π) (Y : β β Ξ© β π¨) (n : β) (Ο : Ξ©) : Iic n β π Γ π¨ := fun i β¦ (A i Ο, Y i Ο)
Actions: Source Β· Open Issue
densityπ
Learning.Algorithm.density
If the algorithm alg is absolutely continuous with respect to the algorithm algβ and they
are both interacting with the same environment, then the law of the history at time n under alg
is the law of the history at time n under algβ with density alg.density algβ n.
Learning.Algorithm.density.{u_1, u_2} {π : Type u_1} {π¨ : Type u_2} [MeasurableSpace π] [MeasurableSpace π¨] [MeasurableSpace.CountablyGenerated π] (alg algβ : Algorithm π π¨) (n : β) : (β₯(Finset.Iic n) β π Γ π¨) β ENNRealLearning.Algorithm.density.{u_1, u_2} {π : Type u_1} {π¨ : Type u_2} [MeasurableSpace π] [MeasurableSpace π¨] [MeasurableSpace.CountablyGenerated π] (alg algβ : Algorithm π π¨) (n : β) : (β₯(Finset.Iic n) β π Γ π¨) β ENNReal
Code
noncomputable
def density [MeasurableSpace.CountablyGenerated π] (alg algβ : Algorithm π π¨) :
(n : β) β (Iic n β π Γ π¨) β ββ₯0β
| 0, h => (alg.p0.rnDeriv algβ.p0 (h β¨0, by simpβ©).1)
| n + 1, h =>
let p := MeasurableEquiv.IicSuccProd (fun _ β¦ π Γ π¨) n h
alg.density algβ n p.1 * (alg.policy n).rnDeriv (algβ.policy n) p.1 p.2.1Type uses (1)
Body uses (1)
Used by (5)
Actions: Source Β· Open Issue
All dependencies, transitively (1)
IicSuccProdπ
MeasurableEquiv.IicSuccProd
Measurable equivalence between a product up to n + 1 and the pair of the product up to n and
the space at n + 1.
MeasurableEquiv.IicSuccProd.{u_3} (X : β β Type u_3) [(n : β) β MeasurableSpace (X n)] (n : β) : ((i : β₯(Finset.Iic (n + 1))) β X βi) βα΅ ((i : β₯(Finset.Iic n)) β X βi) Γ X (n + 1)MeasurableEquiv.IicSuccProd.{u_3} (X : β β Type u_3) [(n : β) β MeasurableSpace (X n)] (n : β) : ((i : β₯(Finset.Iic (n + 1))) β X βi) βα΅ ((i : β₯(Finset.Iic n)) β X βi) Γ X (n + 1)
Code
def _root_.MeasurableEquiv.IicSuccProd (X : β β Type*) [β n, MeasurableSpace (X n)] (n : β) :
MeasurableEquiv (Ξ i : Iic (n + 1), X i) ((Ξ i : Iic n, X i) Γ X (n + 1)) :=
(MeasurableEquiv.IicProdIoc (Nat.le_succ n)).symm.trans
(MeasurableEquiv.prodCongr (MeasurableEquiv.refl _) (MeasurableEquiv.piSingleton n).symm)Used by (11)
Actions: Source Β· Open Issue