Learning.IsBayesAlgEnvSeq
This page has the declaration's own card below, then its dependency graph, then a card for each dependency (type dependencies first, then the rest of the transitive closure). For a theorem, the graph and the dependency cards only follow its statement's dependencies (its proof is replaced by sorry, so what it proves doesn't depend on how); for everything else, both the type and the body/value are followed, since their content is part of what later declarations build on.
IsBayesAlgEnvSeq๐
Learning.IsBayesAlgEnvSeq
IsBayesAlgEnvSeq Q ฮบ alg E A Y P states that there is a measure P : Measure ฮฉ such
that the parameter E : ฮฉ โ ๐ has law Q and that the sequences of actions A : โ โ ฮฉ โ ๐
and feedbacks Y : โ โ ฮฉ โ ๐จ are generated by the algorithm alg : Algorithm ๐ ๐จ interacting
with an underlying environment that depends on E and ฮบ (stationaryEnv (ฮบ.sectR (E ฯ))).
Learning.IsBayesAlgEnvSeq.{u_1, u_2, u_3, u_4} {๐ : Type u_1} {๐ : Type u_2} {๐จ : Type u_3} {ฮฉ : Type u_4} [MeasurableSpace ๐] [MeasurableSpace ๐] [MeasurableSpace ๐จ] [MeasurableSpace ฮฉ] (Q : MeasureTheory.Measure ๐) (ฮบ : ProbabilityTheory.Kernel (๐ ร ๐) ๐จ) (alg : Algorithm ๐ ๐จ) (E : ฮฉ โ ๐) (A : โ โ ฮฉ โ ๐) (Y : โ โ ฮฉ โ ๐จ) (P : MeasureTheory.Measure ฮฉ) [MeasureTheory.IsFiniteMeasure P] : PropLearning.IsBayesAlgEnvSeq.{u_1, u_2, u_3, u_4} {๐ : Type u_1} {๐ : Type u_2} {๐จ : Type u_3} {ฮฉ : Type u_4} [MeasurableSpace ๐] [MeasurableSpace ๐] [MeasurableSpace ๐จ] [MeasurableSpace ฮฉ] (Q : MeasureTheory.Measure ๐) (ฮบ : ProbabilityTheory.Kernel (๐ ร ๐) ๐จ) (alg : Algorithm ๐ ๐จ) (E : ฮฉ โ ๐) (A : โ โ ฮฉ โ ๐) (Y : โ โ ฮฉ โ ๐จ) (P : MeasureTheory.Measure ฮฉ) [MeasureTheory.IsFiniteMeasure P] : Prop
Code
structure IsBayesAlgEnvSeq
(Q : Measure ๐) (ฮบ : Kernel (๐ ร ๐) ๐จ) (alg : Algorithm ๐ ๐จ)
(E : ฮฉ โ ๐) (A : โ โ ฮฉ โ ๐) (Y : โ โ ฮฉ โ ๐จ)
(P : Measure ฮฉ) [IsFiniteMeasure P] : Prop where
measurable_param : Measurable E := by fun_prop
measurable_action n : Measurable (A n) := by fun_prop
measurable_feedback n : Measurable (Y n) := by fun_prop
hasLaw_env : HasLaw E Q P
hasCondDistrib_action_zero : HasCondDistrib (A 0) E (Kernel.const _ alg.p0) P
hasCondDistrib_feedback_zero : HasCondDistrib (Y 0) (fun ฯ โฆ (E ฯ, A 0 ฯ)) ฮบ P
hasCondDistrib_action n :
HasCondDistrib (A (n + 1)) (fun ฯ โฆ (E ฯ, history A Y n ฯ))
((alg.policy n).prodMkLeft _) P
hasCondDistrib_feedback n :
HasCondDistrib (Y (n + 1)) (fun ฯ โฆ (history A Y n ฯ, E ฯ, A (n + 1) ฯ))
(ฮบ.prodMkLeft _) PActions: Source ยท Open Issue
Dependency graph
Type dependencies (2)
Algorithm๐
Learning.AlgorithmA stochastic, sequential algorithm.
Learning.Algorithm.{u_4, u_5} (๐ : Type u_4) (๐จ : Type u_5) [MeasurableSpace ๐] [MeasurableSpace ๐จ] : Type (max u_4 u_5)Learning.Algorithm.{u_4, u_5} (๐ : Type u_4) (๐จ : Type u_5) [MeasurableSpace ๐] [MeasurableSpace ๐จ] : Type (max u_4 u_5)
Code
structure Algorithm (๐ ๐จ : Type*) [MeasurableSpace ๐] [MeasurableSpace ๐จ] where /-- Policy or sampling rule: distribution of the next action. -/ policy : (n : โ) โ Kernel (Iic n โ ๐ ร ๐จ) ๐ /-- The policy is a Markov kernel. -/ [h_policy : โ n, IsMarkovKernel (policy n)] /-- Distribution of the first action. -/ p0 : Measure ๐ /-- The first action distribution is a probability measure. -/ [hp0 : IsProbabilityMeasure p0]
Used by (216)
Actions: Source ยท Open Issue
history๐
Learning.history
History of the algorithm-environment sequence up to time n.
Learning.history.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {ฮฉ : Type u_3} (A : โ โ ฮฉ โ ๐) (Y : โ โ ฮฉ โ ๐จ) (n : โ) (ฯ : ฮฉ) : โฅ(Finset.Iic n) โ ๐ ร ๐จLearning.history.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {ฮฉ : Type u_3} (A : โ โ ฮฉ โ ๐) (Y : โ โ ฮฉ โ ๐จ) (n : โ) (ฯ : ฮฉ) : โฅ(Finset.Iic n) โ ๐ ร ๐จ
Code
def history (A : โ โ ฮฉ โ ๐) (Y : โ โ ฮฉ โ ๐จ) (n : โ) (ฯ : ฮฉ) : Iic n โ ๐ ร ๐จ := fun i โฆ (A i ฯ, Y i ฯ)
Actions: Source ยท Open Issue