Bayesian stationary environments #
This file defines the structure IsBayesAlgEnvSeq and provides its basic properties.
Main definitions #
IsBayesAlgEnvSeq Q ฮบ alg E A Y P: states that there is a measureP : Measure ฮฉsuch that the parameterE : ฮฉ โ ๐has lawQand that the sequences of actionsA : โ โ ฮฉ โ ๐and feedbacksY : โ โ ฮฉ โ ๐จare generated by the algorithmalg : Algorithm ๐ ๐จinteracting with an underlying environment that depends onEandฮบ(stationaryEnv (ฮบ.sectR (E ฯ))).bayesTrajMeasure Q ฮบ alg: for any choice of probability measureQ : Measure ๐, Markov kernelฮบ : Kernel (๐ ร ๐) ๐จ, and algorithmalg : Algorithm ๐ ๐จ, provides a probability measureP : Measure (โ โ ๐ ร ๐ ร ๐จ)on a space that carriesE,A, andYsuch thatIsBayesAlgEnvSeq Q ฮบ alg E A Y P.bayesTrajMeasurePosterior Q ฮบ alg n: aKernel (Iic n โ ๐ ร ๐จ) ๐that represents the posterior overEgiven the history up to timenunder the priorQand the algorithmalg, assuming that the kernelฮบspecifies howEgives rise to the underlying (stationary) environment. See alsoLeanMachineLearning/SequentialLearning/AlgorithmDensityBayes.lean.
Main results #
ae_IsAlgEnvSeq h: ifh : IsBayesAlgEnvSeq Q ฮบ alg E A Y P, forQ-almost everye : ๐,IsAlgEnvSeq A' Y' alg (stationaryEnv (ฮบ.sectR e)) (condDistrib (trajectory A Y) E P e)for some sequence of actionsA' : โ โ (โ โ ๐ ร ๐จ) โ ๐and sequence of feedbacksY' : โ โ (โ โ ๐ ร ๐จ) โ ๐จ. Intuitively, if the observable trajectory is generated by an underlying parametere : ๐, the measure that carries theIsBayesAlgEnvSeqstructure reveals a measure that carries anIsAlgEnvSeqstructure under the environmentstationaryEnv (ฮบ.sectR e)and the same algorithm. This allows transferring results from theIsAlgEnvSeqstructure to theIsBayesAlgEnvSeqstructure.
IsBayesAlgEnvSeq Q ฮบ alg E A Y P states that there is a measure P : Measure ฮฉ such
that the parameter E : ฮฉ โ ๐ has law Q and that the sequences of actions A : โ โ ฮฉ โ ๐
and feedbacks Y : โ โ ฮฉ โ ๐จ are generated by the algorithm alg : Algorithm ๐ ๐จ interacting
with an underlying environment that depends on E and ฮบ (stationaryEnv (ฮบ.sectR (E ฯ))).
- measurable_param : Measurable E
- measurable_action (n : โ) : Measurable (A n)
- measurable_feedback (n : โ) : Measurable (Y n)
- hasLaw_env : ProbabilityTheory.HasLaw E Q P
- hasCondDistrib_action_zero : ProbabilityTheory.HasCondDistrib (A 0) E (ProbabilityTheory.Kernel.const ๐ alg.p0) P
- hasCondDistrib_feedback_zero : ProbabilityTheory.HasCondDistrib (Y 0) (fun (ฯ : ฮฉ) => (E ฯ, A 0 ฯ)) ฮบ P
- hasCondDistrib_action (n : โ) : ProbabilityTheory.HasCondDistrib (A (n + 1)) (fun (ฯ : ฮฉ) => (E ฯ, history A Y n ฯ)) (ProbabilityTheory.Kernel.prodMkLeft ๐ (alg.policy n)) P
- hasCondDistrib_feedback (n : โ) : ProbabilityTheory.HasCondDistrib (Y (n + 1)) (fun (ฯ : ฮฉ) => (history A Y n ฯ, E ฯ, A (n + 1) ฯ)) (ProbabilityTheory.Kernel.prodMkLeft (โฅ(Finset.Iic n) โ ๐ ร ๐จ) ฮบ) P
Instances For
An environment with observations in ๐ ร ๐จ. The first element e of an observation is
sampled from Q once and remains constant. The second element of an observation is sampled from
ฮบ (e, a), where a is the corresponding action.
Equations
- One or more equations did not get rendered due to their size.
Instances For
A measure P on a measurable space that carries random variables E, A, and Y such that
IsBayesAlgEnvSeq Q ฮบ alg E A Y P.
Equations
- Learning.IT.bayesTrajMeasure Q ฮบ alg = Learning.trajMeasure (Learning.Algorithm.prodLeft ๐ alg) (Learning.bayesStationaryEnv Q ฮบ)
Instances For
A kernel that represents the posterior over E given the history up to time n.
Equations
- One or more equations did not get rendered due to their size.