Documentation

LeanMachineLearning.SequentialLearning.IonescuTulceaSpace

Probability space for algorithm-environment interactions #

For any algorithm and environment, we construct a probability space on which we can define a sequence of random variables representing the actions and feedback generated by the interaction of the algorithm and the environment. The main ingredient of the construction is the Ionescu-Tulcea theorem.

Main statements #

noncomputable def Learning.trajMeasure {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (alg : Algorithm 𝓐 𝓨) (env : Environment 𝓐 𝓨) :
MeasureTheory.Measure (𝓐 × 𝓨)

Measure on the sequence of actions and observations generated by the algorithm/environment.

Equations
Instances For
    instance Learning.instIsProbabilityMeasureForallNatProdTrajMeasure {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (alg : Algorithm 𝓐 𝓨) (env : Environment 𝓐 𝓨) :
    theorem Learning.eq_trajMeasure_of_isAlgEnvSeq {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {Ω : Type u_4} { : MeasurableSpace Ω} [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {A₁ : Ω𝓐} {R₁ : Ω𝓨} (h : IsAlgEnvSeq A₁ R₁ alg env P) :
    MeasureTheory.Measure.map (fun (ω : Ω) (n : ) => (A₁ n ω, R₁ n ω)) P = trajMeasure alg env
    theorem Learning.eq_trajMeasure_map_frestrictLe_of_isAlgEnvSeqUntil {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {Ω : Type u_4} { : MeasurableSpace Ω} [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {A₁ : Ω𝓐} {R₁ : Ω𝓨} {N : } (h : IsAlgEnvSeqUntil A₁ R₁ alg env P N) :
    MeasureTheory.Measure.map (fun (ω : Ω) (n : (Finset.Iic N)) => (A₁ (↑n) ω, R₁ (↑n) ω)) P = MeasureTheory.Measure.map (Preorder.frestrictLe N) (trajMeasure alg env)
    theorem Learning.isAlgEnvSeq_unique {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {Ω : Type u_4} {Ω' : Type u_5} { : MeasurableSpace Ω} {mΩ' : MeasurableSpace Ω'} [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {P' : MeasureTheory.Measure Ω'} [MeasureTheory.IsProbabilityMeasure P'] {A₁ : Ω𝓐} {R₁ : Ω𝓨} {A₂ : Ω'𝓐} {R₂ : Ω'𝓨} (h1 : IsAlgEnvSeq A₁ R₁ alg env P) (h2 : IsAlgEnvSeq A₂ R₂ alg env P') :
    MeasureTheory.Measure.map (fun (ω : Ω) (n : ) => (A₁ n ω, R₁ n ω)) P = MeasureTheory.Measure.map (fun (ω : Ω') (n : ) => (A₂ n ω, R₂ n ω)) P'

    The law of the sequence of actions and observations generated by an algorithm-environment pair is unique: it does not depend on the probability space used.

    theorem Learning.IsAlgEnvSeq.identDistrib_trajectory {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {Ω : Type u_4} {Ω' : Type u_5} { : MeasurableSpace Ω} {mΩ' : MeasurableSpace Ω'} [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {P' : MeasureTheory.Measure Ω'} [MeasureTheory.IsProbabilityMeasure P'] {A₁ : Ω𝓐} {R₁ : Ω𝓨} {A₂ : Ω'𝓐} {R₂ : Ω'𝓨} (h1 : IsAlgEnvSeq A₁ R₁ alg env P) (h2 : IsAlgEnvSeq A₂ R₂ alg env P') :
    ProbabilityTheory.IdentDistrib (fun (ω : Ω) (n : ) => (A₁ n ω, R₁ n ω)) (fun (ω' : Ω') (n : ) => (A₂ n ω', R₂ n ω')) P P'

    The law of the sequence of actions and observations generated by an algorithm-environment pair is unique: it does not depend on the probability space used.

    theorem Learning.isAlgEnvSeqUntil_unique {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {Ω : Type u_4} {Ω' : Type u_5} { : MeasurableSpace Ω} {mΩ' : MeasurableSpace Ω'} [StandardBorelSpace 𝓐] [Nonempty 𝓐] [StandardBorelSpace 𝓨] [Nonempty 𝓨] {alg : Algorithm 𝓐 𝓨} {env : Environment 𝓐 𝓨} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {P' : MeasureTheory.Measure Ω'} [MeasureTheory.IsProbabilityMeasure P'] {A₁ : Ω𝓐} {R₁ : Ω𝓨} {A₂ : Ω'𝓐} {R₂ : Ω'𝓨} {N : } (h1 : IsAlgEnvSeqUntil A₁ R₁ alg env P N) (h2 : IsAlgEnvSeqUntil A₂ R₂ alg env P' N) :
    MeasureTheory.Measure.map (fun (ω : Ω) (n : (Finset.Iic N)) => (A₁ (↑n) ω, R₁ (↑n) ω)) P = MeasureTheory.Measure.map (fun (ω : Ω') (n : (Finset.Iic N)) => (A₂ (↑n) ω, R₂ (↑n) ω)) P'
    def Learning.IT.step {𝓐 : Type u_1} {𝓨 : Type u_2} (n : ) (h : 𝓐 × 𝓨) :
    𝓐 × 𝓨

    Action and feedback at step n.

    Equations
    Instances For
      def Learning.IT.action {𝓐 : Type u_1} {𝓨 : Type u_2} (n : ) (h : 𝓐 × 𝓨) :
      𝓐

      action n is the action pulled at time n. This is a random variable on the measurable space ℕ → 𝓐 × 𝓨.

      Equations
      Instances For
        def Learning.IT.feedback {𝓐 : Type u_1} {𝓨 : Type u_2} (n : ) (h : 𝓐 × 𝓨) :
        𝓨

        feedback n is the feedback at time n. This is a random variable on the measurable space ℕ → 𝓐 × 𝓨.

        Equations
        Instances For
          def Learning.IT.hist {𝓐 : Type u_1} {𝓨 : Type u_2} (n : ) (h : 𝓐 × 𝓨) :
          (Finset.Iic n)𝓐 × 𝓨

          hist n is the history up to time n. This is a random variable on the measurable space ℕ → 𝓐 × 𝓨.

          Equations
          Instances For
            theorem Learning.IT.fst_comp_step {𝓐 : Type u_1} {𝓨 : Type u_2} (n : ) :
            theorem Learning.IT.measurable_step {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (n : ) :
            theorem Learning.IT.measurable_step_prod {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} :
            Measurable fun (p : × (𝓐 × 𝓨)) => step p.1 p.2
            theorem Learning.IT.measurable_action {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (n : ) :
            theorem Learning.IT.measurable_action_prod {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} :
            Measurable fun (p : × (𝓐 × 𝓨)) => action p.1 p.2
            theorem Learning.IT.measurable_feedback {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (n : ) :
            theorem Learning.IT.measurable_feedback_prod {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} :
            Measurable fun (p : × (𝓐 × 𝓨)) => feedback p.1 p.2
            theorem Learning.IT.measurable_hist {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (n : ) :

            Filtration of the algorithm Seq.

            Equations
            Instances For
              theorem Learning.IT.filtration_eq_comap {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (n : ) :
              theorem Learning.IT.step_eq_eval_comp_hist {𝓐 : Type u_1} {𝓨 : Type u_2} (n : ) :
              step n = (fun (x : (Finset.Iic n)𝓐 × 𝓨) => x n, ) hist n
              theorem Learning.IT.action_eq_eval_comp_hist {𝓐 : Type u_1} {𝓨 : Type u_2} (n : ) :
              action n = (fun (x : (Finset.Iic n)𝓐 × 𝓨) => (x n, ).1) hist n
              theorem Learning.IT.feedback_eq_eval_comp_hist {𝓐 : Type u_1} {𝓨 : Type u_2} (n : ) :
              feedback n = (fun (x : (Finset.Iic n)𝓐 × 𝓨) => (x n, ).2) hist n
              theorem Learning.IT.adapted_step {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} :
              theorem Learning.IT.adapted_hist {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} :
              theorem Learning.IT.adapted_action {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} :
              theorem Learning.IT.adapted_feedback {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} :

              Filtration generated by the history at time n-1 together with the action at time n.

              Equations
              • One or more equations did not get rendered due to their size.
              Instances For
                theorem Learning.IT.filtrationAction_eq_comap {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (n : ) (hn : n 0) :
                (filtrationAction 𝓐 𝓨) n = MeasurableSpace.comap (fun (ω : 𝓐 × 𝓨) => (hist (n - 1) ω, action n ω)) inferInstance
                theorem Learning.IT.filtration_le_filtrationAction_add_one {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (n : ) :
                (IT.filtration 𝓐 𝓨) n (filtrationAction 𝓐 𝓨) (n + 1)
                theorem Learning.IT.filtration_le_filtrationAction {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {m n : } (h : n < m) :
                (IT.filtration 𝓐 𝓨) n (filtrationAction 𝓐 𝓨) m
                theorem Learning.IT.filtrationAction_le_filtration_self {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (n : ) :
                (filtrationAction 𝓐 𝓨) n (IT.filtration 𝓐 𝓨) n
                theorem Learning.IT.filtrationAction_le_filtration {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} {m n : } (h : m n) :
                (filtrationAction 𝓐 𝓨) m (IT.filtration 𝓐 𝓨) n
                theorem Learning.IT.measurable_action_filtrationAction {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (n : ) :
                theorem Learning.IT.hasLaw_step_zero {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (alg : Algorithm 𝓐 𝓨) (env : Environment 𝓐 𝓨) :
                theorem Learning.IT.hasLaw_action_zero {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} (alg : Algorithm 𝓐 𝓨) (env : Environment 𝓐 𝓨) :
                theorem Learning.IT.condDistrib_feedback_zero {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} [StandardBorelSpace 𝓨] [Nonempty 𝓨] (alg : Algorithm 𝓐 𝓨) (env : Environment 𝓐 𝓨) :
                theorem Learning.IT.condDistrib_step {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} [StandardBorelSpace 𝓨] [Nonempty 𝓨] [StandardBorelSpace 𝓐] [Nonempty 𝓐] (alg : Algorithm 𝓐 𝓨) (env : Environment 𝓐 𝓨) (n : ) :
                𝓛[step (n + 1) | hist n; trajMeasure alg env] =ᵐ[MeasureTheory.Measure.map (hist n) (trajMeasure alg env)] (stepKernel alg env n)
                theorem Learning.IT.condDistrib_action {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} [StandardBorelSpace 𝓨] [Nonempty 𝓨] [StandardBorelSpace 𝓐] [Nonempty 𝓐] (alg : Algorithm 𝓐 𝓨) (env : Environment 𝓐 𝓨) (n : ) :
                theorem Learning.IT.condDistrib_feedback {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} [StandardBorelSpace 𝓨] [Nonempty 𝓨] [StandardBorelSpace 𝓐] [Nonempty 𝓐] (alg : Algorithm 𝓐 𝓨) (env : Environment 𝓐 𝓨) (n : ) :
                𝓛[feedback (n + 1) | fun (ω : 𝓐 × 𝓨) => (hist n ω, action (n + 1) ω); trajMeasure alg env] =ᵐ[MeasureTheory.Measure.map (fun (ω : 𝓐 × 𝓨) => (hist n ω, action (n + 1) ω)) (trajMeasure alg env)] (env.feedback n)
                theorem Learning.IT.isAlgEnvSeq_trajMeasure {𝓐 : Type u_1} {𝓨 : Type u_2} {m𝓐 : MeasurableSpace 𝓐} {m𝓨 : MeasurableSpace 𝓨} [StandardBorelSpace 𝓨] [Nonempty 𝓨] [StandardBorelSpace 𝓐] [Nonempty 𝓐] (alg : Algorithm 𝓐 𝓨) (env : Environment 𝓐 𝓨) :