LeanMachineLearning exposition

3.7.ย SequentialLearning.Deterministic๐Ÿ”—

Deterministic algorithms and environments

A deterministic algorithm chooses its action in a deterministic way. That is, that action is given by a measurable function of the history instead of a general Markov kernel. Similarly, a deterministic environment gives feedback in a deterministic way.

Main definitions

We introduce two typeclasses IsDeterministicAlg and IsDeterministicEnv to express that an algorithm or an environment is deterministic. We also give definitions for the initial action and the next action of a deterministic algorithm, and for the feedback functions of a deterministic environment. Finally, we give a construction of a deterministic algorithm and environment from measurable functions.

  • IsDeterministicAlg alg: a typeclass expressing that the algorithm alg is deterministic.

  • IsDeterministicEnv env: a typeclass expressing that the environment env is deterministic.

  • actionZero alg: the initial action of a deterministic algorithm alg.

  • nextAction alg n: the function that gives the next action of a deterministic algorithm alg at step n, as a function of the history.

  • feedbackFunZero env: the function that gives the initial feedback of a deterministic environment env.

  • feedbackFun env n: the function that gives the feedback of a deterministic environment env at step n, as a function of the history and the current action.

  • detAlgorithm nextA h_next action0: a deterministic algorithm that chooses its action according to the measurable function nextA (with proof of measurability h_next), with initial action action0.

Module LeanMachineLearning.SequentialLearning.Deterministic contains 39 exposed declarations.

IsDeterministicAlg๐Ÿ”—

Type ClassLearning.IsDeterministicAlg

An algorithm is deterministic if its initial action and subsequent actions are determined by measurable functions (and not possibly random kernels).

๐Ÿ”—type class
Learning.IsDeterministicAlg.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (alg : Algorithm ๐“ ๐“จ) : Prop
Learning.IsDeterministicAlg.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (alg : Algorithm ๐“ ๐“จ) : Prop

Code

class IsDeterministicAlg (alg : Algorithm ๐“ ๐“จ) : Prop where
  exists_action0 : โˆƒ action0, alg.p0 = Measure.dirac action0
  exists_nextAction n : โˆƒ (nextAction : (Iic n โ†’ ๐“ ร— ๐“จ) โ†’ ๐“) (h_meas : Measurable nextAction),
    alg.policy n = Kernel.deterministic nextAction h_meas
Type uses (1)
Used by (14)

Actions: Source ยท Open Issue

actionZero๐Ÿ”—

DefinitionLearning.actionZero

The initial action of a deterministic algorithm.

๐Ÿ”—def
Learning.actionZero.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (alg : Algorithm ๐“ ๐“จ) [h_det : IsDeterministicAlg alg] : ๐“
Learning.actionZero.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (alg : Algorithm ๐“ ๐“จ) [h_det : IsDeterministicAlg alg] : ๐“

Code

noncomputable
def actionZero (alg : Algorithm ๐“ ๐“จ) [h_det : IsDeterministicAlg alg] : ๐“ :=
  h_det.exists_action0.choose
Type uses (2)
Used by (12)

Actions: Source ยท Open Issue

nextAction๐Ÿ”—

DefinitionLearning.nextAction

The next action of a deterministic algorithm after step n.

๐Ÿ”—def
Learning.nextAction.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (alg : Algorithm ๐“ ๐“จ) [h_det : IsDeterministicAlg alg] (n : โ„•) : (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“
Learning.nextAction.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (alg : Algorithm ๐“ ๐“จ) [h_det : IsDeterministicAlg alg] (n : โ„•) : (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“

Code

noncomputable
def nextAction (alg : Algorithm ๐“ ๐“จ) [h_det : IsDeterministicAlg alg] (n : โ„•) :
    (Iic n โ†’ ๐“ ร— ๐“จ) โ†’ ๐“ :=
  (h_det.exists_nextAction n).choose
Type uses (2)
Used by (9)

Actions: Source ยท Open Issue

measurable_nextAction๐Ÿ”—

LemmaLearning.measurable_nextAction

No docstring.

๐Ÿ”—theorem
Learning.measurable_nextAction.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (alg : Algorithm ๐“ ๐“จ) [IsDeterministicAlg alg] (n : โ„•) : Measurable (nextAction alg n)
Learning.measurable_nextAction.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (alg : Algorithm ๐“ ๐“จ) [IsDeterministicAlg alg] (n : โ„•) : Measurable (nextAction alg n)

Code

lemma measurable_nextAction (alg : Algorithm ๐“ ๐“จ) [IsDeterministicAlg alg] (n : โ„•) :
    Measurable (nextAction alg n)
Type uses (3)
Used by (3)

Actions: Source ยท Open Issue

Proof
(IsDeterministicAlg.exists_nextAction n).choose_spec.choose

p0_eq_dirac๐Ÿ”—

LemmaLearning.p0_eq_dirac

No docstring.

๐Ÿ”—theorem
Learning.p0_eq_dirac.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (alg : Algorithm ๐“ ๐“จ) [h_det : IsDeterministicAlg alg] : Algorithm.p0 alg = MeasureTheory.Measure.dirac (actionZero alg)
Learning.p0_eq_dirac.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (alg : Algorithm ๐“ ๐“จ) [h_det : IsDeterministicAlg alg] : Algorithm.p0 alg = MeasureTheory.Measure.dirac (actionZero alg)

Code

lemma p0_eq_dirac (alg : Algorithm ๐“ ๐“จ) [h_det : IsDeterministicAlg alg] :
    alg.p0 = Measure.dirac (actionZero alg)
Type uses (3)
Used by (3)

Actions: Source ยท Open Issue

Proof
h_det.exists_action0.choose_spec

policy_eq_deterministic๐Ÿ”—

LemmaLearning.policy_eq_deterministic

No docstring.

๐Ÿ”—theorem
Learning.policy_eq_deterministic.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (alg : Algorithm ๐“ ๐“จ) [h_det : IsDeterministicAlg alg] (n : โ„•) : Algorithm.policy alg n = ProbabilityTheory.Kernel.deterministic (nextAction alg n) โ‹ฏ
Learning.policy_eq_deterministic.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (alg : Algorithm ๐“ ๐“จ) [h_det : IsDeterministicAlg alg] (n : โ„•) : Algorithm.policy alg n = ProbabilityTheory.Kernel.deterministic (nextAction alg n) โ‹ฏ

Code

lemma policy_eq_deterministic (alg : Algorithm ๐“ ๐“จ) [h_det : IsDeterministicAlg alg] (n : โ„•) :
    alg.policy n = Kernel.deterministic (nextAction alg n) (measurable_nextAction alg n)
Type uses (4)
Used by (2)

Actions: Source ยท Open Issue

Proof
(IsDeterministicAlg.exists_nextAction n).choose_spec.choose_spec

hasLaw_action_zero_of_IsAlgEnvSeqUntil๐Ÿ”—

LemmaLearning.IsDeterministicAlg.hasLaw_action_zero_of_IsAlgEnvSeqUntil

No docstring.

๐Ÿ”—theorem
Learning.IsDeterministicAlg.hasLaw_action_zero_of_IsAlgEnvSeqUntil.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐“ ๐“จ} {env : Environment ๐“ ๐“จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} {N : โ„•} [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeqUntil A Y alg env P N) : ProbabilityTheory.HasLaw (A 0) (MeasureTheory.Measure.dirac (actionZero alg)) P
Learning.IsDeterministicAlg.hasLaw_action_zero_of_IsAlgEnvSeqUntil.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐“ ๐“จ} {env : Environment ๐“ ๐“จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} {N : โ„•} [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeqUntil A Y alg env P N) : ProbabilityTheory.HasLaw (A 0) (MeasureTheory.Measure.dirac (actionZero alg)) P

Code

lemma hasLaw_action_zero_of_IsAlgEnvSeqUntil [h_det : IsDeterministicAlg alg]
    (h : IsAlgEnvSeqUntil A Y alg env P N) :
    HasLaw (A 0) (Measure.dirac (actionZero alg)) P where
  aemeasurable
Type uses (5)
Body uses (1)
Used by (2)

Actions: Source ยท Open Issue

Proof
have hA := h.measurable_action; by fun_prop
  map_eq := (h.hasLaw_action_zero).map_eq.trans (p0_eq_dirac alg)

action_zero_of_IsAlgEnvSeqUntil๐Ÿ”—

LemmaLearning.IsDeterministicAlg.action_zero_of_IsAlgEnvSeqUntil

No docstring.

๐Ÿ”—theorem
Learning.IsDeterministicAlg.action_zero_of_IsAlgEnvSeqUntil.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐“ ๐“จ} {env : Environment ๐“ ๐“จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} {N : โ„•} [MeasurableEq ๐“] [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeqUntil A Y alg env P N) : A 0 =แต[P] fun x => actionZero alg
Learning.IsDeterministicAlg.action_zero_of_IsAlgEnvSeqUntil.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐“ ๐“จ} {env : Environment ๐“ ๐“จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} {N : โ„•} [MeasurableEq ๐“] [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeqUntil A Y alg env P N) : A 0 =แต[P] fun x => actionZero alg

Code

lemma action_zero_of_IsAlgEnvSeqUntil [MeasurableEq ๐“] [h_det : IsDeterministicAlg alg]
    (h : IsAlgEnvSeqUntil A Y alg env P N) :
    A 0 =แต[P] fun _ โ†ฆ actionZero alg
Type uses (5)
Body uses (1)
Used by (2)

Actions: Source ยท Open Issue

Proof
by
  have h_eq : โˆ€แต x โˆ‚(P.map (A 0)), x = actionZero alg := by
    simp [(hasLaw_action_zero_of_IsAlgEnvSeqUntil h).map_eq]
  have hA := h.measurable_action
  exact ae_of_ae_map (by fun_prop) h_eq

action_ae_eq_of_IsAlgEnvSeqUntil๐Ÿ”—

LemmaLearning.IsDeterministicAlg.action_ae_eq_of_IsAlgEnvSeqUntil

No docstring.

๐Ÿ”—theorem
Learning.IsDeterministicAlg.action_ae_eq_of_IsAlgEnvSeqUntil.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐“ ๐“จ} {env : Environment ๐“ ๐“จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} {n N : โ„•} [MeasurableEq ๐“] [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeqUntil A Y alg env P N) (hn : n < N) : A (n + 1) =แต[P] fun ฯ‰ => nextAction alg n (history A Y n ฯ‰)
Learning.IsDeterministicAlg.action_ae_eq_of_IsAlgEnvSeqUntil.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐“ ๐“จ} {env : Environment ๐“ ๐“จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} {n N : โ„•} [MeasurableEq ๐“] [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeqUntil A Y alg env P N) (hn : n < N) : A (n + 1) =แต[P] fun ฯ‰ => nextAction alg n (history A Y n ฯ‰)

Code

lemma action_ae_eq_of_IsAlgEnvSeqUntil [MeasurableEq ๐“]
    [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeqUntil A Y alg env P N) (hn : n < N) :
    A (n + 1) =แต[P] fun ฯ‰ โ†ฆ nextAction alg n (history A Y n ฯ‰)
Type uses (6)
Body uses (3)
Used by (2)

Actions: Source ยท Open Issue

Proof
by
  have h_eq := (h.hasCondDistrib_action n hn)
  rw [policy_eq_deterministic alg n] at h_eq
  exact ae_eq_of_hasCondDistrib_deterministic (measurable_nextAction _ _) (by fun_prop)
    (by fun_prop) h_eq

hasLaw_action_zero๐Ÿ”—

LemmaLearning.IsDeterministicAlg.hasLaw_action_zero

No docstring.

๐Ÿ”—theorem
Learning.IsDeterministicAlg.hasLaw_action_zero.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐“ ๐“จ} {env : Environment ๐“ ๐“จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeq A Y alg env P) : ProbabilityTheory.HasLaw (A 0) (MeasureTheory.Measure.dirac (actionZero alg)) P
Learning.IsDeterministicAlg.hasLaw_action_zero.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐“ ๐“จ} {env : Environment ๐“ ๐“จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeq A Y alg env P) : ProbabilityTheory.HasLaw (A 0) (MeasureTheory.Measure.dirac (actionZero alg)) P

Code

lemma hasLaw_action_zero [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeq A Y alg env P) :
    HasLaw (A 0) (Measure.dirac (actionZero alg)) P where
  aemeasurable
Type uses (5)
Body uses (1)
Used by (1)

Actions: Source ยท Open Issue

Proof
have hA := h.measurable_action; by fun_prop
  map_eq := (h.hasLaw_action_zero).map_eq.trans (p0_eq_dirac alg)

action_zero_ae_eq๐Ÿ”—

LemmaLearning.IsDeterministicAlg.action_zero_ae_eq

No docstring.

๐Ÿ”—theorem
Learning.IsDeterministicAlg.action_zero_ae_eq.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐“ ๐“จ} {env : Environment ๐“ ๐“จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} [MeasurableEq ๐“] [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeq A Y alg env P) : A 0 =แต[P] fun x => actionZero alg
Learning.IsDeterministicAlg.action_zero_ae_eq.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐“ ๐“จ} {env : Environment ๐“ ๐“จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} [MeasurableEq ๐“] [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeq A Y alg env P) : A 0 =แต[P] fun x => actionZero alg

Code

lemma action_zero_ae_eq [MeasurableEq ๐“] [h_det : IsDeterministicAlg alg]
    (h : IsAlgEnvSeq A Y alg env P) :
    A 0 =แต[P] fun _ โ†ฆ actionZero alg
Type uses (5)
Body uses (2)
Used by (2)

Actions: Source ยท Open Issue

Proof
action_zero_of_IsAlgEnvSeqUntil (h.isAlgEnvSeqUntil 0)

action_ae_eq๐Ÿ”—

LemmaLearning.IsDeterministicAlg.action_ae_eq

No docstring.

๐Ÿ”—theorem
Learning.IsDeterministicAlg.action_ae_eq.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐“ ๐“จ} {env : Environment ๐“ ๐“จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} [MeasurableEq ๐“] [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeq A Y alg env P) (n : โ„•) : A (n + 1) =แต[P] fun ฯ‰ => nextAction alg n (history A Y n ฯ‰)
Learning.IsDeterministicAlg.action_ae_eq.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐“ ๐“จ} {env : Environment ๐“ ๐“จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} [MeasurableEq ๐“] [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeq A Y alg env P) (n : โ„•) : A (n + 1) =แต[P] fun ฯ‰ => nextAction alg n (history A Y n ฯ‰)

Code

lemma action_ae_eq [MeasurableEq ๐“] [h_det : IsDeterministicAlg alg]
    (h : IsAlgEnvSeq A Y alg env P) (n : โ„•) :
    A (n + 1) =แต[P] fun ฯ‰ โ†ฆ nextAction alg n (history A Y n ฯ‰)
Type uses (6)
Body uses (2)
Used by (2)

Actions: Source ยท Open Issue

Proof
action_ae_eq_of_IsAlgEnvSeqUntil (h.isAlgEnvSeqUntil (n + 1)) (by simp)

action_ae_all_eq๐Ÿ”—

LemmaLearning.IsDeterministicAlg.action_ae_all_eq

No docstring.

๐Ÿ”—theorem
Learning.IsDeterministicAlg.action_ae_all_eq.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐“ ๐“จ} {env : Environment ๐“ ๐“จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} [MeasurableEq ๐“] [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeq A Y alg env P) : โˆ€แต (ฯ‰ : ฮฉ) โˆ‚P, A 0 ฯ‰ = actionZero alg โˆง โˆ€ (n : โ„•), A (n + 1) ฯ‰ = nextAction alg n (history A Y n ฯ‰)
Learning.IsDeterministicAlg.action_ae_all_eq.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐“ ๐“จ} {env : Environment ๐“ ๐“จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} [MeasurableEq ๐“] [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeq A Y alg env P) : โˆ€แต (ฯ‰ : ฮฉ) โˆ‚P, A 0 ฯ‰ = actionZero alg โˆง โˆ€ (n : โ„•), A (n + 1) ฯ‰ = nextAction alg n (history A Y n ฯ‰)

Code

lemma action_ae_all_eq [MeasurableEq ๐“] [h_det : IsDeterministicAlg alg]
    (h : IsAlgEnvSeq A Y alg env P) :
    โˆ€แต ฯ‰ โˆ‚P, A 0 ฯ‰ = actionZero alg โˆง โˆ€ n, A (n + 1) ฯ‰ = nextAction alg n (history A Y n ฯ‰)
Type uses (7)
Body uses (2)
Used by (1)

Actions: Source ยท Open Issue

Proof
by
  rw [eventually_and, ae_all_iff]
  exact โŸจaction_zero_ae_eq h, action_ae_eq hโŸฉ

IsDeterministicEnv๐Ÿ”—

Type ClassLearning.IsDeterministicEnv

An environment is deterministic if its initial feedbacks are determined by measurable functions (and not possibly random kernels).

๐Ÿ”—type class
Learning.IsDeterministicEnv.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (env : Environment ๐“ ๐“จ) : Prop
Learning.IsDeterministicEnv.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (env : Environment ๐“ ๐“จ) : Prop

Code

class IsDeterministicEnv (env : Environment ๐“ ๐“จ) : Prop where
  exists_f0 : โˆƒ (f0 : ๐“ โ†’ ๐“จ) (hf0 : Measurable f0), env.ฮฝ0 = Kernel.deterministic f0 hf0
  exists_f : โˆ€ n, โˆƒ (f : ((Iic n โ†’ ๐“ ร— ๐“จ) ร— ๐“) โ†’ ๐“จ) (hf : Measurable f),
    env.feedback n = Kernel.deterministic f hf
Type uses (1)
Used by (11)

Actions: Source ยท Open Issue

feedbackFunZero๐Ÿ”—

DefinitionLearning.feedbackFunZero

The initial feedback function of a deterministic environment.

๐Ÿ”—def
Learning.feedbackFunZero.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (env : Environment ๐“ ๐“จ) [h_det : IsDeterministicEnv env] : ๐“ โ†’ ๐“จ
Learning.feedbackFunZero.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (env : Environment ๐“ ๐“จ) [h_det : IsDeterministicEnv env] : ๐“ โ†’ ๐“จ

Code

noncomputable
def feedbackFunZero (env : Environment ๐“ ๐“จ) [h_det : IsDeterministicEnv env] : ๐“ โ†’ ๐“จ :=
  h_det.exists_f0.choose
Type uses (2)
Used by (6)

Actions: Source ยท Open Issue

measurable_feedbackFunZero๐Ÿ”—

LemmaLearning.measurable_feedbackFunZero

No docstring.

๐Ÿ”—theorem
Learning.measurable_feedbackFunZero.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (env : Environment ๐“ ๐“จ) [IsDeterministicEnv env] : Measurable (feedbackFunZero env)
Learning.measurable_feedbackFunZero.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (env : Environment ๐“ ๐“จ) [IsDeterministicEnv env] : Measurable (feedbackFunZero env)

Code

lemma measurable_feedbackFunZero (env : Environment ๐“ ๐“จ) [IsDeterministicEnv env] :
    Measurable (feedbackFunZero env)
Type uses (3)
Used by (4)

Actions: Source ยท Open Issue

Proof
(IsDeterministicEnv.exists_f0).choose_spec.choose

ฮฝ0_eq_deterministic๐Ÿ”—

LemmaLearning.ฮฝ0_eq_deterministic

No docstring.

๐Ÿ”—theorem
Learning.ฮฝ0_eq_deterministic.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (env : Environment ๐“ ๐“จ) [IsDeterministicEnv env] : Environment.ฮฝ0 env = ProbabilityTheory.Kernel.deterministic (feedbackFunZero env) โ‹ฏ
Learning.ฮฝ0_eq_deterministic.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (env : Environment ๐“ ๐“จ) [IsDeterministicEnv env] : Environment.ฮฝ0 env = ProbabilityTheory.Kernel.deterministic (feedbackFunZero env) โ‹ฏ

Code

lemma ฮฝ0_eq_deterministic (env : Environment ๐“ ๐“จ) [IsDeterministicEnv env] :
    env.ฮฝ0 = Kernel.deterministic (feedbackFunZero env) (measurable_feedbackFunZero env)
Type uses (4)
Used by (3)

Actions: Source ยท Open Issue

Proof
(IsDeterministicEnv.exists_f0).choose_spec.choose_spec

feedbackFun๐Ÿ”—

DefinitionLearning.feedbackFun

The feedback function of a deterministic environment at step n.

๐Ÿ”—def
Learning.feedbackFun.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (env : Environment ๐“ ๐“จ) [h_det : IsDeterministicEnv env] (n : โ„•) : (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) ร— ๐“ โ†’ ๐“จ
Learning.feedbackFun.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (env : Environment ๐“ ๐“จ) [h_det : IsDeterministicEnv env] (n : โ„•) : (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) ร— ๐“ โ†’ ๐“จ

Code

noncomputable
def feedbackFun (env : Environment ๐“ ๐“จ) [h_det : IsDeterministicEnv env] (n : โ„•) :
    ((Iic n โ†’ ๐“ ร— ๐“จ) ร— ๐“) โ†’ ๐“จ :=
  (h_det.exists_f n).choose
Type uses (2)
Used by (6)

Actions: Source ยท Open Issue

measurable_feedbackFun๐Ÿ”—

LemmaLearning.measurable_feedbackFun

No docstring.

๐Ÿ”—theorem
Learning.measurable_feedbackFun.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (env : Environment ๐“ ๐“จ) [IsDeterministicEnv env] (n : โ„•) : Measurable (feedbackFun env n)
Learning.measurable_feedbackFun.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (env : Environment ๐“ ๐“จ) [IsDeterministicEnv env] (n : โ„•) : Measurable (feedbackFun env n)

Code

lemma measurable_feedbackFun (env : Environment ๐“ ๐“จ) [IsDeterministicEnv env] (n : โ„•) :
    Measurable (feedbackFun env n)
Type uses (3)
Used by (4)

Actions: Source ยท Open Issue

Proof
(IsDeterministicEnv.exists_f n).choose_spec.choose

feedback_eq_deterministic๐Ÿ”—

LemmaLearning.feedback_eq_deterministic

No docstring.

๐Ÿ”—theorem
Learning.feedback_eq_deterministic.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (env : Environment ๐“ ๐“จ) [IsDeterministicEnv env] (n : โ„•) : Environment.feedback env n = ProbabilityTheory.Kernel.deterministic (feedbackFun env n) โ‹ฏ
Learning.feedback_eq_deterministic.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (env : Environment ๐“ ๐“จ) [IsDeterministicEnv env] (n : โ„•) : Environment.feedback env n = ProbabilityTheory.Kernel.deterministic (feedbackFun env n) โ‹ฏ

Code

lemma feedback_eq_deterministic (env : Environment ๐“ ๐“จ) [IsDeterministicEnv env] (n : โ„•) :
    env.feedback n = Kernel.deterministic (feedbackFun env n) (measurable_feedbackFun env n)
Type uses (4)
Used by (3)

Actions: Source ยท Open Issue

Proof
(IsDeterministicEnv.exists_f n).choose_spec.choose_spec

hasCondDistrib_feedback_zero๐Ÿ”—

LemmaLearning.IsDeterministicEnv.hasCondDistrib_feedback_zero

No docstring.

๐Ÿ”—theorem
Learning.IsDeterministicEnv.hasCondDistrib_feedback_zero.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐“ ๐“จ} {env : Environment ๐“ ๐“จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} [h_det : IsDeterministicEnv env] (h : IsAlgEnvSeq A Y alg env P) : ProbabilityTheory.HasCondDistrib (Y 0) (A 0) (ProbabilityTheory.Kernel.deterministic (feedbackFunZero env) โ‹ฏ) P
Learning.IsDeterministicEnv.hasCondDistrib_feedback_zero.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐“ ๐“จ} {env : Environment ๐“ ๐“จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} [h_det : IsDeterministicEnv env] (h : IsAlgEnvSeq A Y alg env P) : ProbabilityTheory.HasCondDistrib (Y 0) (A 0) (ProbabilityTheory.Kernel.deterministic (feedbackFunZero env) โ‹ฏ) P

Code

lemma hasCondDistrib_feedback_zero [h_det : IsDeterministicEnv env]
    (h : IsAlgEnvSeq A Y alg env P) :
    HasCondDistrib (Y 0) (A 0)
      (Kernel.deterministic (feedbackFunZero env) (measurable_feedbackFunZero env)) P
Type uses (6)
Body uses (1)

Actions: Source ยท Open Issue

Proof
by
  rw [โ† ฮฝ0_eq_deterministic]
  exact h.hasCondDistrib_feedback_zero

hasCondDistrib_feedback๐Ÿ”—

LemmaLearning.IsDeterministicEnv.hasCondDistrib_feedback

No docstring.

๐Ÿ”—theorem
Learning.IsDeterministicEnv.hasCondDistrib_feedback.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐“ ๐“จ} {env : Environment ๐“ ๐“จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} [h_det : IsDeterministicEnv env] (h : IsAlgEnvSeq A Y alg env P) (n : โ„•) : ProbabilityTheory.HasCondDistrib (Y (n + 1)) (fun ฯ‰ => (history A Y n ฯ‰, A (n + 1) ฯ‰)) (ProbabilityTheory.Kernel.deterministic (feedbackFun env n) โ‹ฏ) P
Learning.IsDeterministicEnv.hasCondDistrib_feedback.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐“ ๐“จ} {env : Environment ๐“ ๐“จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} [h_det : IsDeterministicEnv env] (h : IsAlgEnvSeq A Y alg env P) (n : โ„•) : ProbabilityTheory.HasCondDistrib (Y (n + 1)) (fun ฯ‰ => (history A Y n ฯ‰, A (n + 1) ฯ‰)) (ProbabilityTheory.Kernel.deterministic (feedbackFun env n) โ‹ฏ) P

Code

lemma hasCondDistrib_feedback [h_det : IsDeterministicEnv env]
    (h : IsAlgEnvSeq A Y alg env P) (n : โ„•) :
    HasCondDistrib (Y (n + 1)) (fun ฯ‰ โ†ฆ (history A Y n ฯ‰, A (n + 1) ฯ‰))
      (Kernel.deterministic (feedbackFun env n) (measurable_feedbackFun env n)) P
Type uses (7)
Body uses (1)

Actions: Source ยท Open Issue

Proof
by
  rw [โ† feedback_eq_deterministic]
  exact h.hasCondDistrib_feedback n

detAlgorithm๐Ÿ”—

DefinitionLearning.detAlgorithm

A deterministic algorithm, which chooses the action given by the function nextAction.

๐Ÿ”—def
Learning.detAlgorithm.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“) (h_next : โˆ€ (n : โ„•), Measurable (nextA n)) (action0 : ๐“) : Algorithm ๐“ ๐“จ
Learning.detAlgorithm.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“) (h_next : โˆ€ (n : โ„•), Measurable (nextA n)) (action0 : ๐“) : Algorithm ๐“ ๐“จ

Code

noncomputable
def detAlgorithm (nextA : (n : โ„•) โ†’ (Iic n โ†’ ๐“ ร— ๐“จ) โ†’ ๐“)
    (h_next : โˆ€ n, Measurable (nextA n)) (action0 : ๐“) :
    Algorithm ๐“ ๐“จ where
  policy n := Kernel.deterministic (nextA n) (h_next n)
  p0 := Measure.dirac action0
Type uses (1)
Used by (15)

Actions: Source ยท Open Issue

detAlgorithm_p0๐Ÿ”—

LemmaLearning.detAlgorithm_p0

No docstring.

๐Ÿ”—theorem
Learning.detAlgorithm_p0.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“) (h_next : โˆ€ (n : โ„•), Measurable (nextA n)) (action0 : ๐“) : Algorithm.p0 (detAlgorithm nextA h_next action0) = MeasureTheory.Measure.dirac action0
Learning.detAlgorithm_p0.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“) (h_next : โˆ€ (n : โ„•), Measurable (nextA n)) (action0 : ๐“) : Algorithm.p0 (detAlgorithm nextA h_next action0) = MeasureTheory.Measure.dirac action0

Code

theorem detAlgorithm_p0 : โˆ€ {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ}
  (nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“) (h_next : โˆ€ (n : โ„•), Measurable (nextA n)) (action0 : ๐“),
  (Learning.detAlgorithm nextA h_next action0).p0 = MeasureTheory.Measure.dirac action0
Type uses (2)

Actions: Source ยท Open Issue

Proof
@[simps]

detAlgorithm_policy๐Ÿ”—

LemmaLearning.detAlgorithm_policy

No docstring.

๐Ÿ”—theorem
Learning.detAlgorithm_policy.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“) (h_next : โˆ€ (n : โ„•), Measurable (nextA n)) (action0 : ๐“) (n : โ„•) : Algorithm.policy (detAlgorithm nextA h_next action0) n = ProbabilityTheory.Kernel.deterministic (nextA n) โ‹ฏ
Learning.detAlgorithm_policy.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“) (h_next : โˆ€ (n : โ„•), Measurable (nextA n)) (action0 : ๐“) (n : โ„•) : Algorithm.policy (detAlgorithm nextA h_next action0) n = ProbabilityTheory.Kernel.deterministic (nextA n) โ‹ฏ

Code

theorem detAlgorithm_policy : โˆ€ {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ}
  (nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“) (h_next : โˆ€ (n : โ„•), Measurable (nextA n)) (action0 : ๐“) (n : โ„•),
  (Learning.detAlgorithm nextA h_next action0).policy n = ProbabilityTheory.Kernel.deterministic (nextA n) โ‹ฏ
Type uses (2)
Used by (2)

Actions: Source ยท Open Issue

Proof
@[simps]

instIsDeterministicAlgDetAlgorithm๐Ÿ”—

InstanceLearning.instIsDeterministicAlgDetAlgorithm

No docstring.

๐Ÿ”—theorem
Learning.instIsDeterministicAlgDetAlgorithm.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“} {h_next : โˆ€ (n : โ„•), Measurable (nextA n)} {action0 : ๐“} : IsDeterministicAlg (detAlgorithm nextA h_next action0)
Learning.instIsDeterministicAlgDetAlgorithm.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“} {h_next : โˆ€ (n : โ„•), Measurable (nextA n)} {action0 : ๐“} : IsDeterministicAlg (detAlgorithm nextA h_next action0)

Code

instance : IsDeterministicAlg (detAlgorithm nextA h_next action0) where
  exists_action0
Type uses (2)
Body uses (1)
Used by (9)

Actions: Source ยท Open Issue

Proof
โŸจaction0, rflโŸฉ
  exists_nextAction n := โŸจnextA n, h_next n, rflโŸฉ

actionZero_detAlgorithm๐Ÿ”—

LemmaLearning.actionZero_detAlgorithm

No docstring.

๐Ÿ”—theorem
Learning.actionZero_detAlgorithm.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“} {h_next : โˆ€ (n : โ„•), Measurable (nextA n)} {action0 : ๐“} [MeasurableSpace.SeparatesPoints ๐“] : actionZero (detAlgorithm nextA h_next action0) = action0
Learning.actionZero_detAlgorithm.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“} {h_next : โˆ€ (n : โ„•), Measurable (nextA n)} {action0 : ๐“} [MeasurableSpace.SeparatesPoints ๐“] : actionZero (detAlgorithm nextA h_next action0) = action0

Code

lemma actionZero_detAlgorithm [MeasurableSpace.SeparatesPoints ๐“] :
    actionZero (detAlgorithm nextA h_next action0) = action0
Type uses (3)
Body uses (3)
Used by (5)

Actions: Source ยท Open Issue

Proof
by
  have h_eq := p0_eq_dirac (detAlgorithm nextA h_next action0)
  simp only [detAlgorithm] at h_eq
  rw [dirac_eq_dirac_iff] at h_eq
  exact h_eq.symm

nextAction_detAlgorithm๐Ÿ”—

LemmaLearning.nextAction_detAlgorithm

No docstring.

๐Ÿ”—theorem
Learning.nextAction_detAlgorithm.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“} {h_next : โˆ€ (n : โ„•), Measurable (nextA n)} {action0 : ๐“} [MeasurableSpace.SeparatesPoints ๐“] (n : โ„•) : nextAction (detAlgorithm nextA h_next action0) n = nextA n
Learning.nextAction_detAlgorithm.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“} {h_next : โˆ€ (n : โ„•), Measurable (nextA n)} {action0 : ๐“} [MeasurableSpace.SeparatesPoints ๐“] (n : โ„•) : nextAction (detAlgorithm nextA h_next action0) n = nextA n

Code

lemma nextAction_detAlgorithm [MeasurableSpace.SeparatesPoints ๐“] (n : โ„•) :
    nextAction (detAlgorithm nextA h_next action0) n = nextA n
Type uses (3)
Body uses (4)
Used by (3)

Actions: Source ยท Open Issue

Proof
by
  have h_eq := policy_eq_deterministic (detAlgorithm nextA h_next action0) n
  simpa [detAlgorithm] using h_eq.symm

detEnvironment๐Ÿ”—

DefinitionLearning.detEnvironment

A deterministic environment, where the feedback is given by evaluating fixed measurable functions.

๐Ÿ”—def
Learning.detEnvironment.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (f0 : ๐“ โ†’ ๐“จ) (hf0 : Measurable f0) (f : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) ร— ๐“ โ†’ ๐“จ) (hf : โˆ€ (n : โ„•), Measurable (f n)) : Environment ๐“ ๐“จ
Learning.detEnvironment.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} (f0 : ๐“ โ†’ ๐“จ) (hf0 : Measurable f0) (f : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) ร— ๐“ โ†’ ๐“จ) (hf : โˆ€ (n : โ„•), Measurable (f n)) : Environment ๐“ ๐“จ

Code

noncomputable def detEnvironment
    (f0 : ๐“ โ†’ ๐“จ) (hf0 : Measurable f0)
    (f : (n : โ„•) โ†’ ((Iic n โ†’ ๐“ ร— ๐“จ) ร— ๐“) โ†’ ๐“จ) (hf : โˆ€ n, Measurable (f n)) :
    Environment ๐“ ๐“จ where
  feedback n := (Kernel.deterministic (f n) (hf n))
  ฮฝ0 := Kernel.deterministic f0 hf0
Type uses (1)
Used by (3)

Actions: Source ยท Open Issue

instIsDeterministicEnvDetEnvironment๐Ÿ”—

InstanceLearning.instIsDeterministicEnvDetEnvironment

No docstring.

๐Ÿ”—theorem
Learning.instIsDeterministicEnvDetEnvironment.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {f0 : ๐“ โ†’ ๐“จ} {hf0 : Measurable f0} {f : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) ร— ๐“ โ†’ ๐“จ} {hf : โˆ€ (n : โ„•), Measurable (f n)} : IsDeterministicEnv (detEnvironment f0 hf0 f hf)
Learning.instIsDeterministicEnvDetEnvironment.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {f0 : ๐“ โ†’ ๐“จ} {hf0 : Measurable f0} {f : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) ร— ๐“ โ†’ ๐“จ} {hf : โˆ€ (n : โ„•), Measurable (f n)} : IsDeterministicEnv (detEnvironment f0 hf0 f hf)

Code

instance : IsDeterministicEnv (detEnvironment f0 hf0 f hf) where
  exists_f0
Type uses (2)
Body uses (1)
Used by (2)

Actions: Source ยท Open Issue

Proof
โŸจf0, hf0, rflโŸฉ
  exists_f n := โŸจf n, hf n, rflโŸฉ

feedbackFunZero_detEnvironment๐Ÿ”—

LemmaLearning.feedbackFunZero_detEnvironment

No docstring.

๐Ÿ”—theorem
Learning.feedbackFunZero_detEnvironment.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {f0 : ๐“ โ†’ ๐“จ} {hf0 : Measurable f0} {f : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) ร— ๐“ โ†’ ๐“จ} {hf : โˆ€ (n : โ„•), Measurable (f n)} [MeasurableSpace.SeparatesPoints ๐“จ] : feedbackFunZero (detEnvironment f0 hf0 f hf) = f0
Learning.feedbackFunZero_detEnvironment.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {f0 : ๐“ โ†’ ๐“จ} {hf0 : Measurable f0} {f : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) ร— ๐“ โ†’ ๐“จ} {hf : โˆ€ (n : โ„•), Measurable (f n)} [MeasurableSpace.SeparatesPoints ๐“จ] : feedbackFunZero (detEnvironment f0 hf0 f hf) = f0

Code

lemma feedbackFunZero_detEnvironment [MeasurableSpace.SeparatesPoints ๐“จ] :
    feedbackFunZero (detEnvironment f0 hf0 f hf) = f0
Type uses (3)
Body uses (4)

Actions: Source ยท Open Issue

Proof
by
  simpa [detEnvironment] using (ฮฝ0_eq_deterministic (detEnvironment f0 hf0 f hf)).symm

feedbackFun_detEnvironment๐Ÿ”—

LemmaLearning.feedbackFun_detEnvironment

No docstring.

๐Ÿ”—theorem
Learning.feedbackFun_detEnvironment.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {f0 : ๐“ โ†’ ๐“จ} {hf0 : Measurable f0} {f : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) ร— ๐“ โ†’ ๐“จ} {hf : โˆ€ (n : โ„•), Measurable (f n)} [MeasurableSpace.SeparatesPoints ๐“จ] (n : โ„•) : feedbackFun (detEnvironment f0 hf0 f hf) n = f n
Learning.feedbackFun_detEnvironment.{u_1, u_2} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {f0 : ๐“ โ†’ ๐“จ} {hf0 : Measurable f0} {f : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) ร— ๐“ โ†’ ๐“จ} {hf : โˆ€ (n : โ„•), Measurable (f n)} [MeasurableSpace.SeparatesPoints ๐“จ] (n : โ„•) : feedbackFun (detEnvironment f0 hf0 f hf) n = f n

Code

lemma feedbackFun_detEnvironment [MeasurableSpace.SeparatesPoints ๐“จ] (n : โ„•) :
    feedbackFun (detEnvironment f0 hf0 f hf) n = f n
Type uses (3)
Body uses (4)

Actions: Source ยท Open Issue

Proof
by
  simpa [detEnvironment] using (feedback_eq_deterministic (detEnvironment f0 hf0 f hf) n).symm

hasLaw_action_zero_detAlgorithm๐Ÿ”—

LemmaLearning.IsAlgEnvSeq.hasLaw_action_zero_detAlgorithm

No docstring.

๐Ÿ”—theorem
Learning.IsAlgEnvSeq.hasLaw_action_zero_detAlgorithm.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“} {h_next : โˆ€ (n : โ„•), Measurable (nextA n)} {action0 : ๐“} {env : Environment ๐“ ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} [MeasurableEq ๐“] (h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) : ProbabilityTheory.HasLaw (A 0) (MeasureTheory.Measure.dirac action0) P
Learning.IsAlgEnvSeq.hasLaw_action_zero_detAlgorithm.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“} {h_next : โˆ€ (n : โ„•), Measurable (nextA n)} {action0 : ๐“} {env : Environment ๐“ ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} [MeasurableEq ๐“] (h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) : ProbabilityTheory.HasLaw (A 0) (MeasureTheory.Measure.dirac action0) P

Code

lemma hasLaw_action_zero_detAlgorithm [MeasurableEq ๐“]
    (h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) :
    HasLaw (A 0) (Measure.dirac action0) P
Type uses (3)
Body uses (4)

Actions: Source ยท Open Issue

Proof
by
  simpa using IsDeterministicAlg.hasLaw_action_zero h

action_zero_detAlgorithm๐Ÿ”—

LemmaLearning.IsAlgEnvSeq.action_zero_detAlgorithm

No docstring.

๐Ÿ”—theorem
Learning.IsAlgEnvSeq.action_zero_detAlgorithm.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“} {h_next : โˆ€ (n : โ„•), Measurable (nextA n)} {action0 : ๐“} {env : Environment ๐“ ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} [MeasurableEq ๐“] (h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) : A 0 =แต[P] fun x => action0
Learning.IsAlgEnvSeq.action_zero_detAlgorithm.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“} {h_next : โˆ€ (n : โ„•), Measurable (nextA n)} {action0 : ๐“} {env : Environment ๐“ ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} [MeasurableEq ๐“] (h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) : A 0 =แต[P] fun x => action0

Code

lemma action_zero_detAlgorithm [MeasurableEq ๐“]
    (h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) :
    A 0 =แต[P] fun _ โ†ฆ action0
Type uses (3)
Body uses (4)

Actions: Source ยท Open Issue

Proof
(IsDeterministicAlg.action_zero_ae_eq h).trans (by simp)

action_detAlgorithm_ae_eq๐Ÿ”—

LemmaLearning.IsAlgEnvSeq.action_detAlgorithm_ae_eq

No docstring.

๐Ÿ”—theorem
Learning.IsAlgEnvSeq.action_detAlgorithm_ae_eq.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“} {h_next : โˆ€ (n : โ„•), Measurable (nextA n)} {action0 : ๐“} {env : Environment ๐“ ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} [MeasurableEq ๐“] (h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) (n : โ„•) : A (n + 1) =แต[P] fun ฯ‰ => nextA n (history A Y n ฯ‰)
Learning.IsAlgEnvSeq.action_detAlgorithm_ae_eq.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“} {h_next : โˆ€ (n : โ„•), Measurable (nextA n)} {action0 : ๐“} {env : Environment ๐“ ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} [MeasurableEq ๐“] (h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) (n : โ„•) : A (n + 1) =แต[P] fun ฯ‰ => nextA n (history A Y n ฯ‰)

Code

lemma action_detAlgorithm_ae_eq [MeasurableEq ๐“]
    (h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) (n : โ„•) :
    A (n + 1) =แต[P] fun ฯ‰ โ†ฆ nextA n (history A Y n ฯ‰)
Type uses (4)
Body uses (4)
Used by (2)

Actions: Source ยท Open Issue

Proof
(IsDeterministicAlg.action_ae_eq h n).trans (by simp)

action_detAlgorithm_ae_all_eq๐Ÿ”—

LemmaLearning.IsAlgEnvSeq.action_detAlgorithm_ae_all_eq

No docstring.

๐Ÿ”—theorem
Learning.IsAlgEnvSeq.action_detAlgorithm_ae_all_eq.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“} {h_next : โˆ€ (n : โ„•), Measurable (nextA n)} {action0 : ๐“} {env : Environment ๐“ ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} [MeasurableEq ๐“] (h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) : โˆ€แต (ฯ‰ : ฮฉ) โˆ‚P, A 0 ฯ‰ = action0 โˆง โˆ€ (n : โ„•), A (n + 1) ฯ‰ = nextA n (history A Y n ฯ‰)
Learning.IsAlgEnvSeq.action_detAlgorithm_ae_all_eq.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“} {h_next : โˆ€ (n : โ„•), Measurable (nextA n)} {action0 : ๐“} {env : Environment ๐“ ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} [MeasurableEq ๐“] (h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) : โˆ€แต (ฯ‰ : ฮฉ) โˆ‚P, A 0 ฯ‰ = action0 โˆง โˆ€ (n : โ„•), A (n + 1) ฯ‰ = nextA n (history A Y n ฯ‰)

Code

lemma action_detAlgorithm_ae_all_eq [MeasurableEq ๐“]
    (h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) :
    โˆ€แต ฯ‰ โˆ‚P, A 0 ฯ‰ = action0 โˆง โˆ€ n, A (n + 1) ฯ‰ = nextA n (history A Y n ฯ‰)
Type uses (4)
Body uses (6)

Actions: Source ยท Open Issue

Proof
by
  filter_upwards [IsDeterministicAlg.action_ae_all_eq h] with ฯ‰ hฯ‰ using by simp [hฯ‰]

hasLaw_action_zero_detAlgorithm๐Ÿ”—

LemmaLearning.IsAlgEnvSeqUntil.hasLaw_action_zero_detAlgorithm

No docstring.

๐Ÿ”—theorem
Learning.IsAlgEnvSeqUntil.hasLaw_action_zero_detAlgorithm.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“} {h_next : โˆ€ (n : โ„•), Measurable (nextA n)} {action0 : ๐“} {env : Environment ๐“ ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} {N : โ„•} [MeasurableEq ๐“] (h : IsAlgEnvSeqUntil A Y (detAlgorithm nextA h_next action0) env P N) : ProbabilityTheory.HasLaw (A 0) (MeasureTheory.Measure.dirac action0) P
Learning.IsAlgEnvSeqUntil.hasLaw_action_zero_detAlgorithm.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“} {h_next : โˆ€ (n : โ„•), Measurable (nextA n)} {action0 : ๐“} {env : Environment ๐“ ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} {N : โ„•} [MeasurableEq ๐“] (h : IsAlgEnvSeqUntil A Y (detAlgorithm nextA h_next action0) env P N) : ProbabilityTheory.HasLaw (A 0) (MeasureTheory.Measure.dirac action0) P

Code

lemma hasLaw_action_zero_detAlgorithm [MeasurableEq ๐“]
    (h : IsAlgEnvSeqUntil A Y (detAlgorithm nextA h_next action0) env P N) :
    HasLaw (A 0) (Measure.dirac action0) P
Type uses (3)
Body uses (4)

Actions: Source ยท Open Issue

Proof
by
  simpa using IsDeterministicAlg.hasLaw_action_zero_of_IsAlgEnvSeqUntil h

action_zero_detAlgorithm๐Ÿ”—

LemmaLearning.IsAlgEnvSeqUntil.action_zero_detAlgorithm

No docstring.

๐Ÿ”—theorem
Learning.IsAlgEnvSeqUntil.action_zero_detAlgorithm.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“} {h_next : โˆ€ (n : โ„•), Measurable (nextA n)} {action0 : ๐“} {env : Environment ๐“ ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} {N : โ„•} [MeasurableEq ๐“] (h : IsAlgEnvSeqUntil A Y (detAlgorithm nextA h_next action0) env P N) : A 0 =แต[P] fun x => action0
Learning.IsAlgEnvSeqUntil.action_zero_detAlgorithm.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“} {h_next : โˆ€ (n : โ„•), Measurable (nextA n)} {action0 : ๐“} {env : Environment ๐“ ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} {N : โ„•} [MeasurableEq ๐“] (h : IsAlgEnvSeqUntil A Y (detAlgorithm nextA h_next action0) env P N) : A 0 =แต[P] fun x => action0

Code

lemma action_zero_detAlgorithm [MeasurableEq ๐“]
    (h : IsAlgEnvSeqUntil A Y (detAlgorithm nextA h_next action0) env P N) :
    A 0 =แต[P] fun _ โ†ฆ action0
Type uses (3)
Body uses (4)
Used by (1)

Actions: Source ยท Open Issue

Proof
(IsDeterministicAlg.action_zero_of_IsAlgEnvSeqUntil h).trans (by simp)

action_detAlgorithm_ae_eq๐Ÿ”—

LemmaLearning.IsAlgEnvSeqUntil.action_detAlgorithm_ae_eq

No docstring.

๐Ÿ”—theorem
Learning.IsAlgEnvSeqUntil.action_detAlgorithm_ae_eq.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“} {h_next : โˆ€ (n : โ„•), Measurable (nextA n)} {action0 : ๐“} {env : Environment ๐“ ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} {N n : โ„•} [MeasurableEq ๐“] (h : IsAlgEnvSeqUntil A Y (detAlgorithm nextA h_next action0) env P N) (hn : n < N) : A (n + 1) =แต[P] fun ฯ‰ => nextA n (history A Y n ฯ‰)
Learning.IsAlgEnvSeqUntil.action_detAlgorithm_ae_eq.{u_1, u_2, u_3} {๐“ : Type u_1} {๐“จ : Type u_2} {m๐“ : MeasurableSpace ๐“} {m๐“จ : MeasurableSpace ๐“จ} {nextA : (n : โ„•) โ†’ (โ†ฅ(Finset.Iic n) โ†’ ๐“ ร— ๐“จ) โ†’ ๐“} {h_next : โˆ€ (n : โ„•), Measurable (nextA n)} {action0 : ๐“} {env : Environment ๐“ ๐“จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ„• โ†’ ฮฉ โ†’ ๐“} {Y : โ„• โ†’ ฮฉ โ†’ ๐“จ} {N n : โ„•} [MeasurableEq ๐“] (h : IsAlgEnvSeqUntil A Y (detAlgorithm nextA h_next action0) env P N) (hn : n < N) : A (n + 1) =แต[P] fun ฯ‰ => nextA n (history A Y n ฯ‰)

Code

lemma action_detAlgorithm_ae_eq [MeasurableEq ๐“]
    (h : IsAlgEnvSeqUntil A Y (detAlgorithm nextA h_next action0) env P N) (hn : n < N) :
    A (n + 1) =แต[P] fun ฯ‰ โ†ฆ nextA n (history A Y n ฯ‰)
Type uses (4)
Body uses (4)
Used by (1)

Actions: Source ยท Open Issue

Proof
(IsDeterministicAlg.action_ae_eq_of_IsAlgEnvSeqUntil h hn).trans (by simp)
  1. Learning.IsDeterministicAlg
  2. Learning.actionZero
  3. Learning.nextAction
  4. Learning.measurable_nextAction
  5. Learning.p0_eq_dirac
  6. Learning.policy_eq_deterministic
  7. Learning.IsDeterministicAlg.hasLaw_action_zero_of_IsAlgEnvSeqUntil
  8. Learning.IsDeterministicAlg.action_zero_of_IsAlgEnvSeqUntil
  9. Learning.IsDeterministicAlg.action_ae_eq_of_IsAlgEnvSeqUntil
  10. Learning.IsDeterministicAlg.hasLaw_action_zero
  11. Learning.IsDeterministicAlg.action_zero_ae_eq
  12. Learning.IsDeterministicAlg.action_ae_eq
  13. Learning.IsDeterministicAlg.action_ae_all_eq
  14. Learning.IsDeterministicEnv
  15. Learning.feedbackFunZero
  16. Learning.measurable_feedbackFunZero
  17. Learning.ฮฝ0_eq_deterministic
  18. Learning.feedbackFun
  19. Learning.measurable_feedbackFun
  20. Learning.feedback_eq_deterministic
  21. Learning.IsDeterministicEnv.hasCondDistrib_feedback_zero
  22. Learning.IsDeterministicEnv.hasCondDistrib_feedback
  23. Learning.detAlgorithm
  24. Learning.detAlgorithm_p0
  25. Learning.detAlgorithm_policy
  26. Learning.instIsDeterministicAlgDetAlgorithm
  27. Learning.actionZero_detAlgorithm
  28. Learning.nextAction_detAlgorithm
  29. Learning.detEnvironment
  30. Learning.instIsDeterministicEnvDetEnvironment
  31. Learning.feedbackFunZero_detEnvironment
  32. Learning.feedbackFun_detEnvironment
  33. Learning.IsAlgEnvSeq.hasLaw_action_zero_detAlgorithm
  34. Learning.IsAlgEnvSeq.action_zero_detAlgorithm
  35. Learning.IsAlgEnvSeq.action_detAlgorithm_ae_eq
  36. Learning.IsAlgEnvSeq.action_detAlgorithm_ae_all_eq
  37. Learning.IsAlgEnvSeqUntil.hasLaw_action_zero_detAlgorithm
  38. Learning.IsAlgEnvSeqUntil.action_zero_detAlgorithm
  39. Learning.IsAlgEnvSeqUntil.action_detAlgorithm_ae_eq