3.7.ย SequentialLearning.Deterministic
Deterministic algorithms and environments
A deterministic algorithm chooses its action in a deterministic way. That is, that action is given by a measurable function of the history instead of a general Markov kernel. Similarly, a deterministic environment gives feedback in a deterministic way.
Main definitions
We introduce two typeclasses IsDeterministicAlg and IsDeterministicEnv to express that
an algorithm or an environment is deterministic. We also give definitions for the initial action
and the next action of a deterministic algorithm, and for the feedback functions of a deterministic
environment. Finally, we give a construction of a deterministic algorithm and environment from
measurable functions.
-
IsDeterministicAlg alg: a typeclass expressing that the algorithmalgis deterministic. -
IsDeterministicEnv env: a typeclass expressing that the environmentenvis deterministic. -
actionZero alg: the initial action of a deterministic algorithmalg. -
nextAction alg n: the function that gives the next action of a deterministic algorithmalgat stepn, as a function of the history. -
feedbackFunZero env: the function that gives the initial feedback of a deterministic environmentenv. -
feedbackFun env n: the function that gives the feedback of a deterministic environmentenvat stepn, as a function of the history and the current action. -
detAlgorithm nextA h_next action0: a deterministic algorithm that chooses its action according to the measurable functionnextA(with proof of measurabilityh_next), with initial actionaction0.
Module LeanMachineLearning.SequentialLearning.Deterministic contains 39 exposed declarations.
IsDeterministicAlg๐
Learning.IsDeterministicAlgAn algorithm is deterministic if its initial action and subsequent actions are determined by measurable functions (and not possibly random kernels).
Learning.IsDeterministicAlg.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (alg : Algorithm ๐ ๐จ) : PropLearning.IsDeterministicAlg.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (alg : Algorithm ๐ ๐จ) : Prop
Code
class IsDeterministicAlg (alg : Algorithm ๐ ๐จ) : Prop where
exists_action0 : โ action0, alg.p0 = Measure.dirac action0
exists_nextAction n : โ (nextAction : (Iic n โ ๐ ร ๐จ) โ ๐) (h_meas : Measurable nextAction),
alg.policy n = Kernel.deterministic nextAction h_measType uses (1)
Used by (14)
Actions: Source ยท Open Issue
actionZero๐
Learning.actionZeroThe initial action of a deterministic algorithm.
Learning.actionZero.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (alg : Algorithm ๐ ๐จ) [h_det : IsDeterministicAlg alg] : ๐Learning.actionZero.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (alg : Algorithm ๐ ๐จ) [h_det : IsDeterministicAlg alg] : ๐
Code
noncomputable def actionZero (alg : Algorithm ๐ ๐จ) [h_det : IsDeterministicAlg alg] : ๐ := h_det.exists_action0.choose
Type uses (2)
Used by (12)
Actions: Source ยท Open Issue
nextAction๐
Learning.nextAction
The next action of a deterministic algorithm after step n.
Learning.nextAction.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (alg : Algorithm ๐ ๐จ) [h_det : IsDeterministicAlg alg] (n : โ) : (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐Learning.nextAction.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (alg : Algorithm ๐ ๐จ) [h_det : IsDeterministicAlg alg] (n : โ) : (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐
Code
noncomputable
def nextAction (alg : Algorithm ๐ ๐จ) [h_det : IsDeterministicAlg alg] (n : โ) :
(Iic n โ ๐ ร ๐จ) โ ๐ :=
(h_det.exists_nextAction n).chooseType uses (2)
Used by (9)
Actions: Source ยท Open Issue
measurable_nextAction๐
Learning.measurable_nextActionNo docstring.
Learning.measurable_nextAction.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (alg : Algorithm ๐ ๐จ) [IsDeterministicAlg alg] (n : โ) : Measurable (nextAction alg n)Learning.measurable_nextAction.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (alg : Algorithm ๐ ๐จ) [IsDeterministicAlg alg] (n : โ) : Measurable (nextAction alg n)
Code
lemma measurable_nextAction (alg : Algorithm ๐ ๐จ) [IsDeterministicAlg alg] (n : โ) :
Measurable (nextAction alg n)Type uses (3)
Actions: Source ยท Open Issue
Proof
(IsDeterministicAlg.exists_nextAction n).choose_spec.choose
p0_eq_dirac๐
Learning.p0_eq_diracNo docstring.
Learning.p0_eq_dirac.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (alg : Algorithm ๐ ๐จ) [h_det : IsDeterministicAlg alg] : Algorithm.p0 alg = MeasureTheory.Measure.dirac (actionZero alg)Learning.p0_eq_dirac.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (alg : Algorithm ๐ ๐จ) [h_det : IsDeterministicAlg alg] : Algorithm.p0 alg = MeasureTheory.Measure.dirac (actionZero alg)
Code
lemma p0_eq_dirac (alg : Algorithm ๐ ๐จ) [h_det : IsDeterministicAlg alg] :
alg.p0 = Measure.dirac (actionZero alg)Type uses (3)
Actions: Source ยท Open Issue
Proof
h_det.exists_action0.choose_spec
policy_eq_deterministic๐
Learning.policy_eq_deterministicNo docstring.
Learning.policy_eq_deterministic.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (alg : Algorithm ๐ ๐จ) [h_det : IsDeterministicAlg alg] (n : โ) : Algorithm.policy alg n = ProbabilityTheory.Kernel.deterministic (nextAction alg n) โฏLearning.policy_eq_deterministic.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (alg : Algorithm ๐ ๐จ) [h_det : IsDeterministicAlg alg] (n : โ) : Algorithm.policy alg n = ProbabilityTheory.Kernel.deterministic (nextAction alg n) โฏ
Code
lemma policy_eq_deterministic (alg : Algorithm ๐ ๐จ) [h_det : IsDeterministicAlg alg] (n : โ) :
alg.policy n = Kernel.deterministic (nextAction alg n) (measurable_nextAction alg n)Type uses (4)
Actions: Source ยท Open Issue
Proof
(IsDeterministicAlg.exists_nextAction n).choose_spec.choose_spec
hasLaw_action_zero_of_IsAlgEnvSeqUntil๐
Learning.IsDeterministicAlg.hasLaw_action_zero_of_IsAlgEnvSeqUntilNo docstring.
Learning.IsDeterministicAlg.hasLaw_action_zero_of_IsAlgEnvSeqUntil.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐ ๐จ} {env : Environment ๐ ๐จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} {N : โ} [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeqUntil A Y alg env P N) : ProbabilityTheory.HasLaw (A 0) (MeasureTheory.Measure.dirac (actionZero alg)) PLearning.IsDeterministicAlg.hasLaw_action_zero_of_IsAlgEnvSeqUntil.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐ ๐จ} {env : Environment ๐ ๐จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} {N : โ} [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeqUntil A Y alg env P N) : ProbabilityTheory.HasLaw (A 0) (MeasureTheory.Measure.dirac (actionZero alg)) P
Code
lemma hasLaw_action_zero_of_IsAlgEnvSeqUntil [h_det : IsDeterministicAlg alg]
(h : IsAlgEnvSeqUntil A Y alg env P N) :
HasLaw (A 0) (Measure.dirac (actionZero alg)) P where
aemeasurableType uses (5)
Body uses (1)
Actions: Source ยท Open Issue
Proof
have hA := h.measurable_action; by fun_prop map_eq := (h.hasLaw_action_zero).map_eq.trans (p0_eq_dirac alg)
action_zero_of_IsAlgEnvSeqUntil๐
Learning.IsDeterministicAlg.action_zero_of_IsAlgEnvSeqUntilNo docstring.
Learning.IsDeterministicAlg.action_zero_of_IsAlgEnvSeqUntil.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐ ๐จ} {env : Environment ๐ ๐จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} {N : โ} [MeasurableEq ๐] [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeqUntil A Y alg env P N) : A 0 =แต[P] fun x => actionZero algLearning.IsDeterministicAlg.action_zero_of_IsAlgEnvSeqUntil.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐ ๐จ} {env : Environment ๐ ๐จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} {N : โ} [MeasurableEq ๐] [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeqUntil A Y alg env P N) : A 0 =แต[P] fun x => actionZero alg
Code
lemma action_zero_of_IsAlgEnvSeqUntil [MeasurableEq ๐] [h_det : IsDeterministicAlg alg]
(h : IsAlgEnvSeqUntil A Y alg env P N) :
A 0 =แต[P] fun _ โฆ actionZero algType uses (5)
Body uses (1)
Used by (2)
Actions: Source ยท Open Issue
Proof
by
have h_eq : โแต x โ(P.map (A 0)), x = actionZero alg := by
simp [(hasLaw_action_zero_of_IsAlgEnvSeqUntil h).map_eq]
have hA := h.measurable_action
exact ae_of_ae_map (by fun_prop) h_eq
action_ae_eq_of_IsAlgEnvSeqUntil๐
Learning.IsDeterministicAlg.action_ae_eq_of_IsAlgEnvSeqUntilNo docstring.
Learning.IsDeterministicAlg.action_ae_eq_of_IsAlgEnvSeqUntil.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐ ๐จ} {env : Environment ๐ ๐จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} {n N : โ} [MeasurableEq ๐] [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeqUntil A Y alg env P N) (hn : n < N) : A (n + 1) =แต[P] fun ฯ => nextAction alg n (history A Y n ฯ)Learning.IsDeterministicAlg.action_ae_eq_of_IsAlgEnvSeqUntil.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐ ๐จ} {env : Environment ๐ ๐จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} {n N : โ} [MeasurableEq ๐] [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeqUntil A Y alg env P N) (hn : n < N) : A (n + 1) =แต[P] fun ฯ => nextAction alg n (history A Y n ฯ)
Code
lemma action_ae_eq_of_IsAlgEnvSeqUntil [MeasurableEq ๐]
[h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeqUntil A Y alg env P N) (hn : n < N) :
A (n + 1) =แต[P] fun ฯ โฆ nextAction alg n (history A Y n ฯ)Type uses (6)
Used by (2)
Actions: Source ยท Open Issue
Proof
by
have h_eq := (h.hasCondDistrib_action n hn)
rw [policy_eq_deterministic alg n] at h_eq
exact ae_eq_of_hasCondDistrib_deterministic (measurable_nextAction _ _) (by fun_prop)
(by fun_prop) h_eq
hasLaw_action_zero๐
Learning.IsDeterministicAlg.hasLaw_action_zeroNo docstring.
Learning.IsDeterministicAlg.hasLaw_action_zero.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐ ๐จ} {env : Environment ๐ ๐จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeq A Y alg env P) : ProbabilityTheory.HasLaw (A 0) (MeasureTheory.Measure.dirac (actionZero alg)) PLearning.IsDeterministicAlg.hasLaw_action_zero.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐ ๐จ} {env : Environment ๐ ๐จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeq A Y alg env P) : ProbabilityTheory.HasLaw (A 0) (MeasureTheory.Measure.dirac (actionZero alg)) P
Code
lemma hasLaw_action_zero [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeq A Y alg env P) :
HasLaw (A 0) (Measure.dirac (actionZero alg)) P where
aemeasurableType uses (5)
Body uses (1)
Used by (1)
Actions: Source ยท Open Issue
Proof
have hA := h.measurable_action; by fun_prop map_eq := (h.hasLaw_action_zero).map_eq.trans (p0_eq_dirac alg)
action_zero_ae_eq๐
Learning.IsDeterministicAlg.action_zero_ae_eqNo docstring.
Learning.IsDeterministicAlg.action_zero_ae_eq.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐ ๐จ} {env : Environment ๐ ๐จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} [MeasurableEq ๐] [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeq A Y alg env P) : A 0 =แต[P] fun x => actionZero algLearning.IsDeterministicAlg.action_zero_ae_eq.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐ ๐จ} {env : Environment ๐ ๐จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} [MeasurableEq ๐] [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeq A Y alg env P) : A 0 =แต[P] fun x => actionZero alg
Code
lemma action_zero_ae_eq [MeasurableEq ๐] [h_det : IsDeterministicAlg alg]
(h : IsAlgEnvSeq A Y alg env P) :
A 0 =แต[P] fun _ โฆ actionZero algType uses (5)
Body uses (2)
Used by (2)
Actions: Source ยท Open Issue
Proof
action_zero_of_IsAlgEnvSeqUntil (h.isAlgEnvSeqUntil 0)
action_ae_eq๐
Learning.IsDeterministicAlg.action_ae_eqNo docstring.
Learning.IsDeterministicAlg.action_ae_eq.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐ ๐จ} {env : Environment ๐ ๐จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} [MeasurableEq ๐] [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeq A Y alg env P) (n : โ) : A (n + 1) =แต[P] fun ฯ => nextAction alg n (history A Y n ฯ)Learning.IsDeterministicAlg.action_ae_eq.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐ ๐จ} {env : Environment ๐ ๐จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} [MeasurableEq ๐] [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeq A Y alg env P) (n : โ) : A (n + 1) =แต[P] fun ฯ => nextAction alg n (history A Y n ฯ)
Code
lemma action_ae_eq [MeasurableEq ๐] [h_det : IsDeterministicAlg alg]
(h : IsAlgEnvSeq A Y alg env P) (n : โ) :
A (n + 1) =แต[P] fun ฯ โฆ nextAction alg n (history A Y n ฯ)Type uses (6)
Body uses (2)
Used by (2)
Actions: Source ยท Open Issue
Proof
action_ae_eq_of_IsAlgEnvSeqUntil (h.isAlgEnvSeqUntil (n + 1)) (by simp)
action_ae_all_eq๐
Learning.IsDeterministicAlg.action_ae_all_eqNo docstring.
Learning.IsDeterministicAlg.action_ae_all_eq.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐ ๐จ} {env : Environment ๐ ๐จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} [MeasurableEq ๐] [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeq A Y alg env P) : โแต (ฯ : ฮฉ) โP, A 0 ฯ = actionZero alg โง โ (n : โ), A (n + 1) ฯ = nextAction alg n (history A Y n ฯ)Learning.IsDeterministicAlg.action_ae_all_eq.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐ ๐จ} {env : Environment ๐ ๐จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} [MeasurableEq ๐] [h_det : IsDeterministicAlg alg] (h : IsAlgEnvSeq A Y alg env P) : โแต (ฯ : ฮฉ) โP, A 0 ฯ = actionZero alg โง โ (n : โ), A (n + 1) ฯ = nextAction alg n (history A Y n ฯ)
Code
lemma action_ae_all_eq [MeasurableEq ๐] [h_det : IsDeterministicAlg alg]
(h : IsAlgEnvSeq A Y alg env P) :
โแต ฯ โP, A 0 ฯ = actionZero alg โง โ n, A (n + 1) ฯ = nextAction alg n (history A Y n ฯ)Type uses (7)
Body uses (2)
Used by (1)
Actions: Source ยท Open Issue
Proof
by rw [eventually_and, ae_all_iff] exact โจaction_zero_ae_eq h, action_ae_eq hโฉ
IsDeterministicEnv๐
Learning.IsDeterministicEnvAn environment is deterministic if its initial feedbacks are determined by measurable functions (and not possibly random kernels).
Learning.IsDeterministicEnv.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (env : Environment ๐ ๐จ) : PropLearning.IsDeterministicEnv.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (env : Environment ๐ ๐จ) : Prop
Code
class IsDeterministicEnv (env : Environment ๐ ๐จ) : Prop where
exists_f0 : โ (f0 : ๐ โ ๐จ) (hf0 : Measurable f0), env.ฮฝ0 = Kernel.deterministic f0 hf0
exists_f : โ n, โ (f : ((Iic n โ ๐ ร ๐จ) ร ๐) โ ๐จ) (hf : Measurable f),
env.feedback n = Kernel.deterministic f hfType uses (1)
Used by (11)
Actions: Source ยท Open Issue
feedbackFunZero๐
Learning.feedbackFunZeroThe initial feedback function of a deterministic environment.
Learning.feedbackFunZero.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (env : Environment ๐ ๐จ) [h_det : IsDeterministicEnv env] : ๐ โ ๐จLearning.feedbackFunZero.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (env : Environment ๐ ๐จ) [h_det : IsDeterministicEnv env] : ๐ โ ๐จ
Code
noncomputable def feedbackFunZero (env : Environment ๐ ๐จ) [h_det : IsDeterministicEnv env] : ๐ โ ๐จ := h_det.exists_f0.choose
Type uses (2)
Used by (6)
Actions: Source ยท Open Issue
measurable_feedbackFunZero๐
Learning.measurable_feedbackFunZeroNo docstring.
Learning.measurable_feedbackFunZero.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (env : Environment ๐ ๐จ) [IsDeterministicEnv env] : Measurable (feedbackFunZero env)Learning.measurable_feedbackFunZero.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (env : Environment ๐ ๐จ) [IsDeterministicEnv env] : Measurable (feedbackFunZero env)
Code
lemma measurable_feedbackFunZero (env : Environment ๐ ๐จ) [IsDeterministicEnv env] :
Measurable (feedbackFunZero env)Type uses (3)
Used by (4)
Actions: Source ยท Open Issue
Proof
(IsDeterministicEnv.exists_f0).choose_spec.choose
ฮฝ0_eq_deterministic๐
Learning.ฮฝ0_eq_deterministicNo docstring.
Learning.ฮฝ0_eq_deterministic.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (env : Environment ๐ ๐จ) [IsDeterministicEnv env] : Environment.ฮฝ0 env = ProbabilityTheory.Kernel.deterministic (feedbackFunZero env) โฏLearning.ฮฝ0_eq_deterministic.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (env : Environment ๐ ๐จ) [IsDeterministicEnv env] : Environment.ฮฝ0 env = ProbabilityTheory.Kernel.deterministic (feedbackFunZero env) โฏ
Code
lemma ฮฝ0_eq_deterministic (env : Environment ๐ ๐จ) [IsDeterministicEnv env] :
env.ฮฝ0 = Kernel.deterministic (feedbackFunZero env) (measurable_feedbackFunZero env)Used by (3)
Actions: Source ยท Open Issue
Proof
(IsDeterministicEnv.exists_f0).choose_spec.choose_spec
feedbackFun๐
Learning.feedbackFun
The feedback function of a deterministic environment at step n.
Learning.feedbackFun.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (env : Environment ๐ ๐จ) [h_det : IsDeterministicEnv env] (n : โ) : (โฅ(Finset.Iic n) โ ๐ ร ๐จ) ร ๐ โ ๐จLearning.feedbackFun.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (env : Environment ๐ ๐จ) [h_det : IsDeterministicEnv env] (n : โ) : (โฅ(Finset.Iic n) โ ๐ ร ๐จ) ร ๐ โ ๐จ
Code
noncomputable
def feedbackFun (env : Environment ๐ ๐จ) [h_det : IsDeterministicEnv env] (n : โ) :
((Iic n โ ๐ ร ๐จ) ร ๐) โ ๐จ :=
(h_det.exists_f n).chooseType uses (2)
Used by (6)
Actions: Source ยท Open Issue
measurable_feedbackFun๐
Learning.measurable_feedbackFunNo docstring.
Learning.measurable_feedbackFun.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (env : Environment ๐ ๐จ) [IsDeterministicEnv env] (n : โ) : Measurable (feedbackFun env n)Learning.measurable_feedbackFun.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (env : Environment ๐ ๐จ) [IsDeterministicEnv env] (n : โ) : Measurable (feedbackFun env n)
Code
lemma measurable_feedbackFun (env : Environment ๐ ๐จ) [IsDeterministicEnv env] (n : โ) :
Measurable (feedbackFun env n)Type uses (3)
Used by (4)
Actions: Source ยท Open Issue
Proof
(IsDeterministicEnv.exists_f n).choose_spec.choose
feedback_eq_deterministic๐
Learning.feedback_eq_deterministicNo docstring.
Learning.feedback_eq_deterministic.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (env : Environment ๐ ๐จ) [IsDeterministicEnv env] (n : โ) : Environment.feedback env n = ProbabilityTheory.Kernel.deterministic (feedbackFun env n) โฏLearning.feedback_eq_deterministic.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (env : Environment ๐ ๐จ) [IsDeterministicEnv env] (n : โ) : Environment.feedback env n = ProbabilityTheory.Kernel.deterministic (feedbackFun env n) โฏ
Code
lemma feedback_eq_deterministic (env : Environment ๐ ๐จ) [IsDeterministicEnv env] (n : โ) :
env.feedback n = Kernel.deterministic (feedbackFun env n) (measurable_feedbackFun env n)Type uses (4)
Actions: Source ยท Open Issue
Proof
(IsDeterministicEnv.exists_f n).choose_spec.choose_spec
hasCondDistrib_feedback_zero๐
Learning.IsDeterministicEnv.hasCondDistrib_feedback_zeroNo docstring.
Learning.IsDeterministicEnv.hasCondDistrib_feedback_zero.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐ ๐จ} {env : Environment ๐ ๐จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} [h_det : IsDeterministicEnv env] (h : IsAlgEnvSeq A Y alg env P) : ProbabilityTheory.HasCondDistrib (Y 0) (A 0) (ProbabilityTheory.Kernel.deterministic (feedbackFunZero env) โฏ) PLearning.IsDeterministicEnv.hasCondDistrib_feedback_zero.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐ ๐จ} {env : Environment ๐ ๐จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} [h_det : IsDeterministicEnv env] (h : IsAlgEnvSeq A Y alg env P) : ProbabilityTheory.HasCondDistrib (Y 0) (A 0) (ProbabilityTheory.Kernel.deterministic (feedbackFunZero env) โฏ) P
Code
lemma hasCondDistrib_feedback_zero [h_det : IsDeterministicEnv env]
(h : IsAlgEnvSeq A Y alg env P) :
HasCondDistrib (Y 0) (A 0)
(Kernel.deterministic (feedbackFunZero env) (measurable_feedbackFunZero env)) PType uses (6)
Body uses (1)
Actions: Source ยท Open Issue
Proof
by rw [โ ฮฝ0_eq_deterministic] exact h.hasCondDistrib_feedback_zero
hasCondDistrib_feedback๐
Learning.IsDeterministicEnv.hasCondDistrib_feedbackNo docstring.
Learning.IsDeterministicEnv.hasCondDistrib_feedback.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐ ๐จ} {env : Environment ๐ ๐จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} [h_det : IsDeterministicEnv env] (h : IsAlgEnvSeq A Y alg env P) (n : โ) : ProbabilityTheory.HasCondDistrib (Y (n + 1)) (fun ฯ => (history A Y n ฯ, A (n + 1) ฯ)) (ProbabilityTheory.Kernel.deterministic (feedbackFun env n) โฏ) PLearning.IsDeterministicEnv.hasCondDistrib_feedback.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {alg : Algorithm ๐ ๐จ} {env : Environment ๐ ๐จ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsFiniteMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} [h_det : IsDeterministicEnv env] (h : IsAlgEnvSeq A Y alg env P) (n : โ) : ProbabilityTheory.HasCondDistrib (Y (n + 1)) (fun ฯ => (history A Y n ฯ, A (n + 1) ฯ)) (ProbabilityTheory.Kernel.deterministic (feedbackFun env n) โฏ) P
Code
lemma hasCondDistrib_feedback [h_det : IsDeterministicEnv env]
(h : IsAlgEnvSeq A Y alg env P) (n : โ) :
HasCondDistrib (Y (n + 1)) (fun ฯ โฆ (history A Y n ฯ, A (n + 1) ฯ))
(Kernel.deterministic (feedbackFun env n) (measurable_feedbackFun env n)) PType uses (7)
Body uses (1)
Actions: Source ยท Open Issue
Proof
by rw [โ feedback_eq_deterministic] exact h.hasCondDistrib_feedback n
detAlgorithm๐
Learning.detAlgorithm
A deterministic algorithm, which chooses the action given by the function nextAction.
Learning.detAlgorithm.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐) (h_next : โ (n : โ), Measurable (nextA n)) (action0 : ๐) : Algorithm ๐ ๐จLearning.detAlgorithm.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐) (h_next : โ (n : โ), Measurable (nextA n)) (action0 : ๐) : Algorithm ๐ ๐จ
Code
noncomputable
def detAlgorithm (nextA : (n : โ) โ (Iic n โ ๐ ร ๐จ) โ ๐)
(h_next : โ n, Measurable (nextA n)) (action0 : ๐) :
Algorithm ๐ ๐จ where
policy n := Kernel.deterministic (nextA n) (h_next n)
p0 := Measure.dirac action0Type uses (1)
Used by (15)
Actions: Source ยท Open Issue
detAlgorithm_p0๐
Learning.detAlgorithm_p0No docstring.
Learning.detAlgorithm_p0.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐) (h_next : โ (n : โ), Measurable (nextA n)) (action0 : ๐) : Algorithm.p0 (detAlgorithm nextA h_next action0) = MeasureTheory.Measure.dirac action0Learning.detAlgorithm_p0.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐) (h_next : โ (n : โ), Measurable (nextA n)) (action0 : ๐) : Algorithm.p0 (detAlgorithm nextA h_next action0) = MeasureTheory.Measure.dirac action0
Code
theorem detAlgorithm_p0 : โ {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ}
(nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐) (h_next : โ (n : โ), Measurable (nextA n)) (action0 : ๐),
(Learning.detAlgorithm nextA h_next action0).p0 = MeasureTheory.Measure.dirac action0Type uses (2)
Actions: Source ยท Open Issue
Proof
@[simps]
detAlgorithm_policy๐
Learning.detAlgorithm_policyNo docstring.
Learning.detAlgorithm_policy.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐) (h_next : โ (n : โ), Measurable (nextA n)) (action0 : ๐) (n : โ) : Algorithm.policy (detAlgorithm nextA h_next action0) n = ProbabilityTheory.Kernel.deterministic (nextA n) โฏLearning.detAlgorithm_policy.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐) (h_next : โ (n : โ), Measurable (nextA n)) (action0 : ๐) (n : โ) : Algorithm.policy (detAlgorithm nextA h_next action0) n = ProbabilityTheory.Kernel.deterministic (nextA n) โฏ
Code
theorem detAlgorithm_policy : โ {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ}
(nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐) (h_next : โ (n : โ), Measurable (nextA n)) (action0 : ๐) (n : โ),
(Learning.detAlgorithm nextA h_next action0).policy n = ProbabilityTheory.Kernel.deterministic (nextA n) โฏType uses (2)
Actions: Source ยท Open Issue
Proof
@[simps]
instIsDeterministicAlgDetAlgorithm๐
Learning.instIsDeterministicAlgDetAlgorithmNo docstring.
Learning.instIsDeterministicAlgDetAlgorithm.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐} {h_next : โ (n : โ), Measurable (nextA n)} {action0 : ๐} : IsDeterministicAlg (detAlgorithm nextA h_next action0)Learning.instIsDeterministicAlgDetAlgorithm.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐} {h_next : โ (n : โ), Measurable (nextA n)} {action0 : ๐} : IsDeterministicAlg (detAlgorithm nextA h_next action0)
Code
instance : IsDeterministicAlg (detAlgorithm nextA h_next action0) where exists_action0
Type uses (2)
Body uses (1)
Used by (9)
Actions: Source ยท Open Issue
Proof
โจaction0, rflโฉ exists_nextAction n := โจnextA n, h_next n, rflโฉ
actionZero_detAlgorithm๐
Learning.actionZero_detAlgorithmNo docstring.
Learning.actionZero_detAlgorithm.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐} {h_next : โ (n : โ), Measurable (nextA n)} {action0 : ๐} [MeasurableSpace.SeparatesPoints ๐] : actionZero (detAlgorithm nextA h_next action0) = action0Learning.actionZero_detAlgorithm.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐} {h_next : โ (n : โ), Measurable (nextA n)} {action0 : ๐} [MeasurableSpace.SeparatesPoints ๐] : actionZero (detAlgorithm nextA h_next action0) = action0
Code
lemma actionZero_detAlgorithm [MeasurableSpace.SeparatesPoints ๐] :
actionZero (detAlgorithm nextA h_next action0) = action0Type uses (3)
Body uses (3)
Used by (5)
Actions: Source ยท Open Issue
Proof
by have h_eq := p0_eq_dirac (detAlgorithm nextA h_next action0) simp only [detAlgorithm] at h_eq rw [dirac_eq_dirac_iff] at h_eq exact h_eq.symm
nextAction_detAlgorithm๐
Learning.nextAction_detAlgorithmNo docstring.
Learning.nextAction_detAlgorithm.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐} {h_next : โ (n : โ), Measurable (nextA n)} {action0 : ๐} [MeasurableSpace.SeparatesPoints ๐] (n : โ) : nextAction (detAlgorithm nextA h_next action0) n = nextA nLearning.nextAction_detAlgorithm.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐} {h_next : โ (n : โ), Measurable (nextA n)} {action0 : ๐} [MeasurableSpace.SeparatesPoints ๐] (n : โ) : nextAction (detAlgorithm nextA h_next action0) n = nextA n
Code
lemma nextAction_detAlgorithm [MeasurableSpace.SeparatesPoints ๐] (n : โ) :
nextAction (detAlgorithm nextA h_next action0) n = nextA nType uses (3)
Actions: Source ยท Open Issue
Proof
by have h_eq := policy_eq_deterministic (detAlgorithm nextA h_next action0) n simpa [detAlgorithm] using h_eq.symm
detEnvironment๐
Learning.detEnvironmentA deterministic environment, where the feedback is given by evaluating fixed measurable functions.
Learning.detEnvironment.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (f0 : ๐ โ ๐จ) (hf0 : Measurable f0) (f : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) ร ๐ โ ๐จ) (hf : โ (n : โ), Measurable (f n)) : Environment ๐ ๐จLearning.detEnvironment.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} (f0 : ๐ โ ๐จ) (hf0 : Measurable f0) (f : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) ร ๐ โ ๐จ) (hf : โ (n : โ), Measurable (f n)) : Environment ๐ ๐จ
Code
noncomputable def detEnvironment
(f0 : ๐ โ ๐จ) (hf0 : Measurable f0)
(f : (n : โ) โ ((Iic n โ ๐ ร ๐จ) ร ๐) โ ๐จ) (hf : โ n, Measurable (f n)) :
Environment ๐ ๐จ where
feedback n := (Kernel.deterministic (f n) (hf n))
ฮฝ0 := Kernel.deterministic f0 hf0Type uses (1)
Used by (3)
Actions: Source ยท Open Issue
instIsDeterministicEnvDetEnvironment๐
Learning.instIsDeterministicEnvDetEnvironmentNo docstring.
Learning.instIsDeterministicEnvDetEnvironment.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {f0 : ๐ โ ๐จ} {hf0 : Measurable f0} {f : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) ร ๐ โ ๐จ} {hf : โ (n : โ), Measurable (f n)} : IsDeterministicEnv (detEnvironment f0 hf0 f hf)Learning.instIsDeterministicEnvDetEnvironment.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {f0 : ๐ โ ๐จ} {hf0 : Measurable f0} {f : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) ร ๐ โ ๐จ} {hf : โ (n : โ), Measurable (f n)} : IsDeterministicEnv (detEnvironment f0 hf0 f hf)
Code
instance : IsDeterministicEnv (detEnvironment f0 hf0 f hf) where exists_f0
Type uses (2)
Body uses (1)
Actions: Source ยท Open Issue
Proof
โจf0, hf0, rflโฉ exists_f n := โจf n, hf n, rflโฉ
feedbackFunZero_detEnvironment๐
Learning.feedbackFunZero_detEnvironmentNo docstring.
Learning.feedbackFunZero_detEnvironment.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {f0 : ๐ โ ๐จ} {hf0 : Measurable f0} {f : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) ร ๐ โ ๐จ} {hf : โ (n : โ), Measurable (f n)} [MeasurableSpace.SeparatesPoints ๐จ] : feedbackFunZero (detEnvironment f0 hf0 f hf) = f0Learning.feedbackFunZero_detEnvironment.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {f0 : ๐ โ ๐จ} {hf0 : Measurable f0} {f : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) ร ๐ โ ๐จ} {hf : โ (n : โ), Measurable (f n)} [MeasurableSpace.SeparatesPoints ๐จ] : feedbackFunZero (detEnvironment f0 hf0 f hf) = f0
Code
lemma feedbackFunZero_detEnvironment [MeasurableSpace.SeparatesPoints ๐จ] :
feedbackFunZero (detEnvironment f0 hf0 f hf) = f0Actions: Source ยท Open Issue
Proof
by simpa [detEnvironment] using (ฮฝ0_eq_deterministic (detEnvironment f0 hf0 f hf)).symm
feedbackFun_detEnvironment๐
Learning.feedbackFun_detEnvironmentNo docstring.
Learning.feedbackFun_detEnvironment.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {f0 : ๐ โ ๐จ} {hf0 : Measurable f0} {f : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) ร ๐ โ ๐จ} {hf : โ (n : โ), Measurable (f n)} [MeasurableSpace.SeparatesPoints ๐จ] (n : โ) : feedbackFun (detEnvironment f0 hf0 f hf) n = f nLearning.feedbackFun_detEnvironment.{u_1, u_2} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {f0 : ๐ โ ๐จ} {hf0 : Measurable f0} {f : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) ร ๐ โ ๐จ} {hf : โ (n : โ), Measurable (f n)} [MeasurableSpace.SeparatesPoints ๐จ] (n : โ) : feedbackFun (detEnvironment f0 hf0 f hf) n = f n
Code
lemma feedbackFun_detEnvironment [MeasurableSpace.SeparatesPoints ๐จ] (n : โ) :
feedbackFun (detEnvironment f0 hf0 f hf) n = f nType uses (3)
Actions: Source ยท Open Issue
Proof
by simpa [detEnvironment] using (feedback_eq_deterministic (detEnvironment f0 hf0 f hf) n).symm
hasLaw_action_zero_detAlgorithm๐
Learning.IsAlgEnvSeq.hasLaw_action_zero_detAlgorithmNo docstring.
Learning.IsAlgEnvSeq.hasLaw_action_zero_detAlgorithm.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐} {h_next : โ (n : โ), Measurable (nextA n)} {action0 : ๐} {env : Environment ๐ ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} [MeasurableEq ๐] (h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) : ProbabilityTheory.HasLaw (A 0) (MeasureTheory.Measure.dirac action0) PLearning.IsAlgEnvSeq.hasLaw_action_zero_detAlgorithm.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐} {h_next : โ (n : โ), Measurable (nextA n)} {action0 : ๐} {env : Environment ๐ ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} [MeasurableEq ๐] (h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) : ProbabilityTheory.HasLaw (A 0) (MeasureTheory.Measure.dirac action0) P
Code
lemma hasLaw_action_zero_detAlgorithm [MeasurableEq ๐]
(h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) :
HasLaw (A 0) (Measure.dirac action0) PType uses (3)
Body uses (4)
Actions: Source ยท Open Issue
Proof
by simpa using IsDeterministicAlg.hasLaw_action_zero h
action_zero_detAlgorithm๐
Learning.IsAlgEnvSeq.action_zero_detAlgorithmNo docstring.
Learning.IsAlgEnvSeq.action_zero_detAlgorithm.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐} {h_next : โ (n : โ), Measurable (nextA n)} {action0 : ๐} {env : Environment ๐ ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} [MeasurableEq ๐] (h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) : A 0 =แต[P] fun x => action0Learning.IsAlgEnvSeq.action_zero_detAlgorithm.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐} {h_next : โ (n : โ), Measurable (nextA n)} {action0 : ๐} {env : Environment ๐ ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} [MeasurableEq ๐] (h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) : A 0 =แต[P] fun x => action0
Code
lemma action_zero_detAlgorithm [MeasurableEq ๐]
(h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) :
A 0 =แต[P] fun _ โฆ action0Type uses (3)
Body uses (4)
Actions: Source ยท Open Issue
Proof
(IsDeterministicAlg.action_zero_ae_eq h).trans (by simp)
action_detAlgorithm_ae_eq๐
Learning.IsAlgEnvSeq.action_detAlgorithm_ae_eqNo docstring.
Learning.IsAlgEnvSeq.action_detAlgorithm_ae_eq.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐} {h_next : โ (n : โ), Measurable (nextA n)} {action0 : ๐} {env : Environment ๐ ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} [MeasurableEq ๐] (h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) (n : โ) : A (n + 1) =แต[P] fun ฯ => nextA n (history A Y n ฯ)Learning.IsAlgEnvSeq.action_detAlgorithm_ae_eq.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐} {h_next : โ (n : โ), Measurable (nextA n)} {action0 : ๐} {env : Environment ๐ ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} [MeasurableEq ๐] (h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) (n : โ) : A (n + 1) =แต[P] fun ฯ => nextA n (history A Y n ฯ)
Code
lemma action_detAlgorithm_ae_eq [MeasurableEq ๐]
(h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) (n : โ) :
A (n + 1) =แต[P] fun ฯ โฆ nextA n (history A Y n ฯ)Type uses (4)
Used by (2)
Actions: Source ยท Open Issue
Proof
(IsDeterministicAlg.action_ae_eq h n).trans (by simp)
action_detAlgorithm_ae_all_eq๐
Learning.IsAlgEnvSeq.action_detAlgorithm_ae_all_eqNo docstring.
Learning.IsAlgEnvSeq.action_detAlgorithm_ae_all_eq.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐} {h_next : โ (n : โ), Measurable (nextA n)} {action0 : ๐} {env : Environment ๐ ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} [MeasurableEq ๐] (h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) : โแต (ฯ : ฮฉ) โP, A 0 ฯ = action0 โง โ (n : โ), A (n + 1) ฯ = nextA n (history A Y n ฯ)Learning.IsAlgEnvSeq.action_detAlgorithm_ae_all_eq.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐} {h_next : โ (n : โ), Measurable (nextA n)} {action0 : ๐} {env : Environment ๐ ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} [MeasurableEq ๐] (h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) : โแต (ฯ : ฮฉ) โP, A 0 ฯ = action0 โง โ (n : โ), A (n + 1) ฯ = nextA n (history A Y n ฯ)
Code
lemma action_detAlgorithm_ae_all_eq [MeasurableEq ๐]
(h : IsAlgEnvSeq A Y (detAlgorithm nextA h_next action0) env P) :
โแต ฯ โP, A 0 ฯ = action0 โง โ n, A (n + 1) ฯ = nextA n (history A Y n ฯ)Type uses (4)
Body uses (6)
Actions: Source ยท Open Issue
Proof
by filter_upwards [IsDeterministicAlg.action_ae_all_eq h] with ฯ hฯ using by simp [hฯ]
hasLaw_action_zero_detAlgorithm๐
Learning.IsAlgEnvSeqUntil.hasLaw_action_zero_detAlgorithmNo docstring.
Learning.IsAlgEnvSeqUntil.hasLaw_action_zero_detAlgorithm.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐} {h_next : โ (n : โ), Measurable (nextA n)} {action0 : ๐} {env : Environment ๐ ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} {N : โ} [MeasurableEq ๐] (h : IsAlgEnvSeqUntil A Y (detAlgorithm nextA h_next action0) env P N) : ProbabilityTheory.HasLaw (A 0) (MeasureTheory.Measure.dirac action0) PLearning.IsAlgEnvSeqUntil.hasLaw_action_zero_detAlgorithm.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐} {h_next : โ (n : โ), Measurable (nextA n)} {action0 : ๐} {env : Environment ๐ ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} {N : โ} [MeasurableEq ๐] (h : IsAlgEnvSeqUntil A Y (detAlgorithm nextA h_next action0) env P N) : ProbabilityTheory.HasLaw (A 0) (MeasureTheory.Measure.dirac action0) P
Code
lemma hasLaw_action_zero_detAlgorithm [MeasurableEq ๐]
(h : IsAlgEnvSeqUntil A Y (detAlgorithm nextA h_next action0) env P N) :
HasLaw (A 0) (Measure.dirac action0) PType uses (3)
Body uses (4)
Actions: Source ยท Open Issue
Proof
by simpa using IsDeterministicAlg.hasLaw_action_zero_of_IsAlgEnvSeqUntil h
action_zero_detAlgorithm๐
Learning.IsAlgEnvSeqUntil.action_zero_detAlgorithmNo docstring.
Learning.IsAlgEnvSeqUntil.action_zero_detAlgorithm.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐} {h_next : โ (n : โ), Measurable (nextA n)} {action0 : ๐} {env : Environment ๐ ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} {N : โ} [MeasurableEq ๐] (h : IsAlgEnvSeqUntil A Y (detAlgorithm nextA h_next action0) env P N) : A 0 =แต[P] fun x => action0Learning.IsAlgEnvSeqUntil.action_zero_detAlgorithm.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐} {h_next : โ (n : โ), Measurable (nextA n)} {action0 : ๐} {env : Environment ๐ ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} {N : โ} [MeasurableEq ๐] (h : IsAlgEnvSeqUntil A Y (detAlgorithm nextA h_next action0) env P N) : A 0 =แต[P] fun x => action0
Code
lemma action_zero_detAlgorithm [MeasurableEq ๐]
(h : IsAlgEnvSeqUntil A Y (detAlgorithm nextA h_next action0) env P N) :
A 0 =แต[P] fun _ โฆ action0Type uses (3)
Body uses (4)
Used by (1)
Actions: Source ยท Open Issue
Proof
(IsDeterministicAlg.action_zero_of_IsAlgEnvSeqUntil h).trans (by simp)
action_detAlgorithm_ae_eq๐
Learning.IsAlgEnvSeqUntil.action_detAlgorithm_ae_eqNo docstring.
Learning.IsAlgEnvSeqUntil.action_detAlgorithm_ae_eq.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐} {h_next : โ (n : โ), Measurable (nextA n)} {action0 : ๐} {env : Environment ๐ ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} {N n : โ} [MeasurableEq ๐] (h : IsAlgEnvSeqUntil A Y (detAlgorithm nextA h_next action0) env P N) (hn : n < N) : A (n + 1) =แต[P] fun ฯ => nextA n (history A Y n ฯ)Learning.IsAlgEnvSeqUntil.action_detAlgorithm_ae_eq.{u_1, u_2, u_3} {๐ : Type u_1} {๐จ : Type u_2} {m๐ : MeasurableSpace ๐} {m๐จ : MeasurableSpace ๐จ} {nextA : (n : โ) โ (โฅ(Finset.Iic n) โ ๐ ร ๐จ) โ ๐} {h_next : โ (n : โ), Measurable (nextA n)} {action0 : ๐} {env : Environment ๐ ๐จ} {ฮฉ : Type u_3} {mฮฉ : MeasurableSpace ฮฉ} {P : MeasureTheory.Measure ฮฉ} [MeasureTheory.IsProbabilityMeasure P] {A : โ โ ฮฉ โ ๐} {Y : โ โ ฮฉ โ ๐จ} {N n : โ} [MeasurableEq ๐] (h : IsAlgEnvSeqUntil A Y (detAlgorithm nextA h_next action0) env P N) (hn : n < N) : A (n + 1) =แต[P] fun ฯ => nextA n (history A Y n ฯ)
Code
lemma action_detAlgorithm_ae_eq [MeasurableEq ๐]
(h : IsAlgEnvSeqUntil A Y (detAlgorithm nextA h_next action0) env P N) (hn : n < N) :
A (n + 1) =แต[P] fun ฯ โฆ nextA n (history A Y n ฯ)Type uses (4)
Body uses (4)
Used by (1)
Actions: Source ยท Open Issue
Proof
(IsDeterministicAlg.action_ae_eq_of_IsAlgEnvSeqUntil h hn).trans (by simp)
-
Learning.IsDeterministicAlg -
Learning.actionZero -
Learning.nextAction -
Learning.measurable_nextAction -
Learning.p0_eq_dirac -
Learning.policy_eq_deterministic -
Learning.IsDeterministicAlg.hasLaw_action_zero_of_IsAlgEnvSeqUntil -
Learning.IsDeterministicAlg.action_zero_of_IsAlgEnvSeqUntil -
Learning.IsDeterministicAlg.action_ae_eq_of_IsAlgEnvSeqUntil -
Learning.IsDeterministicAlg.hasLaw_action_zero -
Learning.IsDeterministicAlg.action_zero_ae_eq -
Learning.IsDeterministicAlg.action_ae_eq -
Learning.IsDeterministicAlg.action_ae_all_eq -
Learning.IsDeterministicEnv -
Learning.feedbackFunZero -
Learning.measurable_feedbackFunZero -
Learning.ฮฝ0_eq_deterministic -
Learning.feedbackFun -
Learning.measurable_feedbackFun -
Learning.feedback_eq_deterministic -
Learning.IsDeterministicEnv.hasCondDistrib_feedback_zero -
Learning.IsDeterministicEnv.hasCondDistrib_feedback -
Learning.detAlgorithm -
Learning.detAlgorithm_p0 -
Learning.detAlgorithm_policy -
Learning.instIsDeterministicAlgDetAlgorithm -
Learning.actionZero_detAlgorithm -
Learning.nextAction_detAlgorithm -
Learning.detEnvironment -
Learning.instIsDeterministicEnvDetEnvironment -
Learning.feedbackFunZero_detEnvironment -
Learning.feedbackFun_detEnvironment -
Learning.IsAlgEnvSeq.hasLaw_action_zero_detAlgorithm -
Learning.IsAlgEnvSeq.action_zero_detAlgorithm -
Learning.IsAlgEnvSeq.action_detAlgorithm_ae_eq -
Learning.IsAlgEnvSeq.action_detAlgorithm_ae_all_eq -
Learning.IsAlgEnvSeqUntil.hasLaw_action_zero_detAlgorithm -
Learning.IsAlgEnvSeqUntil.action_zero_detAlgorithm -
Learning.IsAlgEnvSeqUntil.action_detAlgorithm_ae_eq