Cheat Sheet Posted on 2021-05-18 Edited on 2021-08-20 In RL , ML Views: Symbols count in article: 1.1k Reading time ≈ 1 mins. Cheat Sheet Permutation and Combination Number of permutations for n people and k chairs: nPk=n!(n−k)! nCk=nPkk!=n!k!(n−k)! RL Qπ(s,a)=Es′[r+γEa′[Q(s′,as′)|s′]|s,a,π] V∗(s)=maxaQ∗(s,a) Q∗(s,a)=maxπQπ(s,a),∀(s,a)∈SXA Vπ(s)=Eat∼pi(⋅|s)[Rt|st=s,π]=Eat∼pi(⋅|s)[E[Rt|st,at,π]|st=s]=Eat∼pi(⋅|s)[Qπ(s,at)] Aπ(s,a)=Qπ(s,a)−Vπ(s) Ea[Aπ(s,a)|s]=Ea[Qπ(s,a)|s]−Vπ(s)=0 Es′[Rt+γVπ(s′)−Vπ(s)]=Aπ(s,a) Q(s,a)+δ(s,a;π)=Q(s,a)+T^πQ(s,a)−Q(s,a)=R+γQ(s′,a′) ML True Positive Rate (Sensitivity or Recall): TPTP+FN Specificity: TNTN+FP False Positive Rate (1 - specificity): FPTN+FP Precision: TPTP+FP Accuracy: TP+TNTP+TN+FP+FN F-score: 2∗Precision∗RecallPrecision+Recall Related Posts MDP Bellman Equations Bellman Equations for Optimal Value Functions Calculus (1) convex-opt