Symbolické strojové učení

  • Stránky předmětu: Symbolické strojové učení
  • Přednášející: Jiří Kléma, Filip Železný
  • Cvičící: Jáchym Barvínek, Ondřej Hubáček, Petr Ryšavý, Martin Svatoš

Cvičení

Zkouška

06.06.2018

  1. (5 pnts) Difference between PAC-learning agent and mistake-bound agent.
    • (2 pnts) What does it mean when an agent in both frameworks learns?
    • (3 pnts) What does it mean when it learns efficiently? Online?
  2. (10 pnts) Space-version agent. There are given two agent with different hypotheses spaces. First is all possible 3-conjunctions (non-negative) of n variables. Second is all n-conjunctions of positive and negative literals.
    • (3 pnts) For each agent: does it learn online?
    • (3 pnts) For each agent: does it learn efficiently?
    • (4 pnts) For the first agent: given the first negative observation (0,1,1,1,…,1), what will be the agent's decision on the next observation (0,1,0,1,…)?
  3. (15 pnts) Relative Least General Generalization (rlgg). Given background knowledge B = {half(4,2), half(2,1), int(2), int(1)}. What will be the rlgg of o1 = even(4) and o2 = even(2) relative to the background?
    • (10 pnts) Apply algorithm, draw tables, theta functions.
    • (5 pnts) Make a reduction step relative to B. Why is it needed?
  4. (10 pnts) Bayesian networks.
    • (2 pnts) Find optimal, efficient, complete network (something like Season → Temperature → (two children: → Ice Cream Sales, → Heart Attack Rate)).
    • (2 pnts) Then compute CPT (conditional probability tables).
    • (3 pnts) Compute Pr(Spring|Good Ice Cream Sales, No Heart Attack)
    • (3 pnts) Compute Pr(Heart Attack|Winter, Bad Sales).
  5. (5 pnts) Q-learning. Given 5 small questions, response True/False and provide your reasoning.
    • (1 pnt) Can Q-learning be extended to infinite states or action space? How would it handle this?
    • (1 pnt) Does Q-learning use on-policy update? What is the difference from off-policy update?
    • (1 pnt) Does Q-learning always converge? If so, is it conditioned by anything? By what?
    • (1 pnt) Is Q-learning just an instance of temporal difference learning? If not, what is different?
    • (1 pnt) What is the difference between Q-learning and direct utility estimation or adaptive dynamic programming? What is better?
  6. (5 pnts) Q-learning representation.
    • There is a robot moving in a swimming pool, which can move in either of 3 dimensions and it has exactly one propeller for each dimension. It can also move with two different speeds. There is a treasure at a specific place and a specific depth. There are mines at some places as well. If the robot hits a mine or the wall, it restarts at a random position.
    • (3 pnts) Describe states, actions, rewards of a specific game. You may provide two different representations.
    • (2 pnts) Describe Q-learning representation, the update rule, gamma, alpha value. How are Q values defined?

31.5.2021

  1. (5 pnts) Probabilities
    • (1 pnt) Mathematically describe conditional independence.
    • (2 pnts) If P(X,Y|W) and P(Y,Z|W) are conditionally independent, does it imply conditional independence P(X,Z|W)?
    • (1 pnt) Can bayesian graph contain a cycle?
    • (1 pnt) Is it true that if parents of X are given, X is independent of all other variables in graph? If no, correct this statement.
  2. (5 pnts) Factors computation - given some graph and compute conditional probability
  3. (10 pnts) Space-version agent. There are given two agent with different hypotheses spaces. First is all possible 3-conjunctions (non-negative) of n variables. Second is all n-conjunctions of positive and negative literals.
    • (3 pnts) For each agent: does it learn online?
    • (3 pnts) For each agent: does it learn efficiently?
    • (4 pnts) For the first agent: given the first negative observation (0,1,1,1,…,1), what will be the agent's decision on the next observation (0,1,0,1,…)?
  4. (15 pnts) RLGG relative to B
    • (10 pnts) Compute RLGG relative to B: RLGG(B→X1, B→X2).
    • (5 pnts) Make a reduction of previous result.
  5. (5 pnts) Some theoretical question, if concept learning agent learns, does the pac agent learn? Something in this sense…
  6. (10 pnts) Reinforcement learning
    • (5 * 1 pnt) Small questions asking about ADP and TD learning, shortly describe them, compare convergence, how fast they are, how do they compute utility…
    • (4 pnts) 3 same grids given with different optimal policies, 2 terminal states, one with reward +1 and one with reward -1, other states have negative rewards r_a in grid A, r_b in grid B, r_c in grid C, determine based on the policies which absolute value of these rewards is the biggest and the smallest, justify your answer.
    • (1 pnt) In the grid agent behaves according to the optimal policy. However, in two different episodes it makes different sequences of states, why is it possible?

2.6.2023 & 6.6.2023

  1. RL
    • TD learning/Sarsa/Q-learning calculate the value function and then decide how V/Q changes with different gamma/alpha parameters
    • Convergence of TD
    • Definition of some hyperparams, value function
  2. BN
    • Given a network count how many parameters do we need
    • Given a network determine independent set R
    • Definition of cond. independence, proof that causal chain is cond. independant
  3. NLP
    • Naive Bayes, bag of words (10 pnts)
    • Word embeddings - what is it, how to obtain etc. (4 pnts)
    • TF-IDF - formula
    • Transformers - main idea, diagram
    • Co-occurence matrix, PPMI, truncated SVD (10 pnts)
    • Perplexity - formula
    • LDA, Dirichlet dist.
    • RNN diagram
  4. CoLT
    • learning conjunctions, proof that mistake bound is n+1, describe algorithm for k-term DNF
    • decide if something is PAC efficient/proper if it is a subset of something that is PAC efficient
    • adapt winnow/halving, define |H|, define mistake bound

23.6.2023

  1. (~10-12 pnts) RL
    • (2 pnts) True-False. There are given 2 policies: 1) not necessarily greedy 2) greedy policy. Q1 are Q values of 1st policy at time t, Q2 are values of 2nd policy at time t, Q3 are values of 2nd policy at time t+1. What is the relationship between V1 V2 and V3?
    • (~3-4 pnts) Describe Monte-Carlo, differences between First-Visit and Every-Visit.
    • (1 pnt) Describe GLIE.
    • (~4-5 pnts) Smth about bandits.
  2. (5 pnts) BN
    • (2 pnts) ?
    • (3 pnts) Calculate cond. probability P(A=0|E=1) (There was a network with 8 nodes and it could be eliminated to 3 nodes, so the calculation was easy)
  3. (~16-20 pnts) NLP
    • (10 pnts) 4 sentences are given (as in the sample exam), each sentence is a document.
      • Create a doc-term matrix
      • Describe TF-IDF formula, calculate TF-IDF for doc-term matrix
      • Draw a diagram of truncated SVD
    • (4 pnts) Smth about softmax and negative log likelihood
    • (4 pnts) Smth about CNN, draw a diagram
  4. (14 pnts) CoLT
    • (5 pnts) Define VC dimension, determine VC dimension of given hypothesis class
    • (8 pnts) Some decision tree was given with depth = 4.
      • Express the tree as a 4-DNF
      • Express the tree as a 3-CNF.
      • How can we learn k-decision trees in the PAC learning model?

Ukázkové příklady z roku 2019 + řešení

Ukázkové příklady z roku 2023

courses/b4m36smu.txt · Poslední úprava: 2023/06/27 22:36 autor: undefined
Nahoru
chimeric.de = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0