Large Language Models: Test Your Knowledge

  1. How many 2-grams (bigrams) are present in the following phrase:

    they visited New York last week

  2. Which attributes of large language models help them make better predictions than other types of language models? (Choose all that apply)

    Choose as many answers as you see fit.

  3. True or False: A full Transformer consists of both an encoder and a decoder.

  4. An LLM is trained on a large corpus of data that includes the following example:

    My cousin's new fashion line is so cool!

    What mechanism helps the LLM learn that in this sentence, "cool" most likely means "great" and does not refer to the temperature of the clothing?

  5. Which of the following statements about fine-tuning vs. distilling is true?