What best describes tokenization?

A.    Adding lexical relations to the raw text.
B.    Converting text into the list of terms.
C.    Converting text into a list of unique terms.
D.    Reducing variant forms of tokens to their base forms.

Answer: B

What does YARN provide over and above MapReduce?

A.    Separate cluster and resource management
B.    Parallelized processing
C.    Serialized processing
D.    Access to HDFS data

Answer: A

Consider the two sentences below:
– I mailed my credit card application to the bank.
– We walked along the river bank until we came to a waterwheel.
What type of NLP ambiguity might occur when interpreting the word “bank”?

A.    Discourse
B.    Syntactic
C.    Semantic
D.    Acoustic

Answer: C

What is a characteristic of lemmatization?

A.    Can be performed by calling the synset () function on a lemma in LNTK.
B.    Can be performed by calling the lemma() function on a synset in LNTK.
C.    Reduces words of variant forms to their base forms based on a set of heuristics.
D.    Reduces words of variant forms to their base forms based on a dictionary.

Answer: D

What elements are needed to determine the time complexity of finding all the cliques of size k in social network analysis?

A.    Eigenvector centrality and betwenness.
B.    Clique size and total number of nodes in the network.
C.    Number of edges in the network and centrality measure of the cliques.
D.    Clique size and betweenness centrality.

Answer: B

