18CSE359T - NATURAL LANGUAGE PROCESSING UNIT 2 & 3 - 4M

 Differentiate deep parsing vs shallow parsing


DEEP PARSING

SHALLOW PARSING

Search strategy will give a complete syntactic structure to a sentence

It is a task of parsing a limited part of the syntactic information from the given task

Suitable for complex NLP applications

Used for less complex NLP applications

Eg: dialogue systems and summarization

Eg: information extraction and text mining

Also called full parsing

Also called chunking


Explain with suitable examples of how semantics can be analyzed 

  1. Word-Level Semantics:

    • Homonyms: Words with the same spelling or pronunciation but different meanings. For example: “Bank” can refer to a financial institution or the side of a river.

    • Synonyms: Words with similar meanings. For instance: “Happy,” “joyful,” are synonyms.

    • Antonyms: Words with opposite meanings. For example: “Hot” and “cold” are antonyms.

  2. Context-Level Semantics:

    • Sarcasm: When someone says, “What a wonderful day!” during a downpour, they convey the opposite meaning due to context and tone2.

    • Metaphors: Describing a person as having a “heart of stone” doesn’t mean a literal rock heart; it’s a metaphor for emotional unresponsiveness2.

  3. Sentence-Level Semantics:

    • Anaphora: Referring back to a previously mentioned noun or phrase. Example:

      • “John loves reading books. He spends hours at the library.”

    • Cataphora: Referring to a subsequent noun or phrase. Example:

      • “Before she arrived, Mary had already made dinner.”


Explain the various levels of disambiguating text in NLP

  1. Phonetics:

    • At the sound level, disambiguation involves distinguishing between homophones such as “to” and “too.”

  2. Morphology:

    • At the word structure level, disambiguation deals with different word forms 

    • Example: “Run” (verb) vs. “runs” (noun).

  3. Syntax:

    • At the sentence structure level, disambiguation involves understanding the role of each word in a sentence.

    • Example: “The cat chased the dog” vs. “The dog chased the cat.”

  4. Semantics:

    • At the meaning level, disambiguation identifies the correct sense of a word in context.

    • Example: “Bank” (financial institution) vs. “bank” (river edge).

  5. Pragmatics:

    • At the contextual level, disambiguation considers implied meaning, speaker intentions, and shared knowledge.

    • Example: Understanding sarcasm or indirect requests.

  6. Discourse:

    • At the larger text unit level, disambiguation involves coherence across sentences and paragraphs.

    • Example: Resolving pronouns (“he,” “she”) to their antecedents.


Compare and contrast FrameNet, PropBank and VerbNet

  1. FrameNet:

    • FrameNet is primarily concerned with semantic frames. 

    • FrameNet is built from annotated corpora

    • Example: The frame “Communication” might include lexical units like “talk,” “speak,” and “converse.”

    • Used for semantic role labeling and word sense disambiguation.

  2. PropBank:

    • PropBank focuses on syntactic roles, specifically the arguments (participants) of verbs.

    • PropBank annotations are derived from treebanks

    • Example: For the verb “give,” PropBank identifies arguments like “Agent,” “Theme,” and “Recipient.”

    • Used for semantic parsing and information extraction.

  3. VerbNet:

    • VerbNet classifies verbs into verb classes based on their syntactic and semantic properties.

    • VerbNet is manually constructed by linguists

    • Example: The verb class “Give” includes verbs like “donate,” “hand,” and “pass.”

    • Used in verb sense disambiguation, machine translation, and semantic role labeling.


What is an RDF tuple? Where is this used?

  1. RDF Tuple (Triple):

    • An RDF tuple, also known as an RDF triple, is the atomic data entity in the Resource Description Framework (RDF).

    • It consists of three components that codify a statement about semantic data:

      • Subject: Represents the resource being described (e.g., “Bob”).

      • Predicate: Specifies the relationship between the subject and the object (e.g., “knows”).

      • Object: Describes the related resource (e.g., “John”)

  2. Usage:

    • Semantic Web

    • Applications like creating npc in games, data integration, semantic search

    • Web Ontology Language (OWL): OWL, as part of the Semantic Web, relies on RDF for representing ontologies.


What are the challenges in intent detection

  • Intent detection is the process of algorithmically identifying user intent from a given statement

Challenges: 

  • One of the biggest challenges in building successful intent detection is natural language processing. That in itself is a vast field requiring expertise and cooperation between computer science and linguistics. 

  •   A solid dataset is the backbone of every chatbot worth its salt – and their learning capabilities are what makes or breaks intent detection. This is where machine learning comes in

  • Machine learning brings with it its own challenges, like preparing a solid initial training dataset, making sure the bot recognizes colloquialisms in user input, providing it with enough time and effort. It’s an extensive, arduous process, but it can be made easier with proper tooling.


With suitable examples, explain anaphora and cataphora

Anaphora: 

  • Refers to the use of a pronoun or word that refers back to a previously mentioned noun or phrase in a sentence or text

  • The word occurring before an anaphora is called antecedent

Cataphora: 

  • Refers to the use of a pronoun or word that refers to a subsequent noun or phrase that appears later in a sentence or text

  • The word occurring after an cataphora is called postcedent


Explain in detail about the various approaches and methods for word sense disambiguation

Word sense disambiguation:

  • WSD in NLP may be defined as the ability to determine which meaning of word is activated by the use of word in a particular context

  • It resolves semantic ambiguity

Approaches and methods to WSD:

  • Dictionary-based or knowledge-based methods

  • Supervised methods

  • Semi-supervised methods

  • Unsupervised methods

Applications:

  • Machine translation

  • Information retrieval

  • Text mining and information extraction

Difficulties: 

  • Differences between dictionaries

  • Different algorithms for different applications

  • Word sense discreteness

  • Inter judge variance


What is RST? Explain and give an example of Evidence Relation

RST:

  • Rhetorical Structure Theory is a model of text organization that was originally proposed for the study of text generation

  • It is based on a set of 23 rhetorical relations that can hold between spans of text within a discourse

Evidence relation:

  • Consider the evidence relation, in which a satellite presents evidence for proposition or situation expressed in the nucleus:

  • N - nucleus, S - satellite, R - reader, W - writer


With suitable examples, give coherence to the entire discourse

  • Discourse is defined as to talk about a subject

  • Discourses are collection of coherent sentences - not arbitrary set of sentences

Coherence to the discourse:

  • John went to the bank to deposit his paycheck. (S1)

  • He then took a train to Bill’s car dealership. (S2)

  • He needed to buy a car. (S3)

  • The company he works for now isn’t near any public transportation. (S4)

  • He also wanted to talk to Bill about their softball league. (S5)



Comments

Popular posts from this blog

18CSC305J - Artificial Intelligence UNIT 4 & 5

18CSC303J - Database Management System UNIT 4 & 5 - 12 MARKS

18CSC303J - Database Management System UNIT 4 & 5 - 4 MARKS