Systems

69 - Relation Extraction

Extracting semantic relationships from a text, to make a connection between entities.

Relationship extraction is the task of extracting semantic relationships from a text. A relation can be defined as a connection between entities. There are different ways of extracting relations:

  • Simple deduction: use the presence of two entities in the same sentence (or paragraph) as an unnamed relation.
  • Predicate logic: use the Dependency tags to define semantic relation-queries like (subject, verb, object) where the verb defines the relation.
  • Hearst Patterns: use the POS-tags to extract Hearst Patterns, which are hierarchical relations based on semantic information. Hearst Patterns are used to extract hypernym relations. A hyponym (e.g. Shakespeare) is in a type-of relationship with its hypernym (e.g. author). These are important for extracting tuples for ontologies.

Examples of Hearts Patterns (source)

  • Word2vec similarity: use vector calculations to define relations, like in Gensim:
    import gensim 

    model = gensim.models.Word2Vec.load('model-01')  
    model.most_similar(positive=[' **father** ', ' **son** '], negative=[' **mother** ']) 

    >>> [(' **daughter** ', 0.8783684968948364)] 
  • Question Answering and Slot Filling: ask a question in a certain relationship-template and use the answer to fill the slot.
    Template: husband_of = ”Who is the husband of [PERSON]?”  
    Question: ”Who is the husband of Michelle?”  
    Answer  : ”Barack”  
    Relation: Barack --> husband_of --> Michelle 
  • Transformer for Relation Extraction: Use deeplearning for relation extraction. TACRED, with 106k sentence-level examples and 41 relation types, and DocRED, with 107k document-level examples and 96 relation types, are good relation extraction datasets to train models on.


This article is part of the project Periodic Table of NLP Tasks. Click to read more about the making of the Periodic Table and the project to systemize NLP tasks.