Convert Figma logo to code with AI

zjunlp logoDeepKE

[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction

3,398
675
3,398
6

Top Related Projects

4,311

An Open-Source Package for Neural Relation Extraction (NRE)

11,947

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.

13,348

Python package built to ease deep learning on graph, on top of existing DL frameworks.

Generate embeddings from large-scale graph-structured data.

Open Source Neural Machine Translation and (Large) Language Models in PyTorch

11,750

An open-source NLP research library, built on PyTorch.

Quick Overview

DeepKE is an open-source knowledge extraction toolkit supporting low-resource, document-level, and multimodal scenarios. It provides a unified framework for various knowledge extraction tasks, including named entity recognition, relation extraction, and attribute extraction. DeepKE aims to make knowledge extraction more accessible and efficient for researchers and practitioners.

Pros

  • Supports multiple knowledge extraction tasks in a unified framework
  • Offers pre-trained models and easy-to-use interfaces for quick deployment
  • Provides support for low-resource scenarios, making it useful for languages or domains with limited data
  • Includes multimodal capabilities, allowing for knowledge extraction from text and images

Cons

  • May have a steeper learning curve for users unfamiliar with knowledge extraction concepts
  • Documentation could be more comprehensive, especially for advanced use cases
  • Limited support for languages other than English and Chinese
  • Performance may vary depending on the specific task and dataset

Code Examples

  1. Named Entity Recognition (NER):
from deepke.name_entity_re import *

# Load pre-trained NER model
model = NERModel("bert", "bert-base-chinese", labels=["PER", "ORG", "LOC"])

# Perform NER on a given text
text = "张三在北京大学工作"
result = model.predict(text)
print(result)
  1. Relation Extraction:
from deepke.relation_extraction import *

# Load pre-trained relation extraction model
model = REModel("bert", "bert-base-chinese")

# Extract relations from a sentence
sentence = "苹果公司的总部位于加利福尼亚州"
subject = "苹果公司"
object = "加利福尼亚州"
result = model.predict(sentence, subject, object)
print(result)
  1. Attribute Extraction:
from deepke.attribute_extraction import *

# Load pre-trained attribute extraction model
model = AEModel("bert", "bert-base-chinese")

# Extract attributes from a given text
text = "这款手机的屏幕尺寸为6.1英寸,电池容量为3000mAh"
result = model.predict(text)
print(result)

Getting Started

To get started with DeepKE, follow these steps:

  1. Install DeepKE:
pip install deepke
  1. Import the desired module:
from deepke.name_entity_re import NERModel
from deepke.relation_extraction import REModel
from deepke.attribute_extraction import AEModel
  1. Load a pre-trained model and use it for prediction:
model = NERModel("bert", "bert-base-chinese", labels=["PER", "ORG", "LOC"])
result = model.predict("张三在北京大学工作")
print(result)

For more detailed instructions and advanced usage, refer to the official DeepKE documentation.

Competitor Comparisons

4,311

An Open-Source Package for Neural Relation Extraction (NRE)

Pros of OpenNRE

  • Focuses specifically on neural relation extraction, providing a more specialized toolkit
  • Offers pre-trained models for quick deployment and testing
  • Includes a comprehensive evaluation module for model performance analysis

Cons of OpenNRE

  • Limited to relation extraction tasks, while DeepKE covers a broader range of knowledge extraction tasks
  • Less flexibility in terms of customization and integration with other NLP tasks
  • Smaller community and fewer updates compared to DeepKE

Code Comparison

OpenNRE:

from opennre import encoder, model, framework

# Define the model
model = model.SoftmaxNN(
    ckpt='bert-base-uncased',
    encoder=encoder.BERTEncoder(max_length=80)
)

# Load data and train
framework.train_model(model, train_path='train.txt', val_path='val.txt')

DeepKE:

from deepke import extraction

# Initialize and train the model
extraction.set_config(task='relation_extraction', model='bert')
extraction.train()

# Predict using the trained model
extraction.predict(text="Apple Inc. was founded by Steve Jobs.")
11,947

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.

Pros of PaddleNLP

  • Comprehensive NLP toolkit with a wide range of pre-trained models and datasets
  • Seamless integration with PaddlePaddle deep learning framework
  • Extensive documentation and tutorials for ease of use

Cons of PaddleNLP

  • Primarily focused on Chinese NLP tasks, which may limit its applicability for other languages
  • Steeper learning curve for users not familiar with PaddlePaddle ecosystem

Code Comparison

PaddleNLP:

from paddlenlp import Taskflow

ner = Taskflow("ner")
result = ner("华为是一家总部位于广东省深圳市的中国大型通信设备公司")
print(result)

DeepKE:

from deepke.name_entity_re import NER

ner = NER("bert")
result = ner.predict("华为是一家总部位于广东省深圳市的中国大型通信设备公司")
print(result)

Both repositories provide tools for natural language processing tasks, with a focus on named entity recognition in this example. PaddleNLP offers a more comprehensive toolkit within the PaddlePaddle ecosystem, while DeepKE provides a specialized framework for knowledge extraction tasks. The choice between them depends on the specific requirements of the project and familiarity with the respective ecosystems.

13,348

Python package built to ease deep learning on graph, on top of existing DL frameworks.

Pros of DGL

  • Broader scope: Focuses on general graph neural networks, applicable to various domains
  • More mature project with larger community and extensive documentation
  • Supports multiple deep learning frameworks (PyTorch, MXNet, TensorFlow)

Cons of DGL

  • Steeper learning curve due to its more general-purpose nature
  • May require more code for specific knowledge extraction tasks
  • Less specialized for knowledge graph and relation extraction tasks

Code Comparison

DeepKE example (relation extraction):

from deepke.name_entity_re.standard import *

model = NERModel('bert', 'bert-base-uncased', num_labels=9)
model.train_model(train_data)
predictions = model.predict(test_data)

DGL example (graph neural network):

import dgl
import torch.nn as nn

class GCN(nn.Module):
    def __init__(self, in_feats, h_feats, num_classes):
        super(GCN, self).__init__()
        self.conv1 = dgl.nn.GraphConv(in_feats, h_feats)
        self.conv2 = dgl.nn.GraphConv(h_feats, num_classes)

    def forward(self, g, in_feat):
        h = self.conv1(g, in_feat)
        h = self.conv2(g, h)
        return h

Generate embeddings from large-scale graph-structured data.

Pros of PyTorch-BigGraph

  • Designed for large-scale graph embedding, capable of handling billions of nodes and edges
  • Supports distributed training across multiple machines for improved performance
  • Offers a variety of loss functions and edge sampling techniques

Cons of PyTorch-BigGraph

  • Focused primarily on graph embeddings, less versatile for other NLP tasks
  • Steeper learning curve due to its specialized nature and distributed computing features
  • Less active development and community support compared to DeepKE

Code Comparison

PyTorch-BigGraph:

config = torchbiggraph.config.parse_config({
    'entities': {'all': {'num_partitions': 1}},
    'relations': [{'name': 'all', 'lhs': 'all', 'rhs': 'all'}],
    'dimension': 100,
    'max_epochs': 50,
    'num_batch_negs': 1000,
    'num_uniform_negs': 1000,
})

DeepKE:

config = {
    "model_name": "bert-base-uncased",
    "max_seq_len": 128,
    "batch_size": 32,
    "learning_rate": 2e-5,
    "num_train_epochs": 3,
}
model = NERModel("bert", "bert-base-uncased", args=config)

Open Source Neural Machine Translation and (Large) Language Models in PyTorch

Pros of OpenNMT-py

  • More mature and widely adopted project with extensive documentation
  • Supports a broader range of neural machine translation architectures
  • Active community and regular updates

Cons of OpenNMT-py

  • Focused primarily on machine translation, less versatile for other NLP tasks
  • Steeper learning curve for beginners in NLP

Code Comparison

OpenNMT-py:

import onmt

# Define model parameters
model_opts = {"model_type": "transformer", "src_vocab": src_vocab, "tgt_vocab": tgt_vocab}

# Create and train the model
model = onmt.models.build_model(model_opts)
trainer = onmt.Trainer(model, train_data, valid_data, optim)
trainer.train()

DeepKE:

from deepke import NERModel

# Define model parameters
model_params = {"model_type": "bert", "num_labels": num_labels}

# Create and train the model
model = NERModel("bert", "bert-base-cased", args=model_params)
model.train_model(train_data)

OpenNMT-py is more specialized for machine translation tasks, while DeepKE offers a broader range of NLP functionalities, including named entity recognition, relation extraction, and attribute extraction. DeepKE provides a simpler API for various NLP tasks, making it more accessible for users new to NLP. However, OpenNMT-py's focus on translation allows for more advanced and customizable translation models.

11,750

An open-source NLP research library, built on PyTorch.

Pros of AllenNLP

  • More comprehensive and general-purpose NLP toolkit
  • Larger community and more extensive documentation
  • Built on PyTorch, offering greater flexibility and ease of use

Cons of AllenNLP

  • Steeper learning curve for beginners
  • Less focused on specific knowledge extraction tasks
  • May require more setup and configuration for specialized use cases

Code Comparison

DeepKE example (entity extraction):

from deepke.name_entity_re.standard import *

model = NERModel('bert', 'bert-base-chinese', num_labels=len(label2id))
model.train_model(train_data)
predictions = model.predict(["我在北京大学学习"])

AllenNLP example (named entity recognition):

from allennlp.predictors import Predictor

predictor = Predictor.from_path("https://storage.googleapis.com/allennlp-public-models/ner-model-2020.02.10.tar.gz")
result = predictor.predict(sentence="The girl went to Harvard University.")

Both libraries offer streamlined APIs for NLP tasks, but DeepKE focuses more on knowledge extraction, while AllenNLP provides a broader range of NLP functionalities. DeepKE's API is more tailored for specific tasks, whereas AllenNLP's approach is more generalized and requires additional configuration for specialized use cases.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Documentation PyPI GitHub Documentation Open In Colab

English | 简体中文

A Deep Learning Based Knowledge Extraction Toolkit
for Knowledge Graph Construction

DeepKE is a knowledge extraction toolkit for knowledge graph construction supporting cnSchema,low-resource, document-level and multimodal scenarios for entity, relation and attribute extraction. We provide documents, online demo, paper, slides and poster for beginners.

If you encounter any issues during the installation of DeepKE and DeepKE-LLM, please check Tips or promptly submit an issue, and we will assist you with resolving the problem!

Table of Contents


What's New

  • April, 2024 We release a new bilingual (Chinese and English) schema-based information extraction model called OneKE based on Chinese-Alpaca-2-13B.
  • Feb, 2024 We release a large-scale (0.32B tokens) high-quality bilingual (Chinese and English) Information Extraction (IE) instruction dataset named IEPile, along with two models trained with IEPile, baichuan2-13b-iepile-lora and llama2-13b-iepile-lora.
  • Sep 2023 a bilingual Chinese English Information Extraction (IE) instruction dataset called InstructIE was released for the Instruction based Knowledge Graph Construction Task (Instruction based KGC), as detailed in here.
  • June, 2023 We update DeepKE-LLM to support knowledge extraction with KnowLM, ChatGLM, LLaMA-series, GPT-series etc.
  • Apr, 2023 We have added new models, including CP-NER(IJCAI'23), ASP(EMNLP'22), PRGC(ACL'21), PURE(NAACL'21), provided event extraction capabilities (Chinese and English), and offered compatibility with higher versions of Python packages (e.g., Transformers).
  • Feb, 2023 We have supported using LLM (GPT-3) with in-context learning (based on EasyInstruct) & data generation, added a NER model W2NER(AAAI'22).
Previous News

Prediction Demo

There is a demonstration of prediction. The GIF file is created by Terminalizer. Get the code.


Model Framework

  • DeepKE contains a unified framework for named entity recognition, relation extraction and attribute extraction, the three knowledge extraction functions.
  • Each task can be implemented in different scenarios. For example, we can achieve relation extraction in standard, low-resource (few-shot), document-level and multimodal settings.
  • Each application scenario comprises of three components: Data including Tokenizer, Preprocessor and Loader, Model including Module, Encoder and Forwarder, Core including Training, Evaluation and Prediction.

Quick Start

DeepKE-LLM

In the era of large models, DeepKE-LLM utilizes a completely new environment dependency.

conda create -n deepke-llm python=3.9
conda activate deepke-llm

cd example/llm
pip install -r requirements.txt

Please note that the requirements.txt file is located in the example/llm folder.

DeepKE

  • DeepKE supports pip install deepke.
    Take the fully supervised relation extraction for example.
  • DeepKE supports both manual and docker image environment configuration, you can choose the appropriate way to build.
  • Highly recommended to install deepke in a Linux environment.

🔧Manual Environment Configuration

Step1 Download the basic code

git clone --depth 1 https://github.com/zjunlp/DeepKE.git

Step2 Create a virtual environment using Anaconda and enter it.

conda create -n deepke python=3.8

conda activate deepke
  1. Install DeepKE with source code

    pip install -r requirements.txt
    
    python setup.py install
    
    python setup.py develop
    
  2. Install DeepKE with pip (NOT recommended!)

    pip install deepke
    

Step3 Enter the task directory

cd DeepKE/example/re/standard

Step4 Download the dataset, or follow the annotation instructions to obtain data

wget 120.27.214.45/Data/re/standard/data.tar.gz

tar -xzvf data.tar.gz

Many types of data formats are supported,and details are in each part.

Step5 Training (Parameters for training can be changed in the conf folder)

We support visual parameter tuning by using wandb.

python run.py

Step6 Prediction (Parameters for prediction can be changed in the conf folder)

Modify the path of the trained model in predict.yaml.The absolute path of the model needs to be used,such as xxx/checkpoints/2019-12-03_ 17-35-30/cnn_ epoch21.pth.

python predict.py
  • ❗NOTE: if you encounter any errors, please refer to the Tips or submit a GitHub issue.

🐳Building With Docker Images

Step1 Install the Docker client

Install Docker and start the Docker service.

Step2 Pull the docker image and run the container

docker pull zjunlp/deepke:latest
docker run -it zjunlp/deepke:latest /bin/bash

The remaining steps are the same as Step 3 and onwards in Manual Environment Configuration.

  • ❗NOTE: You can refer to the Tips to speed up installation

Requirements

DeepKE

python == 3.8

  • torch>=1.5,<=1.11
  • hydra-core==1.0.6
  • tensorboard==2.4.1
  • matplotlib==3.4.1
  • transformers==4.26.0
  • jieba==0.42.1
  • scikit-learn==0.24.1
  • seqeval==1.2.2
  • opt-einsum==3.3.0
  • wandb==0.12.7
  • ujson==5.6.0
  • huggingface_hub==0.11.0
  • tensorboardX==2.5.1
  • nltk==3.8
  • protobuf==3.20.1
  • numpy==1.21.0
  • ipdb==0.13.11
  • pytorch-crf==0.7.2
  • tqdm==4.66.1
  • openai==0.28.0
  • Jinja2==3.1.2
  • datasets==2.13.2
  • pyhocon==0.3.60

Introduction of Three Functions

1. Named Entity Recognition

  • Named entity recognition seeks to locate and classify named entities mentioned in unstructured text into pre-defined categories such as person names, organizations, locations, organizations, etc.

  • The data is stored in .txt files. Some instances as following (Users can label data based on the tools Doccano, MarkTool, or they can use the Weak Supervision with DeepKE to obtain data automatically):

    SentencePersonLocationOrganization
    本报北京9月4日讯记者杨涌报道:部分省区人民日报宣传发行工作座谈会9月3日在4日在京举行。杨涌北京人民日报
    《红楼梦》由王扶林导演,周汝昌、王蒙、周岭等多位专家参与制作。王扶林,周汝昌,王蒙,周岭
    秦始皇兵马俑位于陕西省西安市,是世界八大奇迹之一。秦始皇陕西省,西安市
  • Read the detailed process in specific README

    • STANDARD (Fully Supervised)

      We support LLM and provide the off-the-shelf model, DeepKE-cnSchema-NER, which will extract entities in cnSchema without training.

      Step1 Enter DeepKE/example/ner/standard. Download the dataset.

      wget 120.27.214.45/Data/ner/standard/data.tar.gz
      
      tar -xzvf data.tar.gz
      

      Step2 Training

      The dataset and parameters can be customized in the data folder and conf folder respectively.

      python run.py
      

      Step3 Prediction

      python predict.py
      
    • FEW-SHOT

      Step1 Enter DeepKE/example/ner/few-shot. Download the dataset.

      wget 120.27.214.45/Data/ner/few_shot/data.tar.gz
      
      tar -xzvf data.tar.gz
      

      Step2 Training in the low-resouce setting

      The directory where the model is loaded and saved and the configuration parameters can be cusomized in the conf folder.

      python run.py +train=few_shot
      

      Users can modify load_path in conf/train/few_shot.yaml to use existing loaded model.

      Step3 Add - predict to conf/config.yaml, modify loda_path as the model path and write_path as the path where the predicted results are saved in conf/predict.yaml, and then run python predict.py

      python predict.py
      
    • MULTIMODAL

      Step1 Enter DeepKE/example/ner/multimodal. Download the dataset.

      wget 120.27.214.45/Data/ner/multimodal/data.tar.gz
      
      tar -xzvf data.tar.gz
      

      We use RCNN detected objects and visual grounding objects from original images as visual local information, where RCNN via faster_rcnn and visual grounding via onestage_grounding.

      Step2 Training in the multimodal setting

      • The dataset and parameters can be customized in the data folder and conf folder respectively.
      • Start with the model trained last time: modify load_path in conf/train.yamlas the path where the model trained last time was saved. And the path saving logs generated in training can be customized by log_dir.
      python run.py
      

      Step3 Prediction

      python predict.py
      

2. Relation Extraction

  • Relationship extraction is the task of extracting semantic relations between entities from a unstructured text.

  • The data is stored in .csv files. Some instances as following (Users can label data based on the tools Doccano, MarkTool, or they can use the Weak Supervision with DeepKE to obtain data automatically):

    SentenceRelationHeadHead_offsetTailTail_offset
    《岳父也是爹》是王军执导的电视剧,由马恩然、范明主演。导演岳父也是爹1王军8
    《九玄珠》是在纵横中文网连载的一部小说,作者是龙马。连载网站九玄珠1纵横中文网7
    提起杭州的美景,西湖总是第一个映入脑海的词语。所在城市西湖8杭州2
  • !NOTE: If there are multiple entity types for one relation, entity types can be prefixed with the relation as inputs.

  • Read the detailed process in specific README

    • STANDARD (Fully Supervised)

      We support LLM and provide the off-the-shelf model, DeepKE-cnSchema-RE, which will extract relations in cnSchema without training.

      Step1 Enter the DeepKE/example/re/standard folder. Download the dataset.

      wget 120.27.214.45/Data/re/standard/data.tar.gz
      
      tar -xzvf data.tar.gz
      

      Step2 Training

      The dataset and parameters can be customized in the data folder and conf folder respectively.

      python run.py
      

      Step3 Prediction

      python predict.py
      
    • FEW-SHOT

      Step1 Enter DeepKE/example/re/few-shot. Download the dataset.

      wget 120.27.214.45/Data/re/few_shot/data.tar.gz
      
      tar -xzvf data.tar.gz
      

      Step 2 Training

      • The dataset and parameters can be customized in the data folder and conf folder respectively.
      • Start with the model trained last time: modify train_from_saved_model in conf/train.yamlas the path where the model trained last time was saved. And the path saving logs generated in training can be customized by log_dir.
      python run.py
      

      Step3 Prediction

      python predict.py
      
    • DOCUMENT

      Step1 Enter DeepKE/example/re/document. Download the dataset.

      wget 120.27.214.45/Data/re/document/data.tar.gz
      
      tar -xzvf data.tar.gz
      

      Step2 Training

      • The dataset and parameters can be customized in the data folder and conf folder respectively.
      • Start with the model trained last time: modify train_from_saved_model in conf/train.yamlas the path where the model trained last time was saved. And the path saving logs generated in training can be customized by log_dir.
      python run.py
      

      Step3 Prediction

      python predict.py
      
    • MULTIMODAL

      Step1 Enter DeepKE/example/re/multimodal. Download the dataset.

      wget 120.27.214.45/Data/re/multimodal/data.tar.gz
      
      tar -xzvf data.tar.gz
      

      We use RCNN detected objects and visual grounding objects from original images as visual local information, where RCNN via faster_rcnn and visual grounding via onestage_grounding.

      Step2 Training

      • The dataset and parameters can be customized in the data folder and conf folder respectively.
      • Start with the model trained last time: modify load_path in conf/train.yamlas the path where the model trained last time was saved. And the path saving logs generated in training can be customized by log_dir.
      python run.py
      

      Step3 Prediction

      python predict.py
      

3. Attribute Extraction

  • Attribute extraction is to extract attributes for entities in a unstructed text.

  • The data is stored in .csv files. Some instances as following:

    SentenceAttEntEnt_offsetValVal_offset
    张冬梅,女,汉族,1968年2月生,河南淇县人民族张冬梅0汉族6
    诸葛亮,字孔明,三国时期杰出的军事家、文学家、发明家。朝代诸葛亮0三国时期8
    2014年10月1日许鞍华执导的电影《黄金时代》上映上映时间黄金时代192014年10月1日0
  • Read the detailed process in specific README

    • STANDARD (Fully Supervised)

      Step1 Enter the DeepKE/example/ae/standard folder. Download the dataset.

      wget 120.27.214.45/Data/ae/standard/data.tar.gz
      
      tar -xzvf data.tar.gz
      

      Step2 Training

      The dataset and parameters can be customized in the data folder and conf folder respectively.

      python run.py
      

      Step3 Prediction

      python predict.py
      

4. Event Extraction

  • Event extraction is the task to extract event type, event trigger words, event arguments from a unstructed text.
  • The data is stored in .tsv files, some instances are as follows:
Sentence Event type Trigger Role Argument
据《欧洲时报》报道,当地时间27日,法国巴黎卢浮宫博物馆员工因不满工作条件恶化而罢工,导致该博物馆也因此闭门谢客一天。 组织行为-罢工 罢工 罢工人员 法国巴黎卢浮宫博物馆员工
时间 当地时间27日
所属组织 法国巴黎卢浮宫博物馆
中国外运2019年上半年归母净利润增长17%:收购了少数股东股权 财经/交易-出售/收购 收购 出售方 少数股东
收购方 中国外运
交易物 股权
美国亚特兰大航展13日发生一起表演机坠机事故,飞行员弹射出舱并安全着陆,事故没有造成人员伤亡。 灾害/意外-坠机 坠机 时间 13日
地点 美国亚特兰
  • Read the detailed process in specific README

    • STANDARD(Fully Supervised)

      Step1 Enter the DeepKE/example/ee/standard folder. Download the dataset.

      wget 120.27.214.45/Data/ee/DuEE.zip
      unzip DuEE.zip
      

      Step 2 Training

      The dataset and parameters can be customized in the data folder and conf folder respectively.

      python run.py
      

      Step 3 Prediction

      python predict.py
      

Tips

1.Using nearest mirror, THU in China, will speed up the installation of Anaconda; aliyun in China, will speed up pip install XXX.

2.When encountering ModuleNotFoundError: No module named 'past',run pip install future .

3.It's slow to install the pretrained language models online. Recommend download pretrained models before use and save them in the pretrained folder. Read README.md in every task directory to check the specific requirement for saving pretrained models.

4.The old version of DeepKE is in the deepke-v1.0 branch. Users can change the branch to use the old version. The old version has been totally transfered to the standard relation extraction (example/re/standard).

5.If you want to modify the source code, it's recommended to install DeepKE with source codes. If not, the modification will not work. See issue

6.More related low-resource knowledge extraction works can be found in Knowledge Extraction in Low-Resource Scenarios: Survey and Perspective.

7.Make sure the exact versions of requirements in requirements.txt.

To do

In next version, we plan to release a stronger LLM for KE.

Meanwhile, we will offer long-term maintenance to fix bugs, solve issues and meet new requests. So if you have any problems, please put issues to us.

Reading Materials

Data-Efficient Knowledge Graph Construction, 高效知识图谱构建 (Tutorial on CCKS 2022) [slides]

Efficient and Robust Knowledge Graph Construction (Tutorial on AACL-IJCNLP 2022) [slides]

PromptKG Family: a Gallery of Prompt Learning & KG-related Research Works, Toolkits, and Paper-list [Resources]

Knowledge Extraction in Low-Resource Scenarios: Survey and Perspective [Survey][Paper-list]

Related Toolkit

Doccano、MarkTool、LabelStudio: Data Annotation Toolkits

LambdaKG: A library and benchmark for PLM-based KG embeddings

EasyInstruct: An easy-to-use framework to instruct Large Language Models

Reading Materials:

Data-Efficient Knowledge Graph Construction, 高效知识图谱构建 (Tutorial on CCKS 2022) [slides]

Efficient and Robust Knowledge Graph Construction (Tutorial on AACL-IJCNLP 2022) [slides]

PromptKG Family: a Gallery of Prompt Learning & KG-related Research Works, Toolkits, and Paper-list [Resources]

Knowledge Extraction in Low-Resource Scenarios: Survey and Perspective [Survey][Paper-list]

Related Toolkit:

Doccano、MarkTool、LabelStudio: Data Annotation Toolkits

LambdaKG: A library and benchmark for PLM-based KG embeddings

EasyInstruct: An easy-to-use framework to instruct Large Language Models

Citation

Please cite our paper if you use DeepKE in your work

@inproceedings{EMNLP2022_Demo_DeepKE,
  author    = {Ningyu Zhang and
               Xin Xu and
               Liankuan Tao and
               Haiyang Yu and
               Hongbin Ye and
               Shuofei Qiao and
               Xin Xie and
               Xiang Chen and
               Zhoubo Li and
               Lei Li},
  editor    = {Wanxiang Che and
               Ekaterina Shutova},
  title     = {DeepKE: {A} Deep Learning Based Knowledge Extraction Toolkit for Knowledge Base Population},
  booktitle = {{EMNLP} (Demos)},
  pages     = {98--108},
  publisher = {Association for Computational Linguistics},
  year      = {2022},
  url       = {https://aclanthology.org/2022.emnlp-demos.10}
}

Contributors

Ningyu Zhang, Haofen Wang, Fei Huang, Feiyu Xiong, Liankuan Tao, Xin Xu, Honghao Gui, Zhenru Zhang, Chuanqi Tan, Qiang Chen, Xiaohan Wang, Zekun Xi, Xinrong Li, Haiyang Yu, Hongbin Ye, Shuofei Qiao, Peng Wang, Yuqi Zhu, Xin Xie, Xiang Chen, Zhoubo Li, Lei Li, Xiaozhuan Liang, Yunzhi Yao, Jing Chen, Yuqi Zhu, Shumin Deng, Wen Zhang, Guozhou Zheng, Huajun Chen

Community Contributors: thredreams, eltociear, Ziwen Xu, Rui Huang, Xiaolong Weng

Other Knowledge Extraction Open-Source Projects