Top Related Projects
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
TensorFlow code and pre-trained models for BERT
Models and examples built with TensorFlow
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Deep Learning for humans
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Quick Overview
Bert4keras is a lightweight, high-performance BERT implementation in Keras and TensorFlow. It aims to provide a flexible and easy-to-use interface for fine-tuning BERT models on various NLP tasks. The project is designed to be compatible with both TensorFlow 1.x and 2.x versions.
Pros
- Easy integration with existing Keras and TensorFlow projects
- Supports multiple BERT variants (e.g., BERT, RoBERTa, ALBERT)
- Flexible architecture allowing for custom model modifications
- Comprehensive documentation and examples
Cons
- Primarily focused on Chinese NLP tasks, which may limit its applicability for other languages
- Requires some familiarity with Keras and TensorFlow
- May have a steeper learning curve compared to some other BERT implementations
Code Examples
- Loading a pre-trained BERT model:
from bert4keras.models import build_transformer_model
config_path = 'bert_config.json'
checkpoint_path = 'bert_model.ckpt'
model = build_transformer_model(config_path, checkpoint_path)
- Tokenizing text for BERT input:
from bert4keras.tokenizers import Tokenizer
dict_path = 'vocab.txt'
tokenizer = Tokenizer(dict_path)
tokens = tokenizer.tokenize('Hello, BERT!')
print(tokens)
- Fine-tuning BERT for text classification:
from bert4keras.models import build_transformer_model
from bert4keras.optimizers import Adam
model = build_transformer_model(config_path, checkpoint_path)
output = Dense(num_classes, activation='softmax')(model.output)
model = Model(model.input, output)
model.compile(
loss='categorical_crossentropy',
optimizer=Adam(2e-5),
metrics=['accuracy']
)
model.fit(train_generator, steps_per_epoch=1000, epochs=5)
Getting Started
To get started with bert4keras, follow these steps:
- Install the library:
pip install bert4keras
-
Download pre-trained BERT weights and configuration files.
-
Import the necessary modules:
from bert4keras.models import build_transformer_model
from bert4keras.tokenizers import Tokenizer
from bert4keras.snippets import sequence_padding, DataGenerator
- Load a pre-trained model and tokenizer:
model = build_transformer_model(config_path, checkpoint_path)
tokenizer = Tokenizer(dict_path)
- Prepare your data and fine-tune the model for your specific NLP task.
Competitor Comparisons
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Pros of Transformers
- Extensive model support: Covers a wide range of transformer-based models
- Active community and frequent updates
- Comprehensive documentation and tutorials
Cons of Transformers
- Steeper learning curve for beginners
- Larger library size and potentially slower import times
Code Comparison
Transformers
from transformers import BertTokenizer, BertModel
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertModel.from_pretrained('bert-base-uncased')
bert4keras
from bert4keras.tokenizers import Tokenizer
from bert4keras.models import build_transformer_model
tokenizer = Tokenizer(dict_path)
model = build_transformer_model(config_path, checkpoint_path)
Summary
Transformers offers a more comprehensive solution with broader model support and active community involvement. However, it may be more complex for beginners. bert4keras provides a simpler, more focused approach specifically for BERT-like models in Keras, which can be advantageous for users primarily working with these architectures.
TensorFlow code and pre-trained models for BERT
Pros of BERT
- Official implementation by Google Research, ensuring high reliability and adherence to the original paper
- Extensive documentation and examples for various BERT applications
- Large community support and frequent updates
Cons of BERT
- Limited to TensorFlow 1.x, which may be outdated for some users
- Less flexibility in terms of customization and integration with other frameworks
- Steeper learning curve for users not familiar with TensorFlow
Code Comparison
BERT (TensorFlow):
import tensorflow as tf
from bert import modeling
bert_config = modeling.BertConfig.from_json_file("bert_config.json")
model = modeling.BertModel(config=bert_config, is_training=True, input_ids=input_ids)
bert4keras (Keras):
from bert4keras.models import build_transformer_model
model = build_transformer_model(
config_path='bert_config.json',
checkpoint_path='bert_model.ckpt',
model='bert',
)
Key Differences
- bert4keras offers a more user-friendly API with Keras integration
- BERT provides a lower-level implementation, allowing for more control but requiring more setup
- bert4keras supports multiple backends (TensorFlow, Keras, PyTorch), while BERT is TensorFlow-specific
Models and examples built with TensorFlow
Pros of models
- Comprehensive collection of official TensorFlow models and examples
- Extensive documentation and community support
- Regular updates and maintenance by the TensorFlow team
Cons of models
- Large repository size, potentially overwhelming for beginners
- May include unnecessary components for specific BERT implementations
- Steeper learning curve due to broader scope
Code comparison
models:
import tensorflow as tf
from official.nlp import modeling
bert_config = modeling.BertConfig(vocab_size=30522, hidden_size=768, num_hidden_layers=12, num_attention_heads=12)
bert_model = modeling.BertModel(config=bert_config)
bert4keras:
from bert4keras.models import build_transformer_model
config_path = 'bert_config.json'
checkpoint_path = 'bert_model.ckpt'
model = build_transformer_model(config_path, checkpoint_path)
Summary
models offers a comprehensive suite of TensorFlow models and examples, including BERT implementations. It provides extensive documentation and regular updates but may be overwhelming for users focused solely on BERT. bert4keras, on the other hand, is a lightweight alternative specifically designed for BERT implementations in Keras, offering a simpler API and easier integration for BERT-specific tasks.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Pros of PyTorch
- Larger community and more extensive ecosystem
- Supports a wider range of deep learning models and applications
- More comprehensive documentation and tutorials
Cons of PyTorch
- Steeper learning curve for beginners
- Larger library size and potentially slower initial setup
Code Comparison
bert4keras:
from bert4keras.models import build_transformer_model
model = build_transformer_model(
config_path='bert_config.json',
checkpoint_path='bert_model.ckpt',
model='bert'
)
PyTorch:
from transformers import BertModel, BertConfig
config = BertConfig.from_json_file('bert_config.json')
model = BertModel.from_pretrained('bert-base-uncased', config=config)
Key Differences
- bert4keras is specifically designed for BERT and related models, while PyTorch is a general-purpose deep learning framework
- bert4keras offers a simpler API for working with BERT models, making it easier for beginners to get started
- PyTorch provides more flexibility and customization options for advanced users and researchers
Use Cases
- bert4keras: Ideal for projects focused on BERT and its variants, especially for Chinese NLP tasks
- PyTorch: Suitable for a wide range of deep learning projects, including computer vision, natural language processing, and reinforcement learning
Deep Learning for humans
Pros of Keras
- Broader scope and functionality, supporting a wide range of deep learning models
- Larger community and more extensive documentation
- Official support from TensorFlow and integration with other TensorFlow tools
Cons of Keras
- Less specialized for BERT and transformer models
- May require more setup and configuration for BERT-specific tasks
- Potentially steeper learning curve for BERT implementations
Code Comparison
bert4keras:
from bert4keras.models import build_transformer_model
model = build_transformer_model(
config_path='bert_config.json',
checkpoint_path='bert_model.ckpt',
model='bert'
)
Keras:
import tensorflow as tf
from tensorflow import keras
bert_model = keras.models.load_model('bert_model.h5')
bert4keras provides a more streamlined approach for building BERT models, while Keras requires additional setup but offers more flexibility for various model architectures.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Pros of DeepSpeed
- Offers advanced distributed training and optimization techniques for large-scale models
- Supports a wider range of deep learning frameworks, including PyTorch and TensorFlow
- Provides extensive documentation and tutorials for various use cases
Cons of DeepSpeed
- Steeper learning curve due to its more complex architecture and features
- May be overkill for smaller projects or simpler model training tasks
Code Comparison
DeepSpeed:
import deepspeed
model_engine, optimizer, _, _ = deepspeed.initialize(args=args,
model=model,
model_parameters=params)
bert4keras:
from bert4keras.models import build_transformer_model
model = build_transformer_model(
config_path=config_path,
checkpoint_path=checkpoint_path,
model='bert',
)
Summary
DeepSpeed is a more comprehensive and powerful library for large-scale model training, offering advanced optimization techniques and broader framework support. However, it may be more complex to set up and use compared to bert4keras, which is more focused on BERT-related tasks and easier to implement for simpler projects.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
bert4keras
- Our light reimplement of bert for keras
- æ´æ¸ æ°ãæ´è½»é级çkerasçbert
- 个人å客ï¼https://kexue.fm/
- å¨çº¿ææ¡£ï¼http://bert4keras.spaces.ac.cn/ ï¼è¿å¨æ建ä¸ï¼
说æ
è¿æ¯ç¬è éæ°å®ç°çkerasççtransformer模ååºï¼è´åäºç¨å°½å¯è½æ¸ ç½ç代ç æ¥å®ç°ç»åtransformeråkerasã
æ¬é¡¹ç®çåè¡·æ¯ä¸ºäºä¿®æ¹ãå®å¶ä¸çæ¹ä¾¿ï¼æ以å¯è½ä¼é¢ç¹æ´æ°ã
å æ¤æ¬¢è¿starï¼ä½ä¸å»ºè®®forkï¼å ä¸ºä½ forkä¸æ¥ççæ¬å¯è½å¾å¿«å°±è¿æäºã
åè½
ç®åå·²ç»å®ç°ï¼
- å è½½bert/roberta/albertçé¢è®ç»æéè¿è¡finetuneï¼
- å®ç°è¯è¨æ¨¡åãseq2seqæéè¦çattention maskï¼
- 丰å¯çexamplesï¼
- ä»é¶é¢è®ç»ä»£ç ï¼æ¯æTPUãå¤GPUï¼è¯·çpretrainingï¼ï¼
- å ¼å®¹kerasãtf.keras
使ç¨
å®è£ 稳å®çï¼
pip install bert4keras
å®è£ ææ°çï¼
pip install git+https://www.github.com/bojone/bert4keras.git
使ç¨ä¾å请åèexamplesç®å½ã
ä¹ååºäºkeras-bertç»åºçä¾åï¼ä»éç¨äºæ¬é¡¹ç®ï¼åªéè¦å°bert_model
çå è½½æ¹å¼æ¢ææ¬é¡¹ç®çã
ç论ä¸å ¼å®¹Python2åPython3ï¼å ¼å®¹tensorflow 1.14+åtensorflow 2.xï¼å®éªç¯å¢æ¯Python 2.7ãTesorflow 1.14+以åKeras 2.3.1ï¼å·²ç»å¨2.2.4ã2.3.0ã2.3.1ãtf.kerasä¸æµè¯éè¿ï¼ã
为äºè·å¾æ好çä½éªï¼å»ºè®®ä½ 使ç¨Tensorflow 1.14 + Keras 2.3.1ç»åã
å ³äºç¯å¢ç»å
æ¯ætf+kerasåtf+tf.kerasï¼åè éè¦æåä¼ å ¥ç¯å¢åéTF_KERAS=1ã
å½ä½¿ç¨tf+kerasæ¶ï¼å»ºè®®2.2.4 <= keras <= 2.3.1ï¼ä»¥å 1.14 <= tf <= 2.2ï¼ä¸è½ä½¿ç¨tf 2.3+ã
keras 2.4+å¯ä»¥ç¨ï¼ä½äºå®ä¸keras 2.4.xåºæ¬ä¸å·²ç»å®å ¨çä»·äºtf.kerasäºï¼å æ¤å¦æä½ è¦ç¨keras 2.4+ï¼åä¸å¦ç´æ¥ç¨tf.kerasã
å½ç¶ï¼ä¹äºè´¡ç®çæåå¦æåç°äºæäºbugçè¯ï¼ä¹æ¬¢è¿æåºä¿®æ£çè³Pull Requestsï½
æé
ç®åæ¯æå è½½çæéï¼
- Googleåçbert: https://github.com/google-research/bert
- brightmartçroberta: https://github.com/brightmart/roberta_zh
- å工大çroberta: https://github.com/ymcui/Chinese-BERT-wwm
- Googleåçalbert[ä¾å]: https://github.com/google-research/ALBERT
- brightmartçalbert: https://github.com/brightmart/albert_zh
- 转æ¢åçalbert: https://github.com/bojone/albert_zh
- å为çNEZHA: https://github.com/huawei-noah/Pretrained-Language-Model/tree/master/NEZHA-TensorFlow
- å为çNEZHA-GEN: https://github.com/huawei-noah/Pretrained-Language-Model/tree/master/NEZHA-Gen-TensorFlow
- èªç è¯è¨æ¨¡å: https://github.com/ZhuiyiTechnology/pretrained-models
- T5模å: https://github.com/google-research/text-to-text-transfer-transformer
- GPT_OpenAI: https://github.com/bojone/CDial-GPT-tf
- GPT2_ML: https://github.com/imcaspar/gpt2-ml
- GoogleåçELECTRA: https://github.com/google-research/electra
- å工大çELECTRA: https://github.com/ymcui/Chinese-ELECTRA
- CLUEçELECTRA: https://github.com/CLUEbenchmark/ELECTRA
- LaBSEï¼å¤å½è¯è¨BERTï¼: https://github.com/bojone/labse
- Chinese-GEN项ç®ä¸ç模å: https://github.com/bojone/chinese-gen
- T5.1.1: https://github.com/google-research/text-to-text-transfer-transformer/blob/master/released_checkpoints.md#t511
- Multilingual T5: https://github.com/google-research/multilingual-t5/
注æäºé¡¹
- 注1ï¼brightmartçalbertçå¼æºæ¶é´æ©äºGoogleçalbertï¼è¿å¯¼è´æ©æbrightmartçalbertçæéä¸Googleççä¸å®å ¨ä¸è´ï¼æ¢è¨ä¹ä¸¤è ä¸è½ç´æ¥ç¸äºæ¿æ¢ã为äºåå°ä»£ç åä½ï¼bert4kerasç0.2.4ååç»çæ¬ååªæ¯æå è½½Googleç以brightmartçä¸å¸¦Googleåç¼çæéãå¦æè¦å è½½æ©æçæ¬çæéï¼è¯·ç¨0.2.3çæ¬ï¼æè èèä½è 转æ¢è¿çalbert_zhã
- 注2ï¼ä¸è½½ä¸æ¥çELECTRAæéï¼å¦æ没æjsoné
ç½®æ件çè¯ï¼åèè¿éèªå·±æ¹ä¸ä¸ªï¼éè¦å ä¸
type_vocab_size
å段ï¼ã
æ´æ°
- 2023.03.06: æ 穷大æ¹np.infï¼ä¼åæ¾åå ç¨ãå°æ 穷大æ¹ä¸ºnp.infï¼è¿ç®æ´å åç¡®ï¼èä¸å¨ä½ç²¾åº¦è¿ç®æ¶ä¸å®¹æåºéï¼åæ¶å并äºè¥å¹²maskç®åï¼åå°äºæ¾åå ç¨ãå®æµå¨A100ä¸è®ç»baseålarge级å«æ¨¡åæ¶ï¼é度æææ¾å å¿«ï¼æ¾åå ç¨ä¹æéä½ã
- 2022.03.20: å¢å RoFormerV2ã
- 2022.02.28: å¢å GatedAttentionUnitã
- 2021.04.23: å¢å GlobalPointerã
- 2021.03.23: å¢å RoFormerã
- 2021.01.30: åå¸0.9.9çï¼å®åå¤GPUæ¯æï¼å¢å å¤GPUä¾åï¼task_seq2seq_autotitle_multigpu.pyã
- 2020.12.29: å¢å
residual_attention_scores
åæ°æ¥å®ç°RealFormerï¼åªéè¦å¨build_transformer_model
ä¸ä¼ å ¥åæ°residual_attention_scores=True
å¯ç¨ã - 2020.12.04:
PositionEmbedding
å¼å ¥å±æ¬¡å解ï¼å¯ä»¥è®©BERTç´æ¥å¤çè¶ é¿ææ¬ï¼å¨build_transformer_model
ä¸ä¼ å ¥åæ°hierarchical_position=True
å¯ç¨ã - 2020.11.19: æ¯æGPT2模åï¼åèCPM_LM_bert4keras项ç®ã
- 2020.11.14: æ°å¢ååæ°å¦ä¹ ç
extend_with_parameter_wise_lr
ï¼å¯ç¨äºç»æ¯å±è®¾ç½®ä¸åçå¦ä¹ çã - 2020.10.27: æ¯æT5.1.1åMultilingual T5ã
- 2020.08.28: æ¯æGPT_OpenAIã
- 2020.08.22: æ°å¢
WebServing
ç±»ï¼å 许ç®åå°å°æ¨¡å转æ¢ä¸ºWebæ¥å£ï¼è¯¦æ 请åè该类ç说æã - 2020.07.14:
Transformer
ç±»å å ¥prefix
åæ°ï¼snippets.py
å¼å ¥to_array
å½æ°ï¼AutoRegressiveDecoder
ä¿®æ¹rtype='logits'
æ¶çä¸ä¸ªéèbugã - 2020.06.06: 强迫çä½ç¥ï¼å°
Tokenizer
åæ¥çmax_length
åæ°éå½å为maxlen
ï¼åæ¶ä¿çååå ¼å®¹æ§ï¼å»ºè®®å¤§å®¶ç¨æ°åæ°åã - 2020.04.29: å¢å é计ç®ï¼åèkeras_recomputeï¼ï¼å¯ä»¥éè¿æ¶é´æ¢ç©ºé´ï¼éè¿è®¾ç½®ç¯å¢åé
RECOMPUTE=1
å¯ç¨ã - 2020.04.25: ä¼åtf2ä¸ç表ç°ã
- 2020.04.16: ææexampleåéé tensorflow 2.0ã
- 2020.04.06: å¢å UniLMé¢è®ç»æ¨¡å¼ï¼æµè¯ä¸ï¼ã
- 2020.04.06: å®å
rematch
æ¹æ³ã - 2020.04.01:
Tokenizer
å¢årematch
æ¹æ³ï¼ç»åºåè¯ç»æä¸ååºåçæ å°å ³ç³»ã - 2020.03.30: å°½éç»ä¸pyæ件çåæ³ã
- 2020.03.25: æ¯æELECTRAã
- 2020.03.24: 继ç»å 强
DataGenerator
ï¼å è®¸ä¼ å ¥è¿ä»£å¨æ¶è¿è¡å±é¨shuffleã - 2020.03.23: å¢å è°æ´Attentionç
key_size
çé项ã - 2020.03.17: å¢å¼º
DataGenerator
ï¼ä¼å模ååæ³ã - 2020.03.15: æ¯æGPT2_MLã
- 2020.03.10: æ¯æGoogleçT5模åã
- 2020.03.05: å°
tokenizer.py
æ´å为tokenizers.py
ã - 2020.03.05:
application='seq2seq'
æ¹å为application='unilm'
ã - 2020.03.05:
build_bert_model
æ´å为build_transformer_model
ã - 2020.03.05: éå
models.py
ç»æã - 2020.03.04: å°
bert.py
æ´å为models.py
ã - 2020.03.02: éæmaskæºå¶ï¼ç¨åKerasèªå¸¦çmaskæºå¶ï¼ï¼ä»¥ä¾¿æ´å¥½å°ç¼åæ´å¤æçåºç¨ã
- 2020.02.22: æ°å¢
AutoRegressiveDecoder
ç±»ï¼ç»ä¸å¤çSeq2Seqç解ç é®é¢ã - 2020.02.19: transformer blockçåç¼æ¹ä¸ºTransformerï¼æ¬æ¥æ¯Encoderï¼ï¼ä½¿å¾å ¶å«ä¹å±éæ§æ´å°ã
- 2020.02.13: ä¼å
load_vocab
å½æ°ï¼å°build_bert_model
ä¸çkeep_words
åæ°æ´å为keep_tokens
ï¼æ¤å¤æ¹å¨å¯è½ä¼å¯¹é¨åèæ¬äº§çå½±åã - 2020.01.18: è°æ´ææ¬å¤çæ¹å¼ï¼å»æcodecsç使ç¨ã
- 2020.01.17: åapiæ¥è¶ç¨³å®ï¼ä¸ºäºæ¹ä¾¿å¤§å®¶ä½¿ç¨ï¼æå å°pypiï¼é¦ä¸ªæå çæ¬å·ä¸º0.4.6ã
- 2020.01.10: éå模åmaskæ¹æ¡ï¼æç§ç¨åº¦ä¸è®©ä»£ç æ´ä¸ºç®ç»æ¸ æ°ï¼å端ä¼åã
- 2019.12.27: éæé¢è®ç»ä»£ç ï¼åå°åä½ï¼ç®åæ¯æRoBERTaåGPT两ç§é¢è®ç»æ¹å¼ï¼è¯¦è§pretrainingã
- 2019.12.17: éé
å为çnezhaæéï¼åªéè¦å¨
build_bert_model
å½æ°éå ä¸model='nezha'
ï¼æ¤å¤åæ¥albertçå è½½æ¹å¼albert=True
æ¹ä¸ºmodel='albert'
ã - 2019.12.16: éè¿è·keras 2.3+çæ¬ç±»ä¼¼çæè·¯ç»ä½çæ¬å¼å ¥å±ä¸å±åè½ï¼ä»èæ¢å¤å¯¹ä½äº2.3.0çæ¬çkerasçæ¯æã
- 2019.12.14: æ°å¢Conditional Layer Normalizationåç¸å ³demoã
- 2019.12.09: åexampleçdata_generatorè§èåï¼ä¿®å¤application='lm'æ¶çä¸ä¸ªé误ã
- 2019.12.05: ä¼åtokenizerçdo_lower_caseï¼åæ¶å¾®è°å个exampleã
- 2019.11.23: å°train.pyéå½å为optimizers.pyï¼æ´æ°å¤§éä¼åå¨å®ç°ï¼å ¨é¢å ¼å®¹kerasåtf.kerasã
- 2019.11.19: å°utils.pyéå½å为tokenizer.pyã
- 2019.11.19: æ³æ¥æ³å»ï¼æåè¿æ¯å³å®æsnippetsæ¾å°bert4keras.snippetsä¸é¢å»å¥½äºã
- 2019.11.18: ä¼åé¢è®ç»æéå è½½é»è¾ï¼å¢å ä¿å模åæéè³Bertçcheckpointæ ¼å¼æ¹æ³ã
- 2019.11.17:
å离ä¸äºä¸Bertæ¬èº«ä¸ç´æ¥ç¸å ³ç常ç¨ä»£ç ç段å°python_snippetsï¼ä¾å ¶å®é¡¹ç®å ±ç¨ã - 2019.11.11: æ·»å NSPé¨åã
- 2019.11.05: éé googleçalbertï¼ä¸åæ¯æéGoogleçalbert_zhã
- 2019.11.05: 以RoBERTa为ä¾åçé¢è®ç»ä»£ç å¼åå®æ¯ï¼åæ¶æ¯æTPU/å¤GPUè®ç»ï¼è¯¦è§robertaã欢è¿å¨æ¤åºç¡ä¸æ建æ´å¤çé¢è®ç»ä»£ç ã
- 2019.11.01: éæ¥å¢å é¢è®ç»ç¸å ³ä»£ç ï¼è¯¦è§pretrainingã
- 2019.10.28: æ¯æ使ç¨åºäºsentencepieceçtokenizerã
- 2019.10.25: å¼å ¥åçtokenizerã
- 2019.10.22: å¼å ¥æ¢¯åº¦ç´¯ç§¯ä¼åå¨ã
- 2019.10.21: 为äºç®å代ç ç»æï¼å³å®æ¾å¼keras 2.3.0ä¹åççæ¬çæ¯æï¼ç®ååªæ¯ækeras 2.3.0+以åtf.kerasã
- 2019.10.20: åºç½åè¦æ±ï¼ç°æ¯æç´æ¥ç¨
model.save
ä¿å模åç»æï¼ç¨load_model
å è½½æ´ä¸ªæ¨¡åï¼åªéè¦å¨load_model
ä¹åæ§è¡from bert4keras.layers import *
ï¼ä¸éè¦é¢å¤åcustom_objects
ï¼ã - 2019.10.09: å·²å
¼å®¹tf.kerasï¼åæ¶å¨tf 1.13åtf 2.0ä¸çtf.kerasæµè¯éè¿ï¼éè¿è®¾ç½®ç¯å¢åé
TF_KERAS=1
æ¥åæ¢tf.kerasã - 2019.10.09: å·²å ¼å®¹Keras 2.3.xï¼ä½åªæ¯ä¸´æ¶æ¹æ¡ï¼åç»å¯è½ç´æ¥ç§»é¤æ2.3ä¹åçæ¬çæ¯æã
- 2019.10.02: éé
albertï¼è½æåå è½½albert_zhçæéï¼åªéè¦å¨
load_pretrained_model
å½æ°éå ä¸albert=True
ã
èæ¯
ä¹åä¸ç´ç¨CyberZHG大佬çkeras-bertï¼å¦æ纯粹åªæ¯ä¸ºäºå¨kerasä¸å¯¹bertè¿è¡è°ç¨åfine tuneæ¥è¯´ï¼keras-bertå·²ç»è¶³å¤è½è®©äººæ»¡æäºã
ç¶èï¼å¦ææ³è¦å¨å è½½å®æ¹é¢è®ç»æéçåºç¡ä¸ï¼å¯¹bertçå é¨ç»æè¿è¡ä¿®æ¹ï¼é£ä¹keras-bertå°±æ¯è¾é¾æ»¡è¶³æ们çéæ±äºï¼å 为keras-bert为äºä»£ç çå¤ç¨æ§ï¼å ä¹å°æ¯ä¸ªå°æ¨¡åé½å°è£ 为äºä¸ä¸ªåç¬çåºï¼æ¯å¦keras-bertä¾èµäºkeras-transformerï¼èkeras-transformerä¾èµäºkeras-multi-headï¼keras-multi-headä¾èµäºkeras-self-attentionï¼è¿æ ·ä¸ééä¾èµä¸å»ï¼æ¹èµ·æ¥å°±ç¸å½å¤´ç¼äºã
æ以ï¼æå³å®éæ°åä¸ä¸ªkerasççbertï¼äºåå¨å 个æ件å æå®å®æ´å°å®ç°åºæ¥ï¼åå°è¿äºä¾èµæ§ï¼å¹¶ä¸ä¿çå¯ä»¥å è½½å®æ¹é¢è®ç»æéçç¹æ§ã
鸣谢
æè°¢CyberZHG大佬å®ç°çkeras-bertï¼æ¬å®ç°æä¸å°å°æ¹åèäºkeras-bertçæºç ï¼å¨æ¤è¡·å¿æ谢大佬çæ ç§å¥ç®ã
ç¸å ³
bert4torchï¼ä¸ä¸ªè·bert4kerasé£æ ¼å¾ç¸ä¼¼çpytorch-basedçtransofrmeråºï¼ä½¿ç¨pytorchç读è å¯ä»¥å°è¯ã
å¼ç¨
@misc{bert4keras,
title={bert4keras},
author={Jianlin Su},
year={2020},
howpublished={\url{https://bert4keras.spaces.ac.cn}},
}
Top Related Projects
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
TensorFlow code and pre-trained models for BERT
Models and examples built with TensorFlow
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Deep Learning for humans
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot