Convert Figma logo to code with AI

google-gemini logogenerative-ai-python

The official Python library for the Google Gemini API

1,378
269
1,378
88

Top Related Projects

The official Python library for the OpenAI API

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

34,658

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Ongoing research training transformer models at scale

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

37,810

TensorFlow code and pre-trained models for BERT

Quick Overview

The google-gemini/generative-ai-python repository is an official Python library for interacting with Google's Gemini AI models. It provides a simple and efficient way to access Gemini's capabilities, including text generation, image analysis, and multimodal tasks, through a Python API.

Pros

  • Easy integration with Google's state-of-the-art Gemini AI models
  • Supports various tasks including text generation, image analysis, and multimodal interactions
  • Well-documented with clear examples and explanations
  • Regular updates and maintenance from Google's team

Cons

  • Requires a Google Cloud account and API key for usage
  • Limited to Google's Gemini models, not applicable for other AI providers
  • Potential costs associated with API usage, depending on the scale of implementation
  • May have usage limitations or quotas based on Google Cloud policies

Code Examples

  1. Text generation:
import google.generativeai as genai

genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel('gemini-pro')

response = model.generate_content("Explain the theory of relativity in simple terms.")
print(response.text)
  1. Image analysis:
import google.generativeai as genai
from PIL import Image

genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel('gemini-pro-vision')

image = Image.open('path/to/your/image.jpg')
response = model.generate_content(["Describe this image in detail", image])
print(response.text)
  1. Multimodal interaction:
import google.generativeai as genai
from PIL import Image

genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel('gemini-pro-vision')

image = Image.open('path/to/your/image.jpg')
response = model.generate_content([
    "What's unusual about this image?",
    image,
    "Provide a creative story based on what you see."
])
print(response.text)

Getting Started

To get started with the google-gemini/generative-ai-python library:

  1. Install the library:

    pip install google-generativeai
    
  2. Set up a Google Cloud account and obtain an API key.

  3. Configure the library with your API key:

    import google.generativeai as genai
    genai.configure(api_key="YOUR_API_KEY")
    
  4. Create a model instance and start generating content:

    model = genai.GenerativeModel('gemini-pro')
    response = model.generate_content("Your prompt here")
    print(response.text)
    

Remember to replace "YOUR_API_KEY" with your actual Google Cloud API key.

Competitor Comparisons

The official Python library for the OpenAI API

Pros of openai-python

  • More mature and widely adopted library with extensive documentation
  • Supports a broader range of OpenAI models and services
  • Active community and frequent updates

Cons of openai-python

  • Limited to OpenAI's ecosystem and pricing model
  • May require more setup and configuration for basic usage

Code Comparison

openai-python:

import openai

openai.api_key = "your-api-key"
response = openai.Completion.create(
  engine="text-davinci-002",
  prompt="Translate the following English text to French: '{}'",
  max_tokens=60
)

generative-ai-python:

import google.generativeai as genai

genai.configure(api_key="your-api-key")
model = genai.GenerativeModel('gemini-pro')
response = model.generate_content("Translate the following English text to French: '{}'")

Both libraries offer similar functionality for generating content, but generative-ai-python has a more streamlined API for basic usage. openai-python provides more granular control over model parameters, while generative-ai-python focuses on simplicity and ease of use for Google's Gemini models.

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Pros of transformers

  • Extensive model support: Covers a wide range of transformer-based models
  • Active community: Large user base and frequent updates
  • Comprehensive documentation: Detailed guides and examples

Cons of transformers

  • Steeper learning curve: More complex API due to broader scope
  • Larger package size: Includes many features, which may be unnecessary for simple tasks

Code Comparison

transformers:

from transformers import pipeline

generator = pipeline('text-generation', model='gpt2')
result = generator("Hello, I'm a language model,", max_length=30)
print(result[0]['generated_text'])

generative-ai-python:

import google.generativeai as genai

genai.configure(api_key='YOUR_API_KEY')
model = genai.GenerativeModel('gemini-pro')
response = model.generate_content("Hello, I'm a language model,")
print(response.text)

Both libraries provide easy-to-use interfaces for generating text, but transformers offers more flexibility in model selection, while generative-ai-python focuses on Google's Gemini models with a simpler API.

34,658

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Pros of DeepSpeed

  • More comprehensive and mature library for optimizing large-scale deep learning models
  • Offers advanced features like ZeRO optimizer, pipeline parallelism, and 3D parallelism
  • Supports a wider range of hardware configurations and distributed training scenarios

Cons of DeepSpeed

  • Steeper learning curve due to its complexity and extensive feature set
  • May be overkill for smaller projects or those not requiring extreme optimization
  • Requires more setup and configuration compared to simpler libraries

Code Comparison

DeepSpeed:

import deepspeed
model_engine, optimizer, _, _ = deepspeed.initialize(args=args,
                                                     model=model,
                                                     model_parameters=params)

generative-ai-python:

import google.generativeai as genai
genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel('gemini-pro')

DeepSpeed focuses on optimizing and scaling deep learning models, while generative-ai-python provides a simpler interface for accessing Google's Gemini models. DeepSpeed is more suitable for advanced users and large-scale projects, whereas generative-ai-python is designed for quick integration of generative AI capabilities.

Ongoing research training transformer models at scale

Pros of Megatron-LM

  • Designed for large-scale distributed training of transformer models
  • Supports model parallelism for efficient use of multiple GPUs
  • Optimized for NVIDIA hardware, potentially offering better performance

Cons of Megatron-LM

  • More complex setup and configuration required
  • Primarily focused on training, less emphasis on inference and deployment
  • Steeper learning curve for beginners

Code Comparison

Megatron-LM (model initialization):

model = get_model(
    model_provider_func,
    wrap_with_ddp=True,
    virtual_pipeline_model_parallel_size=None
)

generative-ai-python (model usage):

import google.generativeai as genai

genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel('gemini-pro')
response = model.generate_content("Hello, Gemini!")

The Megatron-LM code focuses on model initialization for distributed training, while the generative-ai-python code demonstrates simple API usage for generating content with pre-trained models. Megatron-LM is more suited for researchers and advanced users working on large-scale model training, whereas generative-ai-python provides an accessible interface for developers to integrate Gemini models into their applications.

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

Pros of gpt-neox

  • Open-source implementation of GPT-3-like models
  • Supports distributed training across multiple GPUs and nodes
  • Highly customizable architecture and training parameters

Cons of gpt-neox

  • More complex setup and configuration compared to generative-ai-python
  • Requires significant computational resources for training large models
  • Less user-friendly for beginners or those unfamiliar with deep learning frameworks

Code Comparison

gpt-neox:

from megatron.neox_arguments import NeoXArgs
from megatron.global_vars import set_global_variables, get_tokenizer
from megatron.neox_model import GPTNeoX

args = NeoXArgs.from_ymls("configs/your_config.yml")
model = GPTNeoX(args)

generative-ai-python:

import google.generativeai as genai

genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel('gemini-pro')
response = model.generate_content("Hello, Gemini!")

The gpt-neox code snippet demonstrates the setup for a custom GPT-like model, while the generative-ai-python code shows a simpler API-based approach for using pre-trained Gemini models. gpt-neox offers more flexibility and control over the model architecture, but requires more setup and understanding of the underlying framework. generative-ai-python provides an easier-to-use interface for quick integration of AI capabilities into applications, but with less customization options.

37,810

TensorFlow code and pre-trained models for BERT

Pros of BERT

  • Established and widely adopted in NLP tasks
  • Extensive documentation and community support
  • Pre-trained models available for various languages

Cons of BERT

  • Focused primarily on text understanding, not generation
  • Requires more computational resources for fine-tuning
  • Less suitable for multi-modal or creative AI tasks

Code Comparison

BERT example:

from transformers import BertTokenizer, BertModel
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertModel.from_pretrained('bert-base-uncased')
inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
outputs = model(**inputs)

Generative AI Python example:

import google.generativeai as genai
genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel('gemini-pro')
response = model.generate_content("Write a short poem about AI")
print(response.text)

The BERT code focuses on tokenization and model initialization for text understanding, while the Generative AI Python code demonstrates a simpler interface for generating creative content using Gemini models.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Google AI Python SDK for the Gemini API

PyPI version Python support PyPI - Downloads

The Google AI Python SDK is the easiest way for Python developers to build with the Gemini API. The Gemini API gives you access to Gemini models created by Google DeepMind. Gemini models are built from the ground up to be multimodal, so you can reason seamlessly across text, images, and code.

Get started with the Gemini API

  1. Go to Google AI Studio.
  2. Login with your Google account.
  3. Create an API key.
  4. Try a Python SDK quickstart in the Gemini API Cookbook.
  5. For detailed instructions, try the Python SDK tutorial on ai.google.dev.

Usage example

See the Gemini API Cookbook or ai.google.dev for complete code.

  1. Install from PyPI.

pip install -U google-generativeai

  1. Import the SDK and configure your API key.
import google.generativeai as genai
import os

genai.configure(api_key=os.environ["GEMINI_API_KEY"])
  1. Create a model and run a prompt.
model = genai.GenerativeModel('gemini-1.5-flash')
response = model.generate_content("The opposite of hot is")
print(response.text)

Documentation

See the Gemini API Cookbook or ai.google.dev for complete documentation.

Contributing

See Contributing for more information on contributing to the Google AI Python SDK.

License

The contents of this repository are licensed under the Apache License, version 2.0.