Convert Figma logo to code with AI

jazzband logotablib

Python Module for Tabular Datasets in XLS, CSV, JSON, YAML, &c.

4,680
596
4,680
33

Top Related Projects

7,191

SQL for Humans™

6,116

A suite of utilities for converting to and working with CSV, the king of tabular file formats.

45,255

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

3,775

N-D labeled arrays and datasets in Python

Quick Overview

Tablib is a powerful, format-agnostic tabular dataset library for Python. It allows you to import, export, and manipulate tabular data with ease, supporting various formats such as CSV, JSON, YAML, and Excel.

Pros

  • Supports multiple data formats (CSV, JSON, YAML, Excel, etc.)
  • Easy-to-use API for data manipulation and transformation
  • Extensible architecture allowing custom format support
  • Seamless integration with other Python libraries and frameworks

Cons

  • Limited support for large datasets (performance may degrade with very large files)
  • Some advanced Excel features are not fully supported
  • Documentation could be more comprehensive for complex use cases
  • Dependency on external libraries for certain formats (e.g., openpyxl for Excel)

Code Examples

  1. Creating a dataset and adding rows:
import tablib

data = tablib.Dataset(headers=['Name', 'Age', 'Country'])
data.append(['John Doe', 30, 'USA'])
data.append(['Jane Smith', 25, 'Canada'])
print(data)
  1. Exporting data to different formats:
# Export to CSV
csv_data = data.export('csv')

# Export to JSON
json_data = data.export('json')

# Export to Excel
excel_data = data.export('xlsx')
  1. Importing data from a file:
with open('data.csv', 'r') as f:
    imported_data = tablib.Dataset().load(f.read(), format='csv')
print(imported_data)

Getting Started

To get started with Tablib, first install it using pip:

pip install tablib

Then, you can create a simple dataset and manipulate it:

import tablib

# Create a dataset
data = tablib.Dataset(headers=['Name', 'Age'])
data.append(['Alice', 28])
data.append(['Bob', 32])

# Add a column
data.append_col([True, False], header='Active')

# Export to CSV
csv_output = data.export('csv')
print(csv_output)

This example creates a dataset, adds some rows and a column, and then exports it to CSV format.

Competitor Comparisons

7,191

SQL for Humans™

Pros of Records

  • Simpler API for database operations
  • Built-in SQL query support
  • Automatic connection management

Cons of Records

  • Limited to database operations
  • Less flexible data manipulation
  • Fewer export options

Code Comparison

Records:

import records

db = records.Database('postgresql://...')
rows = db.query('SELECT * FROM users')
print(rows.export('csv'))

Tablib:

import tablib

data = tablib.Dataset()
data.append(['John', 'Doe', 30])
data.append(['Jane', 'Smith', 25])
print(data.export('csv'))

Key Differences

  • Records focuses on database operations, while Tablib is for general data manipulation
  • Tablib offers more export formats (JSON, YAML, XLS, etc.)
  • Records provides a more streamlined approach for working with databases
  • Tablib allows for more flexible data structuring and manipulation

Use Cases

Records is ideal for:

  • Quick database queries and exports
  • Simple data analysis from databases

Tablib is better for:

  • Working with various data formats
  • Creating and manipulating datasets from multiple sources
  • Exporting data to multiple formats

Both libraries have their strengths, and the choice depends on the specific requirements of your project.

6,116

A suite of utilities for converting to and working with CSV, the king of tabular file formats.

Pros of csvkit

  • Comprehensive command-line toolkit for CSV operations
  • Supports a wide range of CSV manipulations and transformations
  • Integrates well with Unix-style pipelines and shell scripts

Cons of csvkit

  • Limited support for other data formats beyond CSV
  • Steeper learning curve for users unfamiliar with command-line tools
  • Less suitable for programmatic use within Python applications

Code Comparison

csvkit (command-line usage):

csvcut -c 1,3 data.csv | csvgrep -c 1 -m "pattern" | csvsort -c 3

tablib (Python code):

import tablib
data = tablib.Dataset().load(open('data.csv').read())
filtered = data.filter(lambda row: 'pattern' in row[0])
sorted_data = filtered.sort('column3')

Summary

csvkit excels in command-line CSV processing, offering powerful tools for data manipulation. It's ideal for shell scripting and quick data transformations. tablib, on the other hand, provides a more Pythonic approach to working with tabular data, supporting multiple formats and offering an intuitive API for in-memory data manipulation. While csvkit is more specialized for CSV operations, tablib offers greater flexibility in handling various data formats and integrating with Python applications.

45,255

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

Pros of pandas

  • More comprehensive data manipulation and analysis capabilities
  • Highly optimized for performance with large datasets
  • Extensive documentation and community support

Cons of pandas

  • Steeper learning curve for beginners
  • Larger library size and memory footprint
  • Can be overkill for simple data tasks

Code Comparison

tablib example:

import tablib

data = tablib.Dataset()
data.headers = ['Name', 'Age']
data.append(['Alice', 30])
data.append(['Bob', 25])
print(data.export('csv'))

pandas example:

import pandas as pd

df = pd.DataFrame({'Name': ['Alice', 'Bob'], 'Age': [30, 25]})
print(df.to_csv(index=False))

Both libraries allow for easy data manipulation and export, but pandas offers more advanced features for complex data analysis tasks. tablib is more lightweight and focused on simple data operations, making it easier to use for basic tasks. pandas excels in handling large datasets and provides powerful tools for data cleaning, transformation, and analysis, but may be excessive for simpler use cases.

3,775

N-D labeled arrays and datasets in Python

Pros of xarray

  • Designed for working with multi-dimensional labeled arrays and datasets
  • Powerful data analysis capabilities, especially for scientific and geospatial data
  • Integrates well with other scientific Python libraries like NumPy and pandas

Cons of xarray

  • Steeper learning curve due to more complex data structures and operations
  • May be overkill for simple tabular data tasks
  • Larger library size and potentially slower performance for basic operations

Code Comparison

xarray:

import xarray as xr

data = xr.DataArray(
    [[1, 2, 3], [4, 5, 6]],
    dims=("x", "y"),
    coords={"x": [10, 20], "y": [100, 200, 300]}
)
result = data.mean(dim="y")

tablib:

import tablib

data = tablib.Dataset()
data.append([1, 2, 3])
data.append([4, 5, 6])
data.headers = ['A', 'B', 'C']
result = data.export('csv')

xarray is more suitable for complex, multi-dimensional data analysis, while tablib excels at simple tabular data manipulation and export to various formats. xarray offers more advanced features but requires more setup, whereas tablib provides a straightforward interface for basic data handling tasks.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Tablib: format-agnostic tabular dataset library

Jazzband PyPI version Supported Python versions PyPI downloads GitHub Actions status codecov GitHub

_____         ______  ___________ ______
__  /_______ ____  /_ ___  /___(_)___  /_
_  __/_  __ `/__  __ \__  / __  / __  __ \
/ /_  / /_/ / _  /_/ /_  /  _  /  _  /_/ /
\__/  \__,_/  /_.___/ /_/   /_/   /_.___/

Tablib is a format-agnostic tabular dataset library, written in Python.

Output formats supported:

  • Excel (Sets + Books)
  • JSON (Sets + Books)
  • YAML (Sets + Books)
  • Pandas DataFrames (Sets)
  • HTML (Sets)
  • Jira (Sets)
  • LaTeX (Sets)
  • TSV (Sets)
  • ODS (Sets)
  • CSV (Sets)
  • DBF (Sets)

Note that tablib purposefully excludes XML support. It always will. (Note: This is a joke. Pull requests are welcome.)

Tablib documentation is graciously hosted on https://tablib.readthedocs.io

It is also available in the docs directory of the source distribution.

Make sure to check out Tablib on PyPI!

Contribute

Please see the contributing guide.