Top Related Projects
Cron for containers
a cron library for go
Python job scheduling for humans.
Easy and fluent Go cron scheduling. This is a fork from https://github.com/jasonlvhit/gocron
A Distributed, Fault-Tolerant Cron-Style Job System.
Quick Overview
pg_cron is an extension for PostgreSQL that enables database administrators to schedule PostgreSQL commands directly from the database. It allows for the creation of job schedules similar to cron, but with the added benefit of running them inside the database, providing access to all database objects.
Pros
- Runs jobs directly inside the PostgreSQL database, eliminating the need for external scheduling tools
- Supports flexible scheduling options, including cron-like syntax and custom intervals
- Integrates seamlessly with PostgreSQL, allowing access to all database objects and functions
- Provides a simple interface for managing scheduled jobs through SQL commands
Cons
- Limited to PostgreSQL databases only, not suitable for other database systems
- Requires superuser privileges to install and manage, which may pose security concerns in some environments
- May impact database performance if complex or resource-intensive jobs are scheduled frequently
- Limited built-in monitoring and alerting capabilities for job execution status
Code Examples
- Creating a simple scheduled job:
SELECT cron.schedule('My daily job', '0 0 * * *', 'DELETE FROM old_data WHERE created_at < NOW() - INTERVAL ''1 month''');
This example schedules a job to run daily at midnight, deleting old data from a table.
- Scheduling a job with a custom interval:
SELECT cron.schedule('My hourly job', '@hourly', 'REFRESH MATERIALIZED VIEW my_view');
This job refreshes a materialized view every hour.
- Listing all scheduled jobs:
SELECT * FROM cron.job;
This query displays all currently scheduled jobs in the database.
- Removing a scheduled job:
SELECT cron.unschedule('My daily job');
This command removes the previously scheduled 'My daily job' from the cron schedule.
Getting Started
- Install the pg_cron extension (requires superuser privileges):
CREATE EXTENSION pg_cron;
- Enable the extension in postgresql.conf:
shared_preload_libraries = 'pg_cron'
cron.database_name = 'your_database_name'
-
Restart the PostgreSQL server.
-
Schedule your first job:
SELECT cron.schedule('My first job', '*/5 * * * *', 'SELECT now()');
This creates a job that runs every 5 minutes, simply selecting the current timestamp.
Competitor Comparisons
Cron for containers
Pros of supercronic
- Language-agnostic: Can run any executable, not limited to PostgreSQL
- Lightweight and containerized: Easy to deploy in Docker environments
- Supports environment variable expansion in cron expressions
Cons of supercronic
- Lacks direct database integration: Cannot directly execute SQL queries
- No built-in logging to database: Requires additional setup for persistent logs
- Less seamless for PostgreSQL-specific tasks compared to pg_cron
Code Comparison
pg_cron:
SELECT cron.schedule('0 1 * * *', $$
DELETE FROM events WHERE event_time < now() - interval '1 month'
$$);
supercronic:
# In a crontab file
0 1 * * * /usr/local/bin/cleanup_events.sh
Key Differences
pg_cron is tightly integrated with PostgreSQL, allowing direct scheduling and execution of SQL queries within the database. It's ideal for database-centric tasks and benefits from PostgreSQL's security model.
supercronic is a standalone cron-like job scheduler designed for containerized environments. It's more versatile for general-purpose tasks but requires additional setup for database operations.
Choose pg_cron for PostgreSQL-specific scheduling needs, and supercronic for broader, container-friendly job scheduling across various applications and languages.
a cron library for go
Pros of cron
- Language-agnostic: Can be used with any Go application
- Flexible scheduling: Supports complex cron expressions
- In-memory execution: Doesn't require a separate database
Cons of cron
- Requires application-level implementation
- Less integrated with database operations
- May need additional setup for persistence and fault tolerance
Code Comparison
pg_cron:
SELECT cron.schedule('0 1 * * *', $$
DELETE FROM events WHERE event_time < now() - interval '1 month'
$$);
cron:
c := cron.New()
c.AddFunc("0 1 * * *", func() {
// Delete old events
})
c.Start()
Key Differences
- pg_cron is specifically designed for PostgreSQL, while cron is a Go library
- pg_cron runs jobs within the database, cron runs jobs in the application
- pg_cron uses SQL for job definitions, cron uses Go functions
- pg_cron leverages database features for persistence and scheduling, cron requires additional implementation for these features
Use Cases
- pg_cron: Ideal for database-centric tasks and PostgreSQL environments
- cron: Better suited for application-level scheduling in Go projects, especially when database independence is required
Both tools offer robust scheduling capabilities, but their integration and use cases differ significantly based on the project's architecture and requirements.
Python job scheduling for humans.
Pros of schedule
- Pure Python implementation, making it easy to integrate into existing Python projects
- Flexible scheduling options, including support for complex cron-like expressions
- Lightweight and doesn't require a database or external dependencies
Cons of schedule
- Limited to running jobs within the Python process, not system-wide
- Requires the Python script to be running continuously for scheduled tasks to execute
- No built-in persistence or job history tracking
Code Comparison
schedule:
import schedule
import time
def job():
print("I'm working...")
schedule.every(10).minutes.do(job)
while True:
schedule.run_pending()
time.sleep(1)
pg_cron:
SELECT cron.schedule('0 1 * * *', $$
DELETE FROM events WHERE event_time < now() - interval '1 week'
$$);
Key Differences
- pg_cron is specifically designed for PostgreSQL, while schedule is a general-purpose Python library
- pg_cron runs jobs within the database, whereas schedule runs jobs in a Python environment
- pg_cron uses SQL syntax for job definitions, while schedule uses Python function calls
- pg_cron provides database-level persistence and scheduling, while schedule requires a running Python process
Easy and fluent Go cron scheduling. This is a fork from https://github.com/jasonlvhit/gocron
Pros of gocron
- Language-agnostic: Can be used in any Go application, not limited to PostgreSQL
- More flexible scheduling options, including custom intervals and time zones
- Easier to integrate into existing Go applications without database dependencies
Cons of gocron
- Requires running a separate Go application alongside the database
- Lacks direct integration with PostgreSQL, potentially complicating data access
- May require additional setup for distributed environments
Code Comparison
pg_cron:
SELECT cron.schedule('0 1 * * *', $$
DELETE FROM events WHERE event_time < now() - interval '1 month'
$$);
gocron:
s := gocron.NewScheduler(time.UTC)
s.Cron("0 1 * * *").Do(func() {
db.Exec("DELETE FROM events WHERE event_time < now() - interval '1 month'")
})
s.StartAsync()
Both examples schedule a job to run at 1 AM daily, deleting old events. pg_cron uses SQL directly within PostgreSQL, while gocron requires Go code and a separate connection to the database.
A Distributed, Fault-Tolerant Cron-Style Job System.
Pros of cronsun
- Distributed system with web UI for managing jobs across multiple nodes
- Supports multiple programming languages and shell scripts
- Offers more advanced features like job dependencies and retries
Cons of cronsun
- More complex setup and maintenance compared to pg_cron
- Requires additional infrastructure (etcd, web server) to run
Code comparison
pg_cron:
SELECT cron.schedule('0 1 * * *', $$
DELETE FROM events WHERE event_time < now() - interval '1 month'
$$);
cronsun:
{
"name": "cleanup_events",
"command": "DELETE FROM events WHERE event_time < now() - interval '1 month'",
"schedule": "0 1 * * *",
"user": "postgres"
}
Summary
pg_cron is a PostgreSQL extension that provides simple, database-integrated job scheduling. It's easy to set up and use within PostgreSQL but limited to database operations.
cronsun is a distributed job scheduling system that offers more flexibility and features. It can run various types of jobs across multiple nodes but requires more complex setup and management.
Choose pg_cron for straightforward database-centric scheduling, and cronsun for more advanced, distributed job management needs.
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
What is pg_cron?
pg_cron is a simple cron-based job scheduler for PostgreSQL (10 or higher) that runs inside the database as an extension. It uses the same syntax as regular cron, but it allows you to schedule PostgreSQL commands directly from the database. You can also use '[1-59] seconds' to schedule a job based on an interval.
pg_cron also allows you using '$' to indicate last day of the month.
-- Delete old data on Saturday at 3:30am (GMT)
SELECT cron.schedule('30 3 * * 6', $$DELETE FROM events WHERE event_time < now() - interval '1 week'$$);
schedule
----------
42
-- Vacuum every day at 10:00am (GMT)
SELECT cron.schedule('nightly-vacuum', '0 10 * * *', 'VACUUM');
schedule
----------
43
-- Change to vacuum at 3:00am (GMT)
SELECT cron.schedule('nightly-vacuum', '0 3 * * *', 'VACUUM');
schedule
----------
43
-- Stop scheduling jobs
SELECT cron.unschedule('nightly-vacuum' );
unschedule
------------
t
SELECT cron.unschedule(42);
unschedule
------------
t
-- Vacuum every Sunday at 4:00am (GMT) in a database other than the one pg_cron is installed in
SELECT cron.schedule_in_database('weekly-vacuum', '0 4 * * 0', 'VACUUM', 'some_other_database');
schedule
----------
44
-- Call a stored procedure every 5 seconds
SELECT cron.schedule('process-updates', '5 seconds', 'CALL process_updates()');
-- Process payroll at 12:00 of the last day of each month
SELECT cron.schedule('process-payroll', '0 12 $ * *', 'CALL process_payroll()');
pg_cron can run multiple jobs in parallel, but it runs at most one instance of a job at a time. If a second run is supposed to start before the first one finishes, then the second run is queued and started as soon as the first run completes.
The schedule uses the standard cron syntax, in which * means "run every time period", and a specific number means "but only at this time":
ââââââââââââââ min (0 - 59)
â âââââââââââââââ hour (0 - 23)
â â ââââââââââââââââ day of month (1 - 31) or last day of the month ($)
â â â âââââââââââââââââ month (1 - 12)
â â â â ââââââââââââââââââ day of week (0 - 6) (0 to 6 are Sunday to
â â â â â Saturday, or use names; 7 is also Sunday)
â â â â â
â â â â â
* * * * *
An easy way to create a cron schedule is: crontab.guru.
The code in pg_cron that handles parsing and scheduling comes directly from the cron source code by Paul Vixie, hence the same options are supported.
Installing pg_cron
Install on Red Hat, CentOS, Fedora, Amazon Linux with PostgreSQL 16 using PGDG:
# Install the pg_cron extension
sudo yum install -y pg_cron_16
Install on Debian, Ubuntu with PostgreSQL 16 using apt.postgresql.org:
# Install the pg_cron extension
sudo apt-get -y install postgresql-16-cron
You can also install pg_cron by building it from source:
git clone https://github.com/citusdata/pg_cron.git
cd pg_cron
# Ensure pg_config is in your path, e.g.
export PATH=/usr/pgsql-16/bin:$PATH
make && sudo PATH=$PATH make install
Setting up pg_cron
To start the pg_cron background worker when PostgreSQL starts, you need to add pg_cron to shared_preload_libraries
in postgresql.conf. Note that pg_cron does not run any jobs as a long a server is in hot standby mode, but it automatically starts when the server is promoted.
# add to postgresql.conf
# required to load pg_cron background worker on start-up
shared_preload_libraries = 'pg_cron'
By default, the pg_cron background worker expects its metadata tables to be created in the "postgres" database. However, you can configure this by setting the cron.database_name
configuration parameter in postgresql.conf.
# add to postgresql.conf
# optionally, specify the database in which the pg_cron background worker should run (defaults to postgres)
cron.database_name = 'postgres'
pg_cron
may only be installed to one database in a cluster. If you need to run jobs in multiple databases, use cron.schedule_in_database()
.
Previously pg_cron could only use GMT time, but now you can adapt your time by setting cron.timezone
in postgresql.conf.
# add to postgresql.conf
# optionally, specify the timezone in which the pg_cron background worker should run (defaults to GMT). E.g:
cron.timezone = 'PRC'
After restarting PostgreSQL, you can create the pg_cron functions and metadata tables using CREATE EXTENSION pg_cron
.
-- run as superuser:
CREATE EXTENSION pg_cron;
-- optionally, grant usage to regular users:
GRANT USAGE ON SCHEMA cron TO marco;
Ensuring pg_cron can start jobs
Important: By default, pg_cron uses libpq to open a new connection to the local database, which needs to be allowed by pg_hba.conf.
It may be necessary to enable trust
authentication for connections coming from localhost in for the user running the cron job, or you can add the password to a .pgpass file, which libpq will use when opening a connection.
You can also use a unix domain socket directory as the hostname and enable trust
authentication for local connections in pg_hba.conf, which is normally safe:
# Connect via a unix domain socket:
cron.host = '/tmp'
# Can also be an empty string to look for the default directory:
cron.host = ''
Alternatively, pg_cron can be configured to use background workers. In that case, the number of concurrent jobs is limited by the max_worker_processes
setting, so you may need to raise that.
# Schedule jobs via background workers instead of localhost connections
cron.use_background_workers = on
# Increase the number of available background workers from the default of 8
max_worker_processes = 20
For security, jobs are executed in the database in which the cron.schedule
function is called with the same permissions as the current user. In addition, users are only able to see their own jobs in the cron.job
table.
-- View active jobs
select * from cron.job;
Viewing job run details
You can view the status of running and recently completed job runs in the cron.job_run_details
:
select * from cron.job_run_details order by start_time desc limit 5;
âââââââââ¬ââââââââ¬ââââââââââ¬âââââââââââ¬âââââââââââ¬ââââââââââââââââââââ¬ââââââââââââ¬âââââââââââââââââââ¬ââââââââââââââââââââââââââââââââ¬ââââââââââââââââââââââââââââââââ
â jobid â runid â job_pid â database â username â command â status â return_message â start_time â end_time â
âââââââââ¼ââââââââ¼ââââââââââ¼âââââââââââ¼âââââââââââ¼ââââââââââââââââââââ¼ââââââââââââ¼âââââââââââââââââââ¼ââââââââââââââââââââââââââââââââ¼ââââââââââââââââââââââââââââââââ¤
â 10 â 4328 â 2610 â postgres â marco â select process() â succeeded â SELECT 1 â 2023-02-07 09:30:00.098164+01 â 2023-02-07 09:30:00.130729+01 â
â 10 â 4327 â 2609 â postgres â marco â select process() â succeeded â SELECT 1 â 2023-02-07 09:29:00.015168+01 â 2023-02-07 09:29:00.832308+01 â
â 10 â 4321 â 2603 â postgres â marco â select process() â succeeded â SELECT 1 â 2023-02-07 09:28:00.011965+01 â 2023-02-07 09:28:01.420901+01 â
â 10 â 4320 â 2602 â postgres â marco â select process() â failed â server restarted â 2023-02-07 09:27:00.011833+01 â 2023-02-07 09:27:00.72121+01 â
â 9 â 4320 â 2602 â postgres â marco â select do_stuff() â failed â job canceled â 2023-02-07 09:26:00.011833+01 â 2023-02-07 09:26:00.22121+01 â
âââââââââ´ââââââââ´ââââââââââ´âââââââââââ´âââââââââââ´ââââââââââââââââââââ´ââââââââââââ´âââââââââââââââââââ´ââââââââââââââââââââââââââââââââ´ââââââââââââââââââââââââââââââââ
(10 rows)
The records in cron.job_run_details
are not cleaned automatically, but every user that can schedule cron jobs also has permission to delete their own cron.job_run_details
records.
Especially when you have jobs that run every few seconds, it can be a good idea to clean up regularly, which can easily be done using pg_cron itself:
-- Delete old cron.job_run_details records of the current user every day at noon
SELECT cron.schedule('delete-job-run-details', '0 12 * * *', $$DELETE FROM cron.job_run_details WHERE end_time < now() - interval '7 days'$$);
If you do not want to use cron.job_run_details
at all, then you can add cron.log_run = off
to postgresql.conf
.
Example use cases
Articles showing possible ways of using pg_cron:
- Auto-partitioning using pg_partman
- Computing rollups in an analytical dashboard
- Deleting old data, vacuum
- Feeding cats
- Routinely invoking a function
- Postgres as a cron server
Managed services
The following table keeps track of which of the major managed Postgres services support pg_cron.
Service | Supported |
---|---|
Aiven | :heavy_check_mark: |
Alibaba Cloud | :heavy_check_mark: |
Amazon RDS | :heavy_check_mark: |
Azure | :heavy_check_mark: |
Crunchy Bridge | :heavy_check_mark: |
DigitalOcean | :heavy_check_mark: |
Google Cloud | :heavy_check_mark: |
Heroku | :x: |
Instaclustr | :heavy_check_mark: |
ScaleGrid | :heavy_check_mark: |
Scaleway | :heavy_check_mark: |
Supabase | :heavy_check_mark: |
Tembo | :heavy_check_mark: |
YugabyteDB | :heavy_check_mark: |
Code of Conduct
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.
Top Related Projects
Cron for containers
a cron library for go
Python job scheduling for humans.
Easy and fluent Go cron scheduling. This is a fork from https://github.com/jasonlvhit/gocron
A Distributed, Fault-Tolerant Cron-Style Job System.
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot