Scheduling ACE Diff Operations (Beta)
ACE supports automated scheduling of table-diff and repset-diff operations through configuration settings in ace_config.py
. The job scheduler allows you to perform regular consistency checks without manual intervention.
Use properties in the ACE Background Service Options
section of the ace_config.py
file to specify general background service preferences:
** ACE Background Service Options **
LISTEN_ADDRESS = "0.0.0.0"
LISTEN_PORT = 5000
# Smallest interval that can be used for any ACE background service
MIN_RUN_FREQUENCY = timedelta(minutes=5)
LISTEN_ADDRESS
(default="0.0.0.0") is the network address ACE should bind to when started as a background process.LISTEN_PORT
(default=5000) is the default port ACE should listen on when started as a background process.MIN_RUN_FREQUENCY
(default=timedelta(minutes=5)) is the minimum interval between consecutive runs of a background job. This value can be set using any timedelta unit -- such as minutes, seconds, or hours. For example, if MIN_RUN_FREQUENCY is set to 5 minutes, then no job can be scheduled to run more frequently than once every 5 minutes.
Additionally, use properties in the following sections to define jobs and schedules for their execution.
Scheduling a Job
The ace_config.py
file (by default, located in $PGEDGE_HOME/hub/scripts/
) contains information about jobs and their schedules in two .json-formatted sections; first, use the following property:value pairs in the schedule_jobs
section to define jobs:
Job Configuration Options
Each job in schedule_jobs
supports:
name
(required): Unique identifier for the jobcluster_name
(required): Name of the clustertable_name
ORrepset_name
(required): Fully qualified table name or repset nameargs
(optional): Dictionary of table-diff parametersmax_cpu_ratio
: Maximum CPU usage ratiobatch_size
: Batch size for processingblock_rows
: Number of rows per blocktable_filter
:SQL WHERE
clause used to filter rows for comparisonnodes
: Nodes to includeoutput
: Output format ["json", "csv", "html"]quiet
: Suppress outputdbname
: Database name
For Example
# Define the jobs
schedule_jobs = [
{
"name": "t1",
"cluster_name": "my_cluster",
"table_name": "public.users"
},
{
"name": "t2",
"cluster_name": "my_cluster",
"table_name": "public.orders",
"args": {
"max_cpu_ratio": 0.7,
"batch_size": 1000,
"block_rows": 10000,
"nodes": "all",
"output": "json",
"quiet": False,
"dbname": "mydb"
}
}
]
Then, use the property:value pairs in the schedule_config section to define the schedule for each job:
Schedule Configuration Options
Each schedule in schedule_config
supports:
job_name
(required): Name of the job to schedule (must match a job name)crontab_schedule
: Cron-style schedule expression- Cron Format:
* * * * *
(minute hour day_of_month month day_of_week) - Examples:
0 0 * * *
: Daily at midnight0 */4 * * *
: Every 4 hours0 0 * * 0
: Weekly on Sunday
- Cron Format:
run_frequency
: Alternative to crontab, using time units (e.g., "30s", "5m", "1h")- Run Frequency Format:
<number><unit>
- Units: "s" (seconds), "m" (minutes), "h" (hours)
- Minimum: 5 minutes
- Examples:
- "30s": Every 30 seconds
- "5m": Every 5 minutes
- "1h": Every hour
- Run Frequency Format:
enabled
: Whether the schedule is active (default: False)rerun_after
: Time to wait before rerunning if differences found
For Example
schedule_config = [
{
"job_name": "t1",
"crontab_schedule": "0 0 * * *", # Run at midnight
"run_frequency": "30s", # Alternative to crontab
"enabled": True,
"rerun_after": "1h" # Rerun if diff found after 1 hour
},
{
"job_name": "t2",
"crontab_schedule": "0 */4 * * *", # Every 4 hours
"run_frequency": "5m", # Alternative to crontab
"enabled": True,
"rerun_after": "30m"
}
]
Starting and Stopping the Scheduler
The scheduler starts automatically when ACE is started.
./pgedge start ace
To stop the scheduler:
./pgedge stop ace
Best Practices
- Resource Management:
- Stagger schedules to avoid overlapping resource-intensive jobs
- Set appropriate
max_cpu_ratio
,block_rows
, andbatch_size
values based on the table size and expected load
- Frequency Selection:
- Use
crontab_schedule
for specific times - Use
run_frequency
for regular intervals
- Use
Scheduling Auto-Repair Jobs (Beta)
The auto-repair module monitors and repairs INSERT-INSERT exceptions in tables containing data that has been detected to have diverged. It runs as a background process, periodically checking for inconsistencies and applying repairs based on configured settings.
To enable auto-repair, specify your auto-repair preferences in ace_config.py
:
auto_repair_config = {
"enabled": False,
"cluster_name": "eqn-t9da",
"dbname": "demo",
"poll_frequency": "10m",
"repair_frequency": "15m"
}
Configuration Options
enabled
: Enable/disable auto-repair functionality (default: False)cluster_name
: Name of the cluster to monitordbname
: Database name to monitorpoll_frequency
: How often the Spock exception log is polled to check for new exceptions.repair_frequency
: How often to repair exceptions that have been detected.
Time Intervals
You can specify the time intervals for execution in either cron format or in a simple frequency format. Both poll_interval
and status_update_interval
accept time strings in the following formats:
Cron Format: * * * * *
(minute hour day_of_month month day_of_week); for example:
0 0 * * *
: Daily at midnight0 */4 * * *
: Every 4 hours0 0 * * 0
: Weekly on Sunday
Run Frequency Format: <number><unit>
; for example:
- Units: "s" (seconds), "m" (minutes), "h" (hours)
- Minimum: 5 minutes
- Examples:
- "30s": Every 30 seconds
- "5m": Every 5 minutes
- "1h": Every hour
Note: The minimum frequency allowed is 5 minutes. However, you can modify that time by editing the MIN_RUN_FREQUENCY
variable in ace_config.py
.
Controlling the auto-repair Daemon
The auto-repair daemon starts automatically when ACE is started.
./pgedge ace start
To stop the auto-repair daemon:
./pgedge ace stop
Common Use Cases
Auto-repair is a great candidate for handling use-cases that have a high probability of INSERT
-INSERT
conflicts. For example, on bidding and reservation servers, INSERT
-INSERT
conflicts are likely to arise across multiple nodes.
Limitations and Considerations
- The auto-repair daemon is currently limited to handling
INSERT
-INSERT
conflicts only.