SLA and Guarantees

Our commitment to reliability, performance, and service quality for your sandbox environments.

At VaultSandbox, we understand that reliable infrastructure is critical for your development and testing workflows. Our Service Level Agreement (SLA) outlines our commitments to availability, performance, and support for your sandbox environments.

This SLA applies to all paid subscription tiers of the VaultSandbox platform. While we strive to maintain the same level of service for trial accounts, formal SLA commitments and service credits are only applicable to paid subscriptions.

SLA Summary

Control Plane Availability

99.95%

Provisioning Success Rate

99.9%

Support Response Time

1 hour (critical)

Service Level Definitions

1. Control Plane Availability

The Control Plane includes our management API, web console, and core services that enable environment provisioning, management, and monitoring. This does not include the actual sandbox environments themselves, which are covered by separate guarantees.

Commitment:
  • 99.95% monthly uptime for the Control Plane
  • Measured as the percentage of successful API requests and web console availability
  • Excludes scheduled maintenance windows (announced at least 48 hours in advance)
Monthly Uptime Calculation:

(Total minutes in month - Downtime minutes) / Total minutes in month × 100%

2. Provisioning Success Rate

This metric measures the reliability of our environment provisioning system, which is critical for on-demand sandbox creation. It covers the success rate of environment creation requests that meet our documented specifications.

Commitment:
  • 99.9% success rate for environment provisioning requests
  • Measured as the percentage of successful environment creation operations
  • Excludes failures due to customer configuration errors or resource quota limitations
Provisioning Success Rate Calculation:

(Successful provisioning requests / Total valid provisioning requests) × 100%

3. Sandbox Environment Reliability

Once provisioned, sandbox environments should maintain stability and performance. This metric covers the reliability of running sandbox environments.

Commitment:
  • 99.5% availability for individual sandbox environments
  • Measured from the time of successful provisioning until scheduled termination
  • Excludes environments automatically terminated due to cost control policies
Environment Availability Calculation:

(Total running time - Unplanned downtime) / Total running time × 100%

4. Snapshot and Restore Operations

The reliability and performance of snapshot and restore operations are critical for development workflows that depend on environment state management.

Commitments:
  • 99.9% success rate for snapshot operations
  • 99.9% success rate for restore operations
  • Maximum snapshot creation time: 5 minutes per 100GB
  • Maximum restore time: 10 minutes per 100GB
Success Rate Calculation:

(Successful operations / Total operations) × 100%

5. API Response Time

For automation workflows, API performance is critical. This metric ensures that our API endpoints respond within acceptable timeframes.

Commitment:
  • 95% of API requests will complete within 500ms
  • 99% of API requests will complete within 1000ms
  • Excludes long-running operations (provisioning, snapshot, etc.) which have separate metrics
Performance Calculation:

Based on server-side response time measurements over a calendar month.

Service Level Objectives (SLOs) & Metrics

In addition to our SLA commitments, we maintain the following service level objectives that guide our engineering priorities and continuous improvement efforts.

Service Component Metric Target Measurement Method
Environment Provisioning Time to provision standard environment < 3 minutes Measured from API request to ready state
Environment Provisioning Time to provision complex environment < 10 minutes Measured from API request to ready state
Snapshot Creation Time to create environment snapshot < 2 minutes per 50GB Measured from API request to snapshot ready
Environment Restore Time to restore from snapshot < 5 minutes per 100GB Measured from API request to environment ready
API Performance Read operation latency (p95) < 300ms Server-side measurement
API Performance Write operation latency (p95) < 500ms Server-side measurement
Cost Controls Budget enforcement accuracy 100% No budget overruns without explicit override
Network Performance Internal network throughput > 10 Gbps Between environments in same region

These targets represent our engineering goals and typical performance, but are not covered by the service credit system. We continuously monitor these metrics and strive to exceed these targets.

Incident Handling & Escalation

We take service disruptions seriously and have established clear procedures for incident response, communication, and resolution. Our incident management system ensures rapid response and transparent communication.

Incident Priority Levels

P1
Critical

Service-wide outage or severe degradation affecting all customers

Response time: 15 minutes

P2
High

Partial service outage or significant performance degradation

Response time: 30 minutes

P3
Medium

Minor service degradation or isolated feature unavailability

Response time: 2 hours

P4
Low

Cosmetic issues or non-critical functionality affected

Response time: 1 business day

Incident Communication

Initial Notification

For P1 and P2 incidents, we'll notify all affected customers within 30 minutes through our status page, email, and in-platform alerts.

Regular Updates

We provide hourly updates for P1 incidents and every 2 hours for P2 incidents until resolution or mitigation.

Resolution Notification

We'll notify you when the incident is resolved, including any necessary follow-up actions.

Post-Incident Report

For all P1 and P2 incidents, we publish a detailed post-mortem within 5 business days, including root cause analysis and preventive measures.

Credits & Remedies

If we fail to meet our SLA commitments, you are eligible for service credits according to the following schedule:

Control Plane Availability Credits

Monthly Uptime Service Credit
99.0% - 99.94% 10% of monthly fee
95.0% - 98.99% 25% of monthly fee
< 95.0% 50% of monthly fee

Provisioning Success Rate Credits

Success Rate Service Credit
98.0% - 99.89% 10% of monthly fee
95.0% - 97.99% 25% of monthly fee
< 95.0% 50% of monthly fee

Credit Request Process

  1. Eligibility Period: Service credits must be requested within 30 days of the end of the billing cycle in which the SLA violation occurred.
  2. Request Process: Submit a credit request through your account dashboard or by contacting your account representative with details of the service disruption.
  3. Validation: We will review your request against our monitoring data to validate the SLA breach.
  4. Credit Application: Approved credits will be applied to your next billing cycle within 2 billing periods.
  5. Credit Limitations: The maximum service credit for any billing month is 100% of the monthly fee. Credits cannot be exchanged for cash refunds.

Maintenance Windows

To ensure the reliability, security, and performance of our platform, we periodically perform scheduled maintenance. We strive to minimize disruption through careful planning and transparent communication.

Standard Maintenance Windows

Regular maintenance is scheduled during low-usage periods, typically Sundays between 01:00-05:00 UTC. We notify customers at least 48 hours in advance.

Emergency Maintenance

In rare cases, we may need to perform emergency maintenance to address critical security vulnerabilities or prevent service degradation. We provide as much advance notice as possible.

Impact Minimization

We use rolling updates and redundant systems to minimize or eliminate service disruptions during maintenance whenever possible.

Notification Channels

Maintenance notifications are sent via email, displayed in the platform dashboard, and posted to our status page.

Maintenance Policies

Advance Notice

Standard maintenance: 48+ hours notice
Emergency maintenance: As soon as possible

Data Protection

All maintenance operations follow strict data protection protocols to ensure the integrity and security of customer data.

Support During Maintenance

Our support team remains available during all maintenance windows to address any customer concerns.

Maintenance Calendar

Customers can view upcoming scheduled maintenance in their account dashboard.

Guarantees Specific to Cost Controls and Data

Cost Control Guarantees

Our cost control mechanisms are designed to prevent unexpected charges and provide predictable infrastructure costs for your sandbox environments.

  • Budget Enforcement Guarantee: We guarantee that hard budget limits will never be exceeded without explicit override approval.
  • Auto-Shutdown Reliability: We guarantee 99.9% reliability for scheduled auto-shutdown operations to prevent runaway costs.
  • Budget Alert Timeliness: Budget threshold alerts will be delivered within 5 minutes of threshold breach.
  • Cost Reporting Accuracy: We guarantee 99.9% accuracy in cost reporting and resource usage metrics.
  • Cost Overrun Protection: In the event that our cost control systems fail to enforce a hard budget limit, we will credit you for any charges exceeding your defined limit.

Data Guarantees

We understand that even test data can be valuable. Our data management policies are designed to provide appropriate protections while enabling the flexibility of ephemeral environments.

  • Data Isolation: We guarantee complete data isolation between customer environments.
  • Snapshot Retention: We guarantee that snapshots will be retained according to your defined policies, up to the maximum retention period specified in your subscription.
  • Secure Deletion: We guarantee secure deletion of data when environments are destroyed, in compliance with industry standards.
  • Pre-Destruction Snapshot Reliability: If enabled, we guarantee 99.9% reliability for automatic pre-destruction snapshots.
  • Data Recovery Window: For ephemeral environments, we maintain a 24-hour recovery window for emergency data recovery requests, even after environment destruction.

Measurement & Transparency

We believe in transparent reporting and providing customers with visibility into our service performance. All SLA metrics are continuously monitored and made available to customers.

Real-time Dashboards

All customers have access to real-time service health dashboards showing current status and historical performance against SLA metrics.

Monthly SLA Reports

We provide detailed monthly reports showing our performance against all SLA commitments, with transparent calculation methodologies.

API Access to Metrics

All performance metrics are accessible through our API, allowing you to integrate service health monitoring into your own dashboards.

Historical Performance

We maintain 12 months of historical performance data, allowing you to track our reliability over time and plan accordingly.

SLA Exclusions

While we strive to maintain the highest levels of service availability and performance, certain circumstances fall outside the scope of our SLA:

  1. Force Majeure events beyond our reasonable control, including natural disasters, acts of war or terrorism, riots, labor disputes, or government actions.
  2. Scheduled maintenance windows that are announced at least 48 hours in advance.
  3. Customer-caused failures resulting from custom configurations, scripts, or code that do not follow our documented best practices.
  4. Exceeding resource quotas or service limits specified in your subscription plan.
  5. Third-party service integrations that are not directly controlled by VaultSandbox.
  6. Network issues outside our network boundary, including internet transit provider failures.
  7. Beta or preview features explicitly marked as such in our documentation.

Have Questions About Our SLA?

Our team is available to discuss our service level commitments and how they align with your specific requirements.