FME Flow Hosted Support Policy
The new FME Flow Hosted support plans explicitly define what FME Flow Hosted customers can expect from Safe Software when they run services on FME Flow Hosted. By default all customers are on the Included support plan. If you wish to receive additional support, then you need to purchase a Managed support plan from an FME Flow Hosted Managed Service Provider (also called an “MSP Partner”).
The main FME Flow Hosted pricing page outlines the basic differences between the Included support plan and the Managed support plan.
Support Channels
There are 3 primary ways for customer to get support: checking the FME Flow Hosted Status page, getting help from the community, or submitting a private support case.
FME Flow Hosted Status
The Safe Software status page contains information on current production status and will be updated during an outage. After an outage, a root-cause analysis is performed and made available via the page.
Please check the status page before submitting a case. If the status page contains a notification about an outage, our team will already be working on restoring service as quickly as possible. Once the issue is resolved, the status page will be updated to reflect that. There is a button on the page to subscribe to notifications of any changes. A root-cause analysis will be published to the status page once an investigation has been done.
Community Forums
Safe’s Knowledge Base contains a wealth of information in the form of articles, tutorials, demos, and FAQs. If you can’t find what you are looking for in the Knowledge Base then you can post to our Q&A Forum or to our Ideas Exchange. A question or idea posted in either place receives the same priority as one privately submitted, and in fact may receive a faster response since people from the community are online 24/7. Everything posted in our community is reviewed by at least one Safer, and many questions see responses from multiple individuals. These resources are available to all customers.
Support Center
You may ask a question via live chat or our report a problem form. Response times are limited to certain hours of coverage and may vary depending on the relative volume and complexity of requests already in the queue. If you have a Managed support plan with an MSP Partner, then they will have defined in the contract how to contact them and their response times.
Scope of Support
Helpdesk Support
FME Flow Hosted Helpdesk support requests cover development and production issues on FME Flow Hosted. If you have a support contract with a MSP Partner, then they will provide the first-line of Helpdesk Support. Helpdesk support is limited to:
- Troubleshooting operational or systemic problems on both the FME Flow Hosted tier and FME Flow instances.
- Troubleshooting security concerns on both the FME Flow Hosted tier and FME Flow instances.
- Troubleshooting access Issues to either the FME Flow Hosted tier or FME Flow instances.
- Proactive investigation into product regressions, deficiencies and security threats.
Helpdesk Support does not include:
- Proof of concepts
- Advice on leveraging third-party services that complement typical FME Flow Hosted deployments
- Performing system administration tasks
Advanced Support
Advanced support is offered by our MSP Partners. Advanced support builds on the Included level of support offered via Helpdesk Support. The specifics of the offering depend on the terms agreed to between you and your MSP Partner. Some examples of the expertise our partners offer:
- Architectural Review - Review of your current architecture and advice on how to migrate to the cloud to take advantage of the many opportunities it presents.
- Security - Going beyond securing FME Flow Hosted, and advising on leveraging other applications and data in the cloud so FME Flow Hosted can be deployed as part of a secure architecture.
Service Priorities
We treat all services FME Flow Hosted provides as mission-critical. There are two things we look at when prioritizing an issue, the severity level and the type of customer impacted. We have identified three severity levels: Critical, Urgent, and Normal. We always work on the highest severity issues first. If multiple customers are affected by the same issue, customers are prioritized based on their support plan and then the instance types as follows:
- Managed support plan
- Included support plan with Standard, Premium, Professional and Enterprise instance types.
- We rely on this framework to help us triage issues internally and to set your expectations.
Severity Levels
Critical
These issues affect the foundational components of FME Flow Hosted and mean your production workflows will be unavailable.
Examples:
- FME Flow instances: Your instance is unresponsive and you are unable to access it.
- FME Flow instances: Data integrity may be at stake.
- FME Flow Hosted tier: You can’t pause or start your instances.
Urgent
These issues affect major functionality of FME Flow Hosted. In urgent issues, key production workflows are not affected, but you may experience a degraded level of performance. Data integrity is also not impacted.
Examples:
- FME Flow instances: Your FME Flow is operational but certain functionality degraded.
- FME Flow Hosted tier: Non-critical functions such as resizing a disk are not available.
Normal
Normal priority can include technical questions, configuration issues, suggestions and defects that affect a small number of users. Typically acceptable workarounds exist.
Examples:
- FME Flow instances: Advice on configuring FME Flow workflows.
- FME Flow Hosted tier: Report missing or erroneous documentation.
Response Times
FME Flow Hosted customers across all support plans can expect the following initial response times during their plan’s hours of coverage, with priority given to customers on Managed support plans:
Severity | Initial Response Time |
---|---|
Critical | 2 hours |
Urgent | 1 business day |
Normal | 3 business days |
We strive to meet these target response times, but they are not guaranteed. These times do not indicate how long a final resolution may take. As well, these target response times are only applicable during the hours of coverage defined in your support plan. (Note: Hours of coverage for Included support plans are 8am-5pm PST, excluding statutory holidays in BC, Canada. If you wish to receive support outside of these hours, you need to purchase a Managed support plan from an MSP Partner.)
As defined in the FME Flow Hosted Shared Responsibility Support Model, if the cause of the issue originates from the FME Flow application tier, it is your responsibility to diagnose and manage the issue. If you require the FME Flow application tier to be proactively monitored, then you need to purchase a Managed support plan through an MSP Partner.
During a mass outage event, this response time will apply exclusively to customers on a Managed support plan. Our response time for customers on the Included support plan will be contingent upon the nature and priority of the outage. Wider spread outages will likely result in longer response times.
Security Incidents
We take security very seriously at Safe Software which you can read more about here. Continuous vulnerability scanning ensures new threats are identified quickly. On identifying a threat, we audit our infrastructure to see what is affected, and based on that, assess the security risk and assign a severity level.
If there is a vulnerability and it is high risk, we will immediately create a patch. Before patching, we will send an email out to the emergency contact of affected customers. If it is a lower severity issue, then we will prepare a patch and communicate the issue via the in-app notifications on the FME Flow Hosted dashboard. When we deploy lower severity patches, we aim to strike a balance between risk and ensuring the impact on your production workflows is minimal.
If it is a high profile issue, we will post a debrief on our blog, e.g. here is a post detailing our exposure to the Heartbleed issue.
Incident Types
We have identified two types of incidents that could impact your level of service: outages and isolated issues.
- Outages: An outage potentially impacts more than one customer and instance.
- Isolated: This incident is specific to one FME Flow instance and thus one customer.
We follow a specific process for each type of incident.
Outage Lifecycle
We have identified three levels of outage depending on its severity and how widespread it is. We follow a different process for each outage type.
Outage Type | Description |
---|---|
Mass Outage | Significant parts of the infrastructure are down: entire sets of customers are experiencing compromised performance. An example might be that FME Flow instances become unreachable because of a network issue, or that there is a problem in the FME Flow Hosted tier preventing instances from being started/stopped. |
Limited Outage | Compared to a mass outage, the issue must be limited in either the severity or number of people it affects. An example might be a degraded level of service in a specific data center which only affects 10% of customers. Or a bug that we introduced that affects all customers, but does not compromise the key production workflows. |
Emerging Issue | We have received reports from a small number of customers (or our monitoring tools have picked it up), usually about edge-case issues with compromised results. |
In all cases, we report the issue on the Safe Software Status page. We continually update that status page until the issue has been resolved. It is recommended that you subscribe to receive email updates on this page as it is not tied to your FME Flow Hosted account.
During an outage our target response times may be compromised: we do our best to meet them.
Isolated Issue
If we identify an issue related to your specific FME Flow instance that we are responsible for (see FME Flow Hosted Shared Responsibility Model), we will contact you directly. To ensure we contact the correct person, we have created an Emergency Contact form on the FME Flow Hosted account settings page. Please fill this out and ensure it is kept up to date. If you do not have a named contact on file, it may impact your level of service as we sometimes need your permission to fix an issue.
Isolated Issue Lifecycle
- Safe's monitoring tools identify an issue with your instance that is causing a drop in the level of service. The monitoring tools are integrated with our incident management platform and will automatically trigger an issue.
- A support engineer will acknowledge the incident during the hours defined in your support plan.
- If the engineer can fix the incident without gaining SSH access, then we will. If successful, we will email the emergency contact to explain the cause.
- If we need to gain SSH access to the instance, we will contact the emergency contact to request permission (via phone and/or email).
- On gaining permission, we will work to resolve the issue as quickly as possible.
- Once the issue is resolved, we will contact the emergency contact to debrief them on the cause and outline the steps we have taken to prevent the issue happening again.
Key Production Workflows
These are the pieces of functionality that we have identified as being key for production workflows.
- API and web access to running FME Flow instances
- Ability to pause an instance via the FME Flow Hosted dashboard, a schedule, or the API.
- Ability to start an instance via the FME Flow Hosted dashboard, a schedule or the API.
FME Flow Hosted Shared Responsibility Model
FME Flow Hosted is a Platform as a Service (PaaS). Two components comprise FME Flow Hosted. The first component is the dashboard/API, herein referred to as the FME Flow Hosted tier. This is a multi-tenant application where FME Flow Hosted customers sign up, launch/manage FME Flow instances, and conduct billing and account management.
The second component is the FME Flow instances. These are where FME Flow Hosted customers publish their workspaces and associated data. Each FME Flow instance is a self-contained environment, isolated from other instances, and includes compute, storage, and database services.
Monitoring, securing and maintaining the FME Flow Hosted tier is the sole responsibility of Safe Software.
For the FME Flow instances, to ensure a high level of uptime, both the FME Flow Hosted customer and Safe Software are responsible for supporting the instance—a shared responsibility model. As a customer, you can purchase a Managed support plan from an MSP Partner who will then handle the customer-side responsibilities on your behalf.
Proactive Monitoring Of The FME Flow Hosted Tier
The FME Flow Hosted tier is monitored 24x7 by comprehensive automated systems. In the event of any issue affecting the health and operation of the infrastructure, core systems, or tools, our dedicated operations team is notified and will respond to diagnose and correct any issues. This 24×7 monitoring of the FME Flow Hosted tier benefits all FME Flow Hosted users.
FME Flow Instances
Delivering a high level of uptime for the customer’s FME Flow deployment on FME Flow Hosted is slightly different to on-premises data centres. When the FME Flow Hosted customer moves their FME Flow deployment up to the cloud, the responsibility of ensuring a high level of uptime for their instance is split between the FME Flow Hosted customer/MSP Partner and Safe Software. Safe Software is responsible for monitoring and maintaining the operating system down to the hardware powering the instance, and the FME Flow Hosted customer/MSP Partner is responsible for monitoring and maintaining the FME Flow application. This shared responsibility model can reduce the FME Flow Hosted customer’s operational burden in many ways.
Safe Software Support Responsibilities
Safe Software is responsible for monitoring and responding if there is an issue with the operating system, hardware or network. We monitor the health and operation of all these components and will be alerted immediately if there is an issue.
Operating System: FME Flow instances run on Ubuntu. Safe Software will fix any issues at the operating system (OS) level. Before gaining access to the instance, permission will be requested from the emergency contact on the account.
Hardware Failure: If there is an issue with the underlying hardware hosting the instance, Safe Software will be alerted and will work to either fix the issue or help the FME Flow Hosted customer migrate to another instance if the damage is irreparable.
Networking: If there is a network issue that causes connectivity to the machine to degrade, then Safe Software will be alerted and will work to fix the issue. If it is a global outage that affects all customers, Safe Software will communicate the issue as defined in our support policy.
Customer/MSP Partner Support Responsibilities
FME Flow Hosted is a Platform as a Service (PaaS), allowing the FME Flow Hosted customer to provision an instance with FME Flow installed in minutes instead of weeks. On provisioning the instance, Safe Software has no ability to access the instance through the FME Flow web interface or APIs. This means it is impossible for Safe Software to support the FME Flow application uptime as we have no access, and thus insight, into FME Flow workloads being run. It is this application tier that the MSP Partner or FME Flow Hosted customer is responsible for supporting.
Monitoring and Automated Alerts
To help the FME Flow Hosted customers and MSP Partners manage this application tier, FME Flow Hosted provides a suite of tools (in addition to those FME Flow provides).
Disk Monitoring: If an FME Flow instance runs out of disk space, then it can cause a critical outage as FME Flow requires free disk to function. FME Flow Hosted customers/MSP Partners can monitor disk usage and define alerts that will send a notification when the amount of remaining disk goes below a certain value.
Memory Monitoring: If an FME Flow instance is consistently running out of memory, then it can potentially cause a severe degradation of service. FME Flow Hosted customers/MSP Partners can monitor memory and define alerts that will send a notification when the memory usage is above a certain value for a period of time.
Web Server Responsiveness: If an FME Flow instance is overloaded, or experiences connectivity issues, one of the best indicators of a potential critical outage is whether the FME Flow web server is responsive. FME Flow Hosted customers/MSP Partners can define alerts on the server response time and there is a special alert which triggers when the server is non-responsive. Notifications can then be configured to ensure the correct people are instantly made aware of the issue.
FME Flow Load: If an instance is constantly overloaded, then it can cause a degradation in service as all services (engines, web server, database, etc.) share the same compute. For example, if an FME Engine hogs all of the CPU, then it can cause the database and web server to crash. If the load is consistently high, then the FME Flow Hosted customer/MSP Partner may need to upgrade the instance type. FME Flow Hosted customers/MSP Partners can monitor server load and define alerts that will send a notification when the load is above a certain value for a period of time.
Security Update Management
Ensuring the FME Flow Hosted customer’s instance is secure is critical to ensuring a high level of uptime. If the operating system is not patched with the latest fixes, then the instance could be vulnerable to attack. FME Flow Hosted provides automated security patching which allows FME Flow Hosted customers/MSP Partners to ensure the instance is patched with a few clicks in the dashboard.