A selection of our current & past projects
See 61511-1 188.8.131.52.4, Note 2:
– Stage 1 – After the hazard and risk assessment has been carried out, the required protection layers have been identified and the safety requirement specification has been developed.
– Stage 2 – After the safety instrumented system has been designed.
– Stage 3 – After the installation, pre-commissioning and final validation of the safety instrumented system has been completed and operation and maintenance procedures have been developed.
– Stage 4 – After gaining experience in operating and maintenance.
– Stage 5 – After modification and prior to decommissioning of a safety instrumented system.
Note that IEC 61511 §17.2.6 requires the Stage 5 FSA before modification activity begins on the system. In practice Stage 5 FSA starts before the modification begins and finishes when records are available to show that the modification has been successfully completed and validated.
λDU = 6 x 10-8/h in a 2oo3 arrangement
PFDG ≈ 6.( (1-bD).λDD + (1-b ).λDU)² .λDU. λD (T1/2) .λDU. λD (T1/3) + b.λDU (T1/2)
PFDG ≈.λDU ² .T1² + b.λDU (T1/2)
≈ (6 x 10-8 x 8760)² + 0.15 x 6 x 10-8 x (8760 /2)
(note that b is multiplied by 1.5 for 2oo3 voting)
≈ 3 x 10-7 + 4 x 10-5
≈ 4 x 10-5 (notice that the b.λDU.T1/2 strongly dominates the result, again)
The RRF ≈ 800 or ≈ 1,000 depending on how you choose to apply the rounding. I would class this as SIL 2. It is almost good enough for SIL 3 but close to the borderline.
We would need to improve the RRF to be confident of achieving SIL 3, for example we could reduce the inspection and test interval.
It should be planned in advance, taking into account the systematic capability required, the degree of novelty, complexity and familiarity and also the degree of risk.
See IEC 61511-1 §7.2.6 NOTE 1:
Selection of techniques and measures for the verification process and the degree of independence depends upon a number of factors including degree of complexity, novelty of design, novelty of technology and required SIL.
IEC 61511-1 §12.5.2 requires the application program and documentation to be reviewed by a competent person not involved in its original development.
See 61511-1 5.2.7: the stage at which formal configuration control is to be implemented needs to be specified in planning
In principle change control should be applied as soon as items are released for use or released for testing.
Refer to IEC 61511 section 5.2.7
184.108.40.206.1 Procedures for configuration management of the SIS during the SIS and software safety life-cycle phases shall be available; in particular, the following should be specified:
• the stage at which formal configuration control is to be implemented;
• the procedures to be used for uniquely identifying all constituent parts of an item (hardware and software);
• the procedures for preventing unauthorized items from entering service.
Configuration management applies to any item that is subject to version changes, including hardware, software, application program, firmware, programming tools and utilities.
The RRF needed is the intermediate event frequency divided by tolerable frequency: RRF = 10-3pa / 10-5pa = 100, i.e. SIL 2 (or arguably SIL 1)
10-1 pa x 0.01 x 1 x 1 x 1 = 10-3 pa
Nil because the IEF = tolerable frequency
2 or 3 fatalities = Extensive: 10-5pa
Once in the past 6 years is closer to 1 in 10 years than to once every year.
It is not frequent enough for ‘Frequent’, 1pa.
It is more appropriate to class it as ‘High’, 10-1pa
= λS + λDD / λS + λDD + λDU
= 500 + 200 / (500 +200 + 1500) = 700 / 2200 = 32%
DO NOT INCLUDE ‘no effect’ failures
IEC 61511-1 does not say who is responsible. It needs to be planned:
220.127.116.11 Persons, departments, organizations or other units which are responsible for carrying out and reviewing each of the SIS safety life-cycle phases shall be identified and be informed of the responsibilities assigned to them.
Safety planning shall take place to define the activities that are required to be carried out along with the persons, departments, organizations or other units responsible to carry out these activities. This planning shall be updated as necessary throughout the entire SIS safety life-cycle (see Clause 6) and carried out to a detailed activity level commensurate with the role the individual or organization is performing in the SIS safety life-cycle.
The main objective is to make a judgement as to the functional safety and safety integrity achieved by every SIF of the SIS.
Refer to IEC 61511-1 §18.104.22.168.5
Prior to hazards being introduced confirm:
Further FSA plannedThe FSA should also judge the systematic integrity. FSA should review evidence of appropriate functional safety management and evidence of sufficient systematic integrity, such as records of verification and compliance with appropriate techniques and measures.
According to IEC 61511-1:
22.214.171.124.2 The membership of the FSA team shall include at least one senior competent person not involved in the project design team (for stages 1, 2 and 3) or not involved in the operation and maintenance of the SIS (for stages 4 and 5).
NOTE When the assessment team is large, consideration should be given to having more than one senior competent individual on the team who is independent from the project team.
According to IEC 61508-1:
8.2.15 The minimum level of independence of those carrying out a functional safety assessment shall be as specified in Tables 4 and 5. Product and application sector international standards may specify, with respect to compliance to their standards, different levels of independence to those specified in Tables 4 and 5. The tables shall be interpreted as follows:
– X: the level of independence specified is the minimum for the specified consequence (Table 4) or safety integrity level/systematic capability (Table 5). If a lower level of independence is adopted, then the rationale for using it shall be detailed.
– X1 and X2: see 8.2.16.
– Y: the level of independence specified is considered insufficient for the specified consequence (Table 4) or safety integrity level/ systematic capability (Table 5).
8.2.16 In the context of Tables 4 and 5, only cells marked X, X1, X2 or Y shall be used as a basis for determining the level of independence. For cells marked X1 or X2, either X1 or X2 is applicable (not both), depending on a number of factors specific to the application. The rationale for choosing X1 or X2 should be detailed. Factors that will make X2 more appropriate than X1 are:
– lack of previous experience with a similar design;
– greater degree of complexity;
– greater degree of novelty of design;
– greater degree of novelty of technology.
NOTE 1 Depending upon the company organization and expertise within the company, the requirement for independent persons and departments may have to be met by using an external organization. Conversely, companies that have internal organizations skilled in risk assessment and the application of safety-related systems, that are independent of and separate (by ways of management and other resources) from those responsible for the main development, may be able to use their own resources to meet the requirements for an independent organization.
NOTE 2 See 3.8.11, 3.8.12 and 3.8.13 of IEC 61508-4 for definitions of independent person, independent department, and independent organization respectively.
NOTE 3 Those carrying out a functional safety assessment should be careful in offering advice on anything within the scope of the assessment, since this could compromise their independence. It is often appropriate to give advice on aspects that could incur a judgement of inadequate safety, such as a shortfall in evidence, but it is usually inappropriate to offer advice or give recommendations for specific remedies for these or other problems.
8.2.17 In the context of Table 4, the consequence values for the specified level of independence are:
– Consequence A: minor injury (for example temporary loss of function);
– Consequence B: serious permanent injury to one or more persons, death to one person;
– Consequence C: death to several people;
– Consequence D: very many people killed.
The consequences specified in Table 4 are those that would arise in the event of failure of all the risk reduction measures including the E/E/PE safety-related systems.
IEC 61511-1 §126.96.36.199.5
Prior to the identified hazards being present (i.e., Stage 3) the FSA team shall confirm that:
Due diligence in our duty of care means we need to:
Audit examines compliance with procedures, processes and practices.
Functional safety assessment takes into account audits but goes much further. It makes an overall judgement about the functional safety achieved by the system.
Most failures can be anticipated given the condition of the equipment, its age and its environment (process conditions and ambient conditions).
Both age and wear can usually be monitored by measuring and trending condition indicators. Typical indicators include:
• Actuator force or stem torque for a valve
• Leakage rate
• Voltage versus current for a transmitter (corrosion and increasing impedance in circuits)
• Transmitter response time, spectral characteristics of process signals (indicating changes in sensor systems such as contamination)
• Temperature of components, enclosures
• Moisture content
• Vibration levels and spectral characteristics
• Cracking, discolouration due to heat or radiation
• Spring tension
• Wall thickness
• Number, extent and duration of process perturbations
• Alarm frequency and duration
• Alarm suppression frequency and duration
All failures and all demands must be analysed.
Performance measurement validates the assumptions in the design:
• Demand rates
• Failure rates
Unexpected behaviour must be analysed
Those responsible for O&M also need to review assumptions regarding factors such as occupancy and corrosion.
IEC 61511-1 §19.2.9,
See also 12.4.2, 14.2.4, 15.2.6, 16.2.2 e and 16.3.3
• Details of equipment used for SIFs and SRS
• Organisation for maintaining FS
• Procedures to achieve and maintain FS
• Modification information
• Safety manuals
• Records of the design, implementation, test and validation
• Application program documentation
• Installation and commission records
• Records of proof test and inspection
The SRS and probability of failure (PF) quantification set benchmarks for proof test interval and proof test coverage.
Proof test and inspection plans should be based on FMEDA studies (or similar) to achieve the required coverage, given knowledge of the anticipated failure modes and the diagnostics that have been implemented.
The plans need to take into account accessibility for testing.
The likelihood or rate of ‘never detected’ failures should be minimised by design, ensuring that all anticipated failures can be revealed by diagnostics, inspection or testing.
Plans should consider staggered test intervals.
Independence of maintainers may be needed to improve systematic integrity and to reduce common cause failure.
Based on IEC 61511-1 Clause 5:
Management of recommendations
Assessment and audit
Revision and change control
Based on IEC 61511-1 Clause 6:
Safety lifecycle – definition of activities, outputs and responsibilities (including verification activities for each output)
Based on IEC 61511-1 Clause 7:
Based on IEC 61511-1 Clause 16:
Procedures for both routine and abnormal activities
Preventive and breakdown maintenance activities
Procedures, techniques and measures
Response to faults and failures
Operation of bypasses
Analysis of performance and unexpected behaviour
Collection of failure rate and demand data
Inspection and proof testing procedures
Records that need to be kept
Timing of activities
Based on IEC 61511-1 Clause 17:
Management of modifications
Based on IEC 61511-1 Clause 19:
Yes but unless it is a very limited language we may need to apply IEC 61508-3. Structured text is usually classed as FVL. Check the SIS safety manual.
Refer to IEC 61508-7 for further guidance on appropriate techniques and measures.
Use templates based on IEC 61511-1 clauses 11 and 12, and verify using checklists based on those clauses.
Demonstrate compliance to IEC 61508-2 and/or -3 and provide the information equivalent to a safety manual in accordance with the two Annex D lists,
OR provide evidence of prior use, IEC 61511-1 § 11.5.3 through to 11.5.6
Confirm that the devices are suitable for the operating environment.
Refer to IEC 61508-2 §188.8.131.52 and Annex D,
and/or for software IEC 61508-3 §184.108.40.206 and Annex D
Demonstrate compliance to IEC 61508-2 and/or -3 and provide a safety manual, OR provide evidence of prior use, IEC 61511-1 § 11.5.3 through to 11.5.6
The design should be traceable to the requirements of the SRS, the APSRS and to the requirements of IEC 61511.
A security risk assessment must be carried out and recorded to set the requirements for system security.
IEC 61511-1, 8.2.4
Countermeasures might affect the requirements for interfaces to the SIS and interface requirements need to be in the SRS 10.3.2 bullet point 20.
IEC 61508-4, 3.6.20
process safety time: period of time between a failure, that has the potential to give rise to a hazardous event, occurring in the EUC or EUC control system and the time by which action has to be completed in the EUC to prevent the hazardous event occurring
10.3.2 bullet point 15 says it needs to be defined in the SRS:
11.2.11 has requirements for the detailed design:
The Application Program Safety Requirements Specification is derived from the SRS, adding sufficient detail to allow the software design and implementation to achieve the required safety integrity and to allow an assessment of functional safety to be carried out.
Requirements tracking is required to ensure that all of the safety requirements are addressed in the design and all of the requirements are demonstrated objectively through the validation inspection and testing process.
Forwards traceability is concerned with ensuring that every objective requirement is addressed in the subsequent detailed design documents and testing specifications and it enables users to find where requirements have been addressed so that impact of changes to requirements can be managed.
Backwards traceability is broadly concerned with checking that every implementation decision (interpreted in a broad context, and not confined to code implementation) is clearly justified by some requirement.
To provide a complete and consistent summary of the user’s safety requirements as a basis for the design, implementation, testing and maintenance of the system.
Modularity and encapsulation, cohesion and coupling
Nested architecture, though not more than 3 or 4 layers
Library of standard functions
Consistent application – clearly defined coding standards and interface standards
There is no such thing as the ‘best’ architecture.
The different architectures have different advantages and disadvantages.
The choice of architecture depends on the requirements of each individual end user.
Primarily for efficiency, to avoid wasted effort in a ‘journey of discovery’.
Very early in the design – when the concept is being developed.
Generally this should be as the P&IDs are being developed and well before the SRS is developed.
We should have a pretty good idea of which SIFs are likely to be SIL 2 or SIL 3.
We should have a pretty good idea of which SIFs are likely to need on-line maintenance access.
Balance between risk reduction and process sensitivity (cost of downtime, and cost of spurious trips):
Risk / SIL – what PFD or PFH must be achieved?
Process sensitivity – what is the target for spurious trip rate?
Cost to install
Operability and maintainability – can we get easily get access for maintenance on-line without process downtime?
Response to detected failure:
• Compensating measures?
• Dependability of response?
See IEC 61511-1 11.3.1
When a dangerous fault in an SIS has been detected (by diagnostic tests, proof tests or by any other means) then
compensating measures shall be taken to maintain safe operation.
If safe operation cannot be maintained, a specified action to achieve or maintain a safe state of the process shall be taken.
Consider the process safety time and the ability of the operator to react promptly and dependably.
In each wellhead the pair of valves has HFT = 1, so with Type A SFF < 60% we can achieve SIL 2
The four parallel wellheads together are still good for SIL 2.
The 2 valves in series after them when taken as a pair also have HFT 1 and can achieve SIL 2. Therefore they add one extra level of fault tolerance to achieve SIL 3 overall.
You could argue that if the final 2 valves are completely independent of each other (different SIFs, maybe even different SIS, e.g. a HIPPS), then each one on its own brings one additional level of HFT, so SIL 2 + HFT 2 = SIL 4.
You would need to demonstrate sufficient independence between the valves. See IEC 61511-1 §9.2.6 and 9.2.7.
In IEC 61511-1 Edition 1 sub-clause 11.4 included a note explaining that ‘The minimum hardware fault tolerance has been defined to alleviate potential shortcomings in SIF design that may result due to the number of assumptions made in the design of the SIF, along with uncertainty in the failure rate of components or subsystems used in various process applications.’
In Edition 2 the note has been moved into part 2, IEC 61511-2 §11.4.1
Fault tolerance is defined as:
Either trip on detected fault or apply a defined set of compensating measures
Alarm should be in the SIS
Operator response should be clearly defined
If the response is only to organise a repair, a time limit should be set and the necessary action defined for when the limit is reached
We cannot claim diagnostic coverage unless we can be sure that appropriate action will be taken when faults are detected.
Shared or common hardware and software elements shall conform to the highest safety integrity (and systematic capability) level.
The fundamental requirement is that non-safety functions must not compromise safety
Non-safety functions that are not separated must be treated as if they were safety functions – subjected to the same rigorous practices to eliminate faults.
Lower capital cost
Lower training costs
Easier data exchange, information management
Easier to manage
To avoid common cause, common mode and dependent failures
Temperature (specify conservatively, protect, insulate, relocate)
Vibration (isolate, relocate)
Contamination (appropriate process connection design, sensor type)
Corrosion (materials compatibility design)
EMI (source identification and risk assessment, shielding, segregation)
Power supply quality (specification, filtering, monitoring)
Air / hydraulic fluid quality (specification, filtering, monitoring)
Errors in design/selection/software/maintenance (appropriate checking, review, audit and inspection)
The PFD is usually dominated by the final elements.
Increase redundancy – e.g. from 1oo1 to 1oo2 but in this case we already have 1oo2. 1oo3 is often impracticable.
Reduce the b factor, actively eliminate common cause failures. Diverse sensors and diverse sensing techniques might be needed to reduce b. Environmental compatibility specification and testing can also reduce b.
Reduce the failure rate? That might be achievable through condition based maintenance if the equipment can be easily accessed for inspection, testing and reconditioning. Design the installation to allow ready access to the equipment during normal operation.
Reduce the proof test interval, consider partial stroke testing.
Increase proof test coverage. Find and fix incipient failures before they develop.
Ensure that all anticipated failure modes can be detected through diagnostics, inspection or testing.
IEC 61508 compliant safety manuals for the devices, or a dossier with the equivalent information.
Data may be available from certificates, from industry databases (exida, OREDA, SINTEF).
Data may be available from prior use.
We need to ensure that the failure rates are credible, traceable, achievable and dependable.
For the sensors HFT =1, and the confidence level is 90% so SIL 3 can be claimed.
The logic solver is certified SIL 3
We can’t apply Route 2H for the valves unless we have enough information to estimate the failure rate of the valves with 90% confidence level.
λ90% could be as much as 50% higher than λ70%, so the PFD might be closer to 1,000, which is marginal for SIL 3.
If we apply Route 1H we can claim SIL 2.
Alternatively we could apply the IEC 61511 Table 6 HFT requirements and claim SIL 3 because HFT =1. Either way we need to show that we have credible and traceable failure rate data based on operating experience with that type of equipment.
If PFDG ≈ 7 x 10-4 RRF ≈ 1400, SIL 3,
If PFDG ≈ 10-3 RRF ≈ 1000, SIL 3 borderline,
≈ 3 x 10-5 (sensor) + 9 x 10-6 (logic solver, analog in) + 7 x 10-4 (valves)
≈ 7.4 x 10-4
≈ 7 x 10-4
Do not show two significant figures because that implies better precision than is credible. It would be very misleading to say that the answer is 7.36 x 10-4.
In reality the uncertainty in the failure rates is something like +/- 50% at best.
It would be just as valid to estimate PFDG ≈ 10-3
PFDG ≈ ((1-b ).λDU T1)²/ 3 + b .λDU .T1 / 2
≈ (0.9 x 1.3 x 10-2 x1)²/ 3 + 0.1 x 1.3 x 10-2 x 1 / 2
≈ 4.5 x 10-5 + 6.5 x 10-4
≈ 7.0 x 10-4
You can ignore the 1-b term, it makes no difference:
PFDG ≈ (λDU T1)²/ 3 + b .λDU .T1 / 2
≈ (1.3 x 10-2 x1)²/ 3 + 0.1 x 1.3 x 10-2 x 1 / 2
≈ 5.6 x 10-5 + 6.5 x 10-4
≈ 7.1 x 10-4
You can never ignore the b .λDU .T1 / 2 term.
If we approximated λDU ≈ 0.01pa:
b .λDU .T1 / 2 ≈ 0.1 x 0.01 x 1/2 ≈ 10-3
λDU = 200 x 10-9 per hour x 0.9 x 104 hours per year ≈ 2 x 102 x 10-9 x 104
≈ 2 x 10-3 pa
Ball valve and actuator:
λDU = 1230 x 10-9 per hour x 0.9 x 104 hours per year ≈ 1.1 x 103 x 10-9 x 104
≈ 1.1 x 10-2 pa
The combined failure rate λDU ≈ 0.013pa or ≈ 0.01pa
The sensor and logic solver are both certified for SIL 3 in this configuration, so presumably they both have SFF high enough for SIL 3.
We need to analyse the HFT requirements for the final element subsystem.
We might arguably claim that the valves are Type A. SFF < 60% and HFT = 1 so only SIL 2 could be claimed if we apply Route 1H.
If we had enough information to establish a 90% confidence level in the valve failure rate data we could claim SIL 3 according to Route 2H.
Alternatively we could apply the IEC 61511 Table 6 HFT requirements and claim SIL 3 because HFT =1. We would need to show that we have credible and traceable failure rate data based on operating experience with that type of equipment.
However if we are told ‘No other information is available about failure rates for the valves and actuators’ then we have no evidence that the devices are suitable for use in SIS service. We cannot make a claim for prior use or for IEC 61508 compliance. We cannot claim any SIL at all because the systematic integrity (systematic capability) is not established.
PFD is improved by roughly x 0.5, i.e. ≈ 0.0007
RRF ≈ 1500, SIL 3
We should also think about how can we reduce λDU and b .
Logic solver has analog inputs (a multilevel coded mA signal) and a digital output so the PFD is 9 x 10-6
≈ 4 x 10-4 for the LSHH + 9 x 10-6 for the PLC + 9 x 10-4 for the valves
≈ 1.3 x 10-3
≈ 0.0013 or roughly ≈ 0.001
PFDG ≈ ((1-b ).λDU T1)²/ 3 + b .λDU .T1 / 2
≈ (0.02pa x1)²/ 3
≈ 1 x 10-4
A whole order of magnitude difference! The common cause failures CANNOT be neglected.
≈ (0.9 x 0.02pa x1)²/ 3 + 0.1 x 0.02pa x1 / 2
≈ 1 x 10-4 + 1 x 10-3 ≈ 10-3
If you insist on working with unwarranted precision you will get the same result:
≈ (0.9 x 0.017pa x1)²/ 3 + 0.1 x 0.017pa x1 / 2
≈ 8 x 10-5 + 8.5 x 10-4
≈ 9 x 10-4, round up to 1 x 10-3
λDU = 230 x 10-9 per hour x 0.9 x 104 hours per year ≈ 2 x 102 x 10-9 x 104
λDU = 1.7 x 10-6 per hour x 0.9 x 104 hours per year ≈ 1.5 x 10-6 x 104
≈ 1.5 x 10-2 pa
The combined failure rate λDU ≈ 0.002 + 0.015 = 0.017pa,
we should approximate that to 0.02pa
DO NOT IMAGINE THE ANSWER IS PRECISE!
λDU = 90 FITS = 90 failures per 109 hours, i.e. 90 x 10-9 failures per hour
Convert to failures per year by multiplying failures per hour x hours per year
≈ 90 x 10-9 hours x 8760 hours per year ≈ 0.9 x 102 x 10-9 x 9 x 103 pa
≈ 8 x 10-4 pa
PFDG = λDU x T1/ 2
≈ 8 x 10-4 x 1 / 2
≈ 4 x 10-4
(Based on IEC 61511-1 §15.2)
• Definition of validation activities with respect to SRS
• Procedures for follow up and resolution of recommendations
• Consideration of all process operation modes
• Techniques and measures to be used (considering risk of hazards), technical strategies
• Timing and sequence of activities
• Responsibilities, levels of independence
• Information against which validation is to be carried out (traceability to specifications and SRS)
• Identification of items and application program subject to validation
• Test environment, tools, equipment
• Acceptance criteria
• Procedures for managing failures and discrepancies
• Calibration requirements
• Documentation to be produced
• Records to be kept
Keep all records of verification:
• What was checked
• How was it checked
• What basis was it checked against
• How were discrepancies identified and resolved
Verification records are essential for demonstrating systematic integrity and for demonstrating due diligence in complying with the appropriate standards and practices
Verification is about checking lifecycle phase outputs with respect to inputs. It involves analysis and/or tests to demonstrate that, for specific inputs, the outputs meet in all respects the objectives and requirements set for the specific phase. It applies to every output of every phase.
Validation is of the end product after installation with respect to requirements. Validation means demonstrating that the SIF(s) and SIS after installation meet the SRS in all respects.
IEC 61511-1 §220.127.116.11 says that all parties involved in SIS shall be competent to carry out the activities for which they are accountable.
IEC 61508-1 §6.2.13 says ‘all persons with responsibilities [for safety lifecycle activities] shall have the appropriate competence […] relevant to the specific duties that they have to perform.’
The IEC 61508 requirement is broader than the one in IEC 61511 because people can be responsible for something without being accountable. Accountability usually sits higher in the chain of command.
a) hazard analysis and risk assessment;
b) assurance activities;
c) verification activities;
d) validation activities;
f) functional safety audits;
g) post-incident and post-accident activities.
Sub-clause 18.104.22.168 addresses performance measurement and corrective actions related to failures and demands.
Refer to IEC 61511-1 §17.
Prior to any modification to SIS, procedures must be in place for identifying and requesting the work, identifying hazards that may affected and for authorising and controlling the changes. The concern is that a modification may increase hazard rate or consequence, or it may reduce effectiveness of risk reduction. Modifications or changes may have unintended consequences and may introduce new hazards.
Key elements in the modification process are:
• Identify and request the work to be done
• Assess the impact on safety
• Plan the change, update documentation
• Independent functional safety assessment before modification work begins
• Obtain authorisation
• Revalidate after implementation
• Notify personnel affected by the change
• Maintain records
22.214.171.124.3 Management of change procedures shall be in place to initiate, document, review, implement and approve changes to the SIS other than replacement in kind (i.e. like for like).
17 SIS modification
17.2.1 Prior to carrying out any modification to a SIS, procedures for authorizing and controlling changes shall be in place.
17.2.2 The procedures shall include a clear method of identifying and requesting the work to be done and the hazards that may be affected.
17.2.3 Prior to carrying out any modification to a SIS (including the application program) an analysis shall be carried out to determine the impact on functional safety as a result of the proposed modification. When the analysis shows that the proposed modification could impact safety then there shall be a return to the first phase of the SIS safety life-cycle affected by the modification.
17.2.4 Safety planning for the modification and re-verification shall be available. Modifications and re-verifications shall be carried out in accordance with the planning.
17.2.5 All documentation affected by the modification shall be updated.
17.2.6 Modification activity shall not begin until a FSA is completed in accordance with 126.96.36.199.9 and after proper authorisation.
17.2.7 Appropriate information shall be maintained for all changes to the SIS. The information shall include:
a) a description of the modification or change;
b) the reason for the change;
c) identified hazards and SIFs which may be affected;
d) an analysis of the impact of the modification activity on the SIS;
e) all approvals required for the changes;
f) tests used to verify that the change was properly implemented and the SIS performs as required;
g) details of all SIS modification activities (e.g., a modification log);
h) appropriate configuration history;
i) tests used to verify that the change has not adversely impacted parts of the SIS which were not modified.
17.2.8 Modification shall be performed with qualified personnel who have been properly trained. All affected and appropriate personnel should be notified of the change and trained with regard to the change.
Anybody with responsibility for one or more phases in a safety lifecycle is responsible for managing their own scope, the scope of their suppliers, and for managing interfaces with the client and other parties.
Ultimately the end user has to take responsibility for ensuring that management responsibilities are clearly defined and understood for each package of work and across all organisational boundaries.
The safety lifecycle plan outlines the phases for the SIS project, defining each phase with:
• inputs and outputs,
• verification activities
It provides clarity to the team regarding the necessary activities and each person’s responsibilities.
A safety lifecycle plan can take the form of a table of the activities and outputs for each phase.
Clarity of information, removal of distractions and uncertainty – e.g. implement alarm management and ‘ASM’ graphics
Drilling (i.e. regular repeated practice)
An increase of +1 is allowable provided that the system designer provides justification that there is sufficient independence between the elements through common cause failure analysis.
No, IEC 61511-1 §6.2.3 requires planning for the techniques, measures, procedures and responsible organisation for all safety lifecycle phases.
IEC 61511-1 §12.6.2 requires selection of methods, techniques and tools for the for each lifecycle phase for the application program.
0.1 at best, given sufficient information to recognise the hazards, familiarity with the scenario and enough time in which to respond.
Systematic capability needs to be considered if the devices are claimed to be compliant with IEC 61508. An increase of +1 in SC is allowable only there is sufficient independence between the elements. As these two sensors are identical they cannot be sufficiently independent.
We have two options:
1. Replace at least one sensor with a device compliant with IEC 61508 and demonstrated to have SC 3 capability. Demonstrate that the other sensor is sufficiently independent with respect to common cause failures or else replace that one with a SC3 device too.
2. Instead of relying on compliance with IEC 61508 we might be able to demonstrate suitability for selection of the devices based on prior use in accordance with IEC 61511-1 §11.5.3 to §11.5.6. The user would need a sufficient volume of evidence from operating experience with this make and model of device.
Essentially the ‘prior use’ approach in IEC 61511 achieves the same aim as the IEC 61508-2 Route 2S ‘proven in use’ method of demonstrating systematic capability.
It can be quantified in terms of SC 1 to SC 4 by analysing the selection of techniques and measures and assessing their appropriateness and degree of effectiveness.
Reference can be made to Annexes A and B in IEC 61508-2 and -3
measure (expressed on a scale of SC 1 to SC 4) of the confidence that the systematic safety integrity of an element meets the requirements of the specified SIL, in respect of the specified element safety function, when the element is applied in accordance with the instructions specified in the compliant item safety manual for the element
systematic safety integrity
part of the safety integrity of a safety-related system relating to systematic failures in a dangerous mode of failure
NOTE Systematic safety integrity cannot usually be quantified (as distinct from hardware safety integrity which usually can).
measure (expressed on a scale of SC 1 to SC 4) of the confidence that the systematic safety integrity of a device meets the requirements of the specified SIL, in respect of the specified safety function, when the device is applied in accordance with the instructions specified in the device safety manual
systematic safety integrity
part of the safety integrity of the SIS relating to systematic failures in a dangerous mode of failure
Define the lifecycle phases by the defining the specific outputs to be produced (e.g. documents, data, equipment items, software code modules) and define the inputs that the outputs are to be based on.
For each output define who is responsible for preparing, verifying and approving the output.
Define the method of verification and the verification records that are to be kept.
Define specific techniques, measures, guidelines or templates to be used.
The level of HFT required can be minimised through:
Policy and Strategy
Hazard and Risk Analysis
Follow up and resolution of recommendations
Assessment and Auditing
Management of Changes
And then in section 6, Life-cycle and document planning,
and in section 7, Verification planning
λT / 2
≈ 2 x 10-7 h-1 x 8,760 h / 2
≈ 9 x 10-4 or about 10-3
The dangerous undetected failures are:
The total λDU is 200 +180 + 50 = 430 FITS
λDU = 430 x 10-9 per hour = 4.3 x 10-7 per hour
8760 hours per year ≈ 0.9 x 104 hours per year
λDU ≈ 0.9 x 104 x 4.3 x 10-7 per year
≈ 4 x 10-3 per year
≈ 0.004 pa
This corresponds to a MTBFDU of about 250 years.
λ = 1000 FITS = λDU + λDD + λS
Therefore λDD + λS = 1000 – 430 FITS = 570 FITS
The SFF is λDD + λS / (λDU + λDD + λS ) = 570/1000 = 57%
Random? Maybe, but unlikely to be purely random
It is acceptable to use generic data in Route 1H:
188.8.131.52 The estimated failure rates, due to random hardware failures, for elements (see 184.108.40.206 a) and c)) can be determined either
NOTE 1 Any failure rate data used should have a confidence level of at least 70 %. The statistical determination of confidence level is defined in reference  of the Bibliography. For an equivalent term: “significance level”, see reference .
NOTE 2 If site-specific failure data are available then this is preferred. If this is not the case then generic data may have to be used.
IEC 61508-4 clause 3.5.16 and IEC 61511-1 Ed 2 clause 3.2.39:
Low demand is ‘where the frequency of demands is no greater than one per year’
Refer to IEC 61508-2 §220.127.116.11.
The difference in Type A and Type B is essentially to do with whether or not:
a) the failure modes of all constituent components are well defined; and
b) the behaviour of the element under fault conditions can be completely determined;
For both Type A and Type B IEC 61508-2 §18.104.22.168 requires:
c) there is sufficient dependable failure data to show that the claimed rates of failure for detected and undetected dangerous failures are met.
‘Critical’ and ‘possible’ puts us in Risk Class II – the middle zone of the ALARP triangle, so we need to implement further risk reduction unless the cost is disproportionately high.
The risk exposure is approximately:
1 fatality in 20 years, i.e. 0.05 fatalities y-1 and
$100M /20 years which is $5M y-1
Reducing the frequency to 1 in 200 reduces the risk to
0.005 fatalities y-1 and
Over 20 years we can expect to save
1 life (20 x 0.045) and
20 x $4.5M = $90M.
We could justify spending something in the range $10M to $100M because of the value of the damage. It would be hard to justify spending much more than $100M.
Considering the loss of life alone we might be able to justify spending $1M to $3M to avert a fatality.
$10M could be justified if the risk were toward the top of Risk Class II
Failure rates from different sources vary over a range of 1 or 2 orders of magnitude.
Additional 0.1 for alarm, therefore need only RRF 10, SIL 1
We would need to be confident that the alarm will be treated as a safety critical alarm. How can we be sure that we can depend on the operator responding correctly? How are safety critical alarms defined and managed?
Initiating frequency 10-1 pa, process design factor 0.01, BPCS factor 1 because it is the BPCS that has failed so it cannot be counted in risk reduction, Alarm factor 1 because there is no independent alarm. 0.01 for the PSV can be claimed.
10-1 pa x 0.01 x 1 x 1 x 0.01 = 10-5 pa
Vulnerability V = 0.5 for a large release, V = 0.5 x 1 or 2 = 0.5 or 1
-> CC, or maybe CD
With CC, FB, PB, W1 -> SIL 2
If CD, FB, PB, W1 then SIL 3
Vulnerability V = 0.01 for a small release. With 2 or 3 people exposed we would estimate 0.02 to 0.03 fatalities per annum, V = 0.01 x 3 = 0.03 -> CB,
Choose FB because people are frequently in the vicinity
Choose PB because we don’t have the possibility of avoiding the hazard if the SIF fails
Choose W2 for 0.1 x D per annum but it is marginal, on the borderline with W1,
-> SIL 2, or maybe SIL 1
Start with the initiating event of once in 10 years or 0.1 pa.
Multiply by the probability of failure of the two existing risk controls, x 0.1 for the operator failing to respond successfully to the alarm and x 0.01 for the PSV failing.
That gives us:
0.1 pa x 0.1 x 0.01 = 0.0001 pa.
It may be easier to work this out using scientific notation, simply add the exponents:
10-1 pa x 10-1 x 10-2 = 10-4 pa
RRF = consequence frequency / tolerable frequency
= 10-4 pa / 10-5 pa = 10,
PFDAVG = tolerable frequency / consequence frequency
= 10-5 pa / 10-4 pa = 0.1,
Lower end of SIL 1 range
We can work this all the way through in units of per annum, it is easier:
(1/10y + 1/200y + 1/10y) + (1/5y + 1/5y + 1/50y)
(0.1pa + 0.005pa + 0.1pa) + (0.2pa + 0.2pa + 0.02pa)
≈ 0.2pa + 0.42pa
If we want to make it harder for ourselves we can work it through in units of h-1
(1 x 10-5/h + 6 x 10-7/h + 1 x 10-5/h) + (2 x 10-5/h + 2 x 10-5/h + 2 x 10-6/h)
≈ 2 x 10-5/h + 4 x 10-5/h
≈ 6 x 10-5 per hour x 8 760 hours per year ≈ 5 x 10-1 pa
≈ 0.5pa, which is close enough to the same answer
The quick way to check the result is to look at causes that are most frequent. We can see that together the blocked outlet and cooling failure occur about once every 2.5 years, so the answer is going to be a bit more frequent than 0.4 pa.
The next most frequent causes are process control failures and valve failures, each once in 10 years. Together that is once in 5 years or 0.2 pa.
That gives us the same result of 0.6 pa. The less frequent causes (e.g. 1 in 50 years) can be ignored.
The consequence frequency without the SIF is 0.1 pa x 10 % = 10-2 pa.
The RRF needed is the consequence frequency divided by tolerable frequency:
RRF = 10-2 pa / 10-5 pa = 1,000
i.e. we want to reduce the consequence frequency by a factor of 1,000 to reach a tolerable level.
This is on the border of SIL 2 and SIL 3. It would usually be classed as SIL 3.
= λDD / λDD + λDU
= 500 / (500 + 1500) = 500 / 2000 = 25%
Maybe but unlikely. Can you find dependable data with a high confidence level. How could you justify it?
IEC 61508-2 §22.214.171.124.3
If Route 2H is selected, then the reliability data used when quantifying the effect of random hardware failures (see 7.4.5) shall be:
a) based on field feedback for elements in use in a similar application and environment; and,
b) based on data collected in accordance with international standards (e.g., IEC 60300-3-2 or ISO 14224:); and,
c) evaluated according to:
i) the amount of field feedback; and,
ii) the exercise of expert judgement; and where needed,
iii) the undertaking of specific tests;
in order to estimate the average and the uncertainty level (e.g., the 90 % confidence interval or the probability distribution (see Note 2)) of each reliability parameter (e.g., failure rate) used in the calculations.
NOTE 1 End-users are encouraged to organize relevant component reliability data collections as described in published standards.
NOTE 2 The 90 % confidence interval of a failure rate is the interval [ 5 %, 95 %] in which its actual value has a probability of 90 % to belong to. has a probability of 5 % to be better than 5 % and worse than 95 %. On a pure statistical basis, the average of the failure rate may be estimated by using the “maximum likelihood estimate” and the confidence bounds ( 5 %, 95 %) may be calculated by using the 2 function. The accuracy depends on the cumulated observation time and the number of failures observed. The Bayesian approach may be used to handle statistical observations, expert judgement and specific test results. This can be used to fit relevant probabilistic distribution functions for further use in Monte Carlo simulation.
If route 2H is selected, then the reliability data uncertainties shall be taken into account when calculating the target failure measure (i.e. PFDavg or PFH) and the system shall be improved until there is a confidence greater than 90 % that the target failure measure is achieved.
IEC 61511-1 §11.9.3 does allow generic data to be used:
Route 2H depends on the availability of dependable failure rate information with a data confidence level of 90%
Route 1H depends on safe failure fraction, which can be improved by increased diagnostic coverage. With Route 1H it is possible to compensate for lack of dependability in failure rate data by providing diagnostic coverage.
Under Route 1H it is possible to justify SIL 3 with no hardware fault tolerance if the diagnostic coverage is sufficiently high. Roue2H always requires fault tolerance for SIL 3 and for continuous mode SIL 2.
In practice almost all failures are mostly systematic in nature, not purely random. The failure rate depends very heavily on how much effort is put into prevention of failure.
Failure rates from different sources may also vary due to the size of the data sets and due to the decisions made regarding which failures should be excluded. Some of the independent certifying authorities exclude systematic failures, some do not.
HFT is required to compensate for uncertainty in design assumptions and uncertainty in failure rate data.
Speed control on a steam turbine with no other overspeed protection, or reactant ratio control in a process reactor so that if the function fails a hazardous situation could immediately occur.
A permissive interlock may be considered as a continuous mode SIF. The interlock maintains equipment in a safe state. If the interlock fails a hazard may immediately result.
Probability of failure per hour rather than probability of failure on demand. In a continuous mode SIF it is the failure of the SIF itself that is the cause of the hazardous event. That is why we characterise it by a failure rate.
It is on the borderline between SIL 1 and SIL 2. It could be classified as SIL 1 provided that the RRF is specified in the SRS.
In semi-quantitative methods and in qualitative methods a RRF of 100 (i.e. 2 orders of magnitude in risk reduction) would usually be classified as SIL 2.
SIL 2 functions provide RRF of at least 100.
The target for PFD will remain the same at 0.01 (1/100), regardless of whether we classify it as SIL 1 or SIL2. But if we classify it as SIL 2 we need to demonstrate systematic capability SC 2. More attention will need to be paid to quality control. It can be argued that the systematic integrity is far more important than the PFD so it is better to classify a RRF of 100 as SIL 2 rather than SIL 1 to be conservative.
Some organisations are conservative and round the RRF up, so a RRF of 90 would be classed SIL 2. Other organisations are less risk averse and would insist on RRF 100 being classed as SIL 1.
The standards were developed in response to increasing complexity of safety related systems.
The complexity increases the risk of systematic failure; more than 90% of the failures are systematic in nature and can be prevented or controlled through quality techniques, procedures and practices.
If a continuous mode SIF fails dangerously a potentially hazardous situation will occur unless action is taken to prevent it. A continuous mode SIF acts to maintain a safe state.
A demand mode SIF may fail dangerously but a potentially hazardous situation will not occur until there is a failure in the process or in the BPCS. A demand mode SIF takes no action until a demand is detected. It then acts to put the equipment into a safe state.
At least 4 orders of magnitude of risk reduction are needed to reduce the risk from ‘severe‘ to ‘medium’. We need to reduce both likelihood (through prevention) and consequence (through mitigation).
Reducing likelihood alone would only achieve ‘high’ risk at best, cell A+.0 is in the orange zone.
IEC 61508 is the more general standard that covers safety related systems in all industry sectors. IEC 61511 is a specific application of IEC 61508 to the process industry sector.
IEC 61508 covers the design and manufacture of equipment and components for safety systems. IEC 61511 is limited to the application of the equipment and components.
A good example of ‘wilful blindness’ is the tacit acceptance of a sub-standard situation because of a perceived lack of funding or due to inappropriate management priorities. ‘Learned helplessness’ afflicts employees who learn that the managers do not or cannot respond to issues that affect the employees’ safety.
Managers need to have clear policies and strategies in place to achieve safety and they need to communicate them effectively. They need to have the means to evaluate the achievement of their policies and strategies.
ICAF = Cost / Number of lives saved
$100k / [5 y x (5 x 10-4 pa – 1 x 10-5 pa)]
≈ $100k / 5 y x (5 x 10-4 pa) ≈ $100k / 25 x 10-4
= $4 x 103 / 10-4
= $4 x 107 = $40M per life, which is disproportionately high. No, the work cannot be easily justified on this basis.
People should not usually be expected to take more risk in their workplace than in their private lives, but an increased level of harm may be tolerated if it is in proportion to the perceived benefit to society. For instance, deep sea divers, underground miners, firefighters, police and soldiers may be exposed to higher risk.
The risk needs to be identified, assessed and managed to a level that is as low as is reasonably practicable. This means that the cost of further risk reduction would be disproportionately high compared to the benefit gained in safety.
Identify appropriate standards or work practices
Take reasonable steps to apply the standards or practices
Owners, operators, designers and installers of plant must all identify and control hazards in the workplace.
The legislation has specific requirements for any person who can control or influence hazards in a workplace.
Functional Safety refers to the application of safety instrumented functions to provide a defined degree of risk reduction in a hazardous facility.
Demand mode safety functions take action on demand to achieve a safe state in response to detection of a developing hazard,
Continuous mode safety functions take continuous action to maintain a safe state, preventing a hazard from occurring.
Not necessarily, the requirement stems from duty of care because the standards are well established and widely applied.
Ignoring the standards could be deemed to be negligence – unless you can find and apply some similar well-established standard.
Application of IEC 61511 is required by some other standards – such as AS 3814, and these may be referenced in legislation.
In some jurisdictions codes of practice may specifically refer to the standards.