We know that the reliability of a product is the ability of the product to perform a specified function under specified conditions and within a specified time. The probability of reliability is called the reliability of the product.
Product reliability can be divided into inherent reliability and service reliability. The inherent reliability is given by the product in design and manufacturing, and is also controlled by the product developer. The reliability of use is a characteristic of the performance retention capability of the product during actual use. It not only considers the inherent reliability factors, but also considers the impact of factors such as product installation, operation, use and maintenance support. Therefore, product reliability design and analysis are the prerequisites for the product to reflect reliability and a strong guarantee for whether it can produce reliable products.
From the above definition, we can see that the reliability of the product is designed, produced, and managed. The reliability design level of the product developer has a significant impact on the inherent reliability of the product, so reliability design and analysis have a very important position in the product development process. Next, we will carry out the reliability design and analysis of the product from the aspects of product reliability technology, failure mode, impact and hazard analysis, fault (failure) tree analysis, and maintainability design. Hope to help those who are engaged in the design of products and quality.
First, the main technology of reliability design
1. Specify qualitative and quantitative reliability requirements. With the reliability index, there is a goal to carry out reliability design, and the reliability of the developed product can be evaluated to avoid the loss of the interests of the developer and the customer due to frequent failures of the product during customer use. The most commonly used reliability indicators are mean time between failure (MTBF) and service life.
2. Establish a reliability model. Establishing system-level and sub-system-level reliability models of products can be used to quantitatively assign, estimate, and evaluate product reliability.
Reliability models include reliability block diagrams and reliability mathematical models. For one or more functional modes of a complex product, a block diagram is used to indicate the failure of each component or their combination. The block diagram is divided into series model and parallel model.
The method is to predict or estimate the series model and parallel model block diagram of the reliability model of the designed product, use the mathematical formula to quantitatively find the reliability and failure rate of the product, and finally derive the reliability index.
3. Reliability distribution. It is to distribute the quantitative requirements of the total reliability of the product to the specified product level. Quantitative requirements for overall and partial reliability are coordinated through allocation. It is a decomposition process from whole to part and from top to bottom. There are many methods for reliable allocation, such as the score allocation method and the proportional allocation method. Below we use the score allocation method as an example:
The score distribution method is a commonly used distribution method. In the absence of product reliability data, experts who are familiar with the product and have practical engineering experience can be asked to give each factor according to the complexity, technical maturity, importance and environmental conditions of several factors that affect product reliability Score (between 1-10 points).
Complexity: The assessment is based on the number of components that make up the subsystem and the ease of assembly and debugging. The most complex rating is 10 points and the simplest rating is 1 point.
Technology maturity: Assess according to the technical level and maturity of the subsystem. 10 points for low technology maturity, 1 point for high technology maturity
Importance: According to the importance of the subsystem. The lowest importance rating is 10 points, and the highest importance rating is 1 point.
Environmental conditions: According to the environmental conditions of the subsystem. 10 points for harsh conditions and 1 point for the best environmental conditions.
Calculate the mean time between failures (MTBF) of the reliability index quantitatively using mathematical formulas, so that the reliability index can be assigned to each component using the score assignment method.
4. Reliability prediction. Reliability prediction is a quantitative estimation of system reliability at the design stage. It is based on factors such as the reliability data of similar products, the composition and structural characteristics of the system, and the working environment of the system to estimate the reliability of the components and systems that make up the system. The reliability prediction results can be compared with the required reliability, and it can be estimated whether the design meets the requirements. Through reliability prediction, the units with high failure rate in each unit constituting the system can also be found, and the weak links can be found and improved. There are many methods for reliability prediction, such as component counting method, stress analysis method, upper and lower limit method, etc.
The component counting method is suitable for the early stage of product design and development. Its advantage is that you can quickly estimate the failure rate of the product without detailed understanding of the application of each component and the logical relationship between them, but the expected result is relatively rough.
The stress analysis method is suitable for the detailed design stage of electronic products. It already has detailed file lists, electrical stress ratios, and ambient temperature information. This method is more accurate than the result of the component counting method. The stress analysis method is obtained in three steps. The first step is to find the working failure rate of various components; the second step is to find the working failure rate of the product; the third step is to find the product reliability index mean time between failures (MTBF).
Note: The above failure rate, environmental coefficient, etc. can be checked in the national military standard GJB299B
5. Criteria for reliability design. It is to summarize the existing engineering experience of similar products and make it organized, systematic and scientific, which has become the principle followed and the requirements that designers should follow for reliability design.
Reliability design criteria are generally for a certain product, but the common content of the reliability design criteria of various products can also be integrated into a certain type of reliability design criteria, such as helicopter reliability design criteria. Of course, these common reliability design guidelines can be tailored and supplemented to specific product-specific reliability design guidelines.
Reliability design criteria should generally be formulated based on product type, importance, reliability requirements, use characteristics, reliability design experience of similar products, and related standards and specifications.
6. Environment-resistant design. The impact of product use environment on product reliability is obvious. Therefore, anti-vibration, anti-impact, anti-noise, anti-moisture, anti-mildew, anti-corrosion design and thermal design should be carried out during product development.
7. Selection and control of components. Electronic components are the basic unit of the circuit that can complete the specified functions of the product and can no longer be divided. It is the basis of the reliability of electronic products. It is extremely important to strictly control the components used to ensure product reliability. Formulating and implementing component outlines is an effective way to control the selection and use of components.
8. Electromagnetic compatibility design. For electronic products, electromagnetic compatibility design is very important. It includes electrostatic immunity, surge and lightning immunity, power supply fluctuation and instantaneous drop immunity, radio frequency electromagnetic field radiation immunity, etc.
9. Derating design and thermal design. The failure rate of components and components is closely related to the stress they bear. Reducing the stress they bear can improve the reliability of their use. Therefore, when designing, the working stress should be designed below the specified rated value. And leave a margin. Excessively high ambient temperature around products, especially electronic products, is an important reason for the increased failure rate. Therefore, the principles of heat conduction, convection, heat radiation, etc. should be used in combination with the necessary natural ventilation, forced ventilation, water cooling and heat pipe technology to make a reasonable thermal design to reduce the surrounding temperature.
The above are some of our main technologies for designing product reliability design. If the design is well-thought-out, the reliability level of the product developed will be greatly improved.
Second, product failure mode, impact and hazard analysis (FMECA)
The failure mode, impact and hazard analysis (FMECA) is for all possible failures of the product, and according to the analysis of the failure mode, determine the impact of each failure mode on the work of the product, find out the single point of failure, and according to the severity of the failure mode Degree and its probability of occurrence determine its hazard. FMECA includes failure mode and impact analysis (FMEA) and hazard analysis (CA). The FMECA analysis method can be used at any level from the entire system to parts and components, generally according to requirements and possibly at the specified product level.
The implementation steps of FMECA are usually:
(1) Master relevant information about product structure and function.
(2) Master product startup, operation, operation and maintenance data.
(3) Grasp the information about the environmental conditions of the product.
At the initial stage of design, these materials are often not all mastered. At the beginning, only certain assumptions can be made to determine some obvious failure modes. Even the preliminary FMECA can point out many single-point failures, and some of them can be eliminated by rearranging the structure. As the design work progresses, the information available continues to increase. FMECA work should be repeated, and the analysis should be extended to a more specific level as needed and possible.
(4) Define products and their functions and minimum working requirements. A complete definition of a system includes its primary and secondary functions, uses, expected performance, environmental requirements, system constraints, and conditions that constitute failures. Since any given product has one or more working modes, and may be in different working stages, the definition of the system also includes each mode of product work and its functional description during the continuous working period. Each product should have its functional block diagram, showing the product work and the correlation between the various functional units of the product.
(5) According to the functional block diagram of the product, draw out its reliability block diagram.
(6) Determine the level of analysis according to the required structure and the amount of existing data, that is, the level of analysis.
(7) Identify the failure mode and analyze its causes and effects.
(8) Find out the fault detection method.
(9) Identify possible preventive measures during design to prevent particularly undesirable incidents.
(10) Determine the severity of the damage caused by various failure modes to the product.
(11) Determine the probability of occurrence of various failure modes.
(12) Fill in the FMEA form and draw the hazard statement. If quantitative FMECA is required, the CA form needs to be filled out. If only FMEA is performed, step (11) and drawing the hazard matrix need not be performed.
The above outlines the basic input information required for FMECA, and on this basis, further reference to the relevant standards to complete the analysis, the reference standards include the national standard GB 7826-87 “System Reliability Analysis Technology Failure Mode and Effect Analysis ( FMEA) procedures”, the International Electrotechnical Commission standard IEC 60812 Ed.2 (2003) 56/797 and the national military standard GJB1391 “Failure Mode, Impact and Hazard Analysis Degree”, in which these tables are provided for the analysts use.
Third, fault (failure) tree analysis (FTA)
Similar to FMECA, Fault Tree Analysis (FTA) is another important reliability analysis tool for analyzing the relationship between product failure causes and results. A fault tree represents a logic diagram of a given failure mode of a product for those component failure modes or external events or a combination of them. It uses a series of event symbols, logic symbols and transfer symbols to describe the causality between various events in the system.
(1) Preparation for fault tree analysis
Analysts must be familiar with design specifications, design drawings, operating regulations, maintenance procedures, and other relevant information. Grasp the design intent, structure, function and environment of the system. According to the complexity and requirements of the system, the FMEA or FMECA of the system should be carried out when necessary to help determine the top event and fault events at all levels, determine the analysis purpose according to the system’s task requirements and understanding of the system, and determine the system fault criterion according to the system’s task function .
(2) Construction of fault tree
After completing the preparation work in (1), you can proceed from the determined top event and follow the basic rules and methods for building a fault tree to build the required fault tree.
(3) Qualitative analysis of fault tree
The qualitative analysis of the fault tree mainly includes the following: the specification of the fault tree; the normalization of the fault; the simplification and module analysis of the fault tree; and the calculation of the minimum cut set of the fault tree.
(4) Quantitative analysis of fault tree
According to the occurrence probability of each bottom event in the fault tree, the probability of the top event is calculated.
(5) Compile fault tree analysis report
Since fault tree analysis has become an important tool for system reliability analysis and is widely used by reliability engineers, corresponding standards have also been issued at home and abroad. These standards are: National Standard GB/T 7829-1987 “Fault Tree Subprogram “, National Military Standard GJB768 “Fault Tree Analysis”, International Electrotechnical Commission standard IEC 61025 Ed.1-1990-10 “Fault Tree Analysis”.
Fourth, maintainability design
The maintainability of the product is designed. Only when the maintainability design and analysis work is carried out during the product design and development process can the maintainability be designed into the product. The main methods of maintainability design are qualitative and quantitative. The maintainability is qualitative. Design is the most important, as long as the designer has maintenance awareness and engineering experience, maintenance can be designed into the product. Maintainability qualitative design mainly includes simplified design, accessibility design, standardized interchange and modular design, error prevention and identification mark design, maintenance safety design, fault detection design, maintenance human engineering design, etc.
(1) Simplified design is to adopt the simplest structure and shape as much as possible on the premise of satisfying performance requirements and use requirements, so as to reduce the skill requirements for use and maintenance personnel. The basic principle of simplified design is to simplify product functions as much as possible, merge product functions and minimize the variety and quantity of components.
(2) Accessibility design is a design that is easy to access the parts to be repaired when the product fails to be repaired. Accessibility design requirements “visible”-visually reachable; “enough”-physically reachable, such as a part of the body or work access to the maintenance part, while leaving sufficient maintenance operations space. Reasonable setting of repair windows and repair channels is an important way to solve “visible and accessible”.
(3) Standardization, interchangeability and modular design. Standardized design is a characteristic of modern product design. The use of standard parts as much as possible in the design is conducive to the supply reserve and adjustment of parts and components, making product maintenance easier.
Interchangeability design refers to the performance of the same kind of products that can be replaced with each other physically and functionally. Interchangeable design can simplify maintenance operations and save spare parts costs, improve product maintainability.
Modular design is a productive way to achieve universal interchange and quick replacement and repair of components. Module refers to a structure that is separated from the product separately and has relatively independent functions. The modularization coefficient of advanced hardware design is over 70% to 80%, and the modularization coefficient of advanced and software design is over 50%.
(4) Anti-error and identification mark design. The anti-error design is to ensure that the structure can only be installed if it is installed correctly. It can not be installed if it is wrong or reversed, or it can be found and corrected immediately when an error occurs. The design of the identification mark is to identify the repaired parts, spare parts, special tools, test equipment, etc., so as to distinguish them, prevent confusion, avoid accidents due to errors, and also improve work efficiency.
(5) Maintenance safety design. Maintenance safety design refers to a design that can avoid maintenance personnel casualties or product damage. For example, in parts where danger may occur, auxiliary precautions such as striking signs, warning lights, and audible warnings should be provided. For devices containing high-pressure gas, springs, high-voltage electricity, etc. that have a large energy storage and need to be dismantled during maintenance time, a structure for standby energy release and safe and reliable disassembly and assembly equipment and tools should be provided to ensure the safety of disassembly and assembly. At the same time, the maintenance design should consider the protection against mechanical damage, electric shock, fire, explosion, and anti-virus, etc., to ensure the safety of maintenance personnel.
(6) Fault detection design. Whether the product fault detection and diagnosis is accurate, fast and simple has a significant impact on maintenance. Therefore, in the design, a series of problems such as test methods, test equipment, and test point configuration should be fully considered to improve the speed of fault location.
(7) Human factors engineering design during maintenance. The human factor engineering of maintenance is to study the relationship between human factors in maintenance, including physiological factors, psychological factors and the geometric dimensions of the human body and the product, in order to improve the efficiency of maintenance work and reduce the problems of maintenance personnel fatigue.