Design of Experiments (DOE): The Overview

Design of Experiments (DOE) is a systematic approach to planning, conducting, and analyzing efficient scientific experiments. It is an indispensable tool in optimization of complex processes, especially in engineering and manufacturing. Unlike traditional one-variable-at-a-time methods, DOE involves simultaneously varying multiple factors to efficiently assess their individual and interactive effects on the outcome. The most obvious benefit of DOE is that it allows to dramatically reduce the number of experiments needed to characterize a system (well, depending on the design you choose). But more importantly, it allows us to uncover interactions between variables, something that one-variable-at-a-time testing simply cannot do.

This note gives a very top level overview of DOE as a technique, while a detailed look at various methods will come in subsequent posts.

The key features of DOE:

Efficiency: DOE allows for the investigation of multiple factors in a relatively small number of experiments. This efficiency is achieved through a carefully designed experimental plan.
Interactions: DOE explicitly accounts for interactions between variables, recognizing that the combined effect of factors may differ from their individual contributions. Conventional methods often overlook these interactions, leading to incomplete or inaccurate conclusions.
Statistical Rigor: DOE incorporates statistical principles from the outset, providing a robust framework for data analysis. This ensures that results are not merely anecdotal but are backed by statistical significance, enhancing the reliability of conclusions drawn from the experiments.
Optimization: One of the primary goals of DOE is to optimize processes or systems by identifying the ideal conditions for the desired outcome. This is achieved by systematically exploring the experimental space and determining the settings that yield the best results.
Prioritization: DOE provides a comprehensive understanding of the factors influencing a system and their relative importance. This allows us to prioritize which changes and improvements will have the most significant impact on the performance of a process.

Overview of DOE steps

Define the Objectives
Identify Factors and their Levels
Select a Design
Conduct Experiments and Analyze Data

Define the Objectives

Defining DOE objectives involves clearly articulating the specific goals or outcomes you aim to achieve through the experimental process. These objectives guide the entire DOE process. Objectives can vary based on the context of the experiment but generally include:

Optimization: Determine the optimal combination of factors and levels to achieve the best outcome or performance.
Understanding and Prioritizing Factors: Gain insights into the individual and interactive effects of factors on the outcome to better understand the system, and identify the most influential factors that significantly impact the response variable.
Process and Quality Improvement: Improve the efficiency, quality, or performance of a process, system, or a product.
Cost Reduction and Resource Optimization: Identify factors that minimize costs or optimize resource utilization without sacrificing desired outcomes.
Robustness Testing and Troubleshooting: Assess the robustness of a process by understanding how variations in factors affect the outcome, and Identify and resolve issues or challenges within a process.
Model Validation: Validate mathematical or predictive models by comparing their predictions with experimental results.

Identify Factors and their Levels

Factors are the independent variables that may influence the outcome (dependent variables) of your experiment, while levels represent the different values or settings each factor can take during the experiment.

Identifying and prioritizing experimental factors requires a good knowledge of the system. It may be useful to first conduct some pilot studies, or initial exploratory experiments, to gather additional information about the individual impact of each factor.

When identifying factors, consider both controllable and uncontrollable factors. Categorize factors as either independent variables (factors you can control) or nuisance variables (factors that may affect the outcome but are difficult to control). Prioritize factors based on their expected impact on the outcome and their controllability. Focus on factors that are most likely to influence the response variable. Choose a manageable number of factors to keep the study focused and efficient.

Lastly, for each identified factor, determine the different levels or settings that will be tested during the experiment. Levels should cover the practical range of values for each factor. Pilot studies are very helpful in determining the useful “design space” that the levels of each factor define.

Select a Design

There are a few types of DOE designs, each suited to different experimental goals, stage of the study, and resource constraints. Here are some common DOE design types:

Full Factorial Design:
- Examines all possible combinations of factor levels. It provides a complete picture of how each factor and their interactions affect the response variable. Full factorial designs are often too costly to run, since the sample size grows exponentially with the number of factors and levels.
Fractional Factorial Design:
- Investigates a fraction of the total possible combinations, significantly reducing the number of experimental runs. It is useful when the full factorial design is impractical due to resource constraints. Often used for screening and pilot studies.
Response Surface Methodology (RSM):
- Focuses on modeling and optimizing the relationship between independent variables (factors) and a response variable. It often involves a sequence of factorial or fractional factorial experiments. To appropriately characterize the curvature of the response surface, each factor needs to be measured on at least three levels. Some design subtypes that fall under RSM umbrella include Central Composite Design (CCD) and Box-Behnken Design.

There is a wide variety of more advanced and model specific designs, which may be worth considering in the later stage of system characterization.

Conduct Experiments and Analyze Data

Before executing the experimental plan, makes sure to use effective randomization schemes and build in appropriate replication to minimize the impact of external variables and ensure the reliability of results.

Analysis of the data usually includes the following steps:

Review the data usign basic descriptive statistics approaches to gain a general understanding of the sources of variability, wheter the data follows the normal distribution, and whether linear or non-linear models would be more appropriate. Popular approaches include histograms, box plots, Q-Q plots, etc.
Use DOE-specific plots to rank the importance of the factors and to narrow down on the more useful set of their levels. Such tools include the main effects mean plots, block plots, normal or half-normal plots of effects, and interaction plots. Sometimes, these visuals may provide clear answers to your experimental questions, allowing you to proceed directly to step 5.
Construct a model tailored to address your objectives. Use a linear model when possible, simplify using stepwise regression methods or considering parameter p-value significance. Assess the model’s assumptions through residual graphs. If assumptions hold, proceed to examine the ANOVA and further simplify the model if necessary.
If model assumptions are violated, investigate potential causes, such as missing terms or the need for a response transformation.
Finally, using new newfound model address your experimental objectives, whether it involves identifying significant factors or determining optimal settings.